part of shnell – a source to source compiler enhancement tool
© Jens Gustedt, 2018
Generate sed expression to replace a meta-variable
This produces a regular expression to evaluate different forms and surroundings of meta-variables given as ${NAME}
#${NAME}
is stringification of the content
## ${NAME}
joins the content to the left
## ${NAME} ##
joins the content to the left and to the right
${NAME} ##
joins the content to the right
${NAME}
without any surrounding #
or ##
operators is simple replacement with the contents
String literal and character literal prefixes
Sometimes we might have ${NAME}
representing a string or character prefix (such as L
) as in ${NAME} ## "something"
or ${NAME} ## 'z'
. The prefix is then joined to the string literal to form a single string token L"something"
or L'z'
.
Otherwise, remember that C string literals don’t need concatenation: two string literals that directly follow each other are joined later by the preprocessor.
Precedence
In theory, ##
binds to the meta-variable that was defined first. E.g ${HEI} ## ${HOI}
binds to the variable HOI
if that was defined in an outer scope (or previous in the same bind
directive) than HEI
. But under normal circumstances you should not notice this, the contents of the two variables should just be glued together.
There is ambiguity for ${HEI}###${HOI}
. This stringifies ${HOI}
regardles of the order of definition and then glues ${HEI}
to it as a string prefix.
There should never be more than three #
in a row.
Tokens and white space
Generally, shnell
pragmas are broken into tokens by white space. Sometimes it may be convenient to also include spaces or tabs into words, e.g if we want to assign a whole list of tokens to a meta-variable, or if some data naturally contains spaces. This can be achieved by placing a backspace character \
in front of a space such as in L\ u\ U\ u8
for a list of string prefixes, or as in Elster,\ Frau
for a combination of data that belongs to a single item.
Such escaped spaces behave differently when such a meta-variable is expanded:
If expanded directly, whitout stringification, the result is a list of tokens that are split at the escaped spaces.
When stringified, the resulting string is still one single token.
Examples
with ${A}
containing 5
, ${NAME}
containing top
, ${EXT}
containing U
, ${FN}
containing Elster\ ,\ Frau
we obtain
source | replacement | |||
---|---|---|---|---|
int a = ${A};
|
int a = 5;
|
simple token | ||
long a = ${A}L;
|
long a = 5L;
|
variable and alnum, 1 token | ||
long a = ${A}##L;
|
long a = 5L;
|
same with ## , 1 token
|
||
unsigned u = ${A}${EXT};
|
unsigned u = 5 U;
|
invalid C, 2 token | ||
unsigned u = ${A}##${EXT};
|
unsigned u = 5U;
|
## necessary, 1 token
|
||
var${A}
|
var5
|
alnum and variable, 1 token | ||
var ## ${A}
|
var5
|
same with ## , 1 token
|
||
var ${A}
|
var 5
|
invalid C, 2 token | ||
${NAME}${A}
|
top 5
|
invalid C, 2 token | ||
${NAME} ${A}
|
top 5
|
invalid C, 2 token | ||
${NAME} ## ${A}
|
top5
|
## necessary, 1 token
|
||
char s[] = #${NAME};
|
char s[] = "top";
|
stringification | ||
wchar s[] = L#${NAME};
|
wchar s[] = L"top";
|
string prefix, 1 token | ||
char32_t su[] = ${EXT}## #${NAME};
|
char32_t su[] = U"top";
|
## necessary, 1 token
|
||
char s[] = #${FN};
|
char s[] = "Elster , Frau";
|
stringification, 1 token | ||
enum { ${FN} };
|
enum { Elster , Frau };
|
3 token, including comma |
Coding and configuration
The following code is needed to enable the sh-module framework.SRC="$_" . "${0%%/${0##*/}}/import.sh"
Imports
The following sh
-modules are imported: