•All Data Structures Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Pages

Searching...

No Matches

Modern compilers and tool chains are sometimes confronted with very large projects that may demand important compilation times. But compared to the overall processing times, preprocessing should have a neglectable share. Nevertheless, preprocessing as much as any other computing effort should not waste resources.

To stay within reasonable bounds, before even discussing implementation aspects, we have to look at the inherent complexity of preprocessing. Because if we don't take care about complexity, any effort to keep preprocessing scalable would be doomed. A good measure for complexity here is IO cost, namely the amount of data that is read In (basically the combined size of the source files) and written Out (the site of the produced token stream).

‍Preprocessing should not take more than O(In + 0ut) resources of any kind.

In particular, it should not take more than linear time (total CPU or wall clock), memory or power.

The inherent non-linearity of macro replacement in C

Standard C macros are not well-equipped for linear processing. This is due to a combination of two features that dominate macro processing of unbounded argument lists:

trailing variable arguments
repeated expansion

C23 and possible macro recursion

With C23, C gains a conditional macro operator, __VA_OPT__(XYZ). It only expands to the tokens that are given in the parenthesis, here XYZ, if the variable argument list of the invoking macro is not empty. It has been proposed to use this mechanism in a future version of C (and C++) for recursion. EĿlipsis has tools to emulate such a feature by defining stacks of macros, see ellipsis-recursive.dirs, so you may experiment if this would be an suitable tool for you.

With the proposed syntax, a typical recursive C macro could look similar to the following.

#define NAMES(X, ...) #X __VA_OPT__(, RECURSIVE(__VA_ARGS__))

When invoked, this has the following effects:

The first argument X is taken out of the argument list and stringified.
If there are no more arguments, processing stops.
If there are more arguments, the remaining list of arguments is expanded and passed into the next revision call level, symbolized by the fixed word RECURSIVE.

In the example, this would take a comma-separated list of words (say for example the constants of an enumeration type) and produce a list of strings that contain the names. This could e.g be convenient for a print function for the enumeration type.

But other than the compact notation suggests, this call mechanism has a high complexity: for each such a call the whole argument list is expanded. This is because that argument expansion is performed before any other token of the replacement list is interpreted. In particular, the closing parenthesis that marks the end of the invocation of RECURSIVE is only seen once all tokens in __VA_ARGS__ have been expanded.

Indeed, if this macro receives a list of N elements, it alone (without accounting for the recursive call) will touch 2N +1 tokens, namely the list elements themselves and the commas between them. So, in total, this recursive approach will touch 2N+1, ... , 5, 3, 1 = N² tokens.

Lets have a look at what could be typical lists that we might want to handle:

	elements	touched tokens
small	10	100	ok
Fibonacci numbers < 2⁶⁴	100	10,000	~
XML entities	2200	4,840,000	limit
Unicode character values	150,000	22,500,000,000	intractable

For a small enumeration, this is certainly a suitable approach. For a list such as the Fibonacci numbers it is certainly tractable, but recursion already shows a substantial overhead. Treating all XML entities would already be challenging for the recursion itself (2200!) but the complexity of touching more that 4 million tokens is clearly disproportionate. Handling lists of Unicode characters becomes prohibitive for the recursion depth itself and completely intractable from a POV of complexity.

‍With the current expansion rule in place, C macro recursion is not an adequate tool to process long lists.

So unless we invent a tool that avoids list expansion on all levels, recursion is not the right tool to process even moderately long lists.

EĿlipsis and include recursion

Besides macro recursion, include recursion is another approach to perform iterative tasks. EĿlipsis has several extensions that ensure linearity for these, and, that ensure that the recursion stays bounded to some reasonable depth of less than 10, say.

The first of these tools is for iteration with arithmetic. A predefined header file ellipsis-do.dirs and predefined operators such as __EXPAND_DEC__ provide domain iteration within linear time.

Processing argument list (such as all constants of an enumeration) would have the same quadratic complexity as recursion, simply because each recursion level again would have linear cost im the six of the lists that are processed. To face this difficulty eĿlipsis proposes directives and builtins as extensions:

#scatter splits up a list by taking out one or several initial elements, but without expanding the rest of the list.
#gather splices several lists into one, without expanding any of them.
__ANY__(ID) checks if identifier ID is either not a macro or if it would expand to a non-empty token list, if it would be expanded.

With these extension directives, recursion can be replaced by iteration that has a cost of 0(1) per element, and thus have an overall cost that is linear. For example, when using a tool such as ellipsis-loop.dirs, the above could be coded by putting something like

# scatter ELEM LIST
__EXPAND_STRINGIFY__(ELEM) /* produce the string */
#if __ANY__(LIST)
,                          /* place a comman between strings */
#else
# undef  LOOP_TERMINATE
# define LOOP_TERMINATE true
#endif

in a file named by the macro LOOP_DIRECTIVES. Since none of these directives expands LIST, processing even a long list is still linear.

EĿlipsis and tail recursion for macros

The new C23 feature __VA_OPT__ has the advantage that it allows to query the __VA_ARGS__ list without expanding it. So it is a first tool that allows to stay with a linear complexity as long as __VA_ARGS__ is not passed on into other macros.

We provide an extension that builds upon that idea, __VA_TAIL__. A call of the form __VA_TAIL__(ID) (or __VA_TAIL__()) makes a tail call into the macro ID and passes the __VA_ARGS__ list to it without further expansion. A macro that uses this must implement the following constraints.

It must have at least one named parameter.
It must have a variable parameter list.
The __VA_TAIL__ construct is only inspected at the very end of processing of the macro call. Then, it is expected to have been resolved to the valid forms __VA_TAIL__(ID), where ID is the name of a functional macro, or __VA_TAIL__().

This ensures that when such a tail recursion goes deeper, the variable argument list gets shorter and shorter as the initial arguments are assigned to the named parameters along with the calls. Once the variable argument list is empty the bottom of the recursion is reached and that last __VA_TAIL__ construct is ignored.

‍The __VA_TAIL__ constructs guarantees linear complexity for macro calls (including recursion) in the size of the input list.

See also: Tail recursion

Efficiency

Besides ensuring the overall linear complexity of preprocessing, our implementation of eĿlipsis tries to be efficient where this is possible, such that constant factors to the cost of processing are reduced. Preliminary test have shown that eĿlipsis is about 5 to 10 times slower than for example the preprocessing phase of gcc. In view of what is said above (preprocessing contributing only with a small fraction to the overall cost of compilation) we this is adequate for such a young project.

If and when eĿlipsis becomes a bottleneck in some development process, we'd certainly have to look into the implementation for possible improvement. Among the places to look for are

Storage management. For the moment we excessively go through malloc and realloc for the allocation of tokens (see ellipsis‿token) and strings (mostly ellipsis‿str32). If this shows to be a bottleneck, a better application level recycling of such items could be considered.
Some of the features are implemented as header files and these are included over and over again. Thus these are opened and then tokenized repeatedly. Caching and then copying the resulting token sequences could probably reduce the cost by some substantial factor. For some of these headers we could even think of implementing them directly in C, without running through dynamic token interpretation.
With the ellipsis‿token‿list data structure for lists of tokens, eĿlipsis is already mildly parallelized. The tokenizer produces a stream of tokens on which the filtering thread starts working, even before the tokenization is terminated. Currently this technique is only applied to the first level source file and not recursively when headers are included. Since parsing C usually includes a lot of headers, extending this technique there could exploit much more parallelism and thus accelerate preprocessing.