export

export

part of shnell – a source to source compiler enhancement tool

© Jens Gustedt, 2019

A tool to introduce prefix naming conventions.

Usage:

#pragma CMOD amend export [ LPREFIX ] [ : [ PREFIX0 PREFIX1 ... ] ]

with the following definitions:

All of these have suitable defaults and this directive is best used indirectly together with the implicit directive through the TRADE dialect or the trade compiler prefix.

Defaults

None of the identifiers used above must be strictly reserved, that is none of

LPREFIX PREFIX0 PREFIX1 ...
 shall start with an underscore that is followed by a capital letter
 or a second underscore.

Any of the two parts above may be empty:

So if neither LPREFIX or PREFIXx are given (directly or via alias) the filename must be known and is used to determine all naming conventions.

Linkage of identifiers

Not concerned by this tool are objects, functions and types that are just used (including tag names) but not declared or defined. In particular types that are just forward declared (such as in struct toto;) . For them we suppose that naming issues are taken care of by whichever TU defines them.

There are several categories of local identifiers, that is identifiers that are defined in the current TU:

If an identifier could have internal or external linkage by (1) or (2), internal linkage prevails. This is so that you may make an identifier private that otherwise would have external linkage. To simplify the use of such an internal identifer you are still able to use the short form, it is rewritten to the internal form.

All identifiers that have internal linkage by these rules are “private”, those that have external linkage are “public”. Those with no linkage in (0) are in a gray zone, but since we force privacy on all macros, there should be no bad interaction between foreign macros and such local identifiers.

Composite identifiers

There are several types of identifiers that are dealt by this tool:

A unique choice of prefixes for the TU within the same project guaranties that such an identifier can never clash with another one in the same project.

As mentionned, the separator that will be used for joining external name components is SHNELL_SEPARATOR. In the case of the special token :: the output names are also mangled. Mangling is done according to the mangle shnell-module. In particular, the prefix manglePrefix is used from there. There are special conventions:

Evidently you’d have to be careful that your identifiers fit with the convention you are using. For C and P only components that start with a lower case letter and do not contain an underscore are transformed. In particular, components that start with an underscore are left alone.

This defaults to “::”, but can be set by the corresponding environment variable.

export SHNELL_SEPARATOR="${SHNELL_SEPARATOR:-::}"

main is special

As for traditional C, the entry point main can be treated specially. If you name a function stdc::main (which you will probably do if you also use implicit) the following strategy is applied:

All of this allows you to have one entry point per TU without creating conflicts. This is particularly useful to implement a test program for your TU in place.

Header files

To determine the identifiers that are defined in the program, a header file is produced and stored in a directory named by the SHNELL_INCLUDE environment variable, if any:

Examples:

Without mangling and a simple universal prefix

SHNELL_SEPARATOR equal to “_” and

#pragma CMOD amend export : string : b

This defines a TU where the naming convention is independent of the filename. All global, non-static, variable and function names are externally visible to have a string_ prefix, if they don’t have one, yet. If in addition, we have three identifiers (EMPTY, INIT, and GET) that are forced to be public (e.g by using string::EMPTY, string::INIT, and string::GET in their definition) they have external names (string_EMPTY, string_INIT, and string_GET) but within this TU the local names (EMPTY, INIT, and GET) or short names may be used just the same.

In addition to all statically defined identifiers there is the identifier b that is forced to be internal. Again, within this TU it can be accessed as string::b or b, but externally it has a name that is something weird, hidden by name obfuscation.

One additional identifier is special, string itself. With the setting as described here, it is left alone. This identifier should generally be reserved for the principal feature of this TU, such as the central data type or function that is defined in the TU. It is always external.

Without mangling and a short and long prefix

SHNELL_SEPARATOR equal to “_” and

#pragma CMOD amend export string : strong type : b

Again, this defines a TU where the naming convention is independent of the filename. All global, non-static, variable and function names are augmented to interally have a strong::type:: prefix and externally to have a strong_type_ prefix, if they don’t have one, yet. The three identifiers (EMPTY, INIT, and GET) as above, are forced to have external names, so strong_type_EMPTY, strong_type_INIT, and strong_type_GET, but within this TU the long names (strong::type::EMPTY, strong::type::INIT, and strong::type::GET), short names (string::EMPTY, string::INIT, and string::GET) and local names (EMPTY, INIT, and GET) may be used just the same.

As above, in addition to all statically defined identifiers there is the identifier b that is forced to be internal. Again, within this TU it can be accessed as string::b or b, but externally it has a name that is something weird, but distinguishable from the other weird form that b would have in the previous setting.

Again, string itself is special. Within the TU, it is left alone, but to the outside it is visible as strong::type, and that name can also be used internally just as string.

This naming scheme can also be made dependent on the source filename. If that would be strong-type.c, the strong type part above could be omitted.

With mangling

If SHNELL_SEPARATOR is equal to “::” the internal names are exactly the same as above. The outside visible forms are mangled by using the PREFIXx components. For the strong type example this would result in something like _ZN2_C6strong4type5EMPTYE, _ZN2_C6strong4type4INITE, and _ZN2_C6strong4type3GETE, but you should not care much. Within this TU the short, long and local names (EMPTY, INIT, and GET) may be used just the same as above.

Implementation considerations

We use sorting to have unique lists of symbols. Therefore we must ensure that the collating sequence of the locale is ignored.

export LC_COLLATE=C
SORT="${SORT:-sort}"
SORTUNIQ="${SORTUNIQ:--u}"

Coding and configuration

The following code is needed to enable the sh-module framework.

SRC="$_" . "${0%%/${0##*/}}/import.sh"

Imports

The following sh-modules are imported:

Details

 

Declared functions checkNames:

check if the identifier components are conforming

compile: The actual work of this shnell-module is done here. (It can’t have the name export because that is a shell keyword.)