|
||
---|---|---|
|
||
By default, KAI C++ automatically instantiates templates. This
means that KAI C++ keeps track of what templates need to be instantiated
without further help from the programmer. KAI C++ usually has
no problem compiling Cfront codes that use templates (with the
option --implicit_include
enabled). Better yet, KAI C++ can compile templates that Cfront
cannot (e.g., the Standard Template Library). KAI C++ also links
programs containing templates much faster than Cfront.
If your templates are simple and compile with Cfront, you may want to try compiling your code with KAI C++ before deciding whether to read the rest of this chapter. If Cfront made you swear off templates because linking was too slow, you should read Section 5.2, and give templates a second chance with KAI C++. Otherwise, if you make ambitious use of templates, you will want to understand exactly how KAI C++ deals with templates.
If you are writing lots of new templates, you should read about reducing template bloat.
There are two parts to a template entity: its declaration and its definition. The declaration declares the interface to the template. The definition is what the template actually does. For example, here is a declaration that declares two template entities:
template<class T> class Iter { private: T * my_ptr; public: T * at(); void operator++(); };
The two template entities are the member functions at
and operator++
. Here is the corresponding template
definition for at
:
template<class T> T * Iter::at() { return my_ptr; }
The declaration and definition can be combined:
template<class T> class Iter { private: T * my_ptr; public: T * at() {return my_ptr;} void operator++() {++my_ptr;} };
In KAI C++, a template entity can be instantiated only if three things come together in the same translation unit:
A translation unit is a source file and the files it
#include
s, directly or indirectly, and either explicitly or by
implicit inclusion.
The definitions
and their use must be brought together in at least one of the
translation units linked into the full application. Wise programmers
usually hold to the rule that if a template is used in a translation
unit, the definition appears in that unit too.
There are three different ways in which you can write templates
so that KAI C++ can find the definition:
explicit inclusion,
implicit inclusion
and the
ad-hoc approach.
The three methods are described
below. We recommend explicit inclusion.
Each .h file that declares a template entity also contains
either the definition of the template entity or includes another
file containing the definition. Codes for Borland C++ and GNU
C++ frequently use this convention. For example, a common way
to do this is to partition a set of template declarations and
definitions across two files "Foo.h"
and
"Foo.C
, and making "Foo.h"
include "Foo.C
.
/* Near end of file "Foo.h" */ #include <"Foo.C">
Notice that this is the exact opposite of the usual
use of #include directives. It is important that "Foo.C"
not contain any non-template definitions, otherwise problems might
arise from having multiple definitions.
Implicit inclusion is useful for making existing Cfront codes compile with minimal intervention. It is turned off by default in KAI C++ 3.3 and later. To turn it on, use the option:
--implicit_include
It is off by default because the following disadvantages weigh against it for modern C++:
-M
.
<foo.h>
declares a template entity, and
the corresponding source file foo.C
(or foo.cc
etc.) defines non-template functions or variables.
#include
directives.
The rest of this section explains the issues.
Implicit inclusion gives KCC license to find template definitions by implicitly including source files corresponding to headers. The rule is that when the following three conditions apply:
.h
file,
and
then KCC implicitly includes the corresponding file with a
.C
or other suffix, on the presumption that the template
definition is there. The suffixes checked are listed below:
.c .C .cpp .CPP .cxx .CXX .cc
For example, if class Iter<T>
is declared
in file foo.h
and instantiated in file bar.C
,
implicit inclusion causes KAI C++ to act as though file foo.C
(or foo.cpp
, etc.) were included at the end of file
bar.C
.
If you use implicit inclusion, the .C
(or other
suffix) file should contain template definitions only. Otherwise,
implicit inclusion can lead to problems. For example, if foo.C
defines a static variable, and is implicitly included by two other
.C
files, there will be two copies of the
static variable. Worse yet, if foo.C
defines a non-template
non-static file-scope entity (function or variable), then implicit
inclusion causes the entity to be defined twice, and a ``multiply
defined symbol'' error results at link time. In fact, the same
problem can happen with Cfront.
In summary, our recommended coding practices are:
--no_implicit_include
. Have each header
file explicitly include the required template definitions.
The programmer simply makes sure that whenever a template entity needs to be instantiated, the template definition and all necessary types are defined in the file. This method usually requires that the programmer write code (possibly with pragmas) that force the compiler to instantiate template entities in the right places.
Explicit inclusion, implicit inclusion, and the "Ad Hoc Approach" are ways of letting KAI C++ find a template definition. This is half of the template instantiation issue. The other half is how KAI C++ chooses where to instantiate a template. Automatic instantiation works via a prelinker. For each translation unit, say foo.C, that uses templates, there are two corresponding files maintained by KCC:
ti_files/foo.ti
- This file communicates information from the parser to the prelinker. It tells the prelinker what templates can be instantiated and what template instances are required.
ti_files/foo.ii
- This file communicates information from the prelinker to the parser. When foo.C is recompiled, the compiler reads this file to determine what templates should be instantiated in the code for foo.C.
Initially, no templates are assigned to any translation units.
When a program is first linked, the prelinker examines all object
files and assigns template instances to appropriate translation
units. If a template is instantiatable in more than one, the prelinker
arbitrarily assigns the template to one of the units. When the
prelinker assigns a new template instance to a translation unit,
it records the assignment in the .ii
file and automatically
recompiles the file:
C++ prelinker: Recompiling foo.C
This may sound like templates make linking slow, since files
must be recompiled at link time. In fact, it does make the first
link slow. However, once you have a complete link, subsequent
links are much faster. This is because whenever KAI C++
compiles foo.C
, it will consult ti_files/foo.ii
and produce the assigned template instantiations without having
to be told to by the prelinker. Subsequent links only require
recompilation effort when a source-code change requires that template
instantiations be reassigned. KAI C++ automatically checks for
such changes and recompiles as necessary.
The C++ notion of templates challenges the cherished notion of code reuse in a roundabout way. An automatic instantiation of a template has no prescribed home among the object files of an application. The code produced for it must either be attached to the definition of some routine in whose context it can be instantiated, or else be split off into a separate object file which must later be added into the final link.
As described above, KCC's prelinker can automatically locate the right source
code host for each template instantiation needed by an application. This is
called "closing" the application or library.
Closing an application records its template assignment information in the ti_files
directory, alongside the compilation command lines and environment needed when
adding new instantiations to each source file.
The drawback to making these records is that the template assignment
is a function of the total application, not just the one file to which it ends up
assigned. The same assignment won't necessarily occur if you close another
application that shares the same source file.
If two applications share foo.C
, but compile it with different options, they
cannot usually shared each other's foo.o
. In addition to whatever effect they
have on foo.o
, these options are recorded
in ti_files/foo.ti
, and only one set will survive. But let's suppose that
the compilation options are identical. Can foo.o
be reused? The answer is
"Not yet, but we're working on it."
The challenge arises when closure of application 1 assigns a template instantiation to foo.C
.
As noted above, future compilations of foo.C
also compile the template instantiations
needed for a closure of application 1. These may be
unneeded or harmful within application 2. At the very least building and closing application 2 will involve
extra effort to undo the assignments appropriate only to application 1. In practice this activity
is very fragile.
Releases 3.3 and later of KAI C++ assume that the ti_files
in the current directory describe the one application context for which every -c
compilation
is being prepared. Different applications that share source files should be built in separate
directories. The one possible exception is shared and archive forms of the same library code,
provided other compilation options remain identical.
By default, the prelinker announces just the fact that it is recompiling a translation unit. You can get quieter or louder output with the following KCC option sequences:
--COMPDO_pl -q
- No announcements of recompiles or template assignments.
--COMPDO_pl -v
- Announces each individual template assignment and prints the complete command line of each recompilation. This was the default with KCC 3.2, but the heavy use of templates in modern C++ made it too noisy for most users' tastes. This mode is the prelinker's contibution to the full verbose output of
KCC -v
.
This is for those who occasionally "cheat" on builds to save
time. An example of cheating is changing timestamps on files to
fool make
because you know rebuilding those files
is pointless. To turn off prelinking, use the option --no_prelink.
For example, if you have already linked your program once, the
correct .ii
files have been built. If subsequent
changes to your program are minor enough so that no new template
instances are required, then there is no need to spend time prelinking.
You may want to consider using --no_prelink during
the edit-compile-debug cycle. At worst, you will get an unresolved
use of a template and know that you need to turn the prelinker
back on for a cycle. The author cheats all the time with make
and --no_prelink.
The C++ notion of templates breaks cherished notions of separate compilation in a fundamental way, because a template cannot be instantiated unless its definition is known. This is a quite different situation from ordinary functions, which can be called merely by knowing their interface. Thus ordinary library tools such as ar and ld should not be used to build libraries. Instead KCC must be used to build libraries.
To build a shared library and do automatic template instantiation, run KCC as you normally would to produce an executable, but specify a shared library as the output file. E.g.:
KCC -o libfoo.so foo1.o foo2.o
KCC recognizes the platform's standard suffixes (.so
or .sl
) for shared libraries.
To build an archive (aka static) library while still doing automatic template instantiation, run KCC as you normally would to produce an executable, but specify an archive library as the output file. E.g.:
KCC -o libfoo.a foo1.o foo2.o
When building an archive, KCC 3.2 behaves like the UNIX command
``ar r libfoo.a foo1.o foo2.o ...
'' in that if libfoo.a
already exists, the .o files are added to (or replace) what is
already in libfoo.a
. This feature is removed in KCC
3.3 because it can require recompiling other objects within the
archive, but the foo*.C
and foo*.ti
files from which this must be done are not kept in the archive.
You probably keep them alongside the extracted .o files in the
directory where you build the archive. On KCC 3.3, just repeat
the original KCC link command for the archive adding any new .o
files to the original list. KCC 3.3 will remove an existing
archive library before assembling a new one of the same name.
A problem with most UNIX archives (IBM's AIX excepted) is that
if any function or data item in a .o file is required by a link,
then the entire .o file is pulled in. This behavior can cause
duplicate definitions of template instances when multiple libraries
are present. For example, suppose library libfoo.a
has an object file foo1.o and library libbar.a
has
an object file bar1.o, and that both were assigned the sample
template chomper<int>
by the prelinker. If
at link time, both foo1.o and bar1.o were pulled in for other
reasons, then both definitions of chomper<int>
would be pulled in, resulting in a duplicate definition.
The solution is to make KCC create separate .o files for chomper<int>
rather than bundle it into both foo1.o and bar1.o, so that only
one of those separate template instances will be loaded. Unfortunately,
KCC -c
cannot know how a .o file is going to be used
later, so you have to tell KCC when compiling with -c
.
The command line option is --one_instantiation_per_object
.
E.g.:
KCC --one_instantiation_per_object -c foo1.C foo2.C KCC -o libfoo.a foo1.o foo2.o KCC --one_instantiation_per_object -c bar1.C bar2.C KCC -o libbar.a bar1.o bar2.o
KCC automatically keeps track of the extra .o files (they are
hiding in the ti_files/
directory). You do not need
to mention them on your linkage command lines, in fact doing so
can cause problems.
One_instantiation_per_object is on by default when KCC's output file is an archive library. So the following commands are equivalent to the above, except that they will not leave the .o files conveniently extracted for a quick re-link.
KCC -o libfoo.a foo1.C foo2.C KCC -o libbar.a bar1.C bar2.C
Compiling with one_instantiation_per_object does not dedicate
a .o file to being used only inside archive libraries. It is okay
to use .o files generated with --one_instantiation_per_object
in executables and shared libraries also. The only reason not
to do this is the extra compilation overhead incurred to generate
multiple .o files.
The UNIX notion of libraries presumes a traditional model of separate compilation that templates inherently subvert (some would say transcend). Hence using templates in a UNIX environment entails certain limitations.
In particular, once a library is built, it is considered frozen and no new template instances can be added. Thus when a client of a library uses a new instance not in the library, the client must include a header that defines the template. Otherwise the use will be unresolved.
Thus creating libraries containing templates requires some
care. When KCC creates a library, it ``closes'' the library by
first running the prelinker so that all uses of templates defined
in the library or accessible headers are satsified. However, when
multiple libraries are involved, you may want to do a bit more
to avoid redundancy. Consider building a library libfoo.a
that you expect to use in conjunction with a libbar.a
,
and the latter already has template instances that libfoo.a
can use. To inform the prelinker that it should not instantiate
instances in libfoo.a
, simply put libbar.a
on the command line used to build libfoo.a
like this:
KCC -o libfoo.a foo1.o foo2.o libbar.a
The library libbar.a
does not become part of libfoo.a
,
but is used by the prelinker during automatic instantiation in
accounting for satisfied template uses. Similar considerations
apply when both libraries are .so
files. However
beware that when a .a
archive is linked into a .so
shared library, selections from the .a
are
usually incorporated into the .so
.
Some programmers find it simpler to forgo
automatic instantiation altogether in libraries. This can be done
portably by using the ISO
syntax for explicit instantiation and otherwise using templates
only for entities with internal linkage. For example, using only
inline functions in templates. Entities with internal linkage
do not require prelinking, because such entities are instantiated
in each .o
file where required. To build the library,
specify .a
, .so
, or .sl
as the executable file. The choice of .so
or .sl
is whichever is normally used for shared libraries on the platform.
For instance, the two commands shown below build a shared library
and an archive library.
KCC --no_prelink -o libfoo.so foo1.o foo2.o KCC --no_prelink -o libfoo.a foo1.o foo2.o
The option --no_prelink
turns off the prelinker. If you are building an archive library,
you can also use a plain ar command. However, if you are building
a shared library, you must use KCC and not ld.
The reason is that on some systems ld
is not "smart
enough" to ensure that constructors for global objects are
called when the library is executed.
Undisciplined use of templates can lead to excessive compilation time and code bloat. The reason is that distinct code is generated for each instance of a template class member or template function.
A common and effective technique for reducing template bloat is to make distinct instances of a template share a common implementation. For example, naive implementations of STL base their map, multimap, set, and multiset classes on a template class rb_tree for red-black trees that takes many parameters:
template rb_tree<class Key, class Value, class KeyOfValue, class Compare, class Allocator>
With this sort of implementation, there will be separate instances (and unshared code) for each kind of tree. But much of the tree logic is really independent of the template parameters. For example, the ``rebalancing'' operation on red-black trees does not require knowing any of the template parameters. Furthermore, many of the operations such as searching need to know only the Key parameter.
When a piece of non-trivial functionality is independent of a template parameter, it can be factored out of the template into a separate piece of code (or template without the parameter). Typically, the independent part is not quite type safe. E.g., it may use typeless (void*) pointers instead of typed (T*) pointers. The template part than uses small inline functions that cast the typeless entities to typed ones seen by the client of the template. The separation can be done in many ways:
Each technique has advantages. Below are some examples.
Making the shared code a base class is useful when the shared code needs access to information about the template parameters. The basic idea is to put the parameter-independent part in a base class, and derive the template. Then the template can invoke shared implementation by calling on the base, and the base class can invoke parameter-dependent code by calling a virtual functions which the derived template defines. The KAI implementation of the STL associative containers is an example. The relevant part of its hierarchy looks like this:
__kai::rb_tree_base (red-black tree) __kai::rb_tree (+ key comparison) __kai::map_base (+ allocator) map (+ uniqueness) multimap (+ nonuniqueness) __kai::set_base (+ allocator set (+ uniqueness) multiset (+ nonuniqueness)
Successively deeper layers in the hierarchy add template parameters. The algorithms have been written so that much code can be pushed up the hierarchy and suffer less duplication. Here is a detailed description of the layering for class map.
__kai::rb_tree_base
make_node
and destroy_node
that do
the type-dependent work. Note that it is in hidden in a separate
namespace since it is not to be used by ordinary clients.
__kai::rb_tree
__kai::map_base
make_node
and destroy_node
are finally defined.
map
This layer adds the knowledge that the keys
are unique.
The layering is a bit complex to implement, but pays back in
in greatly reduced compilation times and code size. In particular,
all the STL associative containers share the same convoluted red-black
tree rebalancing algorithm. Since it is not a template, it can
be (and is) part of KCC's run-time library in binary form. Furthermore,
all STL associative containers for the same type of key (a common
situation) all share the same code for searching such trees.
Another way to separate common code from a template is to put
it in global functions, and pass parameter-dependent information
to the functions. The KAI implementation of std::bitset<N>
is an example. The algorithms for the shift operators (<<,
>>, <<=, and >>=) could have been coded entirely
within the template class. However, this would have led to wasteful
duplication of code. So instead, these operators call global functions
(__kai::shift_left and __kai::shift_right) that can be shared.
Information dependent upon the value of N is passed to these functions.
Of course for small bitsets that fit in a single word, the call would in fact take more code than doing the operation inline. So the KAI implementations of the shift operators check the value of N and use inline code for the one-word case. Since N is a compile-time constant, only the code for the relevant case shows up in the optimized executable.
Inline try blocks and inline throws are particularly expensive.
Therefore the KAI implementation of std::bitset<N>
factors these operations out into global functions that are neither
inline or templates:
void throw_invalid_value_in_string(); void throw_bitset_overflow ();
Subscript checking is another common candidate for factoring out.
This section describes template features that are deprecated
or removed in KCC 3.3.
Guiding declarations
are disallowed in --strict
mode.
--split
RemovedThe feature of ``splitting'' libraries with --split
when building libraries has been replaced by the much faster feature
--one instantiation per object.
The option "--parallel_build n
" still works in KCC 3.3, but
does not parallelize prelink recompilations. The 3.3 prelinking
scheme in KCC has a serializing bottleneck. The good news is that the
3.3 algorithm greatly reduces the number of prelinker iterations when compared
to the algorithm used by KCC 3.2. The reduction
in iterations somewhat makes up for the loss of parallelism during
linking. However, there is still a lot of natural parallelism in the prelink
work, and future KCC releases will begin to exploit this again.
The default mode of automatic instantiation works for most programs. The other modes are deprecated. They are of interest only in rare situations when trying to construct cases for bug reports. Here's a summary of the modes.
- Automatic (
-tnone
)- Instantiate templates according to
ti_files/*.ii
files generated by prelinker. This is the default.- Used (
-tused
)- Instantiate any templates used by a translation unit. This mode causes duplication of template instantiations when more than one translation unit uses the same template. The linker may give warnings or errors about the duplicates. This is handy if you are trying to generate a small test case for the compiler and need to force instantiation of all templates in the test case.
- All (
-tall
)- Instantiate all templates. This instantiates all member functions of template classes that were used, and instantiates other template functions even if the only reference was a declaration. It is a vestige left over from previous incarnations of KCC. The only reason for documenting it is to tell you to not use it.
The template instantiation pragmas are deprecated. They are provided strictly for compatibility with old versions of KCC. New codes should use the explicit instantiation syntax defined by the ISO C++ Draft Working Paper.
Pragmas can be used for fine-grain control of template instantiation.
For instance, the following pragma forces instantiation of member
function Iter<char>::at
.
#pragma instantiate Iter<char>::at
All members in class Iter
can be instantiated
with:
#pragma instantiate Iter<char>
It is also possible to prohibit a specific template instantiation as shown below.
#pragma do_not_instantiate Iter<char>::at
This is useful if you are providing a specific definition (specialization) for a template entity. It is also handy for excluding certain member functions. For example,
#pragma instantiate Iter<char> #pragma do_not_instantiate Iter<char>::operator+
says to instantiate all members of class Iter
except operator+
.
The argument to the instantiation pragmas can be:
A template class name A<int>
A member function name A<int>::f
A static data member name A<int>::i
A member function declaration void A<int>::f(int,char)
A template function declaration char*f(int,float)
There is also a third kind of pragma that tells the prelinker that it can (but does not have to) instantiate a template entity in an object file. For example,
#pragma can_instantiate Iter<char>::at
tells the prelinker to consider the current object file as
a candidate for the instantiation of method Iter<char>::at
.
The argument to an instantiation pragma may not be a compiler-generated
function, an inline function, or a pure virtual function. For
instance, our definition of class Iter
does not define
a copy-constructor, and thus it becomes a compiler-generated function.
Consequently, "Iter::Iter(const Iter&)
"
cannot be an argument to an instantiation pragma.