What the Compiler Books never teach you


Using Templates

Table of Contents      


Getting Up and Running



New C++ Features


External Interface   


Man Page



By default, KAI C++ automatically instantiates templates. This means that KAI C++ keeps track of what templates need to be instantiated without further help from the programmer. KAI C++ usually has no problem compiling Cfront codes that use templates (with the option --implicit_include enabled). Better yet, KAI C++ can compile templates that Cfront cannot (e.g., the Standard Template Library). KAI C++ also links programs containing templates much faster than Cfront.

If your templates are simple and compile with Cfront, you may want to try compiling your code with KAI C++ before deciding whether to read the rest of this chapter. If Cfront made you swear off templates because linking was too slow, you should read Section 5.2, and give templates a second chance with KAI C++. Otherwise, if you make ambitious use of templates, you will want to understand exactly how KAI C++ deals with templates.

If you are writing lots of new templates, you should read about reducing template bloat.

5.1 Template Declaration vs. Definition

There are two parts to a template entity: its declaration and its definition. The declaration declares the interface to the template. The definition is what the template actually does. For example, here is a declaration that declares two template entities:

    template<class T> class Iter {
	T * my_ptr;
	T * at();
	void operator++();

The two template entities are the member functions at and operator++. Here is the corresponding template definition for at:

    template<class T> T * Iter::at() {
	return my_ptr;


The declaration and definition can be combined:

    template<class T> class Iter {
	T * my_ptr;
	T * at() {return my_ptr;}
	void operator++() {++my_ptr;}

5.2 Automatic Template Instantiation

In KAI C++, a template entity can be instantiated only if three things come together in the same translation unit:

A translation unit is a source file and the files it #includes, directly or indirectly, and either explicitly or by implicit inclusion. The definitions and their use must be brought together in at least one of the translation units linked into the full application. Wise programmers usually hold to the rule that if a template is used in a translation unit, the definition appears in that unit too.

There are three different ways in which you can write templates so that KAI C++ can find the definition: explicit inclusion, implicit inclusion and the ad-hoc approach. The three methods are described below. We recommend explicit inclusion.

5.2.1 Explicit Inclusion

Each .h file that declares a template entity also contains either the definition of the template entity or includes another file containing the definition. Codes for Borland C++ and GNU C++ frequently use this convention. For example, a common way to do this is to partition a set of template declarations and definitions across two files "Foo.h" and "Foo.C, and making "Foo.h" include "Foo.C.

	/* Near end of file "Foo.h" */
	#include <"Foo.C">

Notice that this is the exact opposite of the usual use of #include directives. It is important that "Foo.C" not contain any non-template definitions, otherwise problems might arise from having multiple definitions.

5.2.2 Implicit Inclusion

Implicit inclusion is useful for making existing Cfront codes compile with minimal intervention. It is turned off by default in KAI C++ 3.3 and later. To turn it on, use the option:


It is off by default because the following disadvantages weigh against it for modern C++:

The rest of this section explains the issues.

Implicit inclusion gives KCC license to find template definitions by implicitly including source files corresponding to headers. The rule is that when the following three conditions apply:

  1. implicit inclusion is enabled, and
  2. KCC sees a template declaration in a .h file, and
  3. KCC has not yet found the corresponding definition

then KCC implicitly includes the corresponding file with a .C or other suffix, on the presumption that the template definition is there. The suffixes checked are listed below:

    .c .C .cpp .CPP .cxx .CXX .cc

For example, if class Iter<T> is declared in file foo.hand instantiated in file bar.C, implicit inclusion causes KAI C++ to act as though file foo.C (or foo.cpp, etc.) were included at the end of file bar.C.

If you use implicit inclusion, the .C (or other suffix) file should contain template definitions only. Otherwise, implicit inclusion can lead to problems. For example, if foo.C defines a static variable, and is implicitly included by two other .C files, there will be two copies of the static variable. Worse yet, if foo.C defines a non-template non-static file-scope entity (function or variable), then implicit inclusion causes the entity to be defined twice, and a ``multiply defined symbol'' error results at link time. In fact, the same problem can happen with Cfront.

In summary, our recommended coding practices are:

5.2.3 Ad Hoc Approach

The programmer simply makes sure that whenever a template entity needs to be instantiated, the template definition and all necessary types are defined in the file. This method usually requires that the programmer write code (possibly with pragmas) that force the compiler to instantiate template entities in the right places.

5.2.4 Prelinker

Explicit inclusion, implicit inclusion, and the "Ad Hoc Approach" are ways of letting KAI C++ find a template definition. This is half of the template instantiation issue. The other half is how KAI C++ chooses where to instantiate a template. Automatic instantiation works via a prelinker. For each translation unit, say foo.C, that uses templates, there are two corresponding files maintained by KCC:

This file communicates information from the parser to the prelinker. It tells the prelinker what templates can be instantiated and what template instances are required.
This file communicates information from the prelinker to the parser. When foo.C is recompiled, the compiler reads this file to determine what templates should be instantiated in the code for foo.C.

Initially, no templates are assigned to any translation units. When a program is first linked, the prelinker examines all object files and assigns template instances to appropriate translation units. If a template is instantiatable in more than one, the prelinker arbitrarily assigns the template to one of the units. When the prelinker assigns a new template instance to a translation unit, it records the assignment in the .ii file and automatically recompiles the file:

  	C++ prelinker: Recompiling foo.C

This may sound like templates make linking slow, since files must be recompiled at link time. In fact, it does make the first link slow. However, once you have a complete link, subsequent links are much faster. This is because whenever KAI C++ compiles foo.C, it will consult ti_files/foo.ii and produce the assigned template instantiations without having to be told to by the prelinker. Subsequent links only require recompilation effort when a source-code change requires that template instantiations be reassigned. KAI C++ automatically checks for such changes and recompiles as necessary.


Code Reuse by Related Applications

The C++ notion of templates challenges the cherished notion of code reuse in a roundabout way. An automatic instantiation of a template has no prescribed home among the object files of an application. The code produced for it must either be attached to the definition of some routine in whose context it can be instantiated, or else be split off into a separate object file which must later be added into the final link.

As described above, KCC's prelinker can automatically locate the right source code host for each template instantiation needed by an application. This is called "closing" the application or library. Closing an application records its template assignment information in the ti_files directory, alongside the compilation command lines and environment needed when adding new instantiations to each source file. The drawback to making these records is that the template assignment is a function of the total application, not just the one file to which it ends up assigned. The same assignment won't necessarily occur if you close another application that shares the same source file.

If two applications share foo.C, but compile it with different options, they cannot usually shared each other's foo.o. In addition to whatever effect they have on foo.o, these options are recorded in ti_files/foo.ti, and only one set will survive. But let's suppose that the compilation options are identical. Can foo.o be reused? The answer is "Not yet, but we're working on it."

The challenge arises when closure of application 1 assigns a template instantiation to foo.C. As noted above, future compilations of foo.C also compile the template instantiations needed for a closure of application 1. These may be unneeded or harmful within application 2. At the very least building and closing application 2 will involve extra effort to undo the assignments appropriate only to application 1. In practice this activity is very fragile.

Releases 3.3 and later of KAI C++ assume that the ti_files in the current directory describe the one application context for which every -c compilation is being prepared. Different applications that share source files should be built in separate directories. The one possible exception is shared and archive forms of the same library code, provided other compilation options remain identical.


Quieter/Louder Prelinking

By default, the prelinker announces just the fact that it is recompiling a translation unit. You can get quieter or louder output with the following KCC option sequences:

--COMPDO_pl -q
No announcements of recompiles or template assignments.
--COMPDO_pl -v
Announces each individual template assignment and prints the complete command line of each recompilation. This was the default with KCC 3.2, but the heavy use of templates in modern C++ made it too noisy for most users' tastes. This mode is the prelinker's contibution to the full verbose output of KCC -v.


Turning Off Prelinking

This is for those who occasionally "cheat" on builds to save time. An example of cheating is changing timestamps on files to fool make because you know rebuilding those files is pointless. To turn off prelinking, use the option --no_prelink. For example, if you have already linked your program once, the correct .ii files have been built. If subsequent changes to your program are minor enough so that no new template instances are required, then there is no need to spend time prelinking. You may want to consider using --no_prelink during the edit-compile-debug cycle. At worst, you will get an unresolved use of a template and know that you need to turn the prelinker back on for a cycle. The author cheats all the time with make and --no_prelink.

5.3 Building Libraries that Depend on Automatic Instantiation

The C++ notion of templates breaks cherished notions of separate compilation in a fundamental way, because a template cannot be instantiated unless its definition is known. This is a quite different situation from ordinary functions, which can be called merely by knowing their interface. Thus ordinary library tools such as ar and ld should not be used to build libraries. Instead KCC must be used to build libraries.

5.3.1 Shared Libraries

To build a shared library and do automatic template instantiation, run KCC as you normally would to produce an executable, but specify a shared library as the output file. E.g.:

    KCC -o libfoo.so foo1.o foo2.o

KCC recognizes the platform's standard suffixes (.so or .sl) for shared libraries.

5.3.2 Archives

To build an archive (aka static) library while still doing automatic template instantiation, run KCC as you normally would to produce an executable, but specify an archive library as the output file. E.g.:

    KCC -o libfoo.a foo1.o foo2.o


KCC 3.2 to 3.3 migration note:

When building an archive, KCC 3.2 behaves like the UNIX command ``ar r libfoo.a foo1.o foo2.o ...'' in that if libfoo.a already exists, the .o files are added to (or replace) what is already in libfoo.a. This feature is removed in KCC 3.3 because it can require recompiling other objects within the archive, but the foo*.C and foo*.ti files from which this must be done are not kept in the archive. You probably keep them alongside the extracted .o files in the directory where you build the archive. On KCC 3.3, just repeat the original KCC link command for the archive adding any new .o files to the original list. KCC 3.3 will remove an existing archive library before assembling a new one of the same name.

5.3.4 One Instantiation Per Object

A problem with most UNIX archives (IBM's AIX excepted) is that if any function or data item in a .o file is required by a link, then the entire .o file is pulled in. This behavior can cause duplicate definitions of template instances when multiple libraries are present. For example, suppose library libfoo.a has an object file foo1.o and library libbar.a has an object file bar1.o, and that both were assigned the sample template chomper<int> by the prelinker. If at link time, both foo1.o and bar1.o were pulled in for other reasons, then both definitions of chomper<int> would be pulled in, resulting in a duplicate definition.

The solution is to make KCC create separate .o files for chomper<int> rather than bundle it into both foo1.o and bar1.o, so that only one of those separate template instances will be loaded. Unfortunately, KCC -c cannot know how a .o file is going to be used later, so you have to tell KCC when compiling with -c. The command line option is --one_instantiation_per_object. E.g.:

	KCC --one_instantiation_per_object -c foo1.C foo2.C
	KCC -o libfoo.a foo1.o foo2.o
	KCC --one_instantiation_per_object -c bar1.C bar2.C
	KCC -o libbar.a bar1.o bar2.o

KCC automatically keeps track of the extra .o files (they are hiding in the ti_files/ directory). You do not need to mention them on your linkage command lines, in fact doing so can cause problems.

One_instantiation_per_object is on by default when KCC's output file is an archive library. So the following commands are equivalent to the above, except that they will not leave the .o files conveniently extracted for a quick re-link.

	KCC -o libfoo.a foo1.C foo2.C
	KCC -o libbar.a bar1.C bar2.C

Compiling with one_instantiation_per_object does not dedicate a .o file to being used only inside archive libraries. It is okay to use .o files generated with --one_instantiation_per_object in executables and shared libraries also. The only reason not to do this is the extra compilation overhead incurred to generate multiple .o files.

5.3.5 Closing Libraries

The UNIX notion of libraries presumes a traditional model of separate compilation that templates inherently subvert (some would say transcend). Hence using templates in a UNIX environment entails certain limitations.

In particular, once a library is built, it is considered frozen and no new template instances can be added. Thus when a client of a library uses a new instance not in the library, the client must include a header that defines the template. Otherwise the use will be unresolved.

Thus creating libraries containing templates requires some care. When KCC creates a library, it ``closes'' the library by first running the prelinker so that all uses of templates defined in the library or accessible headers are satsified. However, when multiple libraries are involved, you may want to do a bit more to avoid redundancy. Consider building a library libfoo.a that you expect to use in conjunction with a libbar.a, and the latter already has template instances that libfoo.a can use. To inform the prelinker that it should not instantiate instances in libfoo.a, simply put libbar.a on the command line used to build libfoo.a like this:

      KCC -o libfoo.a foo1.o foo2.o libbar.a

The library libbar.a does not become part of libfoo.a, but is used by the prelinker during automatic instantiation in accounting for satisfied template uses. Similar considerations apply when both libraries are .so files. However beware that when a .a archive is linked into a .so shared library, selections from the .a are usually incorporated into the .so.

5.4 Building Libraries that Do Not Depend on Automatic Instantiation

Some programmers find it simpler to forgo automatic instantiation altogether in libraries. This can be done portably by using the ISO syntax for explicit instantiation and otherwise using templates only for entities with internal linkage. For example, using only inline functions in templates. Entities with internal linkage do not require prelinking, because such entities are instantiated in each .o file where required. To build the library, specify .a, .so, or .sl as the executable file. The choice of .so or .sl is whichever is normally used for shared libraries on the platform. For instance, the two commands shown below build a shared library and an archive library.


    KCC --no_prelink -o libfoo.so foo1.o foo2.o 

    KCC --no_prelink -o libfoo.a foo1.o foo2.o

The option --no_prelink turns off the prelinker. If you are building an archive library, you can also use a plain ar command. However, if you are building a shared library, you must use KCC and not ld. The reason is that on some systems ld is not "smart enough" to ensure that constructors for global objects are called when the library is executed.

5.5 Structuring Code to Reduce Template Bloat

Undisciplined use of templates can lead to excessive compilation time and code bloat. The reason is that distinct code is generated for each instance of a template class member or template function.

A common and effective technique for reducing template bloat is to make distinct instances of a template share a common implementation. For example, naive implementations of STL base their map, multimap, set, and multiset classes on a template class rb_tree for red-black trees that takes many parameters:

    template rb_tree<class Key, 
                     class Value, 
                     class KeyOfValue, 
                     class Compare, 
                     class Allocator>

With this sort of implementation, there will be separate instances (and unshared code) for each kind of tree. But much of the tree logic is really independent of the template parameters. For example, the ``rebalancing'' operation on red-black trees does not require knowing any of the template parameters. Furthermore, many of the operations such as searching need to know only the Key parameter.

When a piece of non-trivial functionality is independent of a template parameter, it can be factored out of the template into a separate piece of code (or template without the parameter). Typically, the independent part is not quite type safe. E.g., it may use typeless (void*) pointers instead of typed (T*) pointers. The template part than uses small inline functions that cast the typeless entities to typed ones seen by the client of the template. The separation can be done in many ways:

Each technique has advantages. Below are some examples.

5.5.1 Example: Sharing via Inheritance

Making the shared code a base class is useful when the shared code needs access to information about the template parameters. The basic idea is to put the parameter-independent part in a base class, and derive the template. Then the template can invoke shared implementation by calling on the base, and the base class can invoke parameter-dependent code by calling a virtual functions which the derived template defines. The KAI implementation of the STL associative containers is an example. The relevant part of its hierarchy looks like this:

     __kai::rb_tree_base        (red-black tree)
	 __kai::rb_tree             (+ key comparison)
	     __kai::map_base            (+ allocator)
		 map                        (+ uniqueness)
		 multimap                   (+ nonuniqueness)
	     __kai::set_base            (+ allocator
		 set                        (+ uniqueness)
		 multiset                   (+ nonuniqueness)

Successively deeper layers in the hierarchy add template parameters. The algorithms have been written so that much code can be pushed up the hierarchy and suffer less duplication. Here is a detailed description of the layering for class map.

This is is non-templated code that performs basic algorithms on red-black trees that need only know the tree structure. There are methods there for traversing, copying, and destroying red-black trees. Some of these operations indirectly need to know the exact node types involved. When they do, they invoke virtual functions make_node and destroy_node that do the type-dependent work. Note that it is in hidden in a separate namespace since it is not to be used by ordinary clients.
This layer adds three pieces of knowledge: the key type, where it is in a node, and how to compare keys.
This layers adds knowledge about how to create and destroy tree nodes of specific types, and that keys have associated values. This is the layer in which the virtual functions make_node and destroy_node are finally defined.
map This layer adds the knowledge that the keys are unique.

The layering is a bit complex to implement, but pays back in in greatly reduced compilation times and code size. In particular, all the STL associative containers share the same convoluted red-black tree rebalancing algorithm. Since it is not a template, it can be (and is) part of KCC's run-time library in binary form. Furthermore, all STL associative containers for the same type of key (a common situation) all share the same code for searching such trees.

5.5.2 Example: Sharing via Global Functions

Another way to separate common code from a template is to put it in global functions, and pass parameter-dependent information to the functions. The KAI implementation of std::bitset<N> is an example. The algorithms for the shift operators (<<, >>, <<=, and >>=) could have been coded entirely within the template class. However, this would have led to wasteful duplication of code. So instead, these operators call global functions (__kai::shift_left and __kai::shift_right) that can be shared. Information dependent upon the value of N is passed to these functions.

Of course for small bitsets that fit in a single word, the call would in fact take more code than doing the operation inline. So the KAI implementations of the shift operators check the value of N and use inline code for the one-word case. Since N is a compile-time constant, only the code for the relevant case shows up in the optimized executable.

Inline try blocks and inline throws are particularly expensive. Therefore the KAI implementation of std::bitset<N> factors these operations out into global functions that are neither inline or templates:

    void throw_invalid_value_in_string();
    void throw_bitset_overflow ();

Subscript checking is another common candidate for factoring out.


5.6 Template Features that are Deprecated or Removed

This section describes template features that are deprecated or removed in KCC 3.3.

5.6.1 Guiding Declarations

Guiding declarations are disallowed in --strict mode.

5.6.2 Option --split Removed

The feature of ``splitting'' libraries with --split when building libraries has been replaced by the much faster feature  --one instantiation per object.

5.6.3 Parallel Prelinking is Moot

The option "--parallel_build n" still works in KCC 3.3, but does not parallelize prelink recompilations. The 3.3 prelinking scheme in KCC has a serializing bottleneck. The good news is that the 3.3 algorithm greatly reduces the number of prelinker iterations when compared to the algorithm used by KCC 3.2. The reduction in iterations somewhat makes up for the loss of parallelism during linking. However, there is still a lot of natural parallelism in the prelink work, and future KCC releases will begin to exploit this again.

5.6.4 Other Template Instantiation Modes

The default mode of automatic instantiation works for most programs. The other modes are deprecated. They are of interest only in rare situations when trying to construct cases for bug reports. Here's a summary of the modes.

Automatic (-tnone)
Instantiate templates according to ti_files/*.ii files generated by prelinker. This is the default.
Used (-tused)
Instantiate any templates used by a translation unit. This mode causes duplication of template instantiations when more than one translation unit uses the same template. The linker may give warnings or errors about the duplicates. This is handy if you are trying to generate a small test case for the compiler and need to force instantiation of all templates in the test case.
All (-tall)
Instantiate all templates. This instantiates all member functions of template classes that were used, and instantiates other template functions even if the only reference was a declaration. It is a vestige left over from previous incarnations of KCC. The only reason for documenting it is to tell you to not use it.

5.6.5 Template Instantiation Pragmas

The template instantiation pragmas are deprecated. They are provided strictly for compatibility with old versions of KCC. New codes should use the explicit instantiation syntax defined by the ISO C++ Draft Working Paper.

Pragmas can be used for fine-grain control of template instantiation. For instance, the following pragma forces instantiation of member function Iter<char>::at.

    #pragma instantiate Iter<char>::at

All members in class Iter can be instantiated with:

    #pragma instantiate Iter<char>

It is also possible to prohibit a specific template instantiation as shown below.

    #pragma do_not_instantiate Iter<char>::at

This is useful if you are providing a specific definition (specialization) for a template entity. It is also handy for excluding certain member functions. For example,

    #pragma instantiate Iter<char>
    #pragma do_not_instantiate Iter<char>::operator+

says to instantiate all members of class Iter except operator+.

The argument to the instantiation pragmas can be:

There is also a third kind of pragma that tells the prelinker that it can (but does not have to) instantiate a template entity in an object file. For example,

    #pragma can_instantiate Iter<char>::at

tells the prelinker to consider the current object file as a candidate for the instantiation of method Iter<char>::at.

Restrictions on Pragmas

The argument to an instantiation pragma may not be a compiler-generated function, an inline function, or a pure virtual function. For instance, our definition of class Iter does not define a copy-constructor, and thus it becomes a compiler-generated function. Consequently, "Iter::Iter(const Iter&)" cannot be an argument to an instantiation pragma.

Next Section         Copyright © 1996-1999. All rights reserved.

E-Mail KAI Technical Support   E-Mail KAI   Contact KAI   

This file last updated on 25 March 1999.