2.3.2 External Templates

What was the problem

External templates are one of my favorite features from C++11.

When we were writing template code before C++11, for example a template function, then every time the code was included in a compilation unit the compiler would instantiate all used symbols. Check for example the header below:

// factorial-98.hpp 
#pragma once 
 
template<typename T> 
T factorial(T n) 
{ 
  T r = 1; 
 
  for (T i = 1; i <= n; ++i) 
    r += r * i; 
 
  return r; 
}

If this header was included in two .cpp files, and its function actually called, then its compiled code would be present in the object file of each .cpp; something we can check with nm.

Following the example, let’s compile the two files below:

// foo-98.cpp 
#include "factorial-98.hpp" 
 
int foo(int i) 
{ 
  return factorial(i); 
}
// bar-98.cpp 
#include "factorial-98.hpp" 
 
int bar(int i) 
{ 
  return factorial(i); 
}

Then nm would print:

$ nm bar-98.cpp.o foo-98.cpp.o 
 
bar-98.cpp.o: 
000000000000001b T foo(int) 
0000000000000036
Wintfactorial<int>(int) 
 
foo-98.cpp.o: 
000000000000001b T bar(int) 
0000000000000036
Wintfactorial<int>(int)

Not only does this consume space (54 bytes per file here) but it also costs a lot of work for the compiler and the linker. All of this adds up, even for medium projects. Imagine that for each template function or class the compiler parses the code, then it compiles it, then writes all these bytes on disk; then all these bytes are read again by the linker, who sorts all these duplicate symbols to keep only one.

At the end of this process, literally all the work done by the compiler has been useless. Wouldn’t it have been better not to do it in the first place? The linker then spent more time removing the crap rather than actually linking, and the programmer was wondering if he could get a more powerful computer. Again.

How the Problem is Solved

Thankfully the extern template from C++11 allows us to skip all this useless work. First we have to tell the compiler not to instantiate the template for a subset of valid types by adding a single line to our header:

// factorial-11.hpp 
#pragma once 
 
template<typename T> 
T factorial(T n) 
{ 
  T r = 1; 
 
  for (T i = 1; i <= n; ++i) 
    r += r * i; 
 
  return r; 
}
extern template int factorial<int>(int);

This line tells the compiler not to instantiate factorial() when T = int. Then we add a .cpp file where we explicitly ask for the instantiation:

// factorial-11.cpp 
#include "factorial-11.hpp" 
 
template int factorial<int>(int);

That’s it. Let’s check with nm:

$ nm bar-11.cpp.o foo-11.cpp.o factorial.cpp 
 
bar-11.cpp.o: 
000000000000001b T bar(int) 
 
foo-11.cpp.o: 
000000000000001b T foo(int) 
 
factorial-11.cpp.o: 
0000000000000036 W int factorial<int>(int)

The template function is indeed compiled in a single file, and absent from the others2.

Guideline

If you are writing template code for which you know some or all instantiations, then add an extern template line for them in the header, and explicitly instantiate them in an implementation file. This is not always feasible, but when it can be done it should be done

2We can push even further in this case, by having the whole body of factorial() in the .cpp file and only its signature in the header. It would not only avoid the parsing of the code but, more importantly, allow to remove from the header all include directives related to implementation details. Note that in this case the extern keyword can even be omitted.