Metaprogramming consists of “programming a program.” In other words, we lay out code that the programming system executes to generate new code that implements the functionality we really want. Usually, the term metaprogramming implies a reflexive attribute: The metaprogramming component is part of the program for which it generates a bit of code (i.e., an additional or different bit of the program).
Why would metaprogramming be desirable? As with most other programming techniques, the goal is to achieve more functionality with less effort, where effort can be measured as code size, maintenance cost, and so forth. What characterizes metaprogramming is that some user-defined computation happens at translation time. The underlying motivation is often performance (things computed at translation time can frequently be optimized away) or interface simplicity (a metapro-gram is generally shorter than what it expands to) or both.
Metaprogramming often relies on the concepts of traits and type functions, as developed in Chapter 19. We therefore recommend becoming familiar with that chapter prior to delving into this one.
C++ metaprogramming techniques evolved over time (the Afternotes at the end of this chapter survey some milestones in this area). Let’s discuss and categorize the various approaches to metaprogramming that are in common use in modern C++.
In the first edition of this book, we were limited to the features introduced in the original C++ standard (published in 1998, with minor corrections in 2003). In that world, composing simple compile-time (“meta-”) computations was a minor challenge. We therefore devoted a good chunk of this chapter to doing just that; one fairly advanced example computed the square root of an integer value at compile time using recursive template instantiations. As introduced in Section 8.2 on page 125, C++11 and, especially, C++14 removed most of that challenge with the introduction of constexpr
functions.1 For example, since C++14, a compile-time function to compute a square root is easily written as follows:
meta/sqrtconstexpr.hpp
template<typename T>
constexpr T sqrt(T x)
{
// handle cases where x and its square root are equal as a special case to simplify
// the iteration criterion for larger x:
if (x <= 1) {
return x;
}
// repeatedly determine in which half of a [lo, hi] interval the square root of x is located,
// until the interval is reduced to just one value:
T lo = 0, hi = x;
for (;;) {
auto mid = (hi+lo)/2, midSquared = mid*mid;
if (lo+1 >= hi || midSquared == x) {
// mid must be the square root:
return mid;
}
//continue with the higher/lower half-interval:
if (midSquared < x) {
lo = mid;
}
else {
hi = mid;
}
}
}
This algorithm searches for the answer by repeatedly halving an interval known to contain the square root of x
(the roots of 0 and 1 are treated as special cases to keep the convergence criterion simple). This sqrt()
function can be evaluated at compile or run time:
static_assert(sqrt(25) == 5, ""); //OK (evaluated at compile time)
static_assert(sqrt(40) == 6, ""); //OK (evaluated at compile time)
std::array<int, sqrt(40)+1> arr; //declares array of 7 elements (compile time)
long long l = 53478;
std::cout << sqrt(l) << ’\n’; //prints 231 (evaluated at run time)
This function’s implementation may not be the most efficient at run time (where exploiting peculiarities of the machine often pays off), but because it is meant to perform compile-time computations, absolute efficiency is less important than portability. Note that no advanced “template magic” is in sight in that square root example, only the usual template argument deduction for a function template. The code is “plain C++” and is not particularly challenging to read.
Value metaprogramming (i.e., programming the computation of compile-time values) as we did above is occasionally quite useful, but there are two additional kinds of metaprogramming that can be performed with modern C++ (say, C++14 and C++17): type metaprogramming and hybrid metaprogramming.
We already encountered a form of type computation in our discussion of certain traits templates in Chapter 19, which take a type as input and produce a new type from it. For example, our RemoveReferenceT
class template computes the underlying type of a reference type. However, the examples we developed in Chapter 19 computed only fairly elementary type operations. By relying on recursive template instantiation—a mainstay of template-based metaprogramming—we can perform type computations that are considerably more complex.
Consider the following small example:
meta/removeallextents.hpp
// primary template: in general we yield the given type:
template<typename T>
struct RemoveAllExtentsT {
using Type = T;
};
// partial specializations for array types (with and without bounds):
template<typename T, std::size_t SZ>
struct RemoveAllExtentsT<T[SZ]> {
using Type = typename RemoveAllExtentsT<T>::Type;
};
template<typename T>
struct RemoveAllExtentsT<T[]> {
using Type = typename RemoveAllExtentsT<T>::Type;
};
template<typename T>
using RemoveAllExtents = typename RemoveAllExtentsT<T>::Type;
Here, RemoveAllExtents
is a type metafunction (i.e., a computational device that produces a result type) that will remove an arbitrary number of top-level “array layers” from a type.2 You can use it as follows:
RemoveAllExtents<int[]>
// yields int
// yields
RemoveAllExtents<int[5][10]>int
// yields
RemoveAllExtents<int[][10]>int
// yields
RemoveAllExtents<int(*)[5]>int(*)[5]
The metafunction performs its task by having the partial specialization that matches the top-level array case recursively “invoke” the metafunction itself.
Computing with values would be very limited if all that was available to us were scalar values. Fortunately, just about any programming language has at least one container of values construct that greatly magnifies the power of that language (and most languages have a variety of container kinds, such as arrays/vectors, hash tables, etc.). The same is true of type metaprogramming: Adding a “container of types” construct greatly increases the applicability of the discipline. Fortunately, modern C++ includes mechanisms that enable the development of such a container. Chapter 24 develops a Typelist<…>
class template, which is exactly such a container of types, in great detail.
With value metaprogramming and type metaprogramming we can compute values and types at compile time. Ultimately, however, we’re interested in run-time effects, so we use these metaprograms in run time code in places where types and constants are expected. Metaprogramming can do more than that, however: We can programmatically assemble at compile time bits of code with a run-time effect. We call that hybrid metaprogramming.
To illustrate this principle, let’s start with a simple example: computing the dot-product of two std::array
values. Recall that std::array
is a fixed-length container template declared as follows:
namespace std {
template<typename T, size_t N> struct array;
}
where N
is the number of elements (of type T
) in the array. Given two objects of the same array type, their dot-product can be computed as follows:
template<typename T, std::size_t N>
auto dotProduct(std::array<T, N> const& x, std::array<T, N> const& y)
{
T result{};
for (std::size_t k = 0; k<N; ++k) {
result += x[k]*y[k];
}
return result;
}
A straightforward compilation of the for
-loop will produce branching instructions that on some machines may cause some overhead compared to a straight-line execution of
result += x[0]*y[0];
result += x[1]*y[1];
result += x[2]*y[2];
result += x[3]*y[3];
…
Fortunately, modern compilers will optimize the loop into whichever form is most efficient for the target platform. For the sake of discussion, however, let’s rewrite our dotProduct()
implementation in a way that avoids the loop:3
template<typename T, std::size_t N>
struct DotProductT {
static inline T result(T* a, T* b) {
return *a * *b + DotProduct<T, N-1>::result(a+1,b+1);
}
};
// partial specialization as end criteria
template<typename T>
struct DotProductT<T, 0> {
static inline T result(T*, T*) {
return T{};
}
};
template<typename T, std::size_t N>
auto dotProduct(std::array<T, N> const& x,
std::array<T, N> const& y)
{
return DotProductT<T, N>::result(x.begin(), y.begin());
}
This new implementation delegates the work to a class template DotProductT
. That enables us to use recursive template instantiation with class template partial specialization to end the recursion. Note how each instantiation of DotProductT
produces the sum of one term of the dot-product and the dot-product of the remaining components of the array. For values of type std::array<T,N>
there will therefore be N
instances of the primary template and one instance of the terminating partial specialization. For this to be efficient, it is critical that the compiler inlines every invocation of the static member functions result()
. Fortunately, that is generally true when even a moderate level of compiler optimizations is enabled.4
The central observation about this code is that it blends a compile-time computation (achieved here through recursive template instantiation) that determines the overall structure of the code with a run-time computation (calling result()
) that determines the specific run-time effect.
We mentioned earlier that type metaprogramming is greatly enhanced by the availability of a “container of types.” We’ve already seen that in hybrid metaprogramming a fixed-length array type can be useful. Nonetheless, the true “hero container” of hybrid metaprogramming is the tuple. A tuple is a sequence of values, each with a selectable type. The C++ standard library includes a std::tuple
class template that supports that notion. For example,
std::tuple<int, std::string, bool> tVal{42, "Answer", true};
defines a variable tVal
that aggregates three values of types int
, std::string
, and bool
(in that specific order). Because of the tremendous importance of tuple-like containers for modern C++ programming, we develop one in detail in Chapter 25. The type of tVal
above is very similar to a simple struct
type like:
struct MyTriple {
int v1;
std::string v2;
bool v3;
};
Given that in std::array
and std::tuple
we have flexible counterparts to array types and (simple) struct
types, it is natural to wonder whether a counterpart to simple union
types would also be useful for hybrid computation. The answer is “yes.” The C++ standard library introduced a std::variant
template for this purpose in C++17, and we develop a similar component in Chapter 26.
Because std::tuple
and std::variant
, like struct
types, are heterogeneous types, hybrid metaprogramming that uses such types is sometimes called heterogeneous metaprogramming.
Another example demonstrating the power of hybrid computing is libraries that are able to compute results of values of different unit types. The value computation is performed at run time, but the computation of the resulting units it determined at compile time.
Let’s illustrate this with a highly simplified example. We are going to keep track of units in terms of their ratio (fraction) of a principal unit. For example, if the principal unit for time is a second, a millisecond is represented with ratio 1/1000 and a minute with ratio 60/1. The key, then, is to define a ratio type where each value has its own type:
template<unsigned N, unsigned D = 1>
struct Ratio {
static constexpr unsigned num = N; // numerator
static constexpr unsigned den = D; // denominator
using Type = Ratio<num, den>;
};
Now we can define compile-time computations such as adding two units:
meta/ratioadd.hpp
// implementation of adding two ratios:
template<typename R1, typename R2>
struct RatioAddImpl
{
private:
static constexpr unsigned den = R1::den * R2::den;
static constexpr unsigned num = R1::num * R2::den + R2::num * R1::den;
public:
typedef Ratio<num, den> Type;
};
// using declaration for convenient usage:
template<typename R1, typename R2>
using RatioAdd = typename RatioAddImpl<R1, R2>::Type;
This allows us to compute the sum of two ratios at compile time:
using R1 = Ratio<1,1000>;
using R2 = Ratio<2,3>;
using RS = RatioAdd<R1,R2>; //RS has type Ratio<2003,2000>
std::cout << RS::num << ’/’ << RS::den << ’\n’; //prints 2003/3000
using RA = RatioAdd<Ratio<2,3>,Ratio<5,7>>; //RA has type Ratio<29,21>
std::cout << RA::num << ’/’ << RA::den << ’\n’; //prints 29/21
We can now define a class template for durations, parameterized with an arbitrary value type and a unit type that is an instance of Ratio<>
:
meta/duration.hpp
// duration type for values of type T with unit type U:
template<typename T, typename U = Ratio<1>>
class Duration {
public:
using ValueType = T;
using UnitType = typename U::Type;
private:
ValueType val;
public:
constexpr Duration(ValueType v = 0)
: val(v) {
}
constexpr ValueType value() const {
return val;
}
};
The interesting part is the definition of an operator+
to add two Duration
s:
meta/durationadd.hpp
// adding two durations where unit type might differ:
template<typename T1, typename U1, typename T2, typename U2>
auto constexpr operator+(Duration<T1, U1> const& lhs,
Duration<T2, U2> const& rhs)
{
// resulting type is a unit with 1 a nominator and
// the resulting denominator of adding both unit type fractions
using VT = Ratio<1,RatioAdd<U1,U2>::den>;
// resulting value is the sum of both values
// converted to the resulting unit type:
auto val = lhs.value() * VT::den / U1::den * U1::num +
rhs.value() * VT::den / U2::den * U2::num;
return Duration<decltype(val), VT>(val);
}
We allow the arguments to have different unit types, U1
and U2
. And we use these unit types to compute the resulting duration to have a unit type that is the corresponding unit fraction (a fraction where the numerator is 1). With all that in place, we can compile the following code:
int x = 42;
int y = 77;
auto a = Duration<int, Ratio<1,1000>>(x); // x milliseconds
auto b = Duration<int, Ratio<2,3>>(y); // y 2/3 seconds
auto c = a + b; //computes resulting unit type 1/3000 seconds
//and generates run-time code for c = a*3 + b*2000
The key “hybrid” effect is that for the sum c
the compiler determines the resulting unit type Ratio<1,3000>
at compile time and generates code to compute at run time the resulting value, which is adjusted for the resulting unit type.
Because the value type is a template parameter, we can use class Duration
with value types other than int
or even use heterogeneous value types (as long as adding the values of these types is defined):
auto d = Duration<double, Ratio<1,3>>(7.5); // 7.5 1/3 seconds
auto e = Duration<int, Ratio<1>>(4); // 4 seconds
auto f = d + e; //computes resulting unit type 1/3 seconds
// and generates code for f = d + e*3
In addition, the compiler can even perform the value computation at compile-time if the values are known at compile time, because operator+
for durations is constexpr
.
The C++ standard library class template std::chrono
uses this approach with several refinements, such as using predefined units (e.g., std::chrono::milliseconds
), supporting duration literals (e.g., 10ms
), and dealing with overflow.
Previously, we described value metaprogramming based on constexpr
evaluation and type metaprogramming based on recursive template instantiation. These two options, both available in modern C++, clearly involve different methods driving the computation. It turns out that value metaprogramming can also be driven in terms of recursive template instantiation, and, prior to the introduction of constexpr
functions in C++11, that was the mechanism of choice. For example, the following code computes a square root of an integer using recursive instantiation:
meta/sqrt1.hpp
// primary template to compute sqrt(N)
template<int N, int LO=1, int HI=N>
struct Sqrt {
// compute the midpoint, rounded up
static constexpr auto mid = (LO+HI+1)/2;
// search a not too large value in a halved interval
static constexpr auto value = (N<mid*mid) ? Sqrt<N,LO,mid-1>::value
: Sqrt<N,mid,HI>::value;
};
// partial specialization for the case when LO equals HI
template<int N, int M>
struct Sqrt<N,M,M> {
static constexpr auto value = M;
};
This metaprogram uses much the same algorithm as our integer square root constexpr
function in Section 23.1.1 on page 529, successively halving an interval known to contain the square root. However, the input to the metafunction is a nontype template argument instead of a function argument, and the “local variables” tracking the bounds to the interval are also recast as nontype template arguments. Clearly, this is a far less friendly approach than the constexpr
function, but we will nevertheless analyze this code later on to examine how it consumes compiler resources.
In any case, we can see that the computational engine of metaprogramming could potentially have many options. That is, however, not the only dimension in which such options may be considered. Instead, we like to think that a comprehensive metaprogramming solution for C++ must make choices along three dimensions:
• Computation
• Reflection
• Generation
Reflection is the ability to programmatically inspect features of the program. Generation refers to the ability to generate additional code for the program.
We have seen two options for computation: recursive instantiation and constexpr
evaluation. For reflection, we have found a partial solution in type traits (see Section 19.6.1 on page 431). Although available traits enable quite a few advanced template techniques, they are far from covering all that is wanted from a reflection facility in the language. For example, given a class type, many applications would like to programmatically explore the members of that class. Our current traits are based on template instantiation, and C++ could conceivably provide additional language facilities or “intrinsic” library components5 to produce class template instances that contain the reflected information at compile time. Such an approach is a good match for computations based on recursive template instantiations. Unfortunately, class template instances consume a lot of compiler storage and that storage cannot be released until the end of the compilation (trying to do otherwise would result in taking significant more compilation time). An alternative option, which is expected to pair well with the constexpr
evaluation option for the “computation” dimension, is to introduce a new standard type to represent reflected information. Section 17.9 on page 363 discusses this option (which is now under active investigation by the C++ standardization committee).
Section 17.9 on page 363 also shows a potential future approach to powerful code generation. Creating a flexible, general, and programmer-friendly code generation mechanism within the existing C++ language remains a challenge that is being investigated by various parties. However, it is also true that instantiating templates has always been a “code generation” mechanism of sorts. In addition, compilers have become reliable enough at expanding calls to small functions in-line that that mechanism can be used as a vehicle for code generation. Those observations are exactly what underlies our DotProductT
example above and, combined with more powerful reflection facilities, existing techniques can already achieve remarkable metaprogramming effects.
Let’s analyze the Sqrt<>
template introduced in Section 23.2 on page 537. The primary template is the general recursive computation that is invoked with the template parameter N
(the value for which to compute the square root) and two other optional parameters. These optional parameters represent the minimum and maximum values the result can have. If the template is called with only one argument, we know that the square root is at least 1 and at most the value itself.
The recursion then proceeds using a binary search technique (often called method of bisection in this context). Inside the template, we compute whether value
is in the first or the second half of the range between LO
and HI
. This case differentiation is done using the conditional operator ?:
. If mid
2 is greater than N
, we continue the search in the first half. If mid
2 is less than or equal to N
, we use the same template for the second half again.
The partial specialization ends the recursive process when LO
and HI
have the same value M
, which is our final value
.
Template instantiations are not cheap: Even relatively modest class templates can allocate over a kilobyte of storage for every instance, and that storage cannot be reclaimed until compilation as completed. Let’s therefore examine the details of a simple program that uses our Sqrt
template:
meta/sqrt1.cpp
#include <iostream>
#include "sqrt1.hpp"
int main()
{
std::cout << "Sqrt<16>::value = " << Sqrt<16>::value << ’\n’;
std::cout << "Sqrt<25>::value = " << Sqrt<25>::value << ’\n’;
std::cout << "Sqrt<42>::value = " << Sqrt<42>::value << ’\n’;
std::cout << "Sqrt<1>::value = " << Sqrt<1>::value << ’\n’;
}
The expression
Sqrt<16>::value
is expanded to
Sqrt<16,1,16>::value
Inside the template, the metaprogram computes Sqrt<16,1,16>::value
as follows:
mid = (1+16+1)/2
= 9
value = (16<9*9) ? Sqrt<16,1,8>::value
: Sqrt<16,9,16>::value
= (16<81) ? Sqrt<16,1,8>::value
: Sqrt<16,9,16>::value
= Sqrt<16,1,8>::value
Thus, the result is computed as Sqrt<16,1,8>::value
, which is expanded as follows:
mid = (1+8+1)/2
= 5
value = (16<5*5) ? Sqrt<16,1,4>::value
: Sqrt<16,5,8>::value
= (16<25) ? Sqrt<16,1,4>::value
: Sqrt<16,5,8>::value
= Sqrt<16,1,4>::value
Similarly, Sqrt<16,1,4>::value
is decomposed as follows:
mid = (1+4+1)/2
= 3
value = (16<3*3) ? Sqrt<16,1,2>::value
: Sqrt<16,3,4>::value
= (16<9) ? Sqrt<16,1,2>::value
: Sqrt<16,3,4>::value
= Sqrt<16,3,4>::value
Finally, Sqrt<16,3,4>::value
results in the following:
mid = (3+4+1)/2
= 4
value = (16<4*4) ? Sqrt<16,3,3>::value
: Sqrt<16,4,4>::value
= (16<16) ? Sqrt<16,3,3>::value
: Sqrt<16,4,4>::value
= Sqrt<16,4,4>::value
and Sqrt<16,4,4>::value
ends the recursive process because it matches the explicit specialization that catches equal high and low bounds. The final result is therefore
value = 4
Our analysis above followed the significant instantiations that compute the square root of 16. However, when a compiler evaluates the expression
(16<=8*8) ? Sqrt<16,1,8>::value
: Sqrt<16,9,16>::value
it instantiates not only the templates in the positive branch but also those in the negative branch (Sqrt<16,9,16>
). Furthermore, because the code attempts to access a member of the resulting class type using the ::
operator, all the members inside that class type are also instantiated. This means that the full instantiation of Sqrt<16,9,16>
results in the full instantiation of Sqrt<16,9,12>
and Sqrt<16,13,16>
. When the whole process is examined in detail, we find that dozens of instantiations end up being generated. The total number is almost twice the value of N
.
Fortunately, there are techniques to reduce this explosion in the number of instantiations. To illustrate one such important technique, we rewrite our Sqrt
metaprogram as follows:
meta/sqrt2.hpp
#include "ifthenelse.hpp"
// primary template for main recursive step
template<int N, int LO=1, int HI=N>
struct Sqrt {
// compute the midpoint, rounded up
static constexpr auto mid = (LO+HI+1)/2;
// search a not too large value in a halved interval
using SubT = IfThenElse<(N<mid*mid),
Sqrt<N,LO,mid-1>,
Sqrt<N,mid,HI>>;
static constexpr auto value = SubT::value;
};
// partial specialization for end of recursion criterion
template<int N, int S>
struct Sqrt<N, S, S> {
static constexpr auto value = S;
};
The key change here is the use of the IfThenElse
template, which was introduced in Section 19.7.1 on page 440. Remember, the IfThenElse
template is a device that selects between two types based on a given Boolean constant. If the constant is true, the first type is type-alias
ed to Type
; otherwise, Type
stands for the second type. At this point it is important to remember that defining a type alias for a class template instance does not cause a C++ compiler to instantiate the body of that instance. Therefore, when we write
using SubT = IfThenElse<(N<mid*mid),
Sqrt<N,LO,mid-1>,
Sqrt<N,mid,HI>>;
neither Sqrt<N,LO,mid-1>
nor Sqrt<N,mid,HI>
is fully instantiated. Whichever of these two types ends up being a synonym for SubT
is fully instantiated when looking up SubT::value
. In contrast to our first approach, this strategy leads to a number of instantiations that is proportional to log2(N
): a very significant reduction in the cost of metaprogramming when N
gets moderately large.
Our Sqrt<>
example demonstrates that a template metaprogram can contain:
• State variables: The template parameters
• Loop constructs: Through recursion
• Execution path selection: By using conditional expressions or specializations
• Integer arithmetic
If there are no limits to the amount of recursive instantiations and the number of state variables that are allowed, it can be shown that this is sufficient to compute anything that is computable. However, it may not be convenient to do so using templates. Furthermore, because template instantiation requires substantial compiler resources, extensive recursive instantiation quickly slows down a compiler or even exhausts the resources available. The C++ standard recommends but does not mandate that 1024 levels of recursive instantiations be allowed as a minimum, which is sufficient for most (but certainly not all) template metaprogramming tasks.
Hence, in practice, template metaprograms should be used sparingly. There are a few situations, however, when they are irreplaceable as a tool to implement convenient templates. In particular, they can sometimes be hidden in the innards of more conventional templates to squeeze more performance out of critical algorithm implementations.
Consider the following recursive template:
template<typename T, typename U>
struct Doublify {
};
template<int N>
struct Trouble {
using LongType = Doublify<typename Trouble<N-1>::LongType,
typename Trouble<N-1>::LongType>;
};
template<>
struct Trouble<0> {
using LongType = double;
};
Trouble<10>::LongType ouch;
The use of Trouble<10>::LongType
not only triggers the recursive instantiation of Trouble<9>
, Trouble<8>
, …, Trouble<0>
, but it also instantiates Doublify
over increasingly complex types. Table 23.1 illustrates how quickly it grows.
Type Alias |
Underlying Type |
|
|
|
|
|
|
|
|
As can be seen from Table 23.1, the complexity of the type description of the expression Trouble<N>::LongType
grows exponentially with N
. In general, such a situation stresses a C++ compiler even more than do recursive instantiations that do not involve recursive template arguments. One of the problems here is that a compiler keeps a representation of the mangled name for the type. This mangled name encodes the exact template specialization in some way, and early C++ implementations used an encoding that is roughly proportional to the length of the template-id. These compilers then used well over 10,000 characters for Trouble<10>::LongType
.
Newer C++ implementations take into account the fact that nested template-ids are fairly common in modern C++ programs and use clever compression techniques to reduce considerably the growth in name encoding (e.g., a few hundred characters for Trouble<10>::LongType
). These newer compilers also avoid generating a mangled name if none is actually needed because no low-level code is actually generated for the template instance. Still, all other things being equal, it is probably preferable to organize recursive instantiation in such a way that template arguments need not also be nested recursively.
In the early days of C++, enumeration values were the only mechanism to create “true constants” (called constant-expressions) as named members in class declarations. With them, we could, for example, define a Pow3
metaprogram to compute powers of 3 as follows:
meta/pow3enum.hpp
// primary template to compute 3 to the Nth
template<int N>
struct Pow3 {
enum { value = 3 * Pow3<N-1>::value };
};
// full specialization to end the recursion
template<>
struct Pow3<0> {
enum { value = 1 };
};
The standardization of C++98 introduced the concept of in-class static constant initializers, so that our Pow3
metaprogram could look as follows:
meta/pow3const.hpp
// primary template to compute 3
to the N
th
template<int N
struct Pow3 {
static int const value = 3 * Pow3<N-1>::value;
};
// full specialization to end the recursion
template<>
struct Pow3<0> {
static int const value = 1;
};
However, there is a drawback with this version: Static constant members are lvalues (see Appendix B). So, if we have a declaration such as
void foo(int const&);
and we pass it the result of a metaprogram:
foo(Pow3<7>::value);
a compiler must pass the address of Pow3<7>::value
, and that forces the compiler to instantiate and allocate the definition for the static member. As a result, the computation is no longer limited to a pure “compile-time” effect.
Enumeration values aren’t lvalues (i.e., they don’t have an address). So, when we pass them by reference, no static memory is used. It’s almost exactly as if you passed the computed value as a literal. The first edition of this book therefore preferred the use of enumerator constants for this kind of applications.
C++11, however, introduced constexpr
static data members, and those are not limited to integral types. They do not solve the address issue raised above, but in spite of that shortcoming they are now a common way to produce results of metaprograms. They have the advantage of having a correct type (as opposed to an artificial enum type), and that type can be deduced when the static member is declared with the auto
type specifier. C++17 added inline
static data members, which do solve the address issue raised above, and can be used in conjunction with constexpr
.
The earliest documented example of a metaprogram was by Erwin Unruh, then representing Siemens on the C++ standardization committee. He noted the computational completeness of the template instantiation process and demonstrated his point by developing the first metaprogram. He used the Metaware compiler and coaxed it into issuing error messages that would contain successive prime numbers. Here is the code that was circulated at a C++ committee meeting in 1994 (modified so that it now compiles on standard conforming compilers):6
meta/unruh.cpp
// prime number computation // (modified from original from 1994 by Erwin Unruh)
// prime number computation
// (modi ed from original from 1994 by Erwin Unruh)
template<int p, int i>
struct is_prime {
enum { pri = (p==2) || ((p%i) && is_prime<(i>2?p:0),i-1>::pri) };
};
template<>
struct is_prime<0,0> {
enum {pri=1};
};
template<>
struct is_prime<0,1> {
enum {pri=1};
};
template<int i>
struct D {
D(void*);
};
template<int i>
struct CondNull {
static int const value = i;
};
template<>
struct CondNull<0> {
static void* value;
};
void* CondNull<0>::value = 0;
template<int i>
struct Prime_print { // primary template for loop to print prime numbers
Prime_print<i-1> a;
enum { pri = is_prime<i,i-1>::pri };
void f() {
D<i> d = CondNull<pri ? 1 : 0>::value; // 1 is an error, 0 is ne
a.f();
}
};
template<>
struct Prime_print<1> { // full specialization to end the loop
enum {pri=0};
void f() {
D<1> d = 0;
};
};
#ifndef LAST
#define LAST 18
#endif
int main()
{
Prime_print<LAST> a;
a.f();
}
If you compile this program, the compiler will print error messages when, in Prime_print::f()
, the initialization of d
fails. This happens when the initial value is 1
because there is only a constructor for void*
, and only 0
has a valid conversion to void*
. For example, on one compiler, we get (among several other messages) the following errors:7
unruh.cpp:39:14: error: no viable conversion from ’const int’ to ’D<17>’
unruh.cpp:39:14: error: no viable conversion from ’const int’ to ’D<13>’
unruh.cpp:39:14: error: no viable conversion from ’const int’ to ’D<11>’
unruh.cpp:39:14: error: no viable conversion from ’const int’ to ’D<7>’
unruh.cpp:39:14: error: no viable conversion from ’const int’ to ’D<5>’
unruh.cpp:39:14: error: no viable conversion from ’const int’ to ’D<3>’
unruh.cpp:39:14: error: no viable conversion from ’const int’ to ’D<2>’
The concept of C++ template metaprogramming as a serious programming tool was first made popular (and somewhat formalized) by Todd Veldhuizen in his paper Using C++ Template Metaprograms (see [VeldhuizenMeta95]). Todd’s work on Blitz++ (a numeric array library for C++, see [Blitz++]) also introduced many refinements and extensions to metaprogramming (and to expression template techniques, introduced in Chapter 27).
Both the first edition of this book and Andrei Alexandrescu’s “Modern C++ Design” (see [AlexandrescuDesign]) contributed to an explosion of C++ libraries exploiting template-based metaprogramming by cataloging some of the basic techniques that are still in use today. The Boost project (see [Boost]) was instrumental in bringing order to this explosion. Early on, it introduced the MPL (meta-programming library), which defined a consistent framework for type metaprogramming made popular also through Abrahams and Gurtovoy’s book “C++ Template Metaprogramming” (see [Boost-MPL]).
Additional important advances have been made by Louis Dionne in making metaprogramming syntactically more accessible, particularly through his Boost.Hana library (see [BoostHana]). Louis, along with Andrew Sutton, Herb Sutter, David Vandevoorde, and others are now spearheading efforts in the standardization committee to give metaprogramming first-class support in the language. An important basis for that work is the exploration of what program properties should be available through reflection; Matúš Chochlík, Axel Naumann, and David Sankel are principal contributors in that area.
In [BartonNackman] John J. Barton and Lee R. Nackman illustrated how to keep track of dimensional units when performing computations. The SIunits library was a more comprehensive library for dealing with physical units developed by Walter Brown ([BrownSIunits]). The std::chrono
component in the standard library, which we used as an inspiration for Section 23.1.4 on page 534, only deals with time and dates, and was contributed by Howard Hinnant.
1 The C++11 constexpr
capabilities were sufficient to solve many common challenges, but the programming model was not always pleasant (e.g., no loop statements were available, so iterative computation had to exploit recursive function calls; see Section 23.2 on page 537). C++14 enabled loop statements and various other constructs.
2 The C++ standard library provides a corresponding type trait std::remove_all_extents
. See Section D.4 on page 730 for details.
3 This is known as loop unrolling. We generally recommend against explicitly unrolling loops in portable code since the details that determine the best unrolling strategy depend highly on the target platform and the loop body; the compiler usually does a much better job of taking those considerations into account.
4 We specify the inline
keyword explicitly here because some compilers (notably Clang) take this as hint to try a little harder to inline calls. From a language point of view, these functions are implicitly inline
because they are defined in the body of their enclosing class.
5 Some of the traits provided in the C++ standard library already rely on some cooperation from the compiler (via nonstandard “intrinsic” operators). See Section 19.10 on page 461.
6 Thanks to Erwin Unruh for providing the code for this book. You can find the original example at [UnruhPrimeOrig].
7 As error handling in compilers differs, some compilers might stop after printing the first error message.