Chapter 8. The Type Erasure Design Pattern

Separation of concerns and value semantics are two of the essential takeaways from this book that I have mentioned a couple of times by now. In this chapter, these two are beautifully combined into one of the most interesting modern C++ design patterns: Type Erasure. Since this pattern can be considered one of the hottest irons in the fire, in this chapter I will give you a very thorough, in-depth introduction to all aspects of Type Erasure. This, of course, includes all design-specific aspects and a lot of specifics about implementation details.

In “Guideline 32: Consider Replacing Inheritance Hierarchies with Type Erasure”, I will introduce you to Type Erasure and give you an idea why this design pattern is such a great combination of dependency reduction and value semantics. I will also give you a walkthrough of a basic, owning Type Erasure implementation.

“Guideline 33: Be Aware of the Optimization Potential of Type Erasure” is an exception: despite the fact that in this book I primarily focus on dependencies and design aspects, in this one guideline I will entirely focus on performance-related implementation details. I will show you how to apply the Small Buffer Optimization (SBO) and how to implement a manual virtual dispatch to speed up your Type Erasure implementation.

In “Guideline 34: Be Aware of the Setup Costs of Owning Type Erasure Wrappers”, we will investigate the setup costs of the owning Type Erasure implementation. We will find that there is a cost associated with value semantics that sometimes we may not be willing to pay. For this reason, we dare to take a step into the realm of reference semantics and implement a form of nonowning Type Erasure.

Guideline 32: Consider Replacing Inheritance Hierarchies with Type Erasure

There are a couple of recurring pieces of advice throughout this book:

Minimize dependencies.
Separate concerns.
Prefer composition to inheritance.
Prefer nonintrusive solutions.
Prefer value semantics over reference semantics.

Used on their own, all of these have very positive effects on the quality of your code. In combination, however, these guidelines prove to be so much better. This is what you have experienced in our discussion about the External Polymorphism design pattern in “Guideline 31: Use External Polymorphism for Nonintrusive Runtime Polymorphism”. Extracting the polymorphic behavior turned out to be extremely powerful and unlocked an unprecedented level of loose coupling. Still, probably disappointingly, the demonstrated implementation of External Polymorphism did not strike you as a very modern way of solving things. Instead of following the advice to prefer value semantics, the implementation was firmly built on reference semantics: many pointers, many manual allocations, and manual lifetime management.¹ Hence, the missing detail you’re waiting for is a value semantics–based implementation of the External Polymorphism design pattern. And I will not keep you waiting anymore: the resulting solution is commonly called Type Erasure.²

The History of Type Erasure

Before I give you a detailed introduction, let’s quickly talk about the history of Type Erasure. “Come on,” you argue. “Is this really necessary? I’m dying to finally see how this stuff works.” Well, I promise to keep it short. But yes, I feel this is a necessary detail of this discussion for two reasons. First, to demonstrate that we as a community, aside from the circle of the most experienced C++ experts, may have overlooked and ignored this technique for too long. And second, to give some well-deserved credit to the inventor of the technique.

The Type Erasure design pattern is very often attributed to one of the first and therefore most famous presentations of this technique. At the GoingNative 2013 conference, Sean Parent gave a talk called “Inheritance Is the Base Class of Evil.”³ recapped his experiences with the development of Photoshop and talked about the dangers and disadvantages of inheritance-based implementations. However, he also presented a solution to the inheritance problem, which later came to be known as Type Erasure.

Despite Sean’s talk being one of the first recorded, and for that reason probably the most well-known resource about Type Erasure, the technique was used long before that. For instance, Type Erasure was used in several places in the Boost libraries, for example, by Douglas Gregor for boost::function. Still, to my best knowledge, the technique was first discussed in a paper by Kevlin Henney in the July-August 2000 edition of the C++ Report.⁴ In this paper, Kevlin demonstrated Type Erasure with a code example that later evolved into what we today know as C++17’s std::any. Most importantly, he was the first to elegantly combine several design patterns to form a value semantics–based implementation around a collection of unrelated, nonpolymorphic types.

Since then, a lot of common types have acquired the technique to provide value types for various applications. Some of these types have even found their way into the Standard Library. For instance, we have already seen std::function, which represents a value-based abstraction of a callable.⁵ I’ve already mentioned std::any, which represents an abstract container-like value for virtually anything (hence the name) but without exposing any functionality:

#include <any>
#include <cstdlib>
#include <string>
using namespace std::string_literals;

int main()
{
   std::any a;          // Creating an empty 'any'
   a = 1;               // Storing an 'int' inside the 'any';
   a = "some string"s;  // Replacing the 'int' with a 'std::string'

   // There is nothing we can do with the 'any' except for getting the value back
   std::string s = std::any_cast<std::string>( a );

   return EXIT_SUCCESS;
}

And then there is std::shared_ptr, which uses Type Erasure to store the assigned deleter:

#include <cstdlib>
#include <memory>

int main()
{
   {
      // Creating a 'std::shared_ptr' with a custom deleter
      //   Note that the deleter is not part of the type!
      std::shared_ptr<int> s{ new int{42}, [](int* ptr){ delete ptr; } };
   }
   // The 'std::shared_ptr' is destroyed at the end of the scope,
   //   deleting the 'int' by means of the custom deleter.

   return EXIT_SUCCESS;
}

“It appears to be simpler to just provide a second template parameter for the deleter as std::unique_ptr does. Why isn’t std::shared_ptr implemented in the same way?” you inquire. Well, the designs of std::shared_ptr and std::unique_ptr are different for very good reasons. The philosophy of std::unique_ptr is to represent nothing but the simplest possible wrapper around a raw pointer: it should be as fast as a raw pointer, and it should have the same size as a raw pointer. For that reason, it is not desirable to store the deleter alongside the managed pointer. Consequently, std::unique_ptr is designed such that for stateless deleters, any size overhead can be avoided. However, unfortunately, this second template parameter is easily overlooked and causes artificial restrictions:

// This function takes only unique_ptrs that use the default deleter,
//   and thus is artificially restricted
template< typename T >
void func1( std::unique_ptr<T> ptr );

// This function does not care about the way the resource is cleaned up,
//   and thus is truly generic
template< typename T, typename D >
void func2( std::unique_ptr<T,D> ptr );

This kind of coupling is avoided in the design of std::shared_ptr. Since std::shared_ptr has to store many more data items in its so-called control block (that includes the reference count, the weak count, etc.), it has the opportunity to use Type Erasure to literally erase the type of the deleter, removing any kind of possible dependency.

The Type Erasure Design Pattern Explained

“Wow, that truly sounds intriguing. This makes me even more excited to learn about Type Erasure.” OK then, here we go. However, please don’t expect any magic or revolutionary new ideas. Type Erasure is nothing but a compound design pattern, meaning that it is a very clever and elegant combination of three other design patterns. The three design patterns of choice are External Polymorphism (the key ingredient for achieving the decoupling effect and the nonintrusive nature of Type Erasure; see “Guideline 31: Use External Polymorphism for Nonintrusive Runtime Polymorphism”), Bridge (the key to creating a value semantics–based implementation; see “Guideline 28: Build Bridges to Remove Physical Dependencies”), and (optionally) Prototype (required to deal with the copy semantics of the resulting values; see “Guideline 30: Apply Prototype for Abstract Copy Operations”). These three design patterns form the core of Type Erasure, but of course, keep in mind that different interpretations and implementations exist, mainly to adapt to specific contexts. The point of combining these three design patterns is to create a wrapper type, which represents a loosely coupled, nonintrusive abstraction.

The Type Erasure Compound Design Pattern

Intent: “Provide a value-based, non-intrusive abstraction for an extendable set of unrelated, potentially non-polymorphic types with the same semantic behavior.”

The purpose of this formulation is to be as short as possible, and as precise as necessary. However, every detail of this intent carries meaning. Thus, it may be helpful to elaborate:

Value-based: The intent of Type Erasure is to create value types that may be copyable, movable, and most importantly, easily reasoned about. However, such a value type is not of the same quality as a regular value type; there are some limitations. In particular, Type Erasure works best for unary operations but has its limits for binary operations.
Nonintrusive: The intent of Type Erasure is to create an external, nonintrusive abstraction based on the example set by the External Polymorphism design pattern. All types providing the behavior expected by the abstraction are automatically supported, without the need to apply any modifications to them.

Extendable, unrelated set of types: Type Erasure is firmly based on object-oriented principles, i.e., it enables you to add types easily. These types, though, should not be connected in any way. They do not have to share common behavior via some base class. Instead, it should be possible to add any fitting type, without any intrusive measure, to this set of types.
Potentially nonpolymorphic: As demonstrated with the External Polymorphism design pattern, types should not have to buy into the set by inheritance. They should also not have to provide virtual functionality on their own, but they should be decoupled from their polymorphic behavior. However, types with base classes or virtual functions are not excluded.
Same semantic behavior: The goal is not to provide an abstraction for all possible types but to provide a semantic abstraction for a set of types that provide the same operations (including same syntax) and adhere to some expected behavior, according to the LSP (see “Guideline 6: Adhere to the Expected Behavior of Abstractions”). If possible, for any type that does not provide the expected functionality, a compile-time error should be created.

With this formulation of the intent in mind, let’s take a look at the dependency graph of Type Erasure (see Figure 8-1). The graph should look very familiar, as the structure of the pattern is dominated by the inherent structure of the External Polymorphism design pattern (see Figure 7-8). The most important difference and addition is the Shape class on the highest level of the architecture. This class serves as a wrapper around the external hierarchy introduced by External Polymorphism. Primarily, since this external hierarchy will not be used directly anymore, but also to reflect the fact that ShapeModel is storing, or “owning,” a concrete type, the name of the class template has been adapted to OwningShapeModel.

The dependency graph for the Type Erasure design pattern.

An Owning Type Erasure Implementation

OK, but now, with the structure of Type Erasure in mind, let’s take a look at its implementation details. Still, despite the fact that you’ve seen all the ingredients in action before, the implementation details are not particularly beginner-friendly and are not for the fainthearted. And that is despite the fact that I have picked the simplest Type Erasure implementation I’m aware of. Therefore, I will try to keep everything at a reasonable level and not stray too much into the realm of implementation details. Among other things, this means that I won’t try to squeeze out every tiny bit of performance. For instance, I won’t use forwarding references or avoid dynamic memory allocations. Also, I will favor readability and code clarity. While this may be a disappointment to you, I believe that will save us a lot of headache. However, if you want to dig deeper into the implementation details and optimization options, I recommend taking a look at “Guideline 33: Be Aware of the Optimization Potential of Type Erasure”.

We again start with the Circle and Square classes:

//---- <Circle.h> ----------------

class Circle
{
 public:
   explicit Circle( double radius )
      : radius_( radius )
   {}

   double radius() const { return radius_; }
   /* Several more getters and circle-specific utility functions */

 private:
   double radius_;
   /* Several more data members */
};


//---- <Square.h> ----------------

class Square
{
 public:
   explicit Square( double side )
      : side_( side )
   {}

   double side() const { return side_; }
   /* Several more getters and square-specific utility functions */

 private:
   double side_;
   /* Several more data members */
};

These two classes have not changed since we last encountered them in the discussion of External Polymorphism. But it still pays off to again stress that these two are completely unrelated, do not know about each other, and—most importantly—are nonpolymorphic, meaning that they do not inherit from any base class or introduce virtual function on their own.

We have also seen the ShapeConcept and OwningShapeModel classes before, the latter under the name ShapeModel:

//---- <Shape.h> ----------------

#include <memory>
#include <utility>

namespace detail {

class ShapeConcept  
{
 public:
   virtual ~ShapeConcept() = default;
   virtual void draw() const = 0;  
   virtual std::unique_ptr<ShapeConcept> clone() const = 0;  
};

template< typename ShapeT
        , typename DrawStrategy >
class OwningShapeModel : public ShapeConcept  
{
 public:
   explicit OwningShapeModel( ShapeT shape, DrawStrategy drawer )  
      : shape_{ std::move(shape) }
      , drawer_{ std::move(drawer) }
   {}

   void draw() const override { drawer_(shape_); }  

   std::unique_ptr<ShapeConcept> clone() const override
   {
      return std::make_unique<OwningShapeModel>( *this );  
   }

 private:
   ShapeT shape_;  
   DrawStrategy drawer_;  
};

} // namespace detail

Next to the name change, there are a couple of other, important differences. For instance, both classes have been moved to the detail namespace. The name of the namespace indicates that these two classes are now becoming implementation details, i.e., they are not intended for direct use anymore.⁶ The ShapeConcept class () still introduces the pure virtual function draw() to represent the requirement for drawing a shape (). In addition, ShapeConcept now also introduces a pure virtual clone() function (). “I know what this is, this is the Prototype design pattern!” you exclaim. Yes, correct. The name clone() is very strongly connected to Prototype and is a strong indication of this design pattern (but not a guarantee). However, although the choice of the function name is very reasonable and canonical, allow me to point out explicitly that the choice of the function name for clone(), and also for draw(), is our own: these names are now implementation details and do not have any relationship to the names that we require from our ShapeT types. We could as well name them do_draw() and do_clone(), and it would not have any consequence on the ShapeT types. The real requirement on the ShapeT types is defined by the implementation of the draw() and clone() functions.

As ShapeConcept is again the base class for the external hierarchy, the draw() function, the clone() function, and the destructor represent the set of requirements for all kinds of shapes. This means that all shapes must provide some drawing behavior—they must be copyable and destructible. Note that these three functions are only requirement choices for this example. In particular, copyability is not a general requirement for all implementations of Type Erasure.

The OwningShapeModel class () again represents the one and only implementation of the ShapeConcept class. As before, OwningShapeModel takes a concrete shape type and a drawing Strategy in its constructor () and uses these to initialize its two data members ( and ). Since OwningShapeModel inherits from ShapeConcept, it must implement the two pure virtual functions. The draw() function is implemented by applying the given drawing Strategy (), while the clone() function is implemented to return an exact copy of the corresponding OwningShapeModel ().

Note

If you’re right now thinking, “Oh no, std::make_unique(). That means dynamic memory. Then I can’t use that in my code!”—don’t worry. std::make_unique() is merely an implementation detail, a choice to keep the example simple. In “Guideline 33: Be Aware of the Optimization Potential of Type Erasure”, you will see how to avoid dynamic memory with the SBO.

“I’m pretty unimpressed so far. We’ve barely moved beyond the implementation of the External Polymorphism design pattern.” I completely understand the criticism. However, we are just one step away from turning External Polymorphism into Type Erasure, just one step away from switching from reference semantics to value semantics. All we need is a value type, a wrapper around the external hierarchy introduced by ShapeConcept and OwningShapeModel, that handles all the details that we don’t want to perform manually: the instantiation of the OwningShapeModel class template, managing pointers, performing allocations, and dealing with lifetime. This wrapper is given in the form of the Shape class:

//---- <Shape.h> ----------------

// ...

class Shape
{
 public:
   template< typename ShapeT
           , typename DrawStrategy >
   Shape( ShapeT shape, DrawStrategy drawer )  
   {
      using Model = detail::OwningShapeModel<ShapeT,DrawStrategy>;  
      pimpl_ = std::make_unique<Model>( std::move(shape)  
                                      , std::move(drawer) );
   }

   // ...

 private:
   // ...

   std::unique_ptr<detail::ShapeConcept> pimpl_;  
};

The first, and perhaps most important, detail about the Shape class is the templated constructor (). As the first argument, this constructor takes any kind of shape (called ShapeT), and as the second argument, the desired DrawStrategy. To simplify the instantiation of the corresponding detail::OwningShapeModel class template, it proves to be helpful to use a convenient type alias (). This alias is used to instantiate the required model by std::make_unique() (). Both the shape and the drawing Strategy are passed to the new model.

The newly created model is used to initialize the one data member of the Shape class: the pimpl_ (). “I recognize this one, too; this is a Bridge!” you happily announce. Yes, correct again. This is an application of the Bridge design pattern. In the construction, we create a concrete OwningShapeModel based on the actual given types ShapeT and DrawStrategy, but we store it as a pointer to ShapeConcept. By doing this you create a Bridge to the implementation details, a Bridge to the real shape type. However, after the initialization of pimpl_, after the constructor is finished, Shape doesn’t remember the actual type. Shape does not have a template parameter or any member function that would reveal the concrete type it stores, and there is no data member that remembers the given type. All it holds is a pointer to the ShapeConcept base class. Thus, its memory of the real shape type has been erased. Hence the name of the design pattern: Type Erasure.

The only thing missing in our Shape class is the functionality required for a true value type: the copy and move operations. Luckily, due to the application of std::unique_ptr, our effort is pretty limited. Since the compiler-generated destructor and the two move operations will work, we only need to deal with the two copy operations:

//---- <Shape.h> ----------------

// ...

class Shape
{
 public:
   // ...

   Shape( Shape const& other )  
      : pimpl_( other.pimpl_->clone() )
   {}

   Shape& operator=( Shape const& other )  
   {
      // Copy-and-Swap Idiom
      Shape copy( other );
      pimpl_.swap( copy.pimpl_ );
      return *this;
   }

   ~Shape() = default;
   Shape( Shape&& ) = default;
   Shape& operator=( Shape&& ) = default;

 private:
   friend void draw( Shape const& shape )  
   {
      shape.pimpl_->draw();
   }

   // ...
};

The copy constructor () could be a very difficult function to implement, since we do not know the concrete type of shape stored in the other Shape. However, by providing the clone() function in the ShapeConcept base class, we can ask for an exact copy without needing to know anything about the concrete type. The shortest, most painless, and most convenient way to implement the copy assignment operator () is to build on the Copy-and-Swap idiom.

In addition, the Shape class provides a so-called hidden friend called draw() (). This friend function is called a hidden friend, since although it’s a free function, it is defined within the body of the Shape class. As a friend, it’s granted full access to the private data member and will be injected into the surrounding namespace.

“Didn’t you say that friends are bad?” you ask. I admit, that’s what I said in “Guideline 4: Design for Testability”. However, I also explicitly stated that hidden friends are OK. In this case, the draw() function is an integral part of the Shape class and definitely a real friend (almost part of the family). “But then it should be a member function, right?” you argue. Indeed, that would be a valid alternative. If you like this better, go for it. In this case, my preference is to use a free function, since one of our goals was to reduce dependencies by extracting the draw() operation. This goal should also be reflected in the Shape implementation. However, since the function requires access to the pimpl_ data member, and in order to not increase the overload set of draw() functions, I implement it as a hidden friend.

This is it. All of it. Let’s take a look at how beautifully the new functionality works:

//---- <Main.cpp> ----------------

#include <Circle.h>
#include <Square.h>
#include <Shape.h>
#include <cstdlib>

int main()
{
   // Create a circle as one representative of a concrete shape type
   Circle circle{ 3.14 };

   // Create a drawing strategy in the form of a lambda
   auto drawer = []( Circle const& c ){ /*...*/ };

   // Combine the shape and the drawing strategy in a 'Shape' abstraction
   // This constructor call will instantiate a 'detail::OwningShapeModel' for
   // the given 'Circle' and lambda types
   Shape shape1( circle, drawer );

   // Draw the shape
   draw( shape1 );  

   // Create a copy of the shape by means of the copy constructor
   Shape shape2( shape1 );

   // Drawing the copy will result in the same output
   draw( shape2 );  

   return EXIT_SUCCESS;
}

We first create shape1 as an abstraction for a Circle and an associated drawing Strategy. This feels easy, right? There’s no need to manually allocate and no need to deal with pointers. With the draw() function, we’re able to draw this Shape (). Directly afterward, we create a copy of the shape. A real copy—a “deep copy,” not just the copy of a pointer. Drawing the copy with the draw() function will result in the same output (). Again, this feels good: you can rely on the copy operations of the value type (in this case, the copy constructor), and you do not have to clone() manually.

Pretty amazing, right? And definitely much better than using External Polymorphism manually. I admit that after all these implementation details, it may be a little hard to see it right away, but if you step through the jungle of implementation details, I hope you realize the beauty of this approach: you no longer have to deal with pointers, there are no manual allocations, and you don’t have to deal with inheritance hierarchies anymore. All of these details are there, yes, but all evidence is nicely encapsulated within the Shape class. Still, you didn’t lose any of the decoupling benefits: you are still able to easily add new types, and the concrete shape types are still oblivious about the drawing behavior. They are only connected to the desired functionality via the Shape constructor.

“I’m wondering,” you begin to ask, “Couldn’t we make this much easier? I envision a main() function that looks like this”:

//---- <YourMain.cpp> ----------------

int main()
{
   // Create a circle as one representative of a concrete shape type
   Circle circle{ 3.14 };

   // Bind the circle to some drawing functionality
   auto drawingCircle = [=]() { myCircleDrawer(circle); };

   // Type-erase the circle equipped with drawing behavior
   Shape shape( drawingCircle );

   // Drawing the shape
   draw( shape );

   // ...

   return EXIT_SUCCESS;
}

That is a great idea. Remember, you are in charge of all the implementation details of the Type Erasure wrapper and how to bring together types and their operation implementation. If you like this form better, go for it! However, please do not forget that in our Shape example, for the sake of simplicity and code brevity, I have deliberately used only a single functionality with external dependencies (drawing). There could be more functions that introduce dependencies, such as the serialization of shapes. In that case, the lambda approach would not work, as you would need multiple, named functions (e.g., draw() and serialize()). So, ultimately, it depends. It depends on what kind of abstraction your Type Erasure wrapper represents. But whatever implementation you prefer, just make sure that you do not introduce artificial dependencies between the different pieces of functionality and/or code duplication. In other words, remember “Guideline 2: Design for Change”! That is the reason I favored the solution based on the Strategy design pattern, which you, however, shouldn’t consider the true and only solution. On the contrary, you should strive to fully exploit the potential of the loose coupling of Type Erasure.

Analyzing the Shortcomings of the Type Erasure Design Pattern

Despite the beauty of Type Erasure and the large number of benefits that you acquire, especially from a design perspective, I don’t pretend that there are no downsides to this design pattern. No, it wouldn’t be fair to keep potential disadvantages from you.

The first, and probably most obvious, drawback for you might be the implementation complexity of this pattern. As stated before, I have explicitly kept the implementation details at a reasonable level, which hopefully helped you to get the idea. I hope I have also given you the impression that it is not so difficult after all: a basic implementation of Type Erasure can be realized within approximately 30 lines of code. Still, you might feel that it is too complex. Also, as soon as you start to go beyond the basic implementation and consider performance, exception safety, etc., the implementation details indeed become quite tricky very quickly. In these cases, your safest and most convenient option is to use a third-party library instead of dealing with all of these details yourself. Possible libraries include the dyno library from Louis Dionne, the zoo library from Eduardo Madrid, the erasure library from Gašper Ažman, and the Boost Type Erasure library from Steven Watanabe.

In the explanation of the intent of Type Erasure, I mentioned the second disadvantage, which is much more important and limiting: although we are now dealing with values that can be copied and moved, using Type Erasure for binary operations is not straightforward. For instance, it is not easily possible to do an equality comparison on these values, as you would expect from regular values:

int main()
{
   // ...

   if( shape1 == shape2 ) { /*...*/ }  // Does not compile!

   return EXIT_SUCCESS;
}

The reason is that, after all, Shape is only an abstraction from a concrete shape type and only stores a pointer-to-base. As you would deal with exactly the same problem if you used External Polymorphism directly, this is definitely not a new problem in Type Erasure, and you might not even count this as a real disadvantage. Still, while equality comparison is not an expected operation when you’re dealing with pointers-to-base, it usually is an expected operation on values.

Comparing Two Type Erasure Wrappers

“Isn’t this just a question of exposing the necessary functionality in the interface of Shapes?” you wonder. “For instance, we could simply add an area() function to the public interface of shapes and use it to compare two items”:

bool operator==( Shape const& lhs, Shape const& rhs )
{
   return lhs.area() == rhs.area();
}

“This is easy to do. So what am I missing?” I agree that this might be all you need: if two objects are equal if some public properties are equal, then this operator will work for you. In general, the answer would have to be “it depends.” In this particular case, it depends on the semantics of the abstraction that the Shape class represents. The question is: when are two Shapes equal? Consider the following example with a Circle and a Square:

#include <Circle.h>
#include <Square.h>
#include <cstdlib>

int main()
{
   Shape shape1( Circle{3.14} );
   Shape shape2( Square{2.71} );

   if( shape1 == shape2 ) { /*...*/ }

   return EXIT_SUCCESS;
}

When are these two Shapes equal? Are they equal if their areas are equal, or are they equal if the instances behind the abstraction are equal, meaning that both Shapes are of the same type and have the same properties? It depends. In the same spirit, I could ask the question, when are two Persons equal? Are they equal if their first names are equal? Or are they equal if all of their characteristics are equal? It depends on the desired semantics. And while the first comparison is easily done, the second one is not. In a general case, I assume that the second situation is far more likely to be the desired semantics, and therefore I argue that using Type Erasure for equality comparison and more generally for binary operations is not straightforward.

Note, however, that I did not say that equality comparison is impossible. Technically, you can make it work, although it turns out to be a rather ugly solution. Therefore, you have to promise not to tell anyone that you got this idea from me. “You just made me even more curious,” you smile whimsically. OK, so here it is:

//---- <Shape.h> ----------------

// ...

namespace detail {

class ShapeConcept
{
 public:
   // ...
   virtual bool isEqual( ShapeConcept const* c ) const = 0;
};

template< typename ShapeT
        , typename DrawStrategy >
class OwningShapeModel : public ShapeConcept
{
 public:
   // ...

   bool isEqual( ShapeConcept const* c ) const override
   {
      using Model = OwningShapeModel<ShapeT,DrawStrategy>;
      auto const* model = dynamic_cast<Model const*>( c );  
      return ( model && shape_ == model->shape_ );
   }

 private:
   // ...
};

} // namespace detail


class Shape
{
   // ...

 private:
   friend bool operator==( Shape const& lhs, Shape const& rhs )
   {
      return lhs.pimpl_->isEqual( rhs.pimpl_.get() );
   }

   friend bool operator!=( Shape const& lhs, Shape const& rhs )
   {
      return !( lhs == rhs );
   }

   // ...
};


//---- <Circle.h> ----------------

class Circle
{
   // ...
};

bool operator==( Circle const& lhs, Circle const& rhs )
{
   return lhs.radius() == rhs.radius();
}


//---- <Square.h> ----------------

class Square
{
   // ...
};

bool operator==( Square const& lhs, Square const& rhs )
{
   return lhs.side() == rhs.side();
}

To make equality comparison work, you could use a dynamic_cast (). However, this implementation of equality comparison holds two severe disadvantages. First, as you saw in “Guideline 18: Beware the Performance of Acyclic Visitor”, a dynamic_cast does most certainly not count as a fast operation. Hence, you would have to pay a considerable runtime cost for every comparison. Second, in this implementation, you can only successfully compare two Shapes if they are equipped with the same DrawStrategy. While this might be reasonable in one context, it might also be considered an unfortunate limitation in another context. The only solution I am aware of is to return to std::function to store the drawing Strategy, which, however, would result in another performance penalty.⁷ In summary, depending on the context, equality comparison may be possible, but it’s usually neither easy nor cheap to accomplish. This is evidence to my earlier statement that Type Erasure doesn’t support binary operations.

Interface Segregation of Type Erasure Wrappers

“What about the Interface Segregation Principle (ISP)?” you ask. “While using External Polymorphism, it was easy to separate concerns in the base class. It appears we’ve lost this ability, right?” Excellent question. So you remember my example with the JSONExportable and Serializable base classes in “Guideline 31: Use External Polymorphism for Nonintrusive Runtime Polymorphism”. Indeed, with Type Erasure we are no longer able to use the hidden base class, only the abstracting value type. Therefore, it may appear as if the ISP is out of reach:

class Document  // Type-erased 'Document'
{
 public:
   // ...
   void exportToJSON( /*...*/ ) const;
   void serialize( ByteStream& bs, /*...*/ ) const;
   // ...
};

// Artificial coupling to 'ByteStream', although only the JSON export is needed
void exportDocument( Document const& doc )
{
   // ...
   doc.exportToJSON( /* pass necessary arguments */ );
   // ...
}

However, fortunately, this impression is incorrect. You can easily adhere to the ISP by providing several type-erased abstractions:⁸

Document doc = /*...*/;  // Type-erased 'Document'
doc.exportToJSON( /* pass necessary arguments */ );
doc.serialize( /* pass necessary arguments */ );

JSONExportable jdoc = doc;  // Type-erased 'JSONExportable'
jdoc.exportToJSON( /* pass necessary arguments */ );

Serializable sdoc = doc;  // Type-erased 'Serializable'
sdoc.serialize( /* pass necessary arguments */ );

Before considering this, take a look at “Guideline 34: Be Aware of the Setup Costs of Owning Type Erasure Wrappers”.

“Apart from the implementation complexity and the restriction to unary operations, there seem to be no disadvantages. Well, then, I have to say this is amazing stuff indeed! The benefits clearly outweigh the drawbacks.” Well, of course it always depends, meaning that in a specific context some of these issues might cause some pain. But I agree that, altogether, Type Erasure proves to be a very valuable design pattern. From a design perspective, you’ve gained a formidable level of decoupling, which will definitely lead to less pain when changing or extending your software. However, although this is already fascinating, there’s more. I’ve mentioned performance a couple of times but haven’t yet shown any performance numbers. So let’s take a look at the performance results.

Performance Benchmarks

Before showing you the performance results for Type Erasure, let me remind you about the benchmark scenario that we also used to benchmark the Visitor and Strategy solutions (see Table 4-2 in “Guideline 16: Use Visitor to Extend Operations” and Table 5-1 in “Guideline 23: Prefer a Value-Based Implementation of Strategy and Command”). This time I have extended the benchmark with a Type Erasure solution based on the OwningShapeModel implementation. For the benchmark, we are still using four different kinds of shapes (circles, squares, ellipses, and rectangles). And again, I’m running 25,000 translate operations on 10,000 randomly created shapes. I use both GCC 11.1 and Clang 11.1, and for both compilers, I’m adding only the -O3 and -DNDEBUG compilation flags. The platform I’m using is macOS Big Sur (version 11.4) on an 8-Core Intel Core i7 with 3.8 GHz, 64 GB of main memory.

Table 8-1 shows the performance numbers. For your convenience, I reproduced the performance results from the Strategy benchmarks. After all, the Strategy design pattern is the solution that is aiming at the same design space. The most interesting line, though, is the last line. It shows the performance result for the Type Erasure design pattern.

Table 8-1. Performance results for the Type Erasure implementations
Type Erasure implementation	GCC 11.1	Clang 11.1
Object-oriented solution	1.5205 s	1.1480 s
`std::function`	2.1782 s	1.4884 s
Manual implementation of `std::function`	1.6354 s	1.4465 s
Classic Strategy	1.6372 s	1.4046 s
Type Erasure	1.5298 s	1.1561 s

“Looks very interesting. Type Erasure seems to be pretty fast. Apparently only the object-oriented solution is faster.” Yes. For Clang, the performance of the object-oriented solution is a little better. But only a little. However, please remember that the object-oriented solution does not decouple anything: the draw() function is implemented as a virtual member function in the Shape hierarchy, and thus you experience heavy coupling to the drawing functionality. While this may come with little performance overhead, from a design perspective, this is a worst-case scenario. Taking this into account, the performance numbers of Type Erasure are truly marvelous: it performs between 6% and 20% better than any Strategy implementation. Thus, Type Erasure not only provides the strongest decoupling but also performs better than all the other attempts to reduce coupling.⁹

A Word About Terminology

In summary, Type Erasure is an amazing approach to achieve both efficient and loosely coupled code. While it may have a few limitations and disadvantages, the one thing you probably cannot ignore easily is the complex implementation details. For that reason, many people, including me and Eric Niebler, feel that Type Erasure should become a language feature:¹⁰

If I could go back in time and had the power to change C++, rather than adding virtual functions, I would add language support for type erasure and concepts. Define a single-type concept, automatically generate a type-erasing wrapper for it.

There is more to be done, though, to establish Type Erasure as a real design pattern. I have introduced Type Erasure as a compound design pattern built from External Polymorphism, Bridge, and Prototype. I’ve introduced it as a value-based technique for providing strong decoupling of a set of types from their associated operations. However, unfortunately, you might see other “forms” of Type Erasure: over time, the term Type Erasure has been misused and abused for all kinds of techniques and concepts. For instance, sometimes people refer to a void* as Type Erasure. Rarely, you also hear about Type Erasure in the context of inheritance hierarchies, or more specifically a pointer-to-base. And finally, you also might hear about Type Erasure in the context of std::variant.¹¹

The std::variant example especially demonstrates how deeply flawed this overuse of the term Type Erasure really is. While External Polymorphism, the main design pattern behind Type Erasure, is about enabling you to add new types, the Visitor design pattern and its modern implementation as std::variant are about adding new operations (see “Guideline 15: Design for the Addition of Types or Operations”). From a software design perspective, these two solutions are completely orthogonal to each other: while Type Erasure truly decouples from concrete types and erases type information, the template arguments of std::variant reveal all possible alternatives and therefore make you depend on these types. Using the same term for both of them results in exactly zero information conveyed when using the term Type Erasure and generates these types of comments: “I would suggest we use Type Erasure to solve this problem.” “Could you please be more specific? Do you want to add types or operations?” As such, the term would not fulfill the qualities of a design pattern; it wouldn’t carry any intent. Therefore, it would be useless.

To give Type Erasure its well-earned place in the hall of design patterns and to give it any meaning, consider using the term only for the intent discussed in this guideline.

Guideline 33: Be Aware of the Optimization Potential of Type Erasure

The primary focus of this book is software design. Therefore, all this talk about structuring software, about design principles, about tools for managing dependencies and abstractions, and, of course, all the information on design patterns is at the center of interest. Still, I’ve mentioned a few times that performance is important. Very important! After all, C++ is a performance-centric programming language. Therefore, I now make an exception: this guideline is devoted to performance. Yes, I’m serious: no talk about dependencies, (almost) no examples for separation of concerns, no value semantics. Just performance. “Finally, some performance stuff—great!” you cheer. However, be aware of the consequences: this guideline is pretty heavy on implementation details. And as it is in C++, mentioning one detail requires you to also deal with two more details, and so you are pretty quickly sucked into the realm of implementation details. To avoid that (and to keep my publisher happy), I will not elaborate on every implementation detail or demonstrate all the alternatives. I will, however, give additional references that should help you to dig deeper.¹²

In “Guideline 32: Consider Replacing Inheritance Hierarchies with Type Erasure”, you saw great performance numbers for our basic, unoptimized Type Erasure implementation. However, since we are now in possession of a value type and a wrapper class, not just a pointer, we have gained a multitude of opportunities to speed up performance. This is why we will take a look at two options to improve performance: the SBO and manual virtual dispatch.

Small Buffer Optimization

Let’s start our quest to speed up the performance of our Type Erasure implementation. One of the first things that usually comes to mind when talking about performance is optimizing memory allocations. This is because acquiring and freeing dynamic memory can be very slooowww and nondeterministic. And for real: optimizing memory allocations can make all the difference between slow and lightning fast.

However, there is a second reason to look into memory. In “Guideline 32: Consider Replacing Inheritance Hierarchies with Type Erasure”, I might have accidentally given you the impression that we need dynamic memory to pull off Type Erasure. Indeed, one of the initial implementation details in our first Shape class was the unconditional dynamic memory allocation in the constructor and clone() function, independent of the size of the given object, so for both small and large objects, we would always perform a dynamic memory allocation with std::make_unique(). This choice is limiting, not just because of performance, in particular for small objects, but also because in certain environments dynamic memory is not available. Therefore, I should demonstrate to you that there’s a lot you can do with respect to memory. In fact, you are in full control of memory management! Since you are using a value type, a wrapper, you can deal with memory in any way you see fit. One of the many options is to completely rely on in-class memory and emit a compile-time error if objects are too large. Alternatively, you might switch between in-class and dynamic memory, depending on the size of the stored object. Both of these are made possible by the SBO.

To give you an idea of how SBO works, let’s take a look at a Shape implementation that never allocates dynamically but uses only in-class memory:

#include <array>
#include <cstdlib>
#include <memory>

template< size_t Capacity = 32U, size_t Alignment = alignof(void*) >  
class Shape
{
 public:
   // ...

 private:
   // ...

   Concept* pimpl()  
   {
      return reinterpret_cast<Concept*>( buffer_.data() );
   }

   Concept const* pimpl() const  
   {
      return reinterpret_cast<Concept const*>( buffer_.data() );
   }

   alignas(Alignment) std::array<std::byte,Capacity> buffer_;  
};

This Shape class does not store std::unique_ptr anymore, but instead owns an array of properly aligned bytes ().¹³ To give users of Shape the flexibility to adjust both the capacity and the alignment of the array, you can provide the two nontype template parameters, Capacity and Alignment, to the Shape class ().¹⁴ While this improves the flexibility to adjust to different circumstances, the disadvantage of that approach is that this turns the Shape class into a class template. As a consequence, all functions that use this abstraction will likely turn into function templates. This may be undesirable, for instance, because you might have to move code from source files into header files. However, be aware that this is just one of many possibilities. As stated before, you are in full control.

To conveniently work with the std::byte array, we add a pair of pimpl() functions (named based on the fact that this still realizes the Bridge design pattern, just using in-class memory) ( and ). “Oh no, a reinterpret_cast!” you say. “Isn’t this super dangerous?” You are correct; in general, a reinterpret_cast should be considered potentially dangerous. However, in this particular case, we are backed up by the C++ standard, which explains that what we are doing here is perfectly safe.

As you probably expect by now, we also need to introduce an external inheritance hierarchy based on the External Polymorphism design pattern. This time we realize this hierarchy in the private section of the Shape class. Not because this is better or more suited for this Shape implementation, but for the sole reason to show you another alternative:

template< size_t Capacity = 32U, size_t Alignment = alignof(void*) >
class Shape
{
 public:
   // ...

 private:
   struct Concept
   {
      virtual ~Concept() = default;
      virtual void draw() const = 0;
      virtual void clone( Concept* memory ) const = 0;  
      virtual void move( Concept* memory ) = 0;  
   };

   template< typename ShapeT, typename DrawStrategy >
   struct OwningModel : public Concept
   {
      OwningModel( ShapeT shape, DrawStrategy drawer )
         : shape_( std::move(shape) )
         , drawer_( std::move(drawer) )
      {}

      void draw() const override
      {
         drawer_( shape_ );
      }

      void clone( Concept* memory ) const override  
      {
         std::construct_at( static_cast<OwningModel*>(memory), *this );

         // or:
         // auto* ptr =
         //    const_cast<void*>(static_cast<void const volatile*>(memory));
         // ::new (ptr) OwningModel( *this );
      }

      void move( Concept* memory ) override  
      {
         std::construct_at( static_cast<OwningModel*>(memory), std::move(*this) );

         // or:
         // auto* ptr =
         //    const_cast<void*>(static_cast<void const volatile*>(memory));
         // ::new (ptr) OwningModel( std::move(*this) );
      }

      ShapeT shape_;
      DrawStrategy drawer_;
   };

   // ...

   alignas(Alignment) std::array<std::byte,Capacity> buffer_;
};

The first interesting detail in this context is the clone() function (). As clone() carries the responsibility of creating a copy, it needs to be adapted to the in-class memory. So instead of creating a new Model via std::make_unique(), it creates a new Model in place via std::construct_at(). Alternatively, you could use a placement new to create the copy at the given memory location.¹⁵

“Wow, wait a second! That’s a pretty tough piece of code to swallow. What’s with all these casts? Are they really necessary?” I admit, these lines are a little challenging. Therefore, I should explain them in detail. The good old approach to creating an instance in place is via placement new. However, using new always carries the danger of someone (inadvertently or maliciously) providing a replacement for the class-specific new operator. To avoid any kind of problem and reliably construct an object in place, the given address is first converted to void const volatile* via a static_cast and then to void* via a const_cast. The resulting address is passed to the global placement new operator. Indeed, not the most obvious piece of code. Therefore, it is advisable to use the C++20 algorithm std::construct_at(): it provides you with exactly the same functionality but with a significantly nicer syntax.

However, we need one more function: clone() is concerned only with copy operations. It doesn’t apply to move operations. For that reason, we extend the Concept with a pure virtual move() function and consequently implement it in the OwningModel class template ().

“Is this really necessary? We’re using in-class memory, which cannot be moved to another instance of Shape. What’s the point of that move()?” Well, you are correct that we can’t move the memory itself from one object to another, but we can still move the shape stored inside. Thus, the move() function moves an OwningModel from one buffer to another instead of copying it.

The clone() and move() functions are used in the copy constructor (), the copy assignment operator (), the move constructor (), and the move assignment operator of Shape ():

template< size_t Capacity = 32U, size_t Alignment = alignof(void*) >
class Shape
{
 public:
   // ...

   Shape( Shape const& other )
   {
      other.pimpl()->clone( pimpl() );  
   }

   Shape& operator=( Shape const& other )
   {
      // Copy-and-Swap Idiom
      Shape copy( other );  
      buffer_.swap( copy.buffer_ );
      return *this;
   }

   Shape( Shape&& other ) noexcept
   {
      other.pimpl()->move( pimpl() );  
   }

   Shape& operator=( Shape&& other ) noexcept
   {
      // Copy-and-Swap Idiom
      Shape copy( std::move(other) );  
      buffer_.swap( copy.buffer_ );
      return *this;
   }

   ~Shape()  
   {
      std::destroy_at( pimpl() );
      // or: pimpl()->~Concept();
   }

 private:
   // ...

   alignas(Alignment) std::array<std::byte,Capacity> buffer_;
};

Definitely noteworthy to mention is the destructor of Shape (). Since we manually create an OwningModel within the byte buffer by std::construct_at() or a placement new, we are also responsible for explicitly calling a destructor. The easiest and most elegant way of doing that is to use the C++17 algorithm std::destroy_at(). Alternatively, you can explicitly call the Concept destructor.

The last, but essential, detail of Shape is the templated constructor:

template< size_t Capacity = 32U, size_t Alignment = alignof(void*) >
class Shape
{
 public:
   template< typename ShapeT, typename DrawStrategy >
   Shape( ShapeT shape, DrawStrategy drawer )
   {
      using Model = OwningModel<ShapeT,DrawStrategy>;

      static_assert( sizeof(Model) <= Capacity, "Given type is too large" );
      static_assert( alignof(Model) <= Alignment, "Given type is misaligned" );

      std::construct_at( static_cast<Model*>(pimpl())
                       , std::move(shape), std::move(drawer) );
      // or:
      // auto* ptr =
      //    const_cast<void*>(static_cast<void const volatile*>(pimpl()));
      // ::new (ptr) Model( std::move(shape), std::move(drawer) );
   }

   // ...

 private:
   // ...
};

After a pair of compile-time checks that the required OwningModel fits into the in-class buffer and adheres to the alignment restrictions, an OwningModel is instantiated into the in-class buffer by std::construct_at().

With this implementation in hand, we now adapt and rerun the performance benchmark from “Guideline 32: Consider Replacing Inheritance Hierarchies with Type Erasure”. We run exactly the same benchmark, this time without allocating dynamic memory inside Shape and without fragmenting the memory with many, tiny allocations. As expected, the performance results are impressive (see Table 8-2).

Table 8-2. Performance results for the Type Erasure implementations with SBO
Type Erasure implementation	GCC 11.1	Clang 11.1
Object-oriented solution	1.5205 s	1.1480 s
`std::function`	2.1782 s	1.4884 s
Manual implementation of `std::function`	1.6354 s	1.4465 s
Classic Strategy	1.6372 s	1.4046 s
Type Erasure	1.5298 s	1.1561 s
Type Erasure (SBO)	1.3591 s	1.0348 s

“Wow, this is fast. This is…well, let me do the math…amazing, roughly 20% faster than the fastest Strategy implementation, and even faster than the object-oriented solution.” It is, indeed. Very impressive, right? Still, you should remember that these are the numbers that I got on my system. Your numbers will be different, almost certainly. But even though your numbers might not be the same, the general takeaway is that there is a lot of potential to optimize performance by dealing with memory allocations.

However, while the performance is extraordinary, we’ve lost a lot of flexibility: only OwningModel instantiations that are smaller or equal to the specified Capacity can be stored inside Shape. Bigger models are excluded. This brings me back to the idea that we could switch between in-class and dynamic memory depending on the size of the given shape: small shapes are stored inside an in-class buffer, while large shapes are allocated dynamically. You could now go ahead and update the implementation of Shape to use both kinds of memory. However, at this point it’s probably a good idea to point out one of our most important design principles again: separation of concerns. Instead of squeezing all logic and functionality into the Shape class, it would be easier and (much) more flexible to separate the implementation details and implement Shape with policy-based design (see “Guideline 19: Use Strategy to Isolate How Things Are Done”):

template< typename StoragePolicy >
class Shape;

The Shape class template is rewritten to accept a StoragePolicy. Via this policy, you would be able to specify from outside how the class should acquire memory. And of course, you would perfectly adhere to SRP and OCP. One such storage policy could be the DynamicStorage policy class:

#include <utility>

struct DynamicStorage
{
   template< typename T, typename... Args >
   T* create( Args&&... args ) const
   {
      return new T( std::forward<Args>( args )... );
   }

   template< typename T >
   void destroy( T* ptr ) const noexcept
   {
      delete ptr;
   }
};

As the name suggests, DynamicPolicy would acquire memory dynamically, for instance via new. Alternatively, if you have stronger requirements, you could build on std::aligned_alloc() or similar functionality to provide dynamic memory with a specified alignment. Similarly to DynamicStorage, you could provide an InClassStor⁠age policy:

#include <array>
#include <cstddef>
#include <memory>
#include <utility>

template< size_t Capacity, size_t Alignment >
struct InClassStorage
{
   template< typename T, typename... Args >
   T* create( Args&&... args ) const
   {
      static_assert( sizeof(T) <= Capacity, "The given type is too large" );
      static_assert( alignof(T) <= Alignment, "The given type is misaligned" );

      T* memory = const_cast<T*>(reinterpret_cast<T const*>(buffer_.data()));
      return std::construct_at( memory, std::forward<Args>( args )... );

      // or:
      // void* const memory = static_cast<void*>(buffer_.data());
      // return ::new (memory) T( std::forward<Args>( args )... );
   }

   template< typename T >
   void destroy( T* ptr ) const noexcept
   {
      std::destroy_at(ptr);
      // or: ptr->~T();
   }

   alignas(Alignment) std::array<std::byte,Capacity> buffer_;
};

All of these policy classes provide the same interface: a create() function to instantiate an object of type T and a destroy() function to do whatever is necessary to clean up. This interface is used by the Shape class to trigger construction and destruction, for instance, in its templated constructor ()¹⁶ and in the destructor ():

template< typename StoragePolicy >
class Shape
{
 public:
   template< typename ShapeT >
   Shape( ShapeT shape )
   {
      using Model = OwningModel<ShapeT>;
      pimpl_ = policy_.template create<Model>( std::move(shape) )  
   }

   ~Shape() { policy_.destroy( pimpl_ ); }  

   // ... All other member functions, in particular the
   //     special members functions, are not shown

 private:
   // ...
   [[no_unique_address]] StoragePolicy policy_{};  
   Concept* pimpl_{};
};

The last detail that should not be left unnoticed is the data members (): the Shape class now stores an instance of the given StoragePolicy and, do not be alarmed, a raw pointer to its Concept. Indeed, there is no need to store std::unique_ptr anymore, since we are manually destroying the object in our own destructor again. You might also notice the [[no_unique_address]] attribute on the storage policy. This C++20 feature gives you the opportunity to save the memory for the storage policy. If the policy is empty, the compiler is now allowed to not reserve any memory for the data member. Without this attribute, it would be necessary to reserve at least a single byte for policy_, but likely more bytes due to alignment restrictions.

In summary, SBO is an effective and one of the most interesting optimizations for a Type Erasure implementation. For that reason, many standard types, such as std::function and std::any, use some form of SBO. Unfortunately, the C++ Standard Library specification doesn’t require the use of SBO. This is why you can only hope that SBO is used; you can’t count on it. However, because performance is so important and because SBO plays such a decisive role, there are already proposals out there that also suggest standardizing the types inplace_function and inplace_any. Time will tell if these find their way into the Standard Library.

Manual Implementation of Function Dispatch

“Wow, this will prove useful. Is there anything else I can do to improve the performance of my Type Erasure implementation?” you ask. Oh yes, you can do more. There is a second potential performance optimization. This time we try to improve the performance of the virtual functions. And yes, I’m talking about the virtual functions that are introduced by the external inheritance hierarchy, i.e., by the External Polymorphism design pattern.

“How should we be able to optimize the performance of virtual functions? Isn’t this something that is completely up to the compiler?” Absolutely, you’re correct. However, I am not talking about fiddling with backend, compiler-specific implementation details, but about replacing the virtual functions with something more efficient. And that is indeed possible. Remember that a virtual function is nothing but a function pointer that is stored inside a virtual function table. Every type with at least one virtual function has such a virtual function table. However, there is only one virtual function table for each type. In other words, this table is not stored inside every instance. So in order to connect the virtual function table with every instance of that type, the class stores an additional, hidden data member, which we commonly call the vptr and which is a raw pointer to the virtual function table.

When you call a virtual function, you first go through the vptr to fetch the virtual function table. Once you’re there, you can grab the corresponding function pointer from the virtual function table and call it. Therefore, in total, a virtual function call entails two indirections: the vptr and the pointer to the actual function. For that reason, roughly speaking, a virtual function call is twice as expensive as a regular, noninline function call.

These two indirections provide us with the opportunity for optimization: we can in fact reduce the number of indirections to just one. To achieve that, we will employ an optimization strategy that works fairly often: we’ll trade space for speed. What we will do is implement the virtual dispatch manually by storing the virtual function pointers inside the Shape class. The following code snippet already gives you a pretty good idea of the details:

//---- <Shape.h> ----------------

#include <cstddef>
#include <memory>

class Shape
{
 public:
   // ...

 private:
   // ...

   template< typename ShapeT
           , typename DrawStrategy >
   struct OwningModel  
   {
      OwningModel( ShapeT value, DrawStrategy drawer )
         : shape_( std::move(value) )
         , drawer_( std::move(drawer) )
      {}

      ShapeT shape_;
      DrawStrategy drawer_;
   };

   using DestroyOperation = void(void*);   
   using DrawOperation    = void(void*);   
   using CloneOperation   = void*(void*);  

   std::unique_ptr<void,DestroyOperation*> pimpl_;  
   DrawOperation*  draw_ { nullptr };               
   CloneOperation* clone_{ nullptr };               
};

Since we are replacing all virtual functions, even the virtual destructor, there’s no need for a Concept base class anymore. Consequently, the external hierarchy is reduced to just the OwningModel class template (), which still acts as storage for a specific kind of shape (ShapeT) and DrawStrategy. Still, it meets the same fate: all virtual functions are removed. The only remaining details are the constructor and the data members.

The virtual functions are replaced by manual function pointers. Since the syntax for function pointers is not the most pleasant to use, we add a couple of function type aliases for our convenience:¹⁷ DestroyOperation represents the former virtual destructor (), DrawOperation represents the former virtual draw() function (), and CloneOperation represents the former virtual clone() function (). DestroyOperation is used to configure the Deleter of the pimpl_ data member () (and yes, as such it acts as a Strategy). The latter two, DrawOperation and CloneOperation, are used for the two additional function pointer data members, draw_ and clone_ ( and ).

“Oh no, void*s! Isn’t that an archaic and super dangerous way of doing things?” you gasp. OK, I admit that without explanation it looks very suspicious. However, stay with me, I promise that everything will be perfectly fine and type safe. The key to making this work now lies in the initialization of these function pointers. They are initialized in the templated constructor of the Shape class:

//---- <Shape.h> ----------------

// ...

class Shape
{
 public:
   template< typename ShapeT
           , typename DrawStrategy >
   Shape( ShapeT shape, DrawStrategy drawer )
      : pimpl_(   
            new OwningModel<ShapeT,DrawStrategy>( std::move(shape)
                                                , std::move(drawer) )
          , []( void* shapeBytes ){  
               using Model = OwningModel<ShapeT,DrawStrategy>;
               auto* const model = static_cast<Model*>(shapeBytes);  
               delete model;  
            } )
      , draw_(  
            []( void* shapeBytes ){
               using Model = OwningModel<ShapeT,DrawStrategy>;
               auto* const model = static_cast<Model*>(shapeBytes);
               (*model->drawer_)( model->shape_ );
            } )
      , clone_(  
            []( void* shapeBytes ) -> void* {
               using Model = OwningModel<ShapeT,DrawStrategy>;
               auto* const model = static_cast<Model*>(shapeBytes);
               return new Model( *model );
            } )
   {}

   // ...

 private:
   // ...
};

Let’s focus on the pimpl_ data member. It is initialized both by a pointer to the newly instantiated OwningModel () and by a stateless lambda expression (). You may remember that a stateless lambda is implicitly convertible to a function pointer. This language guarantee is what we use to our advantage: we directly pass the lambda as the deleter to the constructor of unique_ptr, force the compiler to apply the implicit conversion to a DestroyOperation*, and thus bind the lambda function to the std::unique_ptr.

“OK, I get the point: the lambda can be used to initialize the function pointer. But how does it work? What does it do?” Well, also remember that we are creating this lambda inside the templated constructor. That means that at this point we are fully aware of the actual type of the passed ShapeT and DrawStrategy. Thus, the lambda is generated with the knowledge of which type of OwningModel is instantiated and stored inside the pimpl_. Eventually it will be called with a void*, i.e., by the address of some OwningModel. However, based on its knowledge about the actual type of OwningModel, it can first of all perform a static_cast from void* to OwningModel<ShapeT,DrawStrategy>* (). While in most other contexts this kind of cast would be suspicious and would likely be a wild guess, in this context it is perfectly type safe: we can be certain about the correct type of OwningModel. Therefore, we can use the resulting pointer to trigger the correct cleanup behavior ().

The initialization of the draw_ and clone_ data members is very similar ( and ). The only difference is, of course, the action performed by the lambdas: they perform the correct actions to draw the shape and to create a copy of the model, respectively.

I know, this may take some time to digest. But we are almost done; the only missing detail is the special member functions. For the destructor and the two move operations, we can again ask for the compiler-generated default. However, we have to deal with the copy constructor and copy assignment operator ourselves:

//---- <Shape.h> ----------------

// ...

class Shape
{
 public:
   // ...

   Shape( Shape const& other )
      : pimpl_( clone_( other.pimpl_.get() ), other.pimpl_.get_deleter() )
      , draw_ ( other.draw_ )
      , clone_( other.clone_ )
   {}

   Shape& operator=( Shape const& other )
   {
      // Copy-and-Swap Idiom
      using std::swap;
      Shape copy( other );
      swap( pimpl_, copy.pimpl_ );
      swap( draw_, copy.draw_ );
      swap( clone_, copy.clone_ );
      return *this;
   }

   ~Shape() = default;
   Shape( Shape&& ) = default;
   Shape& operator=( Shape&& ) = default;

 private:
   // ...
};

This is all we need to do, and we’re ready to try this out. So let’s put this implementation to the test. Once again we update the benchmark from “Guideline 32: Consider Replacing Inheritance Hierarchies with Type Erasure” and run it with our manual implementation of virtual functions. I have even combined the manual virtual dispatch with the previously discussed SBO. Table 8-3 shows the performance results.

Table 8-3. Performance results for the Type Erasure implementations with manual virtual dispatch
Type Erasure implementation	GCC 11.1	Clang 11.1
Object-oriented solution	1.5205 s	1.1480 s
`std::function`	2.1782 s	1.4884 s
Manual implementation of `std::function`	1.6354 s	1.4465 s
Classic Strategy	1.6372 s	1.4046 s
Type Erasure	1.5298 s	1.1561 s
Type Erasure (SBO)	1.3591 s	1.0348 s
Type Erasure (manual virtual dispatch)	1.1476 s	1.1599 s
Type Erasure (SBO + manual virtual dispatch)	1.2538 s	1.2212 s

The performance improvement for the manual virtual dispatch is extraordinary for GCC. On my system, I get down to 1.1476 seconds, which is an improvement of 25% in comparison to the based, unoptimized implementation of Type Erasure. Clang, on the other hand, does not show any improvement in comparison to the basic, unoptimized implementation. Although this may be a little disappointing, the runtime is, of course, still remarkable.

Unfortunately the combination of SBO and manual virtual dispatch does not lead to an even better performance. While GCC shows a small improvement in comparison to the pure SBO approach (which might be interesting for environments without dynamic memory), on Clang this combination does not work as well as you might have hoped for.

In summary, there is a lot of potential for optimizing the performance for Type Erasure implementations. If you’ve been skeptical before about Type Erasure, this gain in performance should give you a strong incentive to investigate for yourself. While this is amazing and without doubt is pretty exciting, it is important to remember where this is coming from: only due to separating the concerns of virtual behavior and encapsulating the behavior into a value type have we gained these optimization opportunities. We wouldn’t have been able to achieve this if all we had was a pointer-to-base.

Guideline 34: Be Aware of the Setup Costs of Owning Type Erasure Wrappers

In “Guideline 32: Consider Replacing Inheritance Hierarchies with Type Erasure” and “Guideline 33: Be Aware of the Optimization Potential of Type Erasure”, I guided you through the thicket of implementation details for a basic Type Erasure implementation. Yes, that was tough, but definitely worth the effort: you have emerged stronger, wiser, and with a new, efficient, and strongly decoupling design pattern in your toolbox. Great!

However, we have to go back into the thicket. I see you are rolling your eyes, but there is more. And I have to admit: I lied. At least a little. Not by telling you something incorrect, but by omission. There is one more disadvantage of Type Erasure that you should know of. A big one. One that you might not like at all. Sigh.

The Setup Costs of an Owning Type Erasure Wrapper

Assume for a second that Shape is a base class again, and Circle one of many deriving classes. Then passing a Circle to a function expecting a Shape const& would be easy and cheap ():

#include <cstdlib>

class Shape { /*...*/ };  // Classic base class

class Circle : public Shape { /*...*/ };  // Deriving class

void useShape( Shape const& shape )
{
   shape.draw( /*...*/ );
}

int main()
{
   Circle circle{ 3.14 };

   // Automatic and cheap conversion from 'Circle const&' to 'Shape const&'
   useShape( circle );  

   return EXIT_SUCCESS;
}

Although the Type Erasure Shape abstraction is a little different (for instance, it always requires a drawing Strategy), this kind of conversion is still possible:

#include <cstdlib>

class Circle { /*...*/ };  // Nonpolymorphic geometric primitive

class Shape { /*...*/ };  // Type erasure wrapper class as shown before

void useShape( Shape const& shape )
{
   draw(shape);
}

int main()
{
   Circle circle{ 3.14 };
   auto drawStrategy = []( Circle const& c ){ /*...*/ };

   // Creates a temporary 'Shape' object, involving
   //   a copy operation and a memory allocation
   useShape( { circle, drawStrategy } );  

   return EXIT_SUCCESS;
}

Unfortunately, it is no longer cheap. On the contrary, based on our previous implementations, which include both the basic one and optimized ones, the call to the useShape() function would involve a couple of potentially expensive operations ():

To convert a Circle into a Shape, the compiler creates a temporary Shape using the non-explicit, templated Shape constructor.
The call of the constructor results in a copy operation of the given shape (not expensive for Circle, but potentially expensive for other shapes) and the given draw Strategy (essentially free if the Strategy is stateless, but potentially expensive, depending on what is stored inside the object).
Inside the Shape constructor, a new shape model is created, involving a memory allocation (hidden in the call to std::make_unique() in the Shape constructor and definitely expensive).
The temporary (rvalue) Shape is passed by reference-to-const to the useShape() function.

It is important to point out that this is not a specific problem of our Shape implementation. The same problem will hit you if, for instance, you use std::function as a function argument:

#include <cstdlib>
#include <functional>

int compute( int i, int j, std::function<int(int,int)> op )
{
   return op( i, j );
}

int main()
{
   int const i = 17;
   int const j = 10;

   int const sum = compute( i, j, [offset=15]( int x, int y ) {
      return x + y + offset;
   } );

   return EXIT_SUCCESS;
}

In this example, the given lambda is converted into the std::function instance. This conversion will involve a copy operation and might involve a memory allocation. It entirely depends on the size of the given callable and on the implementation of std::function. For that reason, std::function is a different kind of abstraction than, for instance, std::string_view and std::span. std::string_view and std::span are nonowning abstractions that are cheap to copy because they consist of only a pointer to the first element and a size. Because these two types perform a shallow copy, they are perfectly suited as function parameters. std::function, on the other hand, is an owning abstraction that performs a deep copy. Therefore, it is not the perfect type to be used as a function parameter. Unfortunately, the same is true for our Shape implementation.¹⁸

“Oh my, I don’t like this. Not at all. That is terrible! I want my money back!” you exclaim. I have to agree that this may be a severe issue in your codebase. However, you understand that the underlying problem is the owning semantics of the Shape class: on the basis of its value semantics background, our current Shape implementation will always create a copy of the given shape and will always own the copy. While this is perfectly in line with all the benefits discussed in “Guideline 22: Prefer Value Semantics over Reference Semantics”, in this context it results in a pretty unfortunate performance penalty. However, stay calm—there is something we can do: for such a context, we can provide a nonowning Type Erasure implementation.

A Simple Nonowning Type Erasure Implementation

Generally speaking, the value semantics–based Type Erasure implementation is beautiful and perfectly adheres to the spirit of modern C++. However, performance is important. It might be so important that sometimes you might not care about the value semantics part, but only about the abstraction provided by Type Erasure. In that case, you might want to reach for a nonowning implementation of Type Erasure, despite the disadvantage that this pulls you back into the realm of reference semantics.

The good news is that if you desire only a simple Type Erasure wrapper, a wrapper that represents a reference-to-base, that is nonowning and trivially copyable, then the required code is fairly simple. That is particularly true because you have already seen how to manually implement the virtual dispatch in “Guideline 33: Be Aware of the Optimization Potential of Type Erasure”. With this technique, a simple, nonowning Type Erasure implementation is just a matter of a few lines of code:

//---- <Shape.h> ----------------

#include <memory>

class ShapeConstRef
{
 public:
   template< typename ShapeT, typename DrawStrategy >
   ShapeConstRef( ShapeT& shape, DrawStrategy& drawer )  
      : shape_{ std::addressof(shape) }
      , drawer_{ std::addressof(drawer) }
      , draw_{ []( void const* shapeBytes, void const* drawerBytes ){
           auto const* shape = static_cast<ShapeT const*>(shapeBytes);
           auto const* drawer = static_cast<DrawStrategy const*>(drawerBytes);
           (*drawer)( *shape );
        } }
   {}

 private:
   friend void draw( ShapeConstRef const& shape )
   {
      shape.draw_( shape.shape_, shape.drawer_ );
   }

   using DrawOperation = void( void const*,void const* );

   void const* shape_{ nullptr };    
   void const* drawer_{ nullptr };   
   DrawOperation* draw_{ nullptr };  
};

As the name suggests, the ShapeConstRef class represents a reference to a const shape type. Instead of storing a copy of the given shape, it only holds a pointer to it in the form of a void* (). In addition, it holds a void* to the associated DrawStrategy (), and as the third data member, a function pointer to the manually implemented virtual draw() function () (see “Guideline 33: Be Aware of the Optimization Potential of Type Erasure”).

ShapeConstRef takes its two arguments, the shape and the drawing Strategy, both possibly cv qualified, by reference-to-non-const ().¹⁹ In this form, it is not possible to pass rvalues to the constructor, which prevents any kind of lifetime issue with temporary values. This unfortunately does not protect you from all possible lifetime issues with lvalues but still provides a very reasonable protection.²⁰ If you want to allow rvalues, you should reconsider. And if you’re really, really willing to risk lifetime issues with temporaries, then you can simply take the argument(s) by reference-to-const. Just remember that you did not get this advice from me!

This is it. This is the complete nonowning implementation. It is efficient, short, simple, and can be even shorter and simpler if you do not need to store any kind of associated data or Strategy object. With this functionality in place, you are now able to create cheap shape abstractions. This is demonstrated in the following code example by the useShapeConstRef() function. This function enables you to draw any kind of shape (Circles, Squares, etc.) with any possible drawing implementation by simply using a ShapeConstRef as the function argument. In the main() function, we call useShapeConstRef() by a concrete shape and a concrete drawing Strategy (in this case, a lambda) ():

//---- <Main.cpp> ----------------

#include <Circle.h>
#include <Shape.h>
#include <cstdlib>

void useShapeConstRef( ShapeConstRef shape )
{
   draw( shape );
}

int main()
{
   // Create a circle as one representative of a concrete shape type
   Circle circle{ 3.14 };

   // Create a drawing strategy in the form of a lambda
   auto drawer = []( Circle const& c ){ /*...*/ };

   // Draw the circle directly via the 'ShapeConstRef' abstraction
   useShapeConstRef( { circle, drawer } );  

   return EXIT_SUCCESS;
}

This call triggers the desired effect, notably without any memory allocation or expensive copy operation, but only by wrapping polymorphic behavior around a set of pointers to the given shape and drawing Strategy.

A More Powerful Nonowning Type Erasure Implementation

Most of the time, this simple nonowning Type Erasure implementation should prove to be enough and fulfill all your needs. Sometimes, however, and only sometimes, it might not be enough. Sometimes, you might be interested in a slightly different form of Shape reference:

#include <Cirlce.h>
#include <Shape.h>
#include <cstdlib>

int main()
{
   // Create a circle as one representative of a concrete shape type
   Circle circle{ 3.14 };

   // Create a drawing strategy in the form of a lambda
   auto drawer = []( Circle const& c ){ /*...*/ };

   // Combine the shape and the drawing strategy in a 'Shape' abstraction
   Shape shape1( circle, drawer );

   // Draw the shape
   draw( shape1 );

   // Create a reference to the shape
   // Works already, but the shape reference will store a pointer
   // to the 'shape1' instance instead of a pointer to the 'circle'.
   ShapeConstRef shaperef( shape1 );  

   // Draw via the shape reference, resulting in the same output
   // This works, but only by means of two indirections!
   draw( shaperef );  

   // Create a deep copy of the shape via the shape reference
   // This is _not_ possible with the simple nonowning implementation!
   // With the simple implementation, this creates a copy of the 'shaperef'
   // instance. 'shape2' itself would act as a reference and there would be
   // three indirections... sigh.
   Shape shape2( shaperef );  

   // Drawing the copy will again result in the same output
   draw( shape2 );

   return EXIT_SUCCESS;
}

Assuming that you have a type-erased circle called shape1, you might want to convert this Shape instance to a ShapeConstRef (). With the current implementation, this works, but the shaperef instance would hold a pointer to the shape1 instance, instead of a pointer to the circle. As a consequence, any use of the shaperef would result in two indirections (one via the ShapeConstRef, and one via the Shape abstraction) (). Furthermore, you might also be interested in converting a ShapeConstRef instance to a Shape instance (). In that case, you might expect that a full copy of the underlying Circle is created and that the resulting Shape abstraction contains and represents this copy. Unfortunately, with the current implementation, the Shape would create a copy of the ShapeConstRef instance, and thus introduce a third indirection. Sigh.

If you need a more efficient interaction between owning and nonowning Type Erasure wrappers, and if you need a real copy when copying a nonowning wrapper into an owning wrapper, then I can offer you a working solution. Unfortunately, it is more involved than the previous implementation(s), but fortunately it isn’t not overly complex. The solution builds on the basic Type Erasure implementation from “Guideline 32: Consider Replacing Inheritance Hierarchies with Type Erasure”, which includes the ShapeConcept and OnwingShapeModel classes in the detail namespace, and the Shape Type Erasure wrapper. You will see that it just requires a few additions, all of which you have already seen before.

The first addition happens in the ShapeConcept base class:

//---- <Shape.h> ----------------

#include <memory>
#include <utility>

namespace detail {

class ShapeConcept
{
 public:
   // ...
   virtual void clone( ShapeConcept* memory ) const = 0;  
};

// ...

} // namespace detail

The ShapeConcept class is extended with a second clone() function (). Instead of returning a newly instantiated copy of the corresponding model, this function is passed the address of the memory location where the new model needs to be created.

The second addition is a new model class, the NonOwningShapeModel:

//---- <Shape.h> ----------------

// ...

namespace detail {

// ...

template< typename ShapeT
        , typename DrawStrategy >
class NonOwningShapeModel : public ShapeConcept
{
 public:
   NonOwningShapeModel( ShapeT& shape, DrawStrategy& drawer )
      : shape_{ std::addressof(shape) }
      , drawer_{ std::addressof(drawer) }
   {}

   void draw() const override { (*drawer_)(*shape_); }  

   std::unique_ptr<ShapeConcept> clone() const override  
   {
      using Model = OwningShapeModel<ShapeT,DrawStrategy>;
      return std::make_unique<Model>( *shape_, *drawer_ );
   }

   void clone( ShapeConcept* memory ) const override  
   {
      std::construct_at( static_cast<NonOwningShapeModel*>(memory), *this );

      // or:
      // auto* ptr =
      //    const_cast<void*>(static_cast<void const volatile*>(memory));
      // ::new (ptr) NonOwningShapeModel( *this );
   }

 private:
   ShapeT* shape_{ nullptr };  
   DrawStrategy* drawer_{ nullptr };  
};

// ...

} // namespace detail

The NonOwningShapeModel is very similar to the OwningShapeModel implementation, but, as the name suggests, it does not store copies of the given shape and strategy. Instead, it stores only pointers ( and ). Thus, this class represents the reference semantics version of the OwningShapeModel class. Also, NonOwningShapeModel needs to override the pure virtual functions of the ShapeConcept class: draw() again forwards the drawing request to the given drawing Strategy (), while the clone() functions perform a copy. The first clone() function is implemented by creating a new OwningShapeModel and copying both the stored shape and drawing Strategy (). The second clone() function is implemented by creating a new NonOwningShapeModel at the specified address by std::construct_at() ().

In addition, the OwningShapeModel class needs to provide an implementation of the new clone() function:

//---- <Shape.h> ----------------

// ...

namespace detail {

template< typename ShapeT
        , typename DrawStrategy >
class OwningShapeModel : public ShapeConcept
{
 public:
   // ...

   void clone( ShapeConcept* memory ) const  
   {
      using Model = NonOwningShapeModel<ShapeT const,DrawStrategy const>;

      std::construct_at( static_cast<Model*>(memory), shape_, drawer_ );

      // or:
      // auto* ptr =
      //    const_cast<void*>(static_cast<void const volatile*>(memory));
      // ::new (ptr) Model( shape_, drawer_ );
   }
};

// ...

} // namespace detail

The clone() function in OwningShapeModel is implemented similarly to the implementation in the NonOwningShapeModel class by creating a new instance of a NonOwningShapeModel by std::construct_at() ().

The next addition is the corresponding wrapper class that acts as a wrapper around the external hierarchy ShapeConcept and NonOwningShapeModel. This wrapper should take on the same responsibilities as the Shape class (i.e., the instantiation of the NonOwningShapeModel class template and the encapsulation of all pointer handling) but should merely represent a reference to a const concrete shape, not a copy. This wrapper is again given in the form of the ShapeConstRef class:

//---- <Shape.h> ----------------

#include <array>
#include <cstddef>
#include <memory>

// ...

class ShapeConstRef
{
 public:
   // ...

 private:
   // ...

   // Expected size of a model instantiation:
   //     sizeof(ShapeT*) + sizeof(DrawStrategy*) + sizeof(vptr)
   static constexpr size_t MODEL_SIZE = 3U*sizeof(void*);  

   alignas(void*) std::array<std::byte,MODEL_SIZE> raw_;  
};

As you will see, the ShapeConstRef class is very similar to the Shape class, but there are a few important differences. The first noteworthy detail is the use of a raw_ storage in the form of a properly aligned std::byte array (). That indicates that ShapeConstRef does not allocate dynamically, but firmly builds on in-class memory. In this case, however, this is easily possible, because we can predict the size of the required NonOwningShapeModel to be equal to the size of three pointers (assuming that the pointer to the virtual function table, the vptr, has the same size as any other pointer) ().

The private section of ShapeConstRef also contains a couple of member functions:

//---- <Shape.h> ----------------

// ...

class ShapeConstRef
{
 public:
   // ...

 private:
   friend void draw( ShapeConstRef const& shape )
   {
      shape.pimpl()->draw();
   }

   ShapeConcept* pimpl()  
   {
      return reinterpret_cast<ShapeConcept*>( raw_.data() );
   }

   ShapeConcept const* pimpl() const  
   {
      return reinterpret_cast<ShapeConcept const*>( raw_.data() );
   }

   // ...
};

We also add a draw() function as a hidden friend and, just as in the SBO implementation in “Guideline 33: Be Aware of the Optimization Potential of Type Erasure”, we add a pair of pimpl() functions ( and ). This will enable us to work conveniently with the in-class std::byte array.

The second noteworthy detail is the signature function of every Type Erasure implementation, the templated constructor:

//---- <Shape.h> ----------------

// ...

class ShapeConstRef
{
 public:
   // Type 'ShapeT' and 'DrawStrategy' are possibly cv qualified;
   // lvalue references prevent references to rvalues
   template< typename ShapeT
           , typename DrawStrategy >
   ShapeConstRef( ShapeT& shape
                , DrawStrategy& drawer )  
   {
      using Model =
         detail::NonOwningShapeModel<ShapeT const,DrawStrategy const>;  
      static_assert( sizeof(Model) == MODEL_SIZE, "Invalid size detected" );  
      static_assert( alignof(Model) == alignof(void*), "Misaligned detected" );

      std::construct_at( static_cast<Model*>(pimpl()), shape_, drawer_ );  

      // or:
      // auto* ptr =
      //    const_cast<void*>(static_cast<void const volatile*>(pimpl()));
      // ::new (ptr) Model( shape_, drawer_ );
   }

   // ...

 private:
   // ...
};

Again, you have the choice to accept the arguments by reference-to-non-const to prevent lifetime issues with temporaries (very much recommended!) (). Alternatively, you accept the arguments by reference-to-const, which would allow you to pass rvalues but puts you at risk of experiencing lifetime issues with temporaries. Inside the constructor, we again first use a convenient type alias for the required type of model (), before checking the actual size and alignment of the model (). If it does not adhere to the expected MODEL_SIZE or pointer alignment, we create a compile-time error. Then we construct the new model inside the in-class memory by std::construct_at() ():

//---- <Shape.h> ----------------

// ...

class ShapeConstRef
{
 public:
   // ...

   ShapeConstRef( Shape& other )       { other.pimpl_->clone( pimpl() ); }  
   ShapeConstRef( Shape const& other ) { other.pimpl_->clone( pimpl() ); }

   ShapeConstRef( ShapeConstRef const& other )
   {
      other.pimpl()->clone( pimpl() );
   }

   ShapeConstRef& operator=( ShapeConstRef const& other )
   {
      // Copy-and-swap idiom
      ShapeConstRef copy( other );
      raw_.swap( copy.raw_ );
      return *this;
   }

   ~ShapeConstRef()
   {
      std::destroy_at( pimpl() );
      // or: pimpl()->~ShapeConcept();
   }

   // Move operations explicitly not declared  

 private:
   // ...
};

In addition to the templated ShapeConstRef constructor, ShapeConstRef offers two constructors to enable a conversion from Shape instances (). While these are not strictly required, as we could also create an instance of a NonOwningShapeModel for a Shape, these constructors directly create a NonOwningShapeModel for the corresponding, underlying shape type, and thus shave off one indirection, which contributes to better performance. Note that to make these constructors work, ShapeConstRef needs to become a friend of the Shape class. Don’t worry, though, as this is a good example for friendship: Shape and ShapeConstRef truly belong together, work hand in hand, and are even provided in the same header file.

The last noteworthy detail is the fact that the two move operations are neither explicitly declared nor deleted (). Since we have explicitly defined the two copy operations, the compiler neither creates nor deletes the two move operations, thus they are gone. Completely gone in the sense that these two functions never participate in overload resolution. And yes, this is different from explicitly deleting them: if they were deleted, they would participate in overload resolution, and if selected, they would result in a compilation error. But with these two functions gone, when you try to move a ShapeConstRef, the copy operations would be used instead, which are cheap and efficient, since ShapeConstRef only represents a reference. Thus, this class deliberately implements the Rule of 3.

We are almost finished. The last detail is one more addition, one more constructor in the Shape class:

//---- <Shape.h> ----------------

// ...

class Shape
{
 public:
   // ...

   Shape( ShapeConstRef const& other )
      : pimpl_{ other.pimpl()->clone() }
   {}

 private:
   // ...
}

Via this constructor, an instance of Shape creates a deep copy of the shape stored in the passed ShapeConstRef instance. Without this constructor, Shape stores a copy of the ShapeConstRef instance and thus acts as a reference itself.

In summary, both nonowning implementations, the simple and the more complex one, give you all the design advantages of the Type Erasure design pattern but at the same time pull you back into the realm of reference semantics, with all its deficiencies. Hence, utilize the strengths of this nonowning form of Type Erasure, but also be aware of the usual lifetime issues. Consider it on the same level as std::string_view and std::span. All of these serve as very useful tools for function arguments, but do not use them to store anything for a longer period, for instance in the form of a data member. The danger of lifetime-related issues is just too high.

¹ Yes, I consider the manual use of std::unique_ptr manual lifetime management. But of course it could be much worse if we would not reach for the power of RAII.

² The term Type Erasure is heavily overloaded, as it is used in different programming languages and for many different things. Even within the C++ community, you hear the term being used for various purposes: you might have heard it being used to denote void*, pointers-to-base, and std::variant. In the context of software design, I consider this a very unfortunate issue. I will address this issue at the end of this guideline.

³ Sean Parent, “Inheritance Is the Base Class of Evil,” GoingNative 2013, YouTube.

⁴ Kevlin Henney, “Valued Conversions,” C++ Report, July-August 2000, CiteSeer.

⁵ For an introduction to std::function, see “Guideline 23: Prefer a Value-Based Implementation of Strategy and Command”.

⁶ The placement of ShapeConcept and OwningShapeModel in a namespace is purely an implementation detail of this example implementation. Still, as you will see in “Guideline 34: Be Aware of the Setup Costs of Owning Type Erasure Wrappers”, this choice will come in pretty handy. Alternatively, these two classes can be implemented as nested classes. You will see examples of this in “Guideline 33: Be Aware of the Optimization Potential of Type Erasure”.

⁷ Refer to “Guideline 31: Use External Polymorphism for Nonintrusive Runtime Polymorphism” for the implementation based on std::function.

⁸ Many thanks to Arthur O’Dwyer for providing this example.

⁹ Again, please don’t consider these performance numbers the perfect truth. These are the performance results on my machine and my implementation. Your results will differ for sure. However, the takeaway is that Type Erasure performs really well and might perform even better if we take the many optimization options into account (see “Guideline 33: Be Aware of the Optimization Potential of Type Erasure”).

¹⁰ Eric Niebler on Twitter, June 19, 2020.

¹¹ For an introduction of std::variant, see “Guideline 17: Consider std::variant for Implementing Visitor”.

¹² You should avoid going too deep, though, as you probably remember what happened to the dwarves of Moria who dug too deep…

¹³ Alternatively, you could use an array of bytes, e.g., std::byte[Capacity] or std::aligned_storage. The advantage of std::array is that it enables you to copy the buffer (if that is applicable!).

¹⁴ Note that the choice for the default arguments for Capacity and Alignment are reasonable but still arbitrary. You can, of course, use different defaults that best fit the properties of the expected actual types.

¹⁵ You might not have seen a placement new before. If that’s the case, rest assured that this form of new doesn’t perform any memory allocation, but only calls a constructor to create an object at the specified address. The only syntactic difference is that you provide an additional pointer argument to new.

¹⁶ As a reminder, since you might not see this syntax often: the template keyword in the constructor is necessary because we are trying to call a function template on a dependent name (a name whose meaning depends on a template parameter). Therefore, you have to make it clear to the compiler that the following is the beginning of a template argument list and not a less-than comparison.

¹⁷ Some people consider function pointers to be the best feature of C++. In his lightning talk, “The Very Best Feature of C++”, James McNellis demonstrates their syntactic beauty and enormous flexibility. Please do not take this too seriously, though, but rather as a humorous demonstration of a C++ imperfection.

¹⁸ At the time of writing, there is an active proposal for the std::function_ref type, a nonowning version of std::function.

¹⁹ The term cv qualified refers to the const and volatile qualifiers.

²⁰ For a reminder about lvalues and rvalues, refer to Nicolai Josuttis’s book on move semantics: C++ Move Semantics - The Complete Guide.