Separation of concerns and value semantics are two of the essential takeaways from this book that I have mentioned a couple of times by now. In this chapter, these two are beautifully combined into one of the most interesting modern C++ design patterns: Type Erasure. Since this pattern can be considered one of the hottest irons in the fire, in this chapter I will give you a very thorough, in-depth introduction to all aspects of Type Erasure. This, of course, includes all design-specific aspects and a lot of specifics about implementation details.
In “Guideline 32: Consider Replacing Inheritance Hierarchies with Type Erasure”, I will introduce you to Type Erasure and give you an idea why this design pattern is such a great combination of dependency reduction and value semantics. I will also give you a walkthrough of a basic, owning Type Erasure implementation.
“Guideline 33: Be Aware of the Optimization Potential of Type Erasure” is an exception: despite the fact that in this book I primarily focus on dependencies and design aspects, in this one guideline I will entirely focus on performance-related implementation details. I will show you how to apply the Small Buffer Optimization (SBO) and how to implement a manual virtual dispatch to speed up your Type Erasure implementation.
In “Guideline 34: Be Aware of the Setup Costs of Owning Type Erasure Wrappers”, we will investigate the setup costs of the owning Type Erasure implementation. We will find that there is a cost associated with value semantics that sometimes we may not be willing to pay. For this reason, we dare to take a step into the realm of reference semantics and implement a form of nonowning Type Erasure.
There are a couple of recurring pieces of advice throughout this book:
Minimize dependencies.
Separate concerns.
Prefer composition to inheritance.
Prefer nonintrusive solutions.
Prefer value semantics over reference semantics.
Used on their own, all of these have very positive effects on the quality of your code. In combination, however, these guidelines prove to be so much better. This is what you have experienced in our discussion about the External Polymorphism design pattern in “Guideline 31: Use External Polymorphism for Nonintrusive Runtime Polymorphism”. Extracting the polymorphic behavior turned out to be extremely powerful and unlocked an unprecedented level of loose coupling. Still, probably disappointingly, the demonstrated implementation of External Polymorphism did not strike you as a very modern way of solving things. Instead of following the advice to prefer value semantics, the implementation was firmly built on reference semantics: many pointers, many manual allocations, and manual lifetime management.1 Hence, the missing detail you’re waiting for is a value semantics–based implementation of the External Polymorphism design pattern. And I will not keep you waiting anymore: the resulting solution is commonly called Type Erasure.2
Before I give you a detailed introduction, let’s quickly talk about the history of Type Erasure. “Come on,” you argue. “Is this really necessary? I’m dying to finally see how this stuff works.” Well, I promise to keep it short. But yes, I feel this is a necessary detail of this discussion for two reasons. First, to demonstrate that we as a community, aside from the circle of the most experienced C++ experts, may have overlooked and ignored this technique for too long. And second, to give some well-deserved credit to the inventor of the technique.
The Type Erasure design pattern is very often attributed to one of the first and therefore most famous presentations of this technique. At the GoingNative 2013 conference, Sean Parent gave a talk called “Inheritance Is the Base Class of Evil.”3 recapped his experiences with the development of Photoshop and talked about the dangers and disadvantages of inheritance-based implementations. However, he also presented a solution to the inheritance problem, which later came to be known as Type Erasure.
Despite Sean’s talk being one of the first recorded, and for that reason probably the most
well-known resource about Type Erasure, the technique was used long before that. For
instance, Type Erasure was used in several places in the Boost
libraries, for example, by Douglas Gregor for
boost::function
.
Still, to my best knowledge, the technique was first discussed in a paper by Kevlin
Henney in the July-August 2000 edition of the C++ Report.4
In this paper, Kevlin demonstrated Type Erasure with a code example that later
evolved into what we today know as C++17’s std::any
. Most importantly, he
was the first to elegantly combine several design patterns to form a value semantics–based implementation around a collection of unrelated, nonpolymorphic types.
Since then, a lot of common types have acquired the technique to provide value types
for various applications. Some of these types have even found their way into the
Standard Library. For instance, we have already seen std::function
, which represents
a value-based abstraction of a callable.5
I’ve already mentioned std::any
, which represents an abstract container-like value
for virtually anything (hence the name) but without exposing any functionality:
#include
<any>
#include
<cstdlib>
#include
<string>
using
namespace
std
::
string_literals
;
int
main
()
{
std
::
any
a
;
// Creating an empty 'any'
a
=
1
;
// Storing an 'int' inside the 'any';
a
=
"some string"
s
;
// Replacing the 'int' with a 'std::string'
// There is nothing we can do with the 'any' except for getting the value back
std
::
string
s
=
std
::
any_cast
<
std
::
string
>
(
a
);
return
EXIT_SUCCESS
;
}
And then there is std::shared_ptr
, which uses Type Erasure to store the assigned
deleter:
#include
<cstdlib>
#include
<memory>
int
main
()
{
{
// Creating a 'std::shared_ptr' with a custom deleter
// Note that the deleter is not part of the type!
std
::
shared_ptr
<
int
>
s
{
new
int
{
42
},
[](
int
*
ptr
){
delete
ptr
;
}
};
}
// The 'std::shared_ptr' is destroyed at the end of the scope,
// deleting the 'int' by means of the custom deleter.
return
EXIT_SUCCESS
;
}
“It appears to be simpler to just provide a second template parameter for the deleter as
std::unique_ptr
does. Why isn’t std::shared_ptr
implemented in the same way?” you
inquire. Well, the designs of std::shared_ptr
and std::unique_ptr
are different
for very good reasons. The philosophy of std::unique_ptr
is to represent nothing but the
simplest possible wrapper around a raw pointer: it should be as fast as a raw pointer, and it
should have the same size as a raw pointer. For that reason, it is not desirable to store the
deleter alongside the managed pointer. Consequently, std::unique_ptr
is designed such that
for stateless deleters, any size overhead can be avoided. However, unfortunately, this second
template parameter is easily overlooked and causes artificial restrictions:
// This function takes only unique_ptrs that use the default deleter,
// and thus is artificially restricted
template
<
typename
T
>
void
func1
(
std
::
unique_ptr
<
T
>
ptr
);
// This function does not care about the way the resource is cleaned up,
// and thus is truly generic
template
<
typename
T
,
typename
D
>
void
func2
(
std
::
unique_ptr
<
T
,
D
>
ptr
);
This kind of coupling is avoided in the design of std::shared_ptr
. Since std::shared_ptr
has to store many more data items in its so-called control block (that includes the reference
count, the weak count, etc.), it has the opportunity to use Type Erasure to literally erase
the type of the deleter, removing any kind of possible dependency.
“Wow, that truly sounds intriguing. This makes me even more excited to learn about Type Erasure.” OK then, here we go. However, please don’t expect any magic or revolutionary new ideas. Type Erasure is nothing but a compound design pattern, meaning that it is a very clever and elegant combination of three other design patterns. The three design patterns of choice are External Polymorphism (the key ingredient for achieving the decoupling effect and the nonintrusive nature of Type Erasure; see “Guideline 31: Use External Polymorphism for Nonintrusive Runtime Polymorphism”), Bridge (the key to creating a value semantics–based implementation; see “Guideline 28: Build Bridges to Remove Physical Dependencies”), and (optionally) Prototype (required to deal with the copy semantics of the resulting values; see “Guideline 30: Apply Prototype for Abstract Copy Operations”). These three design patterns form the core of Type Erasure, but of course, keep in mind that different interpretations and implementations exist, mainly to adapt to specific contexts. The point of combining these three design patterns is to create a wrapper type, which represents a loosely coupled, nonintrusive abstraction.
Intent: “Provide a value-based, non-intrusive abstraction for an extendable set of unrelated, potentially non-polymorphic types with the same semantic behavior.”
The purpose of this formulation is to be as short as possible, and as precise as necessary. However, every detail of this intent carries meaning. Thus, it may be helpful to elaborate:
The intent of Type Erasure is to create value types that may be copyable, movable, and most importantly, easily reasoned about. However, such a value type is not of the same quality as a regular value type; there are some limitations. In particular, Type Erasure works best for unary operations but has its limits for binary operations.
The intent of Type Erasure is to create an external, nonintrusive abstraction based on the example set by the External Polymorphism design pattern. All types providing the behavior expected by the abstraction are automatically supported, without the need to apply any modifications to them.
Type Erasure is firmly based on object-oriented principles, i.e., it enables you to add types easily. These types, though, should not be connected in any way. They do not have to share common behavior via some base class. Instead, it should be possible to add any fitting type, without any intrusive measure, to this set of types.
As demonstrated with the External Polymorphism design pattern, types should not have to buy into the set by inheritance. They should also not have to provide virtual functionality on their own, but they should be decoupled from their polymorphic behavior. However, types with base classes or virtual functions are not excluded.
The goal is not to provide an abstraction for all possible types but to provide a semantic abstraction for a set of types that provide the same operations (including same syntax) and adhere to some expected behavior, according to the LSP (see “Guideline 6: Adhere to the Expected Behavior of Abstractions”). If possible, for any type that does not provide the expected functionality, a compile-time error should be created.
With this formulation of the intent in mind, let’s take a look at the dependency graph
of Type Erasure (see Figure 8-1). The
graph should look very familiar, as the structure of the pattern is dominated by the
inherent structure of the External Polymorphism design pattern (see
Figure 7-8). The most important difference and addition
is the Shape
class on the highest level of the architecture. This class serves as a
wrapper around the external hierarchy introduced by External Polymorphism. Primarily,
since this external hierarchy will not be used directly anymore, but also to reflect the
fact that ShapeModel
is storing, or “owning,” a concrete type, the name of the
class template has been adapted to OwningShapeModel
.
OK, but now, with the structure of Type Erasure in mind, let’s take a look at its implementation details. Still, despite the fact that you’ve seen all the ingredients in action before, the implementation details are not particularly beginner-friendly and are not for the fainthearted. And that is despite the fact that I have picked the simplest Type Erasure implementation I’m aware of. Therefore, I will try to keep everything at a reasonable level and not stray too much into the realm of implementation details. Among other things, this means that I won’t try to squeeze out every tiny bit of performance. For instance, I won’t use forwarding references or avoid dynamic memory allocations. Also, I will favor readability and code clarity. While this may be a disappointment to you, I believe that will save us a lot of headache. However, if you want to dig deeper into the implementation details and optimization options, I recommend taking a look at “Guideline 33: Be Aware of the Optimization Potential of Type Erasure”.
We again start with the Circle
and Square
classes:
//---- <Circle.h> ----------------
class
Circle
{
public
:
explicit
Circle
(
double
radius
)
:
radius_
(
radius
)
{}
double
radius
()
const
{
return
radius_
;
}
/* Several more getters and circle-specific utility functions */
private
:
double
radius_
;
/* Several more data members */
};
//---- <Square.h> ----------------
class
Square
{
public
:
explicit
Square
(
double
side
)
:
side_
(
side
)
{}
double
side
()
const
{
return
side_
;
}
/* Several more getters and square-specific utility functions */
private
:
double
side_
;
/* Several more data members */
};
These two classes have not changed since we last encountered them in the discussion of External Polymorphism. But it still pays off to again stress that these two are completely unrelated, do not know about each other, and—most importantly—are nonpolymorphic, meaning that they do not inherit from any base class or introduce virtual function on their own.
We have also seen the ShapeConcept
and OwningShapeModel
classes before, the latter
under the name ShapeModel
:
//---- <Shape.h> ----------------
#
include
<memory>
#
include
<utility>
namespace
detail
{
class
ShapeConcept
{
public
:
virtual
~
ShapeConcept
(
)
=
default
;
virtual
void
draw
(
)
const
=
0
;
virtual
std
:
:
unique_ptr
<
ShapeConcept
>
clone
(
)
const
=
0
;
}
;
template
<
typename
ShapeT
,
typename
DrawStrategy
>
class
OwningShapeModel
:
public
ShapeConcept
{
public
:
explicit
OwningShapeModel
(
ShapeT
shape
,
DrawStrategy
drawer
)
:
shape_
{
std
:
:
move
(
shape
)
}
,
drawer_
{
std
:
:
move
(
drawer
)
}
{
}
void
draw
(
)
const
override
{
drawer_
(
shape_
)
;
}
std
:
:
unique_ptr
<
ShapeConcept
>
clone
(
)
const
override
{
return
std
:
:
make_unique
<
OwningShapeModel
>
(
*
this
)
;
}
private
:
ShapeT
shape_
;
DrawStrategy
drawer_
;
}
;
}
// namespace detail
Next to the name change, there are a couple of other, important differences. For
instance, both classes have been moved to the detail
namespace. The name of the
namespace indicates that these two classes are now becoming implementation details,
i.e., they are not intended for direct use anymore.6 The ShapeConcept
class
()
still introduces the pure virtual function
draw()
to represent the requirement for
drawing a shape
().
In addition,
ShapeConcept
now also introduces a pure virtual clone()
function
().
“I know what this is, this is the Prototype design pattern!” you exclaim. Yes, correct.
The name
clone()
is very strongly connected to Prototype and
is a strong indication of this design pattern (but not a guarantee). However, although
the choice of the function name is very reasonable and canonical, allow me to point out
explicitly that the choice of the function name for clone()
, and also for draw()
, is
our own: these names are now implementation details and do not have any relationship
to the names that we require from our ShapeT
types. We could as well name them
do_draw()
and do_clone()
, and it would not have any consequence on the ShapeT
types. The real requirement on the ShapeT
types is defined by the implementation
of the draw()
and clone()
functions.
As ShapeConcept
is again the base class for the external hierarchy, the draw()
function, the clone()
function, and the destructor represent the set of
requirements for all kinds of shapes. This means that all shapes must provide some
drawing
behavior—they must be copyable and destructible. Note that these three
functions are only requirement choices for this example. In particular,
copyability is not a general requirement for all implementations of Type Erasure.
The OwningShapeModel
class
()
again represents the one and only implementation of the
ShapeConcept
class. As before,
OwningShapeModel
takes a concrete shape type and a drawing Strategy in its constructor
()
and uses these to initialize its two data members
(
and
).
Since
OwningShapeModel
inherits from ShapeConcept
, it must implement the two pure
virtual functions. The draw()
function is implemented by applying the given drawing Strategy
(),
while the
clone()
function is implemented to return an exact copy of the corresponding
OwningShapeModel
().
If you’re right now thinking, “Oh no, std::make_unique()
. That means dynamic memory.
Then I can’t use that in my code!”—don’t worry. std::make_unique()
is merely an
implementation detail, a choice to keep the example simple. In
“Guideline 33: Be Aware of the Optimization Potential of Type Erasure”, you will
see how to avoid
dynamic memory with the SBO.
“I’m pretty unimpressed so far. We’ve barely moved beyond the implementation of the
External Polymorphism design pattern.” I completely understand the criticism. However,
we are just one step away from turning External Polymorphism into Type Erasure, just one
step away from switching from reference semantics to value semantics. All we need is a value
type, a wrapper around the external hierarchy introduced by ShapeConcept
and OwningShapeModel
,
that handles all the details that we don’t want to perform manually: the instantiation of
the OwningShapeModel
class template, managing pointers, performing allocations, and dealing
with lifetime. This wrapper is given in the form of the Shape
class:
//---- <Shape.h> ----------------
// ...
class
Shape
{
public
:
template
<
typename
ShapeT
,
typename
DrawStrategy
>
Shape
(
ShapeT
shape
,
DrawStrategy
drawer
)
{
using
Model
=
detail
:
:
OwningShapeModel
<
ShapeT
,
DrawStrategy
>
;
pimpl_
=
std
:
:
make_unique
<
Model
>
(
std
:
:
move
(
shape
)
,
std
:
:
move
(
drawer
)
)
;
}
// ...
private
:
// ...
std
:
:
unique_ptr
<
detail
:
:
ShapeConcept
>
pimpl_
;
}
;
The first, and perhaps most important, detail about the Shape
class is the templated
constructor
().
As the first argument, this constructor takes any kind of shape (called
ShapeT
), and
as the second argument, the desired DrawStrategy
. To simplify the instantiation
of the corresponding detail::OwningShapeModel
class template, it proves to be helpful to use
a convenient type alias
().
This alias is used to instantiate the required model by
std::make_unique()
().
Both the shape and the drawing Strategy are passed to the new model.
The newly created model is used to initialize the one data member of the Shape
class:
the pimpl_
().
“I recognize this one, too; this is a Bridge!” you happily announce. Yes, correct again.
This is an application of the Bridge design pattern. In the construction, we create a
concrete
OwningShapeModel
based on the actual given types ShapeT
and DrawStrategy
,
but we store it as a pointer to ShapeConcept
. By doing this you create a Bridge to the
implementation details, a Bridge to the real shape type. However, after the initialization
of pimpl_
, after the constructor is finished, Shape
doesn’t remember the actual type.
Shape
does not have a template parameter or any member function that would reveal the
concrete type it stores, and there is no data member that remembers the given type. All
it holds is a pointer to the ShapeConcept
base class. Thus, its memory of the real shape
type has been erased. Hence the name of the design pattern: Type Erasure.
The only thing missing in our Shape
class is the functionality required for a true
value type: the copy and move operations. Luckily, due to the application of std::unique_ptr
,
our effort is pretty limited. Since the compiler-generated destructor and the two move
operations will work, we only need to deal with the two copy operations:
//---- <Shape.h> ----------------
// ...
class
Shape
{
public
:
// ...
Shape
(
Shape
const
&
other
)
:
pimpl_
(
other
.
pimpl_
-
>
clone
(
)
)
{
}
Shape
&
operator
=
(
Shape
const
&
other
)
{
// Copy-and-Swap Idiom
Shape
copy
(
other
)
;
pimpl_
.
swap
(
copy
.
pimpl_
)
;
return
*
this
;
}
~
Shape
(
)
=
default
;
Shape
(
Shape
&
&
)
=
default
;
Shape
&
operator
=
(
Shape
&
&
)
=
default
;
private
:
friend
void
draw
(
Shape
const
&
shape
)
{
shape
.
pimpl_
-
>
draw
(
)
;
}
// ...
}
;
The copy constructor
()
could be a very difficult function to implement, since we do not know the concrete type of
shape stored in the
other
Shape
. However, by providing the clone()
function in
the ShapeConcept
base class, we can ask for an exact copy without needing to know
anything about the concrete type. The shortest, most painless, and most convenient way
to implement the copy assignment operator
()
is to build on the
Copy-and-Swap idiom.
In addition, the Shape
class provides a so-called
hidden friend
called draw()
().
This
friend
function is called a hidden friend, since although it’s a free function,
it is defined within the body of the Shape
class. As a friend
, it’s granted full
access to the
private
data member and will be injected into the surrounding namespace.
“Didn’t you say that friend
s are bad?” you ask. I admit, that’s what I said in
“Guideline 4: Design for Testability”. However, I also explicitly stated that hidden friend
s
are OK. In this case, the draw()
function is an integral part of the Shape
class and
definitely a real friend
(almost part of the family). “But then it should be a member
function, right?” you argue. Indeed, that would be a valid alternative. If you like this
better, go for it. In this case, my preference is to use a free function, since one of our
goals was to reduce dependencies by extracting the draw()
operation. This goal should
also be reflected in the Shape
implementation. However, since the function requires access
to the pimpl_
data member, and in order to not increase the overload set of draw()
functions, I implement it as a hidden friend
.
This is it. All of it. Let’s take a look at how beautifully the new functionality works:
//---- <Main.cpp> ----------------
#
include
<Circle.h>
#
include
<Square.h>
#
include
<Shape.h>
#
include
<cstdlib>
int
main
(
)
{
// Create a circle as one representative of a concrete shape type
Circle
circle
{
3.14
}
;
// Create a drawing strategy in the form of a lambda
auto
drawer
=
[
]
(
Circle
const
&
c
)
{
/*...*/
}
;
// Combine the shape and the drawing strategy in a 'Shape' abstraction
// This constructor call will instantiate a 'detail::OwningShapeModel' for
// the given 'Circle' and lambda types
Shape
shape1
(
circle
,
drawer
)
;
// Draw the shape
draw
(
shape1
)
;
// Create a copy of the shape by means of the copy constructor
Shape
shape2
(
shape1
)
;
// Drawing the copy will result in the same output
draw
(
shape2
)
;
return
EXIT_SUCCESS
;
}
We first create shape1
as an abstraction for a Circle
and an associated drawing
Strategy. This feels easy, right? There’s no need to manually allocate and no need to
deal with pointers. With the draw()
function, we’re able to draw this Shape
().
Directly afterward, we create a copy of the shape. A real copy—a “deep copy,” not just
the copy of a pointer. Drawing the copy with the
draw()
function will result in the
same output
().
Again, this feels good: you can rely on the copy operations of the value type (in this case,
the copy constructor), and you do not have to
clone()
manually.
Pretty amazing, right? And definitely much better than using External Polymorphism
manually. I admit that after all these implementation details, it may be a little hard to see
it right away, but if you step through the jungle of implementation details, I hope you
realize the beauty of this approach: you no longer have to deal with pointers, there are no
manual allocations, and you don’t have to deal with inheritance hierarchies anymore. All of
these details are there, yes, but all evidence is nicely encapsulated within the
Shape
class. Still, you didn’t lose any of the decoupling benefits: you are still able
to easily add new types, and the concrete shape types are still oblivious about the drawing
behavior. They are only connected to the desired functionality via the Shape
constructor.
“I’m wondering,” you begin to ask, “Couldn’t we make this much easier? I envision a
main()
function that looks like this”:
//---- <YourMain.cpp> ----------------
int
main
()
{
// Create a circle as one representative of a concrete shape type
Circle
circle
{
3.14
};
// Bind the circle to some drawing functionality
auto
drawingCircle
=
[
=
]()
{
myCircleDrawer
(
circle
);
};
// Type-erase the circle equipped with drawing behavior
Shape
shape
(
drawingCircle
);
// Drawing the shape
draw
(
shape
);
// ...
return
EXIT_SUCCESS
;
}
That is a great idea. Remember, you are in charge of all the implementation details of
the Type Erasure wrapper and how to bring together types and their operation implementation. If you like this form better, go for it! However, please do not forget
that in our Shape
example, for the sake of simplicity and code brevity, I have
deliberately used only a single functionality with external dependencies (drawing).
There could be more functions that introduce dependencies, such as the serialization
of shapes. In that case, the lambda approach would not work, as you would need multiple,
named functions (e.g., draw()
and serialize()
). So, ultimately, it depends. It depends
on what kind of abstraction your Type Erasure wrapper represents. But whatever
implementation you prefer, just make sure that you do not introduce artificial
dependencies
between the different pieces of functionality and/or code duplication. In other words,
remember “Guideline 2: Design for Change”! That is the reason I favored the solution based
on the Strategy design pattern, which you, however, shouldn’t consider the true and
only solution. On the contrary, you should strive to fully exploit the potential of the
loose coupling of Type Erasure.
Despite the beauty of Type Erasure and the large number of benefits that you acquire, especially from a design perspective, I don’t pretend that there are no downsides to this design pattern. No, it wouldn’t be fair to keep potential disadvantages from you.
The first, and probably most obvious, drawback for you might be the implementation complexity of this pattern. As stated before, I have explicitly kept the implementation details at a reasonable level, which hopefully helped you to get the idea. I hope I have also given you the impression that it is not so difficult after all: a basic implementation of Type Erasure can be realized within approximately 30 lines of code. Still, you might feel that it is too complex. Also, as soon as you start to go beyond the basic implementation and consider performance, exception safety, etc., the implementation details indeed become quite tricky very quickly. In these cases, your safest and most convenient option is to use a third-party library instead of dealing with all of these details yourself. Possible libraries include the dyno library from Louis Dionne, the zoo library from Eduardo Madrid, the erasure library from Gašper Ažman, and the Boost Type Erasure library from Steven Watanabe.
In the explanation of the intent of Type Erasure, I mentioned the second disadvantage, which is much more important and limiting: although we are now dealing with values that can be copied and moved, using Type Erasure for binary operations is not straightforward. For instance, it is not easily possible to do an equality comparison on these values, as you would expect from regular values:
int
main
()
{
// ...
if
(
shape1
==
shape2
)
{
/*...*/
}
// Does not compile!
return
EXIT_SUCCESS
;
}
The reason is that, after all, Shape
is only an abstraction from a concrete shape type and
only stores a pointer-to-base. As you would deal with exactly the same problem if you
used External Polymorphism directly, this is definitely not a new problem in Type Erasure,
and you might not even count this as a real disadvantage. Still, while equality comparison
is not an expected operation when you’re dealing with pointers-to-base, it usually is an
expected operation on values.
“Isn’t this just a question of exposing the necessary functionality in the interface of
Shape
s?” you wonder. “For instance, we could simply add an area()
function
to the public
interface of shapes and use it to compare two items”:
bool
operator
==
(
Shape
const
&
lhs
,
Shape
const
&
rhs
)
{
return
lhs
.
area
()
==
rhs
.
area
();
}
“This is easy to do. So what am I missing?” I agree that this might be all you need: if
two objects are equal if some public properties are equal, then this operator will work
for you. In general, the answer would have to be “it depends.” In this particular case, it
depends on the semantics of the abstraction that the Shape
class represents. The question
is: when are two Shape
s equal? Consider the following example with a
Circle
and a
Square
:
#include
<Circle.h>
#include
<Square.h>
#include
<cstdlib>
int
main
()
{
Shape
shape1
(
Circle
{
3.14
}
);
Shape
shape2
(
Square
{
2.71
}
);
if
(
shape1
==
shape2
)
{
/*...*/
}
return
EXIT_SUCCESS
;
}
When are these two Shape
s equal? Are they equal if their areas are equal, or are they
equal if the instances behind the abstraction are equal, meaning that both Shape
s are
of the same type and have the same properties? It depends. In the same spirit, I could ask
the question, when are two Person
s equal? Are they equal if their first names are equal?
Or are they equal if all of their characteristics are equal? It depends on the desired
semantics. And while the first comparison is easily done, the second one is not. In a general case, I assume that the second situation is far more likely to be the
desired semantics, and therefore I argue that using Type Erasure
for equality comparison and more generally for binary operations is not straightforward.
Note, however, that I did not say that equality comparison is impossible. Technically, you can make it work, although it turns out to be a rather ugly solution. Therefore, you have to promise not to tell anyone that you got this idea from me. “You just made me even more curious,” you smile whimsically. OK, so here it is:
//---- <Shape.h> ----------------
// ...
namespace
detail
{
class
ShapeConcept
{
public
:
// ...
virtual
bool
isEqual
(
ShapeConcept
const
*
c
)
const
=
0
;
}
;
template
<
typename
ShapeT
,
typename
DrawStrategy
>
class
OwningShapeModel
:
public
ShapeConcept
{
public
:
// ...
bool
isEqual
(
ShapeConcept
const
*
c
)
const
override
{
using
Model
=
OwningShapeModel
<
ShapeT
,
DrawStrategy
>
;
auto
const
*
model
=
dynamic_cast
<
Model
const
*
>
(
c
)
;
return
(
model
&
&
shape_
=
=
model
-
>
shape_
)
;
}
private
:
// ...
}
;
}
// namespace detail
class
Shape
{
// ...
private
:
friend
bool
operator
=
=
(
Shape
const
&
lhs
,
Shape
const
&
rhs
)
{
return
lhs
.
pimpl_
-
>
isEqual
(
rhs
.
pimpl_
.
get
(
)
)
;
}
friend
bool
operator
!
=
(
Shape
const
&
lhs
,
Shape
const
&
rhs
)
{
return
!
(
lhs
=
=
rhs
)
;
}
// ...
}
;
//---- <Circle.h> ----------------
class
Circle
{
// ...
}
;
bool
operator
=
=
(
Circle
const
&
lhs
,
Circle
const
&
rhs
)
{
return
lhs
.
radius
(
)
=
=
rhs
.
radius
(
)
;
}
//---- <Square.h> ----------------
class
Square
{
// ...
}
;
bool
operator
=
=
(
Square
const
&
lhs
,
Square
const
&
rhs
)
{
return
lhs
.
side
(
)
=
=
rhs
.
side
(
)
;
}
To make equality comparison work, you could use a dynamic_cast
().
However, this implementation of equality comparison holds two severe disadvantages. First,
as you saw in “Guideline 18: Beware the Performance of Acyclic Visitor”, a
dynamic_cast
does
most certainly not count as a fast operation. Hence, you would have to pay a considerable
runtime cost for every comparison. Second, in this implementation, you can only successfully
compare two Shape
s if they are equipped with the same DrawStrategy
. While this might
be reasonable in one context, it might also be considered an unfortunate limitation in
another context. The only solution I am aware of is to return to std::function
to store
the drawing Strategy, which, however, would result in another performance penalty.7 In summary, depending on the context,
equality comparison may be possible, but it’s usually neither easy nor cheap to accomplish.
This is evidence to my earlier statement that Type Erasure doesn’t support binary
operations.
“What about the Interface Segregation Principle (ISP)?” you ask. “While using
External Polymorphism, it was easy to separate concerns in the base class. It appears
we’ve lost this ability, right?” Excellent question. So you remember my example with
the JSONExportable
and Serializable
base classes in
“Guideline 31: Use External Polymorphism for Nonintrusive Runtime Polymorphism”.
Indeed, with Type Erasure we are no longer able to use the hidden base class,
only the abstracting value type. Therefore, it may appear as if the ISP is out
of reach:
class
Document
// Type-erased 'Document'
{
public
:
// ...
void
exportToJSON
(
/*...*/
)
const
;
void
serialize
(
ByteStream
&
bs
,
/*...*/
)
const
;
// ...
};
// Artificial coupling to 'ByteStream', although only the JSON export is needed
void
exportDocument
(
Document
const
&
doc
)
{
// ...
doc
.
exportToJSON
(
/* pass necessary arguments */
);
// ...
}
However, fortunately, this impression is incorrect. You can easily adhere to the ISP by providing several type-erased abstractions:8
Document
doc
=
/*...*/
;
// Type-erased 'Document'
doc
.
exportToJSON
(
/* pass necessary arguments */
);
doc
.
serialize
(
/* pass necessary arguments */
);
JSONExportable
jdoc
=
doc
;
// Type-erased 'JSONExportable'
jdoc
.
exportToJSON
(
/* pass necessary arguments */
);
Serializable
sdoc
=
doc
;
// Type-erased 'Serializable'
sdoc
.
serialize
(
/* pass necessary arguments */
);
Before considering this, take a look at “Guideline 34: Be Aware of the Setup Costs of Owning Type Erasure Wrappers”.
“Apart from the implementation complexity and the restriction to unary operations, there seem to be no disadvantages. Well, then, I have to say this is amazing stuff indeed! The benefits clearly outweigh the drawbacks.” Well, of course it always depends, meaning that in a specific context some of these issues might cause some pain. But I agree that, altogether, Type Erasure proves to be a very valuable design pattern. From a design perspective, you’ve gained a formidable level of decoupling, which will definitely lead to less pain when changing or extending your software. However, although this is already fascinating, there’s more. I’ve mentioned performance a couple of times but haven’t yet shown any performance numbers. So let’s take a look at the performance results.
Before showing you the performance results for Type Erasure, let me remind you about the
benchmark scenario that we also used to benchmark the Visitor and Strategy solutions
(see Table 4-2 in
“Guideline 16: Use Visitor to Extend Operations” and Table 5-1 in
“Guideline 23: Prefer a Value-Based Implementation of Strategy and Command”).
This time I have extended the benchmark with a Type Erasure solution based on the
OwningShapeModel
implementation. For the benchmark, we are still using four different kinds
of shapes (circles, squares, ellipses, and rectangles). And again, I’m running 25,000 translate
operations on 10,000 randomly created shapes. I use both GCC 11.1 and Clang 11.1, and for both
compilers, I’m adding only the -O3
and -DNDEBUG
compilation flags. The platform I’m using
is macOS Big Sur (version 11.4) on an 8-Core Intel Core i7 with 3.8 GHz, 64 GB of main memory.
Table 8-1 shows the performance numbers. For your convenience, I reproduced the performance results from the Strategy benchmarks. After all, the Strategy design pattern is the solution that is aiming at the same design space. The most interesting line, though, is the last line. It shows the performance result for the Type Erasure design pattern.
Type Erasure implementation | GCC 11.1 | Clang 11.1 |
---|---|---|
Object-oriented solution |
1.5205 s |
1.1480 s |
|
2.1782 s |
1.4884 s |
Manual implementation of |
1.6354 s |
1.4465 s |
Classic Strategy |
1.6372 s |
1.4046 s |
Type Erasure |
1.5298 s |
1.1561 s |
“Looks very interesting. Type Erasure seems to be pretty fast. Apparently only the
object-oriented solution is faster.” Yes. For Clang, the performance of the
object-oriented solution is a little better. But only a little. However, please remember
that the object-oriented solution does not decouple anything: the
draw()
function is implemented as a virtual member function in the Shape
hierarchy,
and thus you experience heavy coupling to the drawing functionality. While this may come
with little
performance overhead, from a design perspective, this is a worst-case scenario.
Taking this into account, the performance numbers of Type Erasure are truly marvelous: it
performs between 6% and 20% better than any Strategy implementation. Thus, Type Erasure
not only provides the strongest decoupling but also performs better than all the other
attempts to reduce coupling.9
In summary, Type Erasure is an amazing approach to achieve both efficient and loosely coupled code. While it may have a few limitations and disadvantages, the one thing you probably cannot ignore easily is the complex implementation details. For that reason, many people, including me and Eric Niebler, feel that Type Erasure should become a language feature:10
If I could go back in time and had the power to change C++, rather than adding virtual functions, I would add language support for type erasure and concepts. Define a single-type concept, automatically generate a type-erasing wrapper for it.
There is more to be done, though, to establish Type Erasure as a real design pattern.
I have introduced Type Erasure as a compound design pattern built
from External Polymorphism, Bridge, and Prototype. I’ve introduced it as a
value-based technique for providing strong decoupling of a set of types from their
associated operations. However, unfortunately, you might see other “forms” of Type
Erasure: over time, the term Type Erasure has been misused and abused for all kinds
of techniques and concepts. For instance, sometimes people refer to a void*
as
Type Erasure. Rarely, you also hear about Type Erasure in the context of inheritance
hierarchies, or more specifically a pointer-to-base. And finally, you also might hear
about Type Erasure in the context of std::variant
.11
The std::variant
example especially demonstrates how deeply flawed this overuse of
the term Type Erasure really is. While External Polymorphism, the main design pattern
behind Type Erasure, is about enabling you to add new types, the Visitor design
pattern and its modern implementation as std::variant
are about adding new
operations (see “Guideline 15: Design for the Addition of
Types or Operations”). From a
software design perspective, these two solutions are completely orthogonal to each
other: while Type Erasure truly decouples from concrete types and erases type
information, the template arguments of std::variant
reveal all possible alternatives
and therefore make you depend on these types. Using the same term for both of them
results in exactly zero information conveyed when using the term Type Erasure and generates these types of comments: “I
would suggest we use Type Erasure to solve this problem.” “Could you please be more
specific? Do you want to add types or operations?” As such, the term would not fulfill
the qualities of a design pattern; it wouldn’t carry any intent. Therefore, it would
be useless.
To give Type Erasure its well-earned place in the hall of design patterns and to give it any meaning, consider using the term only for the intent discussed in this guideline.
The primary focus of this book is software design. Therefore, all this talk about structuring software, about design principles, about tools for managing dependencies and abstractions, and, of course, all the information on design patterns is at the center of interest. Still, I’ve mentioned a few times that performance is important. Very important! After all, C++ is a performance-centric programming language. Therefore, I now make an exception: this guideline is devoted to performance. Yes, I’m serious: no talk about dependencies, (almost) no examples for separation of concerns, no value semantics. Just performance. “Finally, some performance stuff—great!” you cheer. However, be aware of the consequences: this guideline is pretty heavy on implementation details. And as it is in C++, mentioning one detail requires you to also deal with two more details, and so you are pretty quickly sucked into the realm of implementation details. To avoid that (and to keep my publisher happy), I will not elaborate on every implementation detail or demonstrate all the alternatives. I will, however, give additional references that should help you to dig deeper.12
In “Guideline 32: Consider Replacing Inheritance Hierarchies with Type Erasure”, you saw great performance numbers for our basic, unoptimized Type Erasure implementation. However, since we are now in possession of a value type and a wrapper class, not just a pointer, we have gained a multitude of opportunities to speed up performance. This is why we will take a look at two options to improve performance: the SBO and manual virtual dispatch.
Let’s start our quest to speed up the performance of our Type Erasure implementation. One of the first things that usually comes to mind when talking about performance is optimizing memory allocations. This is because acquiring and freeing dynamic memory can be very slooowww and nondeterministic. And for real: optimizing memory allocations can make all the difference between slow and lightning fast.
However, there is a second reason to look into memory. In
“Guideline 32: Consider Replacing Inheritance Hierarchies with Type Erasure”,
I might have accidentally given you the impression that we need dynamic memory to pull off
Type Erasure. Indeed, one of the initial implementation details in our first Shape
class
was the unconditional dynamic memory allocation in the constructor and clone()
function,
independent of the size of the given object, so for both small and large objects, we would
always perform a dynamic memory allocation with std::make_unique()
. This choice
is limiting, not just because of performance, in particular for small objects, but also
because in certain environments dynamic memory is not available.
Therefore, I should demonstrate to you that there’s a lot you can do with respect
to memory. In fact, you are in full control of memory management! Since you are using a
value type, a wrapper, you can deal with memory in any way you see fit. One of the many
options is to completely rely on in-class memory and emit a compile-time error if objects
are too large. Alternatively, you might switch between in-class and dynamic memory, depending
on the size of the stored object. Both of these are made possible by the SBO.
To give you an idea of how SBO works, let’s take a look at a Shape
implementation
that never allocates dynamically but uses only in-class memory:
#
include
<array>
#
include
<cstdlib>
#
include
<memory>
template
<
size_t
Capacity
=
32U
,
size_t
Alignment
=
alignof
(
void
*
)
>
class
Shape
{
public
:
// ...
private
:
// ...
Concept
*
pimpl
(
)
{
return
reinterpret_cast
<
Concept
*
>
(
buffer_
.
data
(
)
)
;
}
Concept
const
*
pimpl
(
)
const
{
return
reinterpret_cast
<
Concept
const
*
>
(
buffer_
.
data
(
)
)
;
}
alignas
(
Alignment
)
std
:
:
array
<
std
:
:
byte
,
Capacity
>
buffer_
;
}
;
This Shape
class does not store std::unique_ptr
anymore, but instead owns an array of
properly aligned bytes
().13 To give users of
Shape
the flexibility to adjust both the capacity
and the alignment of the array, you can provide the two nontype template parameters, Capacity
and Alignment
, to the Shape
class
().14
While this improves the flexibility to adjust to different circumstances, the disadvantage
of that approach is that this turns the
Shape
class into a class template. As a consequence,
all functions that use this abstraction will likely turn into function templates. This may
be undesirable, for instance, because you might have to move code from source files into
header files. However, be aware that this is just one of many possibilities. As stated before,
you are in full control.
To conveniently work with the std::byte
array, we add a pair of pimpl()
functions (named based on the fact that this still realizes the Bridge design pattern,
just using in-class memory)
( and
).
“Oh no, a
reinterpret_cast
!” you say. “Isn’t this super dangerous?” You are
correct; in general, a reinterpret_cast
should be considered potentially dangerous.
However, in this particular case, we are backed up by the
C++ standard, which explains that what
we are doing here is perfectly safe.
As you probably expect by now, we also need to introduce an external inheritance hierarchy
based on the External Polymorphism design pattern. This time we realize this hierarchy
in the private
section of the Shape
class. Not because this is better or more suited
for this Shape
implementation, but for the sole reason to show you another alternative:
template
<
size_t
Capacity
=
32U
,
size_t
Alignment
=
alignof
(
void
*
)
>
class
Shape
{
public
:
// ...
private
:
struct
Concept
{
virtual
~
Concept
(
)
=
default
;
virtual
void
draw
(
)
const
=
0
;
virtual
void
clone
(
Concept
*
memory
)
const
=
0
;
virtual
void
move
(
Concept
*
memory
)
=
0
;
}
;
template
<
typename
ShapeT
,
typename
DrawStrategy
>
struct
OwningModel
:
public
Concept
{
OwningModel
(
ShapeT
shape
,
DrawStrategy
drawer
)
:
shape_
(
std
:
:
move
(
shape
)
)
,
drawer_
(
std
:
:
move
(
drawer
)
)
{
}
void
draw
(
)
const
override
{
drawer_
(
shape_
)
;
}
void
clone
(
Concept
*
memory
)
const
override
{
std
:
:
construct_at
(
static_cast
<
OwningModel
*
>
(
memory
)
,
*
this
)
;
// or:
// auto* ptr =
// const_cast<void*>(static_cast<void const volatile*>(memory));
// ::new (ptr) OwningModel( *this );
}
void
move
(
Concept
*
memory
)
override
{
std
:
:
construct_at
(
static_cast
<
OwningModel
*
>
(
memory
)
,
std
:
:
move
(
*
this
)
)
;
// or:
// auto* ptr =
// const_cast<void*>(static_cast<void const volatile*>(memory));
// ::new (ptr) OwningModel( std::move(*this) );
}
ShapeT
shape_
;
DrawStrategy
drawer_
;
}
;
// ...
alignas
(
Alignment
)
std
:
:
array
<
std
:
:
byte
,
Capacity
>
buffer_
;
}
;
The first interesting detail in this context is the clone()
function
().
As
clone()
carries the responsibility of creating a copy, it needs to be adapted to the
in-class memory. So instead of creating a new Model
via std::make_unique()
, it creates
a new Model
in place via std::construct_at()
. Alternatively, you could use a
placement new
to create the copy at the given memory location.15
“Wow, wait a second! That’s a pretty tough piece of code to swallow. What’s with all these
casts? Are they really necessary?” I admit, these lines are a little challenging. Therefore,
I should explain them in detail. The good old approach to creating an instance in place is via
placement new
. However, using new
always carries the danger of someone (inadvertently
or maliciously) providing a replacement for the class-specific new
operator. To
avoid any kind of problem and reliably construct an object in place, the given address
is first converted to void const volatile*
via a static_cast
and then to void*
via a const_cast
. The resulting address is passed to the global placement new
operator.
Indeed, not the most obvious piece of code. Therefore, it is advisable to use the
C++20 algorithm std::construct_at()
: it provides you with exactly the same
functionality but with a significantly nicer syntax.
However, we need one more function: clone()
is concerned only with copy operations. It doesn’t apply to move operations. For that reason, we extend the Concept
with a pure
virtual move()
function and consequently implement it in the
OwningModel
class template
().
“Is this really necessary? We’re using in-class memory, which cannot be moved to another
instance of Shape
. What’s the point of that move()
?” Well, you are correct that we can’t
move the memory itself from one object to another, but we can still move the shape stored
inside. Thus, the move()
function moves an OwningModel
from one buffer to another instead
of copying it.
The clone()
and move()
functions are used in the copy constructor
(),
the copy assignment operator
(
),
the move constructor
(
),
and the move assignment operator of
Shape
():
template
<
size_t
Capacity
=
32U
,
size_t
Alignment
=
alignof
(
void
*
)
>
class
Shape
{
public
:
// ...
Shape
(
Shape
const
&
other
)
{
other
.
pimpl
(
)
-
>
clone
(
pimpl
(
)
)
;
}
Shape
&
operator
=
(
Shape
const
&
other
)
{
// Copy-and-Swap Idiom
Shape
copy
(
other
)
;
buffer_
.
swap
(
copy
.
buffer_
)
;
return
*
this
;
}
Shape
(
Shape
&
&
other
)
noexcept
{
other
.
pimpl
(
)
-
>
move
(
pimpl
(
)
)
;
}
Shape
&
operator
=
(
Shape
&
&
other
)
noexcept
{
// Copy-and-Swap Idiom
Shape
copy
(
std
:
:
move
(
other
)
)
;
buffer_
.
swap
(
copy
.
buffer_
)
;
return
*
this
;
}
~
Shape
(
)
{
std
:
:
destroy_at
(
pimpl
(
)
)
;
// or: pimpl()->~Concept();
}
private
:
// ...
alignas
(
Alignment
)
std
:
:
array
<
std
:
:
byte
,
Capacity
>
buffer_
;
}
;
Definitely noteworthy to mention is the destructor of Shape
().
Since we manually create an
OwningModel
within the byte buffer by
std::construct_at()
or a placement new
, we are also responsible for explicitly calling a
destructor. The easiest and most elegant way of doing that is to use the C++17
algorithm std::destroy_at()
.
Alternatively, you can explicitly call the Concept
destructor.
The last, but essential, detail of Shape
is the templated constructor:
template
<
size_t
Capacity
=
32U
,
size_t
Alignment
=
alignof
(
void
*
)
>
class
Shape
{
public
:
template
<
typename
ShapeT
,
typename
DrawStrategy
>
Shape
(
ShapeT
shape
,
DrawStrategy
drawer
)
{
using
Model
=
OwningModel
<
ShapeT
,
DrawStrategy
>
;
static_assert
(
sizeof
(
Model
)
<=
Capacity
,
"Given type is too large"
);
static_assert
(
alignof
(
Model
)
<=
Alignment
,
"Given type is misaligned"
);
std
::
construct_at
(
static_cast
<
Model
*>
(
pimpl
())
,
std
::
move
(
shape
),
std
::
move
(
drawer
)
);
// or:
// auto* ptr =
// const_cast<void*>(static_cast<void const volatile*>(pimpl()));
// ::new (ptr) Model( std::move(shape), std::move(drawer) );
}
// ...
private
:
// ...
};
After a pair of compile-time checks that the required OwningModel
fits into the in-class
buffer and adheres to the alignment restrictions, an OwningModel
is instantiated into the
in-class buffer by std::construct_at()
.
With this implementation in hand, we now adapt and rerun the performance benchmark from
“Guideline 32: Consider Replacing Inheritance Hierarchies with Type Erasure”. We run exactly the same
benchmark, this time without allocating dynamic memory inside Shape
and without
fragmenting the memory with many, tiny allocations. As expected, the performance results
are impressive (see Table 8-2).
Type Erasure implementation | GCC 11.1 | Clang 11.1 |
---|---|---|
Object-oriented solution |
1.5205 s |
1.1480 s |
|
2.1782 s |
1.4884 s |
Manual implementation of |
1.6354 s |
1.4465 s |
Classic Strategy |
1.6372 s |
1.4046 s |
Type Erasure |
1.5298 s |
1.1561 s |
Type Erasure (SBO) |
1.3591 s |
1.0348 s |
“Wow, this is fast. This is…well, let me do the math…amazing, roughly 20% faster than the fastest Strategy implementation, and even faster than the object-oriented solution.” It is, indeed. Very impressive, right? Still, you should remember that these are the numbers that I got on my system. Your numbers will be different, almost certainly. But even though your numbers might not be the same, the general takeaway is that there is a lot of potential to optimize performance by dealing with memory allocations.
However, while the performance is extraordinary, we’ve lost a lot of flexibility: only
OwningModel
instantiations that are smaller or equal to the specified Capacity
can be
stored inside Shape
. Bigger models are excluded. This brings me back to the idea that we
could switch between in-class and dynamic memory depending on the size of the given shape:
small shapes are stored inside an in-class buffer, while large shapes are allocated dynamically.
You could now go ahead and update the implementation of Shape
to use both kinds of
memory. However, at this point it’s probably a good idea to point out one of our most
important design principles again: separation of concerns. Instead of squeezing all logic
and functionality into the Shape
class, it would be easier and (much) more flexible to
separate the implementation details and implement Shape
with policy-based design
(see “Guideline 19: Use Strategy to Isolate How Things Are Done”):
template
<
typename
StoragePolicy
>
class
Shape
;
The Shape
class template is rewritten to accept a StoragePolicy
. Via this policy, you
would be able to specify from outside how the class should acquire memory. And of course,
you would perfectly adhere to SRP and OCP. One such storage policy could be the
DynamicStorage
policy class:
#include
<utility>
struct
DynamicStorage
{
template
<
typename
T
,
typename
...
Args
>
T
*
create
(
Args
&&
...
args
)
const
{
return
new
T
(
std
::
forward
<
Args
>
(
args
)...
);
}
template
<
typename
T
>
void
destroy
(
T
*
ptr
)
const
noexcept
{
delete
ptr
;
}
};
As the name suggests, DynamicPolicy
would acquire memory dynamically, for instance via
new
. Alternatively, if you have stronger requirements, you could build on
std::aligned_alloc()
or
similar functionality to provide dynamic memory with a specified alignment. Similarly to
DynamicStorage
, you could provide an InClassStorage
policy:
#include
<array>
#include
<cstddef>
#include
<memory>
#include
<utility>
template
<
size_t
Capacity
,
size_t
Alignment
>
struct
InClassStorage
{
template
<
typename
T
,
typename
...
Args
>
T
*
create
(
Args
&&
...
args
)
const
{
static_assert
(
sizeof
(
T
)
<=
Capacity
,
"The given type is too large"
);
static_assert
(
alignof
(
T
)
<=
Alignment
,
"The given type is misaligned"
);
T
*
memory
=
const_cast
<
T
*>
(
reinterpret_cast
<
T
const
*>
(
buffer_
.
data
()));
return
std
::
construct_at
(
memory
,
std
::
forward
<
Args
>
(
args
)...
);
// or:
// void* const memory = static_cast<void*>(buffer_.data());
// return ::new (memory) T( std::forward<Args>( args )... );
}
template
<
typename
T
>
void
destroy
(
T
*
ptr
)
const
noexcept
{
std
::
destroy_at
(
ptr
);
// or: ptr->~T();
}
alignas
(
Alignment
)
std
::
array
<
std
::
byte
,
Capacity
>
buffer_
;
};
All of these policy classes provide the same interface: a create()
function to instantiate
an object of type T
and a destroy()
function to do whatever is necessary to clean up.
This interface is used by the Shape
class to trigger construction and destruction, for
instance, in its templated constructor
()16 and in the destructor
(
):
template
<
typename
StoragePolicy
>
class
Shape
{
public
:
template
<
typename
ShapeT
>
Shape
(
ShapeT
shape
)
{
using
Model
=
OwningModel
<
ShapeT
>
;
pimpl_
=
policy_
.
template
create
<
Model
>
(
std
:
:
move
(
shape
)
)
}
~
Shape
(
)
{
policy_
.
destroy
(
pimpl_
)
;
}
// ... All other member functions, in particular the
// special members functions, are not shown
private
:
// ...
[
[
no_unique_address
]
]
StoragePolicy
policy_
{
}
;
Concept
*
pimpl_
{
}
;
}
;
The last detail that should not be left unnoticed is the data members
():
the
Shape
class now stores an instance of the given StoragePolicy
and, do not be alarmed,
a raw pointer to its Concept
. Indeed, there is no need to store std::unique_ptr
anymore,
since we are manually destroying the object in our own destructor again. You might also notice
the [[no_unique_address]]
attribute on the storage policy. This
C++20 feature gives you the opportunity to save
the memory for the storage policy. If the policy is empty, the compiler is now allowed to
not reserve any memory for the data member. Without this attribute, it would be necessary to
reserve at least a single byte for policy_
, but likely more bytes due to alignment restrictions.
In summary, SBO is an effective and one of the most interesting optimizations for
a Type Erasure implementation. For that reason, many standard types, such as std::function
and std::any
, use some form of SBO. Unfortunately, the C++ Standard Library
specification doesn’t require the use of SBO. This is why you can only hope that
SBO is used; you can’t count on it. However, because performance is so important and
because SBO plays such a decisive role, there are already proposals out there that
also suggest standardizing the types inplace_function
and inplace_any
. Time will
tell if these find their way into the Standard Library.
“Wow, this will prove useful. Is there anything else I can do to improve the performance of my Type Erasure implementation?” you ask. Oh yes, you can do more. There is a second potential performance optimization. This time we try to improve the performance of the virtual functions. And yes, I’m talking about the virtual functions that are introduced by the external inheritance hierarchy, i.e., by the External Polymorphism design pattern.
“How should we be able to optimize the performance of virtual functions? Isn’t this
something that is completely up to the compiler?” Absolutely, you’re correct. However,
I am not talking about fiddling with backend, compiler-specific implementation details,
but about replacing the virtual functions with something more efficient. And that is
indeed possible. Remember that a virtual function is nothing but a function pointer
that is stored inside a virtual function table. Every type with at least one virtual
function has such a virtual function table. However, there is only one virtual function
table for each type. In other words, this table is not stored inside every instance. So
in order to connect the virtual function table with every instance of that type, the class
stores an additional, hidden data member, which we commonly call the vptr
and which is
a raw pointer to the virtual function table.
When you call a virtual function, you first go through the vptr
to fetch the virtual
function table. Once you’re there, you can grab the corresponding function pointer from the
virtual function table and call it. Therefore, in total, a virtual function call entails
two indirections: the vptr
and the pointer to the actual function. For
that
reason, roughly speaking, a virtual function call is twice as expensive as a regular,
noninline function call.
These two indirections provide us with the opportunity for optimization: we can in fact
reduce the number of indirections to just one. To achieve that, we will employ an
optimization strategy that works fairly often: we’ll trade space for speed. What we will do
is implement the virtual dispatch manually by storing the virtual function pointers
inside the Shape
class. The following code snippet already gives you a pretty good idea
of the details:
//---- <Shape.h> ----------------
#
include
<cstddef>
#
include
<memory>
class
Shape
{
public
:
// ...
private
:
// ...
template
<
typename
ShapeT
,
typename
DrawStrategy
>
struct
OwningModel
{
OwningModel
(
ShapeT
value
,
DrawStrategy
drawer
)
:
shape_
(
std
:
:
move
(
value
)
)
,
drawer_
(
std
:
:
move
(
drawer
)
)
{
}
ShapeT
shape_
;
DrawStrategy
drawer_
;
}
;
using
DestroyOperation
=
void
(
void
*
)
;
using
DrawOperation
=
void
(
void
*
)
;
using
CloneOperation
=
void
*
(
void
*
)
;
std
:
:
unique_ptr
<
void
,
DestroyOperation
*
>
pimpl_
;
DrawOperation
*
draw_
{
nullptr
}
;
CloneOperation
*
clone_
{
nullptr
}
;
}
;
Since we are replacing all virtual functions, even the virtual destructor, there’s no need
for a Concept
base class anymore. Consequently, the external hierarchy is reduced to just
the OwningModel
class template
(),
which still acts as storage for a specific kind of shape (
ShapeT
) and DrawStrategy
.
Still, it meets the same fate: all virtual functions are removed. The only remaining details
are the constructor and the data members.
The virtual functions are replaced by manual function pointers. Since the syntax for function
pointers is not the most pleasant to use, we add a couple of
function type aliases for our convenience:17
DestroyOperation
represents the former virtual destructor
(),
DrawOperation
represents the former virtual draw()
function
(), and
CloneOperation
represents the former virtual clone()
function
().
DestroyOperation
is used to configure the Deleter
of the pimpl_
data member
()
(and yes, as such it acts as a Strategy). The latter two,
DrawOperation
and
CloneOperation
, are used for the two additional function pointer data members, draw_
and clone_
( and
).
“Oh no, void*
s! Isn’t that an archaic and super dangerous way of doing things?” you
gasp. OK, I admit that without explanation it looks very suspicious. However, stay with me,
I promise that everything will be perfectly fine and type safe. The key to making this work
now lies in the initialization of these function pointers. They are initialized in the
templated constructor of the Shape
class:
//---- <Shape.h> ----------------
// ...
class
Shape
{
public
:
template
<
typename
ShapeT
,
typename
DrawStrategy
>
Shape
(
ShapeT
shape
,
DrawStrategy
drawer
)
:
pimpl_
(
new
OwningModel
<
ShapeT
,
DrawStrategy
>
(
std
:
:
move
(
shape
)
,
std
:
:
move
(
drawer
)
)
,
[
]
(
void
*
shapeBytes
)
{
using
Model
=
OwningModel
<
ShapeT
,
DrawStrategy
>
;
auto
*
const
model
=
static_cast
<
Model
*
>
(
shapeBytes
)
;
delete
model
;
}
)
,
draw_
(
[
]
(
void
*
shapeBytes
)
{
using
Model
=
OwningModel
<
ShapeT
,
DrawStrategy
>
;
auto
*
const
model
=
static_cast
<
Model
*
>
(
shapeBytes
)
;
(
*
model
-
>
drawer_
)
(
model
-
>
shape_
)
;
}
)
,
clone_
(
[
]
(
void
*
shapeBytes
)
-
>
void
*
{
using
Model
=
OwningModel
<
ShapeT
,
DrawStrategy
>
;
auto
*
const
model
=
static_cast
<
Model
*
>
(
shapeBytes
)
;
return
new
Model
(
*
model
)
;
}
)
{
}
// ...
private
:
// ...
}
;
Let’s focus on the pimpl_
data member. It is initialized both by a pointer
to the newly instantiated OwningModel
()
and by a stateless lambda expression
(
).
You may remember that a stateless lambda is implicitly convertible to a function pointer.
This language guarantee is what we use to our advantage: we directly pass the lambda as the
deleter to the constructor of
unique_ptr
, force the compiler to apply the implicit conversion
to a DestroyOperation*
, and thus bind the lambda function to the std::unique_ptr
.
“OK, I get the point: the lambda can be used to initialize the function pointer. But how does
it work? What does it do?” Well, also remember that we are creating this lambda inside the
templated constructor. That means that at this point we are fully aware of the actual type
of the passed ShapeT
and DrawStrategy
. Thus, the lambda is generated with the
knowledge of which type of OwningModel
is instantiated and stored inside the pimpl_
.
Eventually it will be called with a void*
, i.e., by the address of some OwningModel
.
However, based on its knowledge about the actual type of
OwningModel
, it can first of all
perform a static_cast
from void*
to
OwningModel<ShapeT,DrawStrategy>*
().
While in most other contexts this kind of cast would be suspicious and would likely be a wild
guess, in this context it is perfectly type safe: we can be certain about the correct type
of
OwningModel
. Therefore, we can use the resulting pointer to trigger the correct
cleanup behavior
().
The initialization of the draw_
and clone_
data members is very similar
( and
).
The only difference is, of course, the action performed by the lambdas: they perform the
correct actions to draw the shape and to create a copy of the model, respectively.
I know, this may take some time to digest. But we are almost done; the only missing detail is the special member functions. For the destructor and the two move operations, we can again ask for the compiler-generated default. However, we have to deal with the copy constructor and copy assignment operator ourselves:
//---- <Shape.h> ----------------
// ...
class
Shape
{
public
:
// ...
Shape
(
Shape
const
&
other
)
:
pimpl_
(
clone_
(
other
.
pimpl_
.
get
()
),
other
.
pimpl_
.
get_deleter
()
)
,
draw_
(
other
.
draw_
)
,
clone_
(
other
.
clone_
)
{}
Shape
&
operator
=
(
Shape
const
&
other
)
{
// Copy-and-Swap Idiom
using
std
::
swap
;
Shape
copy
(
other
);
swap
(
pimpl_
,
copy
.
pimpl_
);
swap
(
draw_
,
copy
.
draw_
);
swap
(
clone_
,
copy
.
clone_
);
return
*
this
;
}
~
Shape
()
=
default
;
Shape
(
Shape
&&
)
=
default
;
Shape
&
operator
=
(
Shape
&&
)
=
default
;
private
:
// ...
};
This is all we need to do, and we’re ready to try this out. So let’s put this implementation to the test. Once again we update the benchmark from “Guideline 32: Consider Replacing Inheritance Hierarchies with Type Erasure” and run it with our manual implementation of virtual functions. I have even combined the manual virtual dispatch with the previously discussed SBO. Table 8-3 shows the performance results.
Type Erasure implementation | GCC 11.1 | Clang 11.1 |
---|---|---|
Object-oriented solution |
1.5205 s |
1.1480 s |
|
2.1782 s |
1.4884 s |
Manual implementation of |
1.6354 s |
1.4465 s |
Classic Strategy |
1.6372 s |
1.4046 s |
Type Erasure |
1.5298 s |
1.1561 s |
Type Erasure (SBO) |
1.3591 s |
1.0348 s |
Type Erasure (manual virtual dispatch) |
1.1476 s |
1.1599 s |
Type Erasure (SBO + manual virtual dispatch) |
1.2538 s |
1.2212 s |
The performance improvement for the manual virtual dispatch is extraordinary for GCC. On my system, I get down to 1.1476 seconds, which is an improvement of 25% in comparison to the based, unoptimized implementation of Type Erasure. Clang, on the other hand, does not show any improvement in comparison to the basic, unoptimized implementation. Although this may be a little disappointing, the runtime is, of course, still remarkable.
Unfortunately the combination of SBO and manual virtual dispatch does not lead to an even better performance. While GCC shows a small improvement in comparison to the pure SBO approach (which might be interesting for environments without dynamic memory), on Clang this combination does not work as well as you might have hoped for.
In summary, there is a lot of potential for optimizing the performance for Type Erasure implementations. If you’ve been skeptical before about Type Erasure, this gain in performance should give you a strong incentive to investigate for yourself. While this is amazing and without doubt is pretty exciting, it is important to remember where this is coming from: only due to separating the concerns of virtual behavior and encapsulating the behavior into a value type have we gained these optimization opportunities. We wouldn’t have been able to achieve this if all we had was a pointer-to-base.
In “Guideline 32: Consider Replacing Inheritance Hierarchies with Type Erasure” and “Guideline 33: Be Aware of the Optimization Potential of Type Erasure”, I guided you through the thicket of implementation details for a basic Type Erasure implementation. Yes, that was tough, but definitely worth the effort: you have emerged stronger, wiser, and with a new, efficient, and strongly decoupling design pattern in your toolbox. Great!
However, we have to go back into the thicket. I see you are rolling your eyes, but there is more. And I have to admit: I lied. At least a little. Not by telling you something incorrect, but by omission. There is one more disadvantage of Type Erasure that you should know of. A big one. One that you might not like at all. Sigh.
Assume for a second that Shape
is a base class again, and Circle
one of many deriving
classes. Then passing a Circle
to a function expecting a Shape const&
would be easy and
cheap ():
#
include
<cstdlib>
class
Shape
{
/*...*/
}
;
// Classic base class
class
Circle
:
public
Shape
{
/*...*/
}
;
// Deriving class
void
useShape
(
Shape
const
&
shape
)
{
shape
.
draw
(
/*...*/
)
;
}
int
main
(
)
{
Circle
circle
{
3.14
}
;
// Automatic and cheap conversion from 'Circle const&' to 'Shape const&'
useShape
(
circle
)
;
return
EXIT_SUCCESS
;
}
Although the Type Erasure Shape
abstraction is a little different (for instance,
it always requires a drawing Strategy), this kind of conversion is still possible:
#
include
<cstdlib>
class
Circle
{
/*...*/
}
;
// Nonpolymorphic geometric primitive
class
Shape
{
/*...*/
}
;
// Type erasure wrapper class as shown before
void
useShape
(
Shape
const
&
shape
)
{
draw
(
shape
)
;
}
int
main
(
)
{
Circle
circle
{
3.14
}
;
auto
drawStrategy
=
[
]
(
Circle
const
&
c
)
{
/*...*/
}
;
// Creates a temporary 'Shape' object, involving
// a copy operation and a memory allocation
useShape
(
{
circle
,
drawStrategy
}
)
;
return
EXIT_SUCCESS
;
}
Unfortunately, it is no longer cheap. On the contrary, based on our previous implementations,
which include both the basic one and optimized ones, the call to the useShape()
function would involve a couple of potentially expensive operations
():
To convert a Circle
into a Shape
, the compiler creates a temporary Shape
using the non-explicit
, templated Shape
constructor.
The call of the constructor results in a copy operation of the given shape
(not expensive for Circle
, but potentially expensive for other shapes) and the
given draw Strategy (essentially free if the Strategy is stateless, but
potentially expensive, depending on what is stored inside the object).
Inside the Shape
constructor, a new
shape model is created, involving a
memory allocation (hidden in the call to std::make_unique()
in the Shape
constructor and definitely expensive).
The temporary (rvalue) Shape
is passed by reference-to-const
to the
useShape()
function.
It is important to point out that this is not a specific problem of our Shape
implementation. The same problem will hit you if, for instance, you use std::function
as a function argument:
#include
<cstdlib>
#include
<functional>
int
compute
(
int
i
,
int
j
,
std
::
function
<
int
(
int
,
int
)
>
op
)
{
return
op
(
i
,
j
);
}
int
main
()
{
int
const
i
=
17
;
int
const
j
=
10
;
int
const
sum
=
compute
(
i
,
j
,
[
offset
=
15
](
int
x
,
int
y
)
{
return
x
+
y
+
offset
;
}
);
return
EXIT_SUCCESS
;
}
In this example, the given lambda is converted into the std::function
instance. This
conversion will involve a copy operation and might involve a memory allocation. It entirely
depends on the size of the given callable and on the implementation of std::function
.
For that reason, std::function
is a different kind of abstraction than, for instance,
std::string_view
and std::span
. std::string_view
and std::span
are nonowning
abstractions that are cheap to copy because they consist of only a pointer to the
first element and a size. Because these two types perform a shallow copy, they are
perfectly suited as function parameters. std::function
, on the other hand, is an
owning abstraction that performs a deep copy. Therefore, it is not the perfect type
to be used as a function parameter. Unfortunately, the same is true for our Shape
implementation.18
“Oh my, I don’t like this. Not at all. That is terrible! I want my money back!” you
exclaim. I have to agree that this may be a severe issue in your codebase. However, you
understand that the underlying problem is the owning semantics of the Shape
class: on
the basis of its value semantics background, our current Shape
implementation will always
create a copy of the given shape and will always own the copy. While this is perfectly in
line with all the benefits discussed in “Guideline 22: Prefer Value Semantics over
Reference Semantics”,
in this context it results in a pretty unfortunate performance penalty. However, stay calm—there is something we can do: for such a context, we can provide a nonowning Type Erasure
implementation.
Generally speaking, the value semantics–based Type Erasure implementation is beautiful and perfectly adheres to the spirit of modern C++. However, performance is important. It might be so important that sometimes you might not care about the value semantics part, but only about the abstraction provided by Type Erasure. In that case, you might want to reach for a nonowning implementation of Type Erasure, despite the disadvantage that this pulls you back into the realm of reference semantics.
The good news is that if you desire only a simple Type Erasure wrapper, a wrapper that represents a reference-to-base, that is nonowning and trivially copyable, then the required code is fairly simple. That is particularly true because you have already seen how to manually implement the virtual dispatch in “Guideline 33: Be Aware of the Optimization Potential of Type Erasure”. With this technique, a simple, nonowning Type Erasure implementation is just a matter of a few lines of code:
//---- <Shape.h> ----------------
#
include
<memory>
class
ShapeConstRef
{
public
:
template
<
typename
ShapeT
,
typename
DrawStrategy
>
ShapeConstRef
(
ShapeT
&
shape
,
DrawStrategy
&
drawer
)
:
shape_
{
std
:
:
addressof
(
shape
)
}
,
drawer_
{
std
:
:
addressof
(
drawer
)
}
,
draw_
{
[
]
(
void
const
*
shapeBytes
,
void
const
*
drawerBytes
)
{
auto
const
*
shape
=
static_cast
<
ShapeT
const
*
>
(
shapeBytes
)
;
auto
const
*
drawer
=
static_cast
<
DrawStrategy
const
*
>
(
drawerBytes
)
;
(
*
drawer
)
(
*
shape
)
;
}
}
{
}
private
:
friend
void
draw
(
ShapeConstRef
const
&
shape
)
{
shape
.
draw_
(
shape
.
shape_
,
shape
.
drawer_
)
;
}
using
DrawOperation
=
void
(
void
const
*
,
void
const
*
)
;
void
const
*
shape_
{
nullptr
}
;
void
const
*
drawer_
{
nullptr
}
;
DrawOperation
*
draw_
{
nullptr
}
;
}
;
As the name suggests, the ShapeConstRef
class represents a reference to a const
shape type. Instead of storing a copy of the given shape, it only holds a pointer to it
in the form of a void*
().
In addition, it holds a
void*
to the associated DrawStrategy
(),
and as the third data member, a function pointer to the manually implemented virtual
draw()
function
()
(see “Guideline 33: Be Aware of the Optimization Potential of Type Erasure”).
ShapeConstRef
takes its two arguments, the shape and the drawing Strategy, both
possibly cv qualified, by reference-to-non-const
().19 In this form, it is not possible to pass rvalues to the constructor,
which prevents any kind of lifetime issue with temporary values. This unfortunately does
not protect you from all possible lifetime issues with lvalues but still provides a very
reasonable protection.20 If you want to allow rvalues, you
should reconsider. And if you’re really, really willing to risk lifetime issues
with temporaries, then you can simply take the argument(s) by reference-to-
const
.
Just remember that you did not get this advice from me!
This is it. This is the complete nonowning implementation. It is efficient, short,
simple, and can be even shorter and simpler if you do not need to store any kind of
associated data or Strategy object.
With this functionality in place, you are now able to create cheap shape abstractions.
This is demonstrated in the following code example by the useShapeConstRef()
function. This function enables you to draw any kind of shape (Circle
s, Square
s, etc.)
with any possible drawing implementation by simply using a ShapeConstRef
as the function argument.
In the main()
function, we call useShapeConstRef()
by a concrete shape and a
concrete drawing Strategy (in this case, a lambda)
():
//---- <Main.cpp> ----------------
#
include
<Circle.h>
#
include
<Shape.h>
#
include
<cstdlib>
void
useShapeConstRef
(
ShapeConstRef
shape
)
{
draw
(
shape
)
;
}
int
main
(
)
{
// Create a circle as one representative of a concrete shape type
Circle
circle
{
3.14
}
;
// Create a drawing strategy in the form of a lambda
auto
drawer
=
[
]
(
Circle
const
&
c
)
{
/*...*/
}
;
// Draw the circle directly via the 'ShapeConstRef' abstraction
useShapeConstRef
(
{
circle
,
drawer
}
)
;
return
EXIT_SUCCESS
;
}
This call triggers the desired effect, notably without any memory allocation or expensive copy operation, but only by wrapping polymorphic behavior around a set of pointers to the given shape and drawing Strategy.
Most of the time, this simple nonowning Type Erasure implementation should prove to be
enough and fulfill all your needs. Sometimes, however, and only sometimes, it might not be
enough. Sometimes, you might be interested in a slightly different form of Shape
reference:
#
include
<Cirlce.h>
#
include
<Shape.h>
#
include
<cstdlib>
int
main
(
)
{
// Create a circle as one representative of a concrete shape type
Circle
circle
{
3.14
}
;
// Create a drawing strategy in the form of a lambda
auto
drawer
=
[
]
(
Circle
const
&
c
)
{
/*...*/
}
;
// Combine the shape and the drawing strategy in a 'Shape' abstraction
Shape
shape1
(
circle
,
drawer
)
;
// Draw the shape
draw
(
shape1
)
;
// Create a reference to the shape
// Works already, but the shape reference will store a pointer
// to the 'shape1' instance instead of a pointer to the 'circle'.
ShapeConstRef
shaperef
(
shape1
)
;
// Draw via the shape reference, resulting in the same output
// This works, but only by means of two indirections!
draw
(
shaperef
)
;
// Create a deep copy of the shape via the shape reference
// This is _not_ possible with the simple nonowning implementation!
// With the simple implementation, this creates a copy of the 'shaperef'
// instance. 'shape2' itself would act as a reference and there would be
// three indirections... sigh.
Shape
shape2
(
shaperef
)
;
// Drawing the copy will again result in the same output
draw
(
shape2
)
;
return
EXIT_SUCCESS
;
}
Assuming that you have a type-erased circle
called shape1
, you might want to
convert this Shape
instance to a ShapeConstRef
().
With the current implementation, this works, but the
shaperef
instance would hold
a pointer to the shape1
instance, instead of a pointer to the circle
. As a
consequence, any use of the shaperef
would result in two indirections (one via the
ShapeConstRef
, and one via the Shape
abstraction)
().
Furthermore, you might also be interested in converting a
ShapeConstRef
instance to a
Shape
instance
().
In that case, you might expect that a full copy of the underlying
Circle
is created
and that the resulting Shape
abstraction contains and represents this copy.
Unfortunately, with the current implementation, the Shape
would create a copy of
the ShapeConstRef
instance, and thus introduce a third indirection. Sigh.
If you need a more efficient interaction between owning and nonowning Type
Erasure wrappers, and if you need a real copy when copying a nonowning wrapper into
an owning wrapper, then I can offer you a working solution. Unfortunately, it is more
involved than the previous implementation(s), but fortunately it isn’t not overly complex. The
solution builds on the basic Type Erasure implementation from
“Guideline 32: Consider Replacing Inheritance Hierarchies with Type Erasure”, which includes the
ShapeConcept
and OnwingShapeModel
classes in the detail
namespace, and the
Shape
Type Erasure wrapper. You will see that it just requires a few additions,
all of which you have already seen before.
The first addition happens in the ShapeConcept
base class:
//---- <Shape.h> ----------------
#
include
<memory>
#
include
<utility>
namespace
detail
{
class
ShapeConcept
{
public
:
// ...
virtual
void
clone
(
ShapeConcept
*
memory
)
const
=
0
;
}
;
// ...
}
// namespace detail
The ShapeConcept
class is extended with a second clone()
function
().
Instead of returning a newly instantiated copy of the corresponding model, this function is
passed the address of the memory location where the new model needs to be created.
The second addition is a new model class, the NonOwningShapeModel
:
//---- <Shape.h> ----------------
// ...
namespace
detail
{
// ...
template
<
typename
ShapeT
,
typename
DrawStrategy
>
class
NonOwningShapeModel
:
public
ShapeConcept
{
public
:
NonOwningShapeModel
(
ShapeT
&
shape
,
DrawStrategy
&
drawer
)
:
shape_
{
std
:
:
addressof
(
shape
)
}
,
drawer_
{
std
:
:
addressof
(
drawer
)
}
{
}
void
draw
(
)
const
override
{
(
*
drawer_
)
(
*
shape_
)
;
}
std
:
:
unique_ptr
<
ShapeConcept
>
clone
(
)
const
override
{
using
Model
=
OwningShapeModel
<
ShapeT
,
DrawStrategy
>
;
return
std
:
:
make_unique
<
Model
>
(
*
shape_
,
*
drawer_
)
;
}
void
clone
(
ShapeConcept
*
memory
)
const
override
{
std
:
:
construct_at
(
static_cast
<
NonOwningShapeModel
*
>
(
memory
)
,
*
this
)
;
// or:
// auto* ptr =
// const_cast<void*>(static_cast<void const volatile*>(memory));
// ::new (ptr) NonOwningShapeModel( *this );
}
private
:
ShapeT
*
shape_
{
nullptr
}
;
DrawStrategy
*
drawer_
{
nullptr
}
;
}
;
// ...
}
// namespace detail
The NonOwningShapeModel
is very similar to the OwningShapeModel
implementation, but, as the
name suggests, it does not store copies of the given shape and strategy. Instead, it stores only
pointers (
and
).
Thus, this class represents the reference semantics version of the
OwningShapeModel
class.
Also, NonOwningShapeModel
needs to override the pure virtual functions of the ShapeConcept
class: draw()
again forwards the drawing request to the given drawing Strategy
(),
while the
clone()
functions perform a copy. The first clone()
function is implemented
by creating a new OwningShapeModel
and copying both the stored shape and drawing Strategy
().
The second
clone()
function is implemented by creating a new NonOwningShapeModel
at
the specified address by std::construct_at()
().
In addition, the OwningShapeModel
class needs to provide an implementation of the new
clone()
function:
//---- <Shape.h> ----------------
// ...
namespace
detail
{
template
<
typename
ShapeT
,
typename
DrawStrategy
>
class
OwningShapeModel
:
public
ShapeConcept
{
public
:
// ...
void
clone
(
ShapeConcept
*
memory
)
const
{
using
Model
=
NonOwningShapeModel
<
ShapeT
const
,
DrawStrategy
const
>
;
std
:
:
construct_at
(
static_cast
<
Model
*
>
(
memory
)
,
shape_
,
drawer_
)
;
// or:
// auto* ptr =
// const_cast<void*>(static_cast<void const volatile*>(memory));
// ::new (ptr) Model( shape_, drawer_ );
}
}
;
// ...
}
// namespace detail
The clone()
function in OwningShapeModel
is implemented similarly to the implementation
in the NonOwningShapeModel
class by creating a new instance of a
NonOwningShapeModel
by std::construct_at()
().
The next addition is the corresponding wrapper class that acts as a wrapper around the
external hierarchy ShapeConcept
and NonOwningShapeModel
. This wrapper should take
on the same responsibilities as the Shape
class (i.e., the instantiation of
the NonOwningShapeModel
class template and the encapsulation of all pointer handling)
but should merely represent a reference to a const
concrete shape, not a copy. This
wrapper is again given in the form of the ShapeConstRef
class:
//---- <Shape.h> ----------------
#
include
<array>
#
include
<cstddef>
#
include
<memory>
// ...
class
ShapeConstRef
{
public
:
// ...
private
:
// ...
// Expected size of a model instantiation:
// sizeof(ShapeT*) + sizeof(DrawStrategy*) + sizeof(vptr)
static
constexpr
size_t
MODEL_SIZE
=
3U
*
sizeof
(
void
*
)
;
alignas
(
void
*
)
std
:
:
array
<
std
:
:
byte
,
MODEL_SIZE
>
raw_
;
}
;
As you will see, the ShapeConstRef
class is very similar to the Shape
class, but there
are a few important differences. The first noteworthy detail is the use of a raw_
storage
in the form of a properly aligned std::byte
array
().
That indicates that
ShapeConstRef
does not allocate dynamically, but firmly builds on in-class
memory. In this case, however, this is easily possible, because we can predict the size
of the required NonOwningShapeModel
to be equal to the size of three pointers (assuming
that the pointer to the virtual function table, the vptr
, has the same size as any
other pointer)
().
The private
section of ShapeConstRef
also contains a couple of member functions:
//---- <Shape.h> ----------------
// ...
class
ShapeConstRef
{
public
:
// ...
private
:
friend
void
draw
(
ShapeConstRef
const
&
shape
)
{
shape
.
pimpl
(
)
-
>
draw
(
)
;
}
ShapeConcept
*
pimpl
(
)
{
return
reinterpret_cast
<
ShapeConcept
*
>
(
raw_
.
data
(
)
)
;
}
ShapeConcept
const
*
pimpl
(
)
const
{
return
reinterpret_cast
<
ShapeConcept
const
*
>
(
raw_
.
data
(
)
)
;
}
// ...
}
;
We also add a draw()
function as a hidden friend
and, just as in the SBO implementation in
“Guideline 33: Be Aware of the Optimization Potential of Type Erasure”, we add a pair of pimpl()
functions
( and
).
This will enable us to work conveniently with the in-class
std::byte
array.
The second noteworthy detail is the signature function of every Type Erasure implementation, the templated constructor:
//---- <Shape.h> ----------------
// ...
class
ShapeConstRef
{
public
:
// Type 'ShapeT' and 'DrawStrategy' are possibly cv qualified;
// lvalue references prevent references to rvalues
template
<
typename
ShapeT
,
typename
DrawStrategy
>
ShapeConstRef
(
ShapeT
&
shape
,
DrawStrategy
&
drawer
)
{
using
Model
=
detail
:
:
NonOwningShapeModel
<
ShapeT
const
,
DrawStrategy
const
>
;
static_assert
(
sizeof
(
Model
)
=
=
MODEL_SIZE
,
"
Invalid size detected
"
)
;
static_assert
(
alignof
(
Model
)
=
=
alignof
(
void
*
)
,
"
Misaligned detected
"
)
;
std
:
:
construct_at
(
static_cast
<
Model
*
>
(
pimpl
(
)
)
,
shape_
,
drawer_
)
;
// or:
// auto* ptr =
// const_cast<void*>(static_cast<void const volatile*>(pimpl()));
// ::new (ptr) Model( shape_, drawer_ );
}
// ...
private
:
// ...
}
;
Again, you have the choice to accept the arguments by reference-to-non-const
to prevent lifetime issues with temporaries (very much recommended!)
().
Alternatively, you accept the arguments by reference-to-
const
, which would allow you to
pass rvalues but puts you at risk of experiencing lifetime issues with temporaries. Inside
the constructor, we again first use a convenient type alias for the required type of model
(),
before checking the actual size and alignment of the model
(
).
If it does not adhere to the expected
MODEL_SIZE
or pointer alignment, we create a compile-time error. Then we construct the new model inside the in-class memory by std::construct_at()
():
//---- <Shape.h> ----------------
// ...
class
ShapeConstRef
{
public
:
// ...
ShapeConstRef
(
Shape
&
other
)
{
other
.
pimpl_
-
>
clone
(
pimpl
(
)
)
;
}
ShapeConstRef
(
Shape
const
&
other
)
{
other
.
pimpl_
-
>
clone
(
pimpl
(
)
)
;
}
ShapeConstRef
(
ShapeConstRef
const
&
other
)
{
other
.
pimpl
(
)
-
>
clone
(
pimpl
(
)
)
;
}
ShapeConstRef
&
operator
=
(
ShapeConstRef
const
&
other
)
{
// Copy-and-swap idiom
ShapeConstRef
copy
(
other
)
;
raw_
.
swap
(
copy
.
raw_
)
;
return
*
this
;
}
~
ShapeConstRef
(
)
{
std
:
:
destroy_at
(
pimpl
(
)
)
;
// or: pimpl()->~ShapeConcept();
}
// Move operations explicitly not declared
private
:
// ...
}
;
In addition to the templated ShapeConstRef
constructor, ShapeConstRef
offers two
constructors to enable a conversion from Shape
instances
().
While these are not strictly required, as we could also create an instance of a
NonOwningShapeModel
for a Shape
, these constructors directly create a
NonOwningShapeModel
for the corresponding, underlying shape type, and thus shave off one
indirection, which contributes to better performance. Note that to make these
constructors work, ShapeConstRef
needs to become a friend
of the Shape
class. Don’t worry, though, as this is a good example for friend
ship: Shape
and ShapeConstRef
truly belong together, work hand in hand, and are even provided in the same header
file.
The last noteworthy detail is the fact that the two move operations are neither explicitly
declared nor deleted
().
Since we have explicitly defined the two copy operations, the compiler neither creates
nor deletes the two move operations, thus they are gone. Completely gone in the sense that
these two functions never participate in overload resolution. And yes, this is different
from explicitly deleting them: if they were deleted, they would participate in overload
resolution, and if selected, they would result in a compilation error. But with these two functions
gone, when you try to move a
ShapeConstRef
, the copy operations would be used instead,
which are cheap and efficient, since ShapeConstRef
only represents a reference.
Thus, this class deliberately implements the
Rule of 3.
We are almost finished. The last detail is one more addition, one more constructor in the
Shape
class:
//---- <Shape.h> ----------------
// ...
class
Shape
{
public
:
// ...
Shape
(
ShapeConstRef
const
&
other
)
:
pimpl_
{
other
.
pimpl
()
->
clone
()
}
{}
private
:
// ...
}
Via this constructor, an instance of Shape
creates a deep copy of the shape stored in
the passed ShapeConstRef
instance. Without this constructor, Shape
stores a
copy of the ShapeConstRef
instance and thus acts as a reference itself.
In summary, both nonowning implementations, the simple and the more complex one, give you all
the design advantages of the Type Erasure design pattern but at the same time pull you back
into the realm of reference semantics, with all its deficiencies. Hence, utilize the strengths
of this nonowning form of Type Erasure, but also be aware of the usual lifetime issues. Consider
it on the same level as std::string_view
and std::span
. All of these serve as very useful
tools for function arguments, but do not use them to store anything for a longer period, for
instance in the form of a data member. The danger of lifetime-related issues is just too high.
1 Yes, I consider the manual use of std::unique_ptr
manual lifetime management. But of course it could be much worse if we would not reach for the power of RAII.
2 The term Type Erasure is heavily overloaded, as it is used in different programming languages and for many different things. Even within the C++ community, you hear the term being used for various purposes: you might have heard it being used to denote void*
, pointers-to-base, and std::variant
. In the context of software design, I consider this a very unfortunate issue. I will address this issue at the end of this guideline.
3 Sean Parent, “Inheritance Is the Base Class of Evil,” GoingNative 2013, YouTube.
4 Kevlin Henney, “Valued Conversions,” C++ Report, July-August 2000, CiteSeer.
5 For an introduction to std::function
, see “Guideline 23: Prefer a Value-Based Implementation of Strategy and Command”.
6 The placement of ShapeConcept
and OwningShapeModel
in a namespace is purely an implementation detail of this example implementation. Still, as you will see in “Guideline 34: Be Aware of the Setup Costs of Owning
Type Erasure Wrappers”, this choice will come in pretty handy. Alternatively, these two classes can be implemented as nested classes. You will see examples of this in “Guideline 33: Be Aware of the Optimization Potential of Type Erasure”.
7 Refer to “Guideline 31: Use External Polymorphism for Nonintrusive Runtime Polymorphism” for the implementation based on std::function
.
8 Many thanks to Arthur O’Dwyer for providing this example.
9 Again, please don’t consider these performance numbers the perfect truth. These are the performance results on my machine and my implementation. Your results will differ for sure. However, the takeaway is that Type Erasure performs really well and might perform even better if we take the many optimization options into account (see “Guideline 33: Be Aware of the Optimization Potential of Type Erasure”).
10 Eric Niebler on Twitter, June 19, 2020.
11 For an introduction of std::variant
, see “Guideline 17: Consider std::variant for
Implementing Visitor”.
12 You should avoid going too deep, though, as you probably remember what happened to the dwarves of Moria who dug too deep…
13 Alternatively, you could use an array of bytes, e.g., std::byte[Capacity]
or std::aligned_storage
. The advantage of std::array
is that it enables you to copy the buffer (if that is applicable!).
14 Note that the choice for the default arguments for Capacity
and Alignment
are reasonable but still arbitrary. You can, of course, use different defaults that best fit the properties of the expected actual types.
15 You might not have seen a placement new
before. If that’s the case, rest assured that this form of new
doesn’t perform any memory allocation, but only calls a constructor to create an object at the specified address. The only syntactic difference is that you provide an additional pointer argument to new
.
16 As a reminder, since you might not see this syntax often: the template
keyword in the constructor is necessary because we are trying to call a function template on a dependent name (a name whose meaning depends on a template parameter). Therefore, you have to make it clear to the compiler that the following is the beginning of a template argument list and not a less-than comparison.
17 Some people consider function pointers to be the best feature of C++. In his lightning talk, “The Very Best Feature of C++”, James McNellis demonstrates their syntactic beauty and enormous flexibility. Please do not take this too seriously, though, but rather as a humorous demonstration of a C++ imperfection.
18 At the time of writing, there is an active proposal for the std::function_ref
type, a nonowning version of std::function
.
19 The term cv qualified refers to the const
and volatile
qualifiers.
20 For a reminder about lvalues and rvalues, refer to Nicolai Josuttis’s book on move semantics: C++ Move Semantics - The Complete Guide.