Cippi reasons about the rule of zero, five, or six.
A class is a user-defined type for which the programmer can specify the representation, the operations, and the interface. Class hierarchies are used to organize related structures.
The C++ Core Guidelines have about a hundred rules for user-defined types.
The guidelines start with summary rules before they dive into the special rules for
Concrete types
The eight summary rules provide the background for the special rules.
The summary rules are quite short and don’t go into much detail. They provide broad but valuable insight into classes.
How can draw
’s interface be improved?
void draw(int fromX, fromY, int toX, int toY);
It is not obvious what the int
s stand for. Consequently, you may invoke the function with a wrong sequence of arguments. Compare the previous function draw
with the new one:
void draw(Point from, Point to);
By putting related elements together into a structure, the function signature becomes self-documenting and is, therefore, less error prone than the previous one.
Use |
A class invariant is an invariant used for constraining instances of a class. Member functions have to preserve this invariant. The invariant constrains the possible values for the instances of a class.
This is a common question in C++: When do I have to use a class
or a struct
? The C++ Core Guidelines give the following recommendation. Use a class
if the class has an invariant. A class invariant can be that (y
, m
, d
) together represent a valid calendar date.
struct Pair { // the members can vary independently string name; int volume; }; class Date { public: // validate that {yy, mm, dd} is a valid date and initialize Date(int yy, Month mm, char dd); // ... private: int y; Month m; char d; // day };
The class invariant is initialized and checked in the constructor. The data type Pair
has no invariant because all values for name
and volume
are valid. Pair
is a simple data holder and needs no explicitly provided constructor.
C.3 | Represent the distinction between an interface and an implementation using a class |
The public member functions of a class are the interface of a class, and the private part is the implementation.
class Date { public: Date(); // validate that {yy, mm, dd} is a valid date and initialize Date(int yy, Month mm, char dd); int day() const; Month month() const; // ... private: // ... some representation };
From a maintainability perspective, the implementation of the class Date
can be changed without affecting the user of the class.
C.4 | Make a function a member only if it needs direct access to the representation of a class |
If a function needs no access to the internals of the class, it should not be a member. Hence, you get loose coupling, and a change of the internals of the class will not affect the helper functions.
class Date { // ... relatively small interface ... }; // helper functions: Date next_weekday(Date); bool operator == (Date, Date);
The operators =
, ()
, []
, and ->
have to be members.
C.5 | Place helper functions in the same namespace as the class they support |
A helper function should be in the namespace of the class because it is part of the interface to the class. In contrast to a member function, a helper function does not need direct access to the representation of the class.
namespace Chrono { // here we keep time-related services class Date { /* ... */ }; // helper functions: bool operator == (Date, Date); Date next_weekday(Date); // ... } ... if (date1 == date2) { ... // (1)
Thanks to argument-dependent lookup (ADL), the comparison date1 == date2
will additionally look for the equality operator in the Chrono
namespace. ADL is, in particular, important for overloaded operators such as the output operator: <<
.
C.7 | Don’t define a |
Defining a class and declaring a variable of its type in the same statement confuses and should, therefore, be avoided.
// bad struct Data { /*...*/ } data { /*...*/ }; // good struct Data { /*...*/ }; Data data{ /*...*/ };
C.8 | Use |
When your user-defined type has nonpublic members, you probably want to protect their invariants from the outside. It is the job of the constructor to establish the invariants. Accordingly, you should use a class
instead of a struct
.
C.9 | Minimize exposure of members |
Data hiding and encapsulation is one of the cornerstones of object-oriented class design. You encapsulate the members in the class and allow access only via public member functions. You should think about two interfaces to your class: a public
interface for the outside in general and a protected
interface for derived classes. The remaining members should be private
.
This section has only two rules but introduces the terms concrete and regular type.
A concrete type is “the simplest kind of a class” according to the C++ Core Guidelines. It is often called a value type and is not part of a type hierarchy.
A regular type is a type that “behaves like an int
” and has, therefore, to support copy and assignment, equality, and order. To be more formal, a regular type X
behaves like an int
and supports the following operations:
Default constructor: X()
Copy constructor: X(const X&)
Copy assignment: operator = (const X&)
Move constructor: X(X&&)
Move assignment: operator = (X&&)
Destructor: ~(X)
Swap operator: swap(X&, X&)
Equality operator: operator == (const X&, const X&)
If you do not have a use case for a class hierarchy, use a concrete type. A concrete type is way easier to implement, smaller, and faster. You do not have to worry about inheritance, virtuality, references, or pointers including memory allocation and deal-location. There is no virtual dispatch and, therefore, no run-time overhead.
To make a long story short: Apply the KISS principle (keep it simple, stupid). Your type behaves like a value.
Regular types (int
s) are easier to understand. They are per se intuitive. This means that if you have a concrete type, think about upgrading it to a regular type.
The built-in types such as int
or double
are regular but so are the user-defined types such as std::string
or containers such as std::vector
or std::unordered_map
.
C++20 supports the concept of regular
.
This section about constructors, assignments, and destructors has by far the most rules to classes and class hierarchies. They control the life cycle of objects: creation, copy, move, and destruction. In short, we call them the big six. Here are the six special member functions:
Default constructor: X()
Copy constructor: X(const X&)
Copy assignment: operator = (const X&)
Move constructor: X(X&&)
Move assignment: operator = (X&&)
Destructor: ~(X)
The compiler can generate default implementations for the big six. The section starts with rules regarding default operations; continues with rules about constructors, copy and move operations, and destructors; and ends with rules for the other default operations that do not fall into the previous four categories.
Based on the declaration of the default constructor, you may have the impression that the default constructor takes no arguments. This is wrong. A default constructor can be invoked without argument, but it may have default arguments for each parameter.
By default, the compiler can generate the big six if needed. You can define the six special member functions but can also explicitly ask the compiler to provide them with = default
or delete them with = delete
.
This rule is also known as “the rule of zero.” That means that you can avoid writing any custom constructors, copy/move constructors, assignment operators, or destructors by using types that support the appropriate copy/move semantics. This applies to the regular types such as the built-in types bool
or double
but also the containers of the Standard Template Library (STL) such as std::vector
or std::string
.
class Named_map { public: // ... no default operations declared ... private: std::string name; std::map<int, int> rep; }; Named_map nm; // default construct Named_map nm2 {nm}; // copy construct
The default construction and the copy construction work because they are already defined for std::string
and std::map
. When the compiler auto-generates the copy constructor for a class, it invokes the copy constructor for all members and all bases of the class.
If you define or |
The big six are closely related. Due to this relationship, you have to define or =delete
all six. Consequently, this rule is called “the rule of six.” Sometimes you hear “the rule of five” because the default constructor is special and, therefore, sometimes excluded.
When you don’t follow this rule, you get very unintuitive objects. Here is an unintuitive example from the guidelines.
// doubleFree.cpp #include <cstddef> class BigArray { public: BigArray(std::size_t len): len_(len), data_(new int[len]) {} ~BigArray(){ delete[] data_; } private: size_t len_; int* data_; }; int main(){ BigArray bigArray1(1000); BigArray bigArray2(1000); bigArray2 = bigArray1; // (1) } // (2)
Why does this program have undefined behavior? The default copy-assignment operation bigArray2 = bigArray1
(1) of the example copies all members of bigArray2
. Copying means, in particular, that pointer data
is copied but not the data. Hence, the destructor for bigArray1
and bigArray2
is called (2), and we get undefined behavior because of double free.
The unintuitive behavior of the example is that the compiler-generated copy-assignment operator of BigArray
makes a shallow copy of BigArray
, but the explicit implemented destructor of BigArray
assumes ownership of data.
AddressSanitizer makes the undefined behavior visible (see Figure 5.2).
Figure 5.2 Double free detected with AddressSanitizer
This rule is related to the previous rule. If you implement the default operations with different semantics, the users of the class may become very confused. This strange behavior may also appear if you partially implement the member functions and partially request them via =default
. You cannot assume that the compiler-generated special member functions have the same semantics as yours.
As an example of the odd behavior, here is the class Strange
. Strange
includes a pointer to int
.
1 // strange.cpp 2 3 #include <iostream> 4 5 struct Strange { 6 7 Strange(): p(new int(2011)) {} 8 9 // deep copy 10 Strange(const Strange& a) : p(new int(*a.p)) {} 11 12 // shallow copy 13 // equivalent to Strange& operator = (const Strange&) = default; 14 Strange& operator = (const Strange& a) { 15 p = a.p; 16 return *this; 17 } 18 19 int* p; 20 21 }; 22 23 int main() { 24 25 std::cout << '\n'; 26 27 std::cout << "Deep copy" << '\n'; 28 29 Strange s1; 30 Strange s2(s1); 31 32 std::cout << "s1.p: " << s1.p << "; *s1.p: " << *s1.p << '\n'; 33 std::cout << "s2.p: " << s2.p << "; *s2.p: " << *s2.p << '\n'; 34 35 std::cout << "*s2.p = 2017" << '\n'; 36 *s2.p = 2017; 37 38 std::cout << "s1.p: " << s1.p << "; *s1.p: " << *s1.p << '\n'; 39 std::cout << "s2.p: " << s2.p << "; *s2.p: " << *s2.p << '\n'; 40 41 std::cout << '\n'; 42 43 std::cout << "Shallow copy" << '\n'; 44 45 Strange s3; 46 s3 = s1; 47 48 std::cout << "s1.p: " << s1.p << "; *s1.p: " << *s1.p << '\n'; 49 std::cout << "s3.p: " << s3.p << "; *s3.p: " << *s3.p << '\n'; 50 51 52 std::cout << "*s3.p = 2017" << '\n'; 53 *s3.p = 2017; 54 55 std::cout << "s1.p: " << s1.p << "; *s1.p: " << *s1.p << '\n'; 56 std::cout << "s3.p: " << s3.p << "; *s3.p: " << *s3.p << '\n'; 57 58 std::cout << '\n'; 59 60 std::cout << "delete s1.p" << '\n'; 61 delete s1.p; 62 63 std::cout << "s2.p: " << s2.p << "; *s2.p: " << *s2.p << '\n'; 64 std::cout << "s3.p: " << s3.p << "; *s3.p: " << *s3.p << '\n'; 65 66 std::cout << '\n'; 67 68 }
The class Strange
has a copy constructor (line 10) and a copy-assignment operator (line 14). The copy constructor applies deep copy, and the assignment operator applies shallow copy. By the way, the compiler-generated copy constructor or copy-assignment operator also applies shallow copy. Most of the time, you want deep copy semantics (value semantics) for your types, but you probably never want to have different semantics for these two related operations. The difference is that deep copy semantics creates two new separate storage p(new int(*a.p))
while shallow copy semantics just copies the pointer p = a.p
. Let’s play with the Strange
types. Figure 5.3 shows the output of the program.
Figure 5.3 Output of strange.cpp
Line 30 uses the copy constructor to create s2
. Displaying the addresses of the pointer and changing the value of the pointer s2.p
(line 36) shows that s1
and s2
are two distinct objects. This is not the case for s1
and s3
. The copy-assignment operation in line 46 performs a shallow copy. The result is that changing the pointer s3.p
(line 53) also affects the pointer s1.p
because both pointers refer to the same value.
The fun starts if I delete the pointer s1.p
(line 61). Thanks to the deep copy, nothing bad happens to s2.p
, but the value of s3.p
becomes an invalid pointer. To be more precise: Dereferencing an invalid pointer such as in *s3.p
(line 63) is undefined behavior.
Thirteen rules deal with the construction of objects. Roughly speaking, they fall into five categories:
Constructor with a single argument
Special constructors such as an inheriting or a delegating constructor
In the end, I have a warning. Don’t call a virtual function from a constructor. I refer to this warning in a broader context, including destructors, in the section Other Default Operations later in this chapter.
I skipped the rule “C.40: Define a constructor if a class has an invariant” because I already wrote about it in the rule “C.2: Use class
if the class has an invariant; use struct
if the data members can vary independently.” Therefore, two closely related guidelines are left: “C.41: A constructor should create a fully initialized object” and “C.42: If a constructor cannot construct a valid object, throw an exception.”
It is the job of the constructor to create a fully initialized object. A class having an init
member function is asking for trouble.
class DiskFile { // BAD: default constructor not sufficient FILE* f; // call init() before any other function // ... public: DiskFile() = default; void init(); // initialize f void read(); // read from f // ... }; int main() { DiskFile file; file.read(); // crash or bad read! // ... file.init(); // too late // ... }
The user might mistakenly invoke read
before init
or might just forget to invoke init
. Making the member function init
private and calling it from all constructors is better but not optimal. When you have common actions for all constructors of a class, use a delegating constructor.
If a constructor cannot construct a valid object, throw an exception |
According to the previous rule, throw an exception if you cannot construct a valid object. There is not much to add. If you work with an invalid object, you always have to check the state of the object before its usage. This is extremely tedious, inefficient, and in particular, error prone. Here is an example from the guidelines, violating this rule:
class DiskFile { // BAD: constructor leaves a nonvalid object behind FILE* f; bool valid; // ... public: explicit DiskFile(const string& name) :f{fopen(name.c_str(), "r")}, valid{false} { if (f) valid = true; // ... } bool is_valid() const { return valid; } void read(); // read from f // ... }; int main() { DiskFile file {"Heraclides"}; file.read(); // crash or bad read! // ... if (file.is_valid()) { file.read(); // ... } else { // ... handle error ... } // ... }
The next two rules answer the question: When does and when doesn’t a class need a default constructor?
Ensure that a copyable (value type) class has a default constructor |
Informally said, a class needs no default constructor when instances of the class have no meaningful default. For example, a human being has no meaningful default, but a type such as a bank account has one. The initial value of a bank account may be zero. Having a default constructor makes it easier to use your type. Many constructors of the STL containers rely on the fact that your type has a default constructor—for example, for the value of an ordered associative container such as std::map
. If all the members of the class have a default constructor, the compiler generates one for your class if possible (read the previous section in this chapter Dependencies between the Special Member Functions for more details).
Now to the case where a default constructor should not be provided.
Often, code says more than a thousand words.
1 // classMemberInitializerWidget.cpp 2 3 #include <iostream> 4 5 class Widget { 6 public: 7 Widget(): width(640), height(480), 8 frame(false), visible(true) {} 9 explicit Widget(int w): width(w), height(getHeight(w)), 10 frame(false), visible(true) {} 11 Widget(int w, int h): width(w), height(h), 12 frame(false), visible(true) {} 13 14 void show() const { 15 std::cout << std::boolalpha << width << "x" << height 16 << ", frame: " << frame 17 << ", visible: " << visible << '\n'; 18 } 19 private: 20 int getHeight(int w) { return w*3/4; } 21 int width; 22 int height; 23 bool frame; 24 bool visible; 25 }; 26 27 class WidgetImpro { 28 public: 29 WidgetImpro() = default; 30 explicit WidgetImpro(int w): width(w), height(getHeight(w)) {} 31 WidgetImpro(int w, int h): width(w), height(h) {} 32 33 void show() const { 34 std::cout << std::boolalpha << width << "x" << height 35 << ", frame: " << frame 36 << ", visible: " << visible << '\n'; 37 } 38 39 private: 40 int getHeight(int w) { return w * 3 / 4; } 41 int width{640}; 42 int height{480}; 43 bool frame{false}; 44 bool visible{true}; 45 }; 46 47 48 int main() { 49 50 std::cout << '\n'; 51 52 Widget wVGA; 53 Widget wSVGA(800); 54 Widget wHD(1280, 720); 55 56 wVGA.show(); 57 wSVGA.show(); 58 wHD.show(); 59 60 std::cout << '\n'; 61 62 WidgetImpro wImproVGA; 63 WidgetImpro wImproSVGA(800); 64 WidgetImpro wImproHD(1280, 720); 65 66 wImproVGA.show(); 67 wImproSVGA.show(); 68 wImproHD.show(); 69 70 std::cout << '\n'; 71 72 }
The class Widget
uses its three constructors (lines 7–12) exclusively to initialize its members. The refactored class WidgetImpro
initializes its members directly in the class body (lines 41–44). See Figure 5.4. By moving the initialization from the constructor to the class body, the three constructors (lines 29–31) become easier to comprehend and the class easier to maintain. For example, when you add a new member to the class, you have only to add the initialization in the class body, not to all constructors. Additionally, there is no need to think about and take care of putting initializers in constructors in correct order. Consequently, you cannot have a partially initialized object when you create a new object.
Of course, both objects behave identically.
Figure 5.4 Directly initializing in the class
Here is the approach that I follow when I design a new class: Define the default behavior in the class body. Use explicitly defined constructors only to vary the default behavior.
Did you notice the keyword explicit
in the previous constructor taking one argument?
To say it more explicitly: A single-argument constructor without explicit
is a converting constructor. A converting constructor takes an argument and makes an object of the class out of it. This behavior is often the cause of big surprises.
The program convertingConstructor.cpp
uses user-defined literals.
// convertingConstructor.cpp #include <iomanip> #include <iostream> #include <ostream> namespace Distance { class MyDistance { public: MyDistance(double d):m(d) {} // (5) friend MyDistance operator + (const MyDistance& a, // (2) const MyDistance& b) { return MyDistance(a.m + b.m); } friend std::ostream& operator << (std::ostream &out, // (3) const MyDistance& myDist) { out << myDist.m << " m"; return out; } private: double m; }; namespace Unit{ MyDistance operator "" _km(long double d) { // (1) return MyDistance(1000*d); } MyDistance operator "" _m(long double m) { return MyDistance(m); } MyDistance operator "" _dm(long double d) { return MyDistance(d/10); } MyDistance operator "" _cm(long double c) { return MyDistance(c/100); } } } using namespace Distance::Unit; int main() { std:: cout << std::setprecision(7) << '\n'; std::cout << "1.0_km + 2.0_dm + 3.0_cm: " << 1.0_km + 2.0_dm + 3.0_cm << '\n'; std::cout << "4.2_km + 5.5_dm + 10.0_m + 0.3_cm: " << 4.2_km + 5.5 + 10.0_m + 0.3_cm << '\n'; // (4) std::cout << '\n'; }
A call such as 1.0_km
goes to the literal operator "" _km(long double d)
(1), which creates a MyDistance(1000.0)
object that stands for 1000.0 meters. Additionally, MyDistance
overloads the + operator (2) and the output operator (3). The main reason for user-defined literals is to define a type-safe arithmetic. Each number has its units attached. See Figure 5.5.
Figure 5.5 Converting constructor
Fine? No! I made an error and wrote 5.5
(4) instead of 5.5_dm
. The converting constructor made a MyDistance
object out of it. What should be a decimeter ended up being a meter. This implicit conversion from double
would not have happened if the constructor (5) had been defined as explicit: explicit MyDistance(double d);
.
Three rules deal with the initialization of members. The first rule has the potential for a big surprise.
Define and initialize member variables in the order of member declaration |
The class members are initialized in the order of their declaration. If you initialize them in the member initialization list in a different order, you may get a surprise.
// memberDeclarationOrder.cpp #include <iostream> class Foo { int m1; int m2; public: Foo(int x) :m2{x}, m1{++x} { // BAD: misleading initializer order std::cout << "m1: " << m1 << '\n'; std::cout << "m2: " << m2 << } }; int main() { std::cout << '\n'; Foo foo(1); std::cout<< '\n'; }
Many people assume that first m2
is initialized and then m1
. As a consequence, m2
would have the value 1 and m1
the value 2.
The class members are destructed exactly in the reverse order of their initialization (see Figure 5.6).
Figure 5.6 Wrong initialization order of member variables
Prefer in-class initializers to member initializers in constructors for constant initializers |
This rule is kind of similar to the previous rule “C.45: Don’t define a default constructor that only initializes data members; use member initializers instead.” In-class initializers make it a lot easier to define the constructors. Additionally, you cannot forget to initialize a member.
class X { // BAD int i; string s; int j; public: X() :i{666}, s{"qqq"} {} // j is uninitialized explicit X(int ii) :i{ii} {} // s is "" and j is uninitialized // ... }; class X2 { int i{0}; std::string s{"qqq"}; int j{0} public: X2() = default; // all members are initialized to their defaults explicit X2(int ii) :i{ii} {} // s and j initialized // to their defaults // ... };
While the in-class initialization establishes the default behavior of an object, the constructor allows the variation of the default behavior.
The most obvious pros of initialization to the assignment are twofold: First, you cannot forget to assign a value and use it uninitialized; second, initialization may be faster but never slower than an assignment. The following code snippet from the guidelines shows why.
class Bad { std::string s1; public: Bad(const std::string& s2) { s1 = s2; } // BAD: default // constructor followed by assignment // ... };
First, the default constructor of std::string
is called, and second, the assignment takes place in the constructor.
To the contrary, the constructor in the class Good
initializes the std::string
.
class Good { std::string s1; public: Good(const std::string& s2): s1{s2} {} // Good: initialization // ... };
Since C++11, a constructor can delegate its work to another constructor of the same class and constructors can be inherited from the parent class. Both techniques allow the programmer to write more concise and more expressive code.
Use delegating constructors to represent common actions for all constructors of a class |
A constructor can delegate its work to another constructor of the same class. Delegating is the modern way in C++ to put common actions for all constructors into one constructor. Before C++11, a special initialization function, which was typically called init
, had to be used.
class Degree { public: explicit Degree(int deg) { // (1) degree = deg % 360; if (degree < 0) degree += 360; } Degree(): Degree(0) {} // (2) explicit Degree(double deg): // (3) Degree(static_cast<int>(std::ceil(deg))) {} private: int degree; };
The constructors (2) and (3) of the class Degree
delegate its initialization work to the constructor (1), which verifies its arguments. Invoking constructors recursively is undefined behavior.
A simplified implementation initializes Degree
in the class and skips the default constructor.
class Degree { public: explicit Degree(int deg) { // (1) degree= deg % 360; if (degree < 0) degree += 360; } explicit Degree(double deg): // (3) Degree(static_cast<int>(std::ceil(deg))) {} private: int degree = 0; };
Reuse the constructors of the base class in the derived class if you can. This idea of reuse applies when your derived class has no members. If you don’t reuse constructors when you could, you violate the DRY (don’t repeat yourself) principle. The inherited constructors keep all characteristics from their definition in the base class, such as access specifiers or attributes explicit
or constexpr
.
class Rec { // ... data and lots of nice constructors ... }; class Oper : public Rec { using Rec::Rec; // ... no data members ... // ... lots of nice utility functions ... }; struct Rec2 : public Rec { int x; using Rec::Rec; }; Rec2 r {"foo", 7}; int val = r.x; // uninitialized
There is a danger of using inherited constructors. If your derived class, such as Rec2
, has its own members, such as int x
, they are not initialized unless they have in-class initializers (see “C.48: Prefer in-class initializers to member initializers in constructors for constant initializers”).
Although the C++ Core Guidelines have eight rules regarding copy and move, they boil down to three classes of rules: copy- and move-assignment operations, the semantics of copy and move, and the infamous slicing.
The two rules “C.60: Make copy assignment non-virtual
, take the parameter by const&
, and return by non-const&
” and “C.63: Make move assignment non-virtual
, take the parameter by &&
, and return by non-const&
” state explicitly the syntax of the copy- and move-assignment operator. std::vector
follows the proposed syntax. Here is a simplified version:
// copy assignment vector& operator = (const vector& other); // move assignment vector& operator = (vector&& other); // until C++17 vector& operator = (vector&& other) noexcept ; // since C++17
The small code snippet shows that the move-assignment operator is noexcept
. With C++17, the rule is quite obvious: “C.66: Make move operations noexcept
.” Move operations include the move constructor and the move-assignment operator. A noexcept
declared function is an optimization opportunity for the compiler. The following code snippet shows the declaration of the move operations for std::vector
.
vector(vector&& other) noexcept ; // since C++17 vector& operator = (vector&& other) noexcept ; // since C++17
Both rules address self-assignment: “C.62: Make copy assignment safe for self-assignment” and “C.65: Make move assignment safe for self-assignment.” Safe for self-assignment means that the operation x = x
should not change the value of x
.
Copy/move assignment of the containers of the STL, std::string
, and built-in types such as int
are safe for self-assignment. The automatic generated copy/move assignment operator is safe for self-assignment. The same holds for an automatically generated copy/move assignment operator that uses types that are safe for self-assignment.
The following class Foo
does the right job. No self-assignment could happen.
class Foo { std::string s; int i; public: Foo& Foo::operator = (const Foo& a) { s = a.s; i = a.i; return *this; } Foo& Foo::operator = (Foo&& a) noexcept { s = std::move(a.s); i = a.i; return *this; } // .... };
Any redundant and expensive check for self-assignment is a pessimization in this case.
class Foo { std::string s; int i; public: Foo& Foo::operator = (const Foo& a) { if (this == &a) return *this; // redundant self-assignment check s = a.s; i = a.i; return *this; } Foo& Foo::operator = (Foo&& a) noexcept { if (this == &a) return *this; // redundant self-assignment check s = std::move(a.s); i = a.i; return *this; } // .... };
The two guidelines for this section sound obvious: “C.61: A copy operation should copy” and “C.64: A move operation should move and leave its source in a valid state.” What does that mean?
Copy operation
After copying (a = b)
, a
and b
must be the same: (a == b)
.
Copying can be deep or shallow. Deep copying means that both objects a
and b
are afterward independent of each other (value semantics). Shallow copying means that both objects a
and b
share an object afterward (reference semantics).
Move operation
The C++ standard requires that the moved-from object must be afterward in an unspecified but valid state. Often, this moved-from state is in the default state of the source of the move operation.
This rule sounds innocuous but is often the reason for undefined behavior. First of all: What is a polymorphic class?
A polymorphic class is a class that defines or inherits at least one virtual function.
Copying a polymorphic class may end in slicing. Slicing is one of the darkest parts of C++.
Now, it becomes really dangerous. Slicing kicks in when you copy a polymorphic class.
1 // sliceVirtuality.cpp 2 3 #include <iostream> 4 #include <string> 5 6 struct Base { 7 virtual std::string getName() const { 8 return "Base"; 9 } 10 }; 11 12 struct Derived : Base { 13 std::string getName() const override { 14 return "Derived"; 15 } 16 }; 17 18 int main() { 19 20 std::cout << '\n'; 21 22 Base b; 23 std::cout << "b.getName(): " << b.getName() << '\n'; 24 25 Derived d; 26 std::cout << "d.getName(): " << d.getName() << '\n'; 27 28 Base b1 = d; // slicing 29 std::cout << "b1.getName(): " << b1.getName() << '\n'; 30 31 Base& b2 = d; 32 std::cout << "b2.getName(): " << b2.getName() << '\n'; 33 34 Base* b3 = new Derived; 35 std::cout << "b3->getName(): " << b3->getName() << '\n'; 36 37 std::cout << '\n'; 38 39 }
The program has a small hierarchy consisting of the Base
and the Derived
classes. Each object of this class hierarchy returns its name. The member function getName
is virtual (line 7) and class Derived
overrides it in line 13. Class Base
is a polymorphic class. This means that I can use a derived object via a reference (line 31) or a pointer to a base object (line 34) to get polymorphic behavior. Under the hood, the object is of type Derived
.
This behavior does not hold if I copy Derived d
to Base b1
(line 28). In this case, slicing kicks in, and I have a Base
object under the hood. See Figure 5.7. In the case of copying, the declared or static type is used. If you use an indirection such as a reference or a pointer, the current or dynamic type is used.
Figure 5.7 Slicing
If you want to make a deep copy, prefer a virtual clone
function. Read the details about this technique in the rule “C.130: For making deep copies of polymorphic classes prefer a virtual clone
function instead of copy construction/assignment.”
Does my class need a destructor? I often hear this question. Most of the time, the answer is no, and you are fine with the rule of zero. Sometimes the answer is yes, and we are back to the rule of five/six. To be more precise, the C++ Core Guidelines provide seven rules for destructors. They fall into four categories: when destructors are needed, how destructors should handle pointers and references, how base class destructors should be defined, and why destructors should not fail.
The destructor of an object is automatically invoked at the end of its lifetime. To be more precise, the destructor of the object is invoked when the object goes out of scope.
Define a destructor if a class needs an explicit action at object destruction |
The question is if the compiler-generated destructor is sufficient in your case. If you must execute extra code at the end of the lifetime of your user-defined type, you have to write a destructor. For example, your user-defined type wants to deregister itself from a registration. If you define the destructor, the rule of five/six kicks in.
To put it the other way around, if no member of your class needs additional cleanup, there is no need to define a destructor such as in the following code snippet from the guidelines:
class Foo { // bad; use the default destructor public: // ... ~Foo() { s = ""; i = 0; vi.clear(); } // clean up private: std::string s; int i; std::vector<int> vi; };
All resources acquired by a class must be released by the class’s destructor |
This rule sounds quite obvious and helps you to prevent resource leaks. Right? But you have to consider which of your class members have a full set of default operations. Now we are once more back to the rule of zero or the rule of five/six.
In the following example, while the std::ifstream
class has a destructor, the class File
might not have one, and therefore, we get a memory leak if instances of MyClass
go out of scope.
class MyClass { std::ifstream fstream; // may own a file File* file_; // may own a file ... };
If your class has raw pointers or references, you have to answer the crucial question: Who is the owner?
If a class has a raw pointer ( |
If a class has a raw pointer or a reference, you have to be specific about ownership. This means in the case of the pointer. If the ownership is obscure, you may delete a pointer to an object that you do not own or may not delete a pointer that you own. In the first case, you end up with undefined behavior because of double delete; in the second case, you end up with a memory leak. The corresponding reasoning holds about references.
The topic of this paragraph is already thoroughly answered in the chapter on the ownership semantics of function parameters. Read the details in the section Parameter Passing: Ownership Semantics in Chapter 4.
If a class has an owning pointer member, define a destructor |
The reason for this rule is straightforward: If a class owns an object, it is responsible for its destruction. The destruction is the job of the destructor.
Admittedly, there is more to write about a class owning a pointer member. You should first answer the following question: Is the class the exclusive owner of the pointer? The answer can be yes or no. Make the class the exclusive owner by putting the pointer into a std::unique_ptr
. Otherwise, make the class the shared owner by putting the pointer into a std::shared_ptr
. Raising the abstraction level from a pointer to a smart pointer makes ownership semantics transparent and way less error prone.
What are the advantages of smart pointers over pointers? First and foremost, the lifetime of the smart pointer is automatically managed by the C++ run time. Second, a std::shared_ptr
supports the big six. This means using a std::shared_ptr
in a class does not impose any restriction on the class. To the contrary, a std::unique_ptr
used in the class definition disables the copy semantics.
// classWithUniquePtr.cpp #include <memory> struct MyClass { std::unique_ptr<int> uniPtr = std::make_unique<int>(2011); }; int main() { MyClass myClass; MyClass myClass2(myClass); MyClass myClass3; myClass3 = myClass; }
Due to the std::unique_ptr
, objects of type MyClass
cannot be copied. Neither calling the copy constructor (MyClass myClass2(myClass)
) nor calling the copy-assignment operator (myClass3 = myClass
) is valid. See Figure 5.8.
Figure 5.8 A class with a std::unique_ptr
A base class destructor should be either public and virtual, or protected and non-virtual |
This rule is very interesting from the perspective of virtual functions. Let’s divide it into two parts.
Public and virtual destructor
If the base class has a public and virtual destructor, you can destroy instances of a derived class through a base class pointer. The same holds for references.
struct Base { // no virtual destructor virtual void f() {}; }; struct Derived : Base { std::string s {"a resource needing cleanup"}; ~Derived() { /* ... do some cleanup ... */ } }; ... Base* b = new Derived(); delete b;
The compiler generates for Base
a nonvirtual destructor, but deleting an instance of Derived
through a Base
pointer is undefined behavior if the destructor of Base
is nonvirtual.
Protected and nonvirtual destructor
This is quite easy to get. If the destructor of the base class is protected, you cannot destroy derived objects using a base class pointer or reference; therefore, the destructor need not be virtual.
Here are a few concluding remarks about the access specifiers for destructors of a Base
class.
If the destructor of a class Base
is private, you cannot derive from it.
If the destructor of a class Base
is protected, you can derive only Derived
from Base
and use Derived
.
struct Base { protected: ~Base() = default; }; struct Derived: Base {}; int main() { Base b; // Error: Base::~Base is protected within this context Derived d; }
The declaration Base b;
causes an error because the destructor of Base
is inaccessible.
Two rules address the issue of failing destructors: “C.36: A destructor may not fail” and “C.37: Make destructors noexcept
.”
The remaining rules related to constructors, assignments, and destructors have a broad focus. They cover when you should use =default
and =delete
explicitly and why you should not call virtual functions from constructors and destructors. The remaining rules make the story of regular types complete. The swap
function (swap(X&, X&)
) is the first rule, followed by the equality operator (operator == (const X&)
).
=default
and =delete
This section provides guidance about when to use =default
and =delete
explicitly.
Use |
Do you remember the rule of five? It means that if you define one of the five special member functions, you have to define them all. The five special member functions are all the special member functions excluding the default constructor.
When I define the destructor such as in the following example, I have to define the copy and move constructor and the copy- and move-assignment operators. Requesting the remaining four by =default
is the easiest way.
class Tracer { std::string message; public: explicit Tracer(const std::string& m) : message{m} { std::cerr << "entering " << message << '\n'; } ~Tracer() { std::cerr << "exiting " << message << '\n'; } Tracer(const Tracer&) = default; Tracer& operator = (const Tracer&) = default; Tracer(Tracer&&) = default; Tracer& operator = (Tracer&&) = default; };
This was easy! Right? Providing your own implementation is boring and also very prone to mistakes. For example, the user-defined move constructor and move-assignment operator in the following example are not declared noexcept
.
class Tracer { std::string message; public: explicit Tracer(const std::string& m) : message{m} { std::cerr << "entering " << message << '\n'; } ~Tracer() { std::cerr << "exiting " << message << '\n'; } Tracer(const Tracer& a) : message{a.message} {} Tracer& operator = (const Tracer& a) { message = a.message; return *this; } Tracer(Tracer&& a) :message{a.message} {} Tracer& operator = (Tracer&& a) { message = a.message; return *this; } };
Use |
Sometimes, you want to disable the default operations. Here comes =delete
into play. C++ eats its own dog food. The copy constructor of almost all types from the threading API is set to delete
. This holds true for data types such as mutexes, locks, or futures.
You can use delete
to create strange types. Instances of Immortal
cannot be destructed.
// immortal.cpp class Immortal { public: ~Immortal() = delete; // do not allow destruction }; int main() { Immortal im; // (1) Immortal* pIm = new Immortal; delete pIm; // (2) }
An implicit call of the destructor (1) or an explicit call of the destructor (2) causes a compile-time error. See Figure 5.9.
Figure 5.9 delete
the destructor
Don’t call virtual functions in constructors and destructors |
Calling a pure virtual function from a constructor or a destructor is undefined behavior. Calling a virtual function from a constructor or a destructor does not work the way you may expect. For protection reasons, the virtual call mechanism is disabled in the constructor or destructor, and you get a nonvirtual call.
Hence, the Base
version of the virtual function f
will be called in the following example.
// virtualCall.cpp #include <iostream> struct Base { Base() { f(); } virtual void f() { std::cout << "Base called" << '\n'; } }; struct Derived: Base { void f() override { std::cout << "Derived called" << '\n'; } }; int main() { std::cout << '\n'; Derived d; std::cout << '\n'; };
Figure 5.10 shows the surprising behavior.
Figure 5.10 Calling a virtual function in the constructor
swap
functionFor a type to be a regular type, it has to support a swap
function. A more informal term for a regular type is a value-like type, and this is the wording the first rule uses: “C.83: For value-like types, consider providing a noexcept
swap function.” According, to the first rule, a swap
should not fail (“C.84: A swap
may not fail”) and should, therefore, be declared as noexcept
: “C.85: Make swap noexcept
.”
The data type Foo
from the C++ Core Guidelines has a swap
function.
class Foo { public: void swap(Foo& rhs) noexcept { m1.swap(rhs.m1); std::swap(m2, rhs.m2); } private: Bar m1; int m2; };
For convenience reasons, you should consider supporting a nonmember swap
function based on the already implemented swap
member function.
void swap(Foo& a, Foo& b) noexcept { a.swap(b); }
If you do not provide a nonmember swap
function, then the standard library algorithms that require swapping (such as std::sort
and std::rotate
) will fall back to the std::swap
template, which is defined in terms of move construction and move assignment.
template<typename T> void std::swap(T& a, T& b) noexcept { T tmp(std::move(a)); a = std::move(b); b = std::move(tmp); }
The C++ standard offers more than 40 overloads of std::swap
. You can use the swap
function as a building block for many idioms such as copy construction or move assignment. A swap
function should not fail; therefore, you should declare it as noexcept
.
When a swap
function is based on copy semantics instead of move semantics, a swap
function may fail because of memory exhaustion. The following implementation contradicts the already mentioned rule “C.84: A swap
must not fail.” This is the C++98 implementation of std::swap
.
template<typename T> void std::swap(T& a, T& b) { T tmp = a; a = b; b = tmp; }
In this case, memory exhaustion causes a std::bad_alloc
exception.
To be regular, a data type also has to support the equality operator.
Make |
If you don’t want to surprise your user, you should make the equality operator symmetric.
The following code snippet shows an unintuitive equality operator that is defined inside the class.
class MyInt { // BAD: unsymmetric == int num; public: MyInt(int n): num(n) {}; bool operator == (const MyInt& rhs) const noexcept { return num == rhs.num; } }; int main() { MyInt(5) == 5; // OK 5 == MyInt(5); // ERROR }
The call MyInt(5) == 5
is valid because the constructor converts the int
to an instance of MyInt
. The last line (5 == MyInt(5)
) gives an error. An object of type int
cannot be compared with a MyInt
object, and there is no conversion from MyInt
to int
possible.
The elegant way to solve this asymmetry is to declare a friend operator ==
inside the class MyInt
. Here is the improved version of MyInt
.
class MyInt { int num; public: MyInt(int n): num(n) {}; friend bool operator == (const MyInt& lhs, const MyInt& rhs) noexcept { return lhs.num == rhs.num; } }; int main() { MyInt(5) == 5; // OK 5 == MyInt(5); // OK }
If you carefully read this book, you may recall that a constructor taking one argument should be explicit (“C.46: By default, declare single-argument constructors explicit
”). Honestly, you are right.
class MyInt { int num; public: explicit MyInt(int n): num(n) {}; friend bool operator == (const MyInt& lhs, const MyInt& rhs) noexcept { return lhs.num == rhs.num; } }; int main() { MyInt(5) == 5; // ERROR 5 == MyInt(5); // ERROR }
Making the constructor explicit
breaks the implicit conversion from int
to MyInt
. Providing two additional overloads solves the issue. One overload takes an int
as the left and the other an int
as the right argument.
// equalityOperator.cpp class MyInt { int num; public: explicit MyInt(int n): num(n) {}; friend bool operator == (const MyInt& lhs, const MyInt& rhs) noexcept { return lhs.num == rhs.num; } friend bool operator == (int lhs, const MyInt& rhs) noexcept { return lhs == rhs.num; } friend bool operator == (const MyInt& lhs, int rhs) noexcept { return lhs.num == rhs; } }; int main() { MyInt(5) == 5; // OK 5 == MyInt(5); // OK }
The surprises continue with the equality operator.
Writing a foolproof equality operator for a hierarchy is hard. The guidelines give a nice example of the complications involved. Here is the hierarchy.
// equalityOperatorHierarchy.cpp #include <string> struct Base { std::string name; int number; virtual bool operator == (const Base& a) const { return name == a.name && number == a.number; } }; struct Derived: Base { char character; virtual bool operator == (const Derived& a) const { return name == a.name && number == a.number && character == a.character; } }; int main() { Base b; Base& base = b; Derived d; Derived& derived = d; base == derived; // compares name and number, but (1) // ignores derived's character derived == base; // error: no == defined (2) Derived derived2; derived == derived2; // compares name, number, and character Base& base2 = derived2; base2 == derived; // compares name and number, but (3) // ignores derived2's and derived's character } // ignores derived2's and derived's character
Comparing instances of Base
or instances of Derived
works. But mixing instances of Base
and Derived
does not work as expected. Using Base
’s == operator
ignores Derived
’s character (3). Using Derived
’s operator does not work for instances of Base
(4). The line causes a compilation error. The last line (3) is quite tricky. The equality operator of Base
is used. Why? The == operator
of Derived
overwrote the == operator
of Base
. No! Both operators have different signatures. One operator takes an instance of Base
; the other operator takes an instance of Derived
. Derived
’s version does not overwrite Base
’s version.
These observations also hold for the other five comparison operators: !=
, <
, <=
, >
, and >=
. This misbehaving is another facet of the slicing issue: “C.67: A polymorphic class should suppress copying.”
The C++ Core Guidelines have about thirty rules in total addressing class hierarchies.
But first, what is a class hierarchy? The C++ Core Guidelines give a clear answer. Let me rephrase it. A class hierarchy represents a set of hierarchically organized concepts. Base classes typically act as interfaces. There are two uses for interfaces. One is often named interface inheritance and the other implementation inheritance.
Interface inheritance uses public inheritance. It separates users from implementations to allow derived classes to add or change functionality of the base class without affecting the users of base classes.
For example, if you derive public
a class Handball
from a Ball
, you can use Handball
instead of a Ball
. A Handball
is also a Ball
. This principle is called the Liskov substitution principle.
Implementation inheritance often uses private inheritance. Typically, the derived class provides its functionality by adapting functionality from base classes.
A prominent example of implementation inheritance is the adapter pattern if you implement it with multiple inheritance. The idea of the adapter pattern is to adapt an existing interface to a new one. The adapter uses private
inheritance from the implementation and public
inheritance from the new interface. The new interface uses the existing implementation to provide its services to the user.
The first three rules for class hierarchies have a general focus. They provide a kind of summary for the more detailed rules for the designing of classes and the accessing of objects in class hierarchies.
The first rules describe when to use class hierarchies and introduce the idea of abstract classes.
Use class hierarchies to represent concepts with inherent hierarchical structure (only) |
This rule makes a software system intuitive and easy to comprehend. If you model something in the code that has an inherently hierarchical structure, you should use a hierarchy. Often, the easiest way to reason about code is if you have a natural match between the code and the world.
For example, your job as a software architect is to model a complex system such as a defibrillator. This system consists of many subsystems. For example, a subsystem is the user interfaces. The requirement for the defibrillator is that different input devices such as a keyboard, a touch screen, or a few buttons could be used as a user interface. This system consisting of various subsystems such as a user interface is inherently hierarchical and should, therefore, be modeled hierarchically. The great benefit is that the complex system is now easy to explain in a top-down fashion because there is a natural match between the real hardware and the software.
Of course, the classic example of using a hierarchy is in the design of a graphical user interface (GUI). This is the example the C++ Core Guidelines use.
class DrawableUIElement { public: virtual void render() const = 0; // ... }; class AbstractButton : public DrawableUIElement { public: virtual void onClick() = 0; // ... }; class PushButton : public AbstractButton { void render() const override; void onClick() override; // ... }; class Checkbox : public AbstractButton { // ... };
If something is not inherently hierarchical, you should not model it in a hierarchical way. Have a look here.
template<typename T> class Container { public: // list operations: virtual T& get() = 0; virtual void put(T&) = 0; virtual void insert(Position) = 0; // ... // vector operations: virtual T& operator [] (int) = 0; virtual void sort() = 0; // ... // tree operations: virtual void balance() = 0; // ... };
Why is the example terrible? Read the comments! The class template Container
consists of pure virtual functions for modeling a list, a vector, and a tree. That means if you use Container
as an interface, you have to implement three disjunctive concepts.
If a base class is used as an interface, make it an abstract class |
An abstract class is a class that has at least one pure virtual function. A pure virtual function (virtual void function() = 0
) is a function that must be implemented by a derived class if that class should not be abstract. An abstract class cannot be instantiated.
I want to add for completeness: An abstract class can provide an implementation for a pure virtual function. A derived class can, therefore, use this implementation.
Interfaces should usually consist of public
pure virtual functions, don’t have data members, and have a default/empty virtual destructor (virtual ~My_interface() = default
).
Abstract classes are about the separation of interface and implementation. If the client, such as in this case an application, depends only on the interface Device
, it can use different implementations during run time. Additionally, a modification in the implementation does not necessarily affect the interface and, therefore, the application.
struct Device { virtual void write(std::span<const char> outbuf) = 0; virtual void read(std::span<char> inbuf) = 0; }; class Mouse : public Device { // ... data ... void write(std::span<const char> outbuf) override; void read(std::span<char> inbuf) override; }; class TouchScreen : public Device { // ... different data ... void write(std::span<const char> outbuf) override; void read(std::span<char> inbuf) override; };
The 12 rules for designing classes target the following topics: constructors for abstract classes, virtuality, access specifiers for data members, multiple inheritance, and typical traps.
Let me combine the already presented rules “C.2: Use class
if the class has an invariant; use struct
if the data members can vary independently” and “C.41: A constructor should create a fully initialized object” to get the actual rule. An invariant is a condition on a class data member that has to be established by the constructor. Conversely, an abstract base class does not have any data and needs, therefore, no declared constructor.
There a few rules to virtual functions you should keep in mind when designing class hierarchies.
Virtual functions should specify exactly one of |
Since C++11, we have had three keywords to control overriding.
virtual:
declares a virtual function that can be overridden in derived classes
override:
verifies that the function is virtual and overrides a virtual function of a base class
final:
verifies that the function is virtual and cannot be overridden by a member function of a derived class
According to the guidelines, the rules for the usage of the three keywords are straight-forward: “Use virtual
only when declaring a new virtual function. Use override
only when declaring an overrider. Use final
only when declaring a final overrider.”
struct Base{ virtual void testGood() {} virtual void testBad() {} }; struct Derived: Base{ void testGood() final {} virtual void testBad() final override {} }; int main() { Derived d; }
The member function testBad()
in the class Derived
provides much redundant information.
You should use final
or override
only if the function is virtual
. Skip virtual
: void testBad() final override {}
.
Using the keyword final
without the virtual
keyword is valid only if the function is already virtual
; therefore, the function must override a virtual function of a base class. Skip override
: void testBad() final {}
.
This rule is a continuation of rule “C.67: A polymorphic class should suppress copying.” Rule C.67 explicitly shows that copying a polymorphic class may lead to the slicing problem. To overcome this issue, override a virtual clone
function that copies the actual type and returns an owning pointer (std::unique_ptr
) to the new object. In the derived class, return the derived type by using the so-called covariant return type.
Covariant return type: allows for an overriding member function to return a derived type of the return type of the overridden member function.
Let me illustrate this recommendation with an example.
// cloneFunction.cpp #include <iostream> #include <memory> #include <string> struct Base { // GOOD: base class suppresses copying Base() = default; virtual ~Base() = default; Base(const Base&) = delete; Base& operator = (const Base&) = delete; virtual std::unique_ptr<Base> clone() { return std::make_unique<Base>(); } virtual std::string getName() const { return "Base"; } }; struct Derived : public Base { Derived() = default; std::unique_ptr<Base> clone() override { return std::make_unique<Derived>(); } std::string getName() const override { return "Derived"; } }; int main() { std::cout << '\n'; auto base1 = std::make_unique<Base>(); auto base2 = base1->clone(); std::cout << "base1->getName(): " << base1->getName() << '\n'; std::cout << "base2->getName(): " << base2->getName() << '\n'; auto derived1 = std::make_unique<Derived>(); auto derived2 = derived1->clone(); std::cout << "derived1->getName(): " << derived1->getName() << '\n'; std::cout << "derived2->getName(): " << derived2->getName() << '\n'; std::cout << '\n'; }
The clone
member function returns the newly created object in a std::unique_ptr
. The ownership of the newly created objects goes, therefore, to the caller. Now the virtual dispatch happens as expected. See Figure 5.11.
Figure 5.11 A virtual clone
member function
It’s obligatory for the covariant return type that the Derived::clone
member function’s return type is std::unique_ptr<Base>
and not std::unique_ptr<Derived>
. When I change the return type of Derived::clone
to std::unique_ptr<Derived>
, the compilation fails (see Figure 5.12).
Figure 5.12 A virtual clone
member function without covariant return type
A virtual function is a feature that does not come for free.
A virtual function
Increases the run time and the object code size
Is open for errors because it can be overridden in derived classes
Typically, the access specifier for all data members of a class is the same: All data members are either public
or private
.
public
if there is no invariant on the data members. Use a struct
.
private
if there is an invariant on the data members. Use a class
.
Getters or setters are trivial if they do not provide additional semantic value to the data members. Here are two examples of trivial getters and setters from the C++ Core Guidelines:
class Point { // Bad: verbose public: Point(int xx, int yy) : x{xx}, y{yy} { } int get_x() const { return x; } void set_x(int xx) { x = xx; } int get_y() const { return y; } void set_y(int yy) { y = yy; } // no behavioral member functions private: int x; int y; };
x
and y
can have arbitrary values. This means an instance of Point
maintains no invariant on x
and y
. x
and y
are just values. Using a struct
as a collection of values is more appropriate, and x
and y
should, consequently, become public
.
struct Point { int x{0}; int y{0}; };
protected
data make your program complex and error prone. If you put protected
data into a base class, you cannot reason about derived classes in isolation and, therefore, you break encapsulation. You always have to reason about the entire class hierarchy.
This means you have to answer at least these three questions.
Do I have to implement a constructor to initialize the protected
data?
What is the actual value of the protected
data if I use them?
Who is affected if I modify the protected
data?
Answering these questions becomes more and more difficult as your class hierarchy becomes more and more complex.
To put it the other way, protected
data is a kind of global data in the scope of the class hierarchy. And you know non-const
global data is bad.
Ensure all non |
The previous rule, C.133, stated that you should avoid protected data. Consequently, all of your non-const
data members should be either public
or private
. An object can have data members that do not represent the invariants of the object. Non-const
data members that do not represent the invariants of an object should be public
. In contrast, non-const
private
data members are used for the object invariants. As a reminder: A data member having an invariant cannot have all the values of the underlying type.
Based on this observation and the additional observation that you should not mix data members representing/not representing invariants in one class, all your non-const
data members should be either public
or private
. Imagine if you have a class with public
and private
data members that are non-const
. Now your data type is confusing. Does your data type maintain an invariant, or is it merely a collection of unrelated values?
There are two typical use cases for multiple inheritance: separating interface inheritance from implementation inheritance and implementing multiple distinct interfaces.
Interface inheritance is about the separation of interface and implementation, so that a derived class can be changed without affecting the user of the base class; implementation inheritance is the use of inheritance to support new functionality by extending existing functionality.
Pure interface inheritance is if your base class has only pure virtual functions. In contrast, if your base class has data members or implemented functions, this is implementation inheritance. Consequently, you break the previous rule “C.121: If a base class is used as an interface, make it an abstract class.” The C++ Core Guidelines give an example of mixing both concepts.
class Shape { // BAD, mixed interface and implementation public: Shape(Point ce = {0, 0}, Color co = none): cent{ce}, col {co} { /* ... */ } Point center() const { return cent; } Color color() const { return col; } virtual void rotate(int) = 0; virtual void move(Point p) { cent = p; redraw(); } virtual void redraw() const; // ... public: Point cent; Color col; }; class Circle : public Shape { public: Circle(Point c, int r) :Shape{c}, rad{r} { /* ... */ } // ... private: int rad; }; class Triangle : public Shape { public: Triangle(Point p1, Point p2, Point p3); // calculate center // ... };
Mixing the concepts of interface inheritance and implementation inheritance is bad. Why?
As the Shape
class evolves, it may become more and more difficult and error prone to maintain the various constructors.
The member functions of the Shape
class may never be used.
If you add data to the Shape
class, a recompilation becomes probable.
How can we get the best of those two worlds: stable interfaces with interface hierarchies and code reuse with implementation inheritance? One possible answer, which I implement in this chapter, is dual inheritance. Another answer is the PImpl idiom. PImpl stands for pointer to implementation. It moves implementation details in a separate class that can be accessed through a pointer.
Let’s continue with dual inheritance. Dual inheritance implements a quite sophisticated recipe.
Define the base Shape
of the class hierarchy as pure interface.
class Shape {
public:
virtual Point center() const = 0;
virtual Color color() const = 0;
virtual void rotate(int) = 0;
virtual void move(Point p) = 0;
virtual void redraw() const = 0;
// ...
};
Derive a pure interface Circle
from the Shape
.
class Circle : public virtual Shape {
public:
virtual int radius() = 0;
// ...
};
Provide the implementation class Impl::Shape
.
class Impl::Shape : public virtual Shape {
public:
// constructors, destructor
// ...
Point center() const override { /* ... */ }
Color color() const override { /* ... */ }
void rotate(int) override { /* ... */ }
void move(Point p) override { /* ... */ }
void redraw() const override { /* ... */ }
// ...
};
Implement the class Impl::Circle
by inheriting from the interface and the implementation.
class Impl::Circle : public Circle, public Impl::Shape {
public:
// constructors, destructor
int radius() override { /* ... */ }
// ...
};
If you want to extend the class hierarchy, you have to derive from the interface and from the implementation.
class Smiley : public Circle {
public:
// ...
};
// implementationclass Impl::Smiley : public virtual Smiley, public Impl::Circle {
public:
// constructors, destructor
// ...
}
This is the big picture of the two hierarchies.
Interface: Smiley
-> Circle
-> Shape
Implementation: Impl::Smiley
-> Impl::Circle
-> Impl::Shape
By reading the last lines, maybe you had déjà vu. You are right. This technique of multiple inheritance is similar to the adapter pattern, implemented with multiple inheritance. The adapter pattern is from the well-known Gang of Four (GoF) design pattern book, Design Patterns: Elements of Reusable Object-Oriented Software, authored by Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides.
Use multiple inheritance to represent multiple distinct interfaces |
It is a good idea that your interfaces support only one aspect of your design. What does that mean? If you provide a pure interface consisting only of pure virtual functions, a concrete class has to implement all functions. If the interface is too broad, the class has to implement functions it doesn’t need or that make no sense.
An example of two distinct interfaces is istream
and ostream
from C++’s input and output streams library.
class iostream : public istream, public ostream { // very simplified // ... };
There are two typical traps when it comes to the design of a class hierarchy.
Create an overload set for a derived class and its bases with |
This rule holds for virtual and nonvirtual functions. If you don’t use the using
declaration, member functions in the derived class hide the entire overload set. This process is also often called shadowing (see Figure 5.13). Shadowing is a behavior that contradicts the intuition of many C++ developers because an overload may be chosen that doesn’t seem like the best match.
Figure 5.13 Shadowing of member functions
// overloadSet.cpp #include <iostream> class Base { public: void func(int i) { std::cout << "Base::func(int) \n"; } void func(double d) { std::cout << "Base::func(double) \n"; } }; class Derived: public Base { // Bad: shadowing func of Base public: void func(int i) { std::cout << "Derived::func(int) \n"; } }; int main() { std::cout << '\n'; Derived der; der.func(2011); der.func(2020.5); std::cout << '\n'; }
The line der.func(2020.5)
with a double argument is called, but the int
overload of class Derived
is used. Consequently, a narrowing conversion from double
to int
happens. That is most of the time not the behavior that you want.
To use the double
overload of class Base
, you have to introduce it in the scope of Derived
.
class Derived: public Base { // good: Base::func is introduced public: void func(int i) { std::cout << "f(int) \n"; } using Base::func; // exposes func(double) };
Do not provide different default arguments for a virtual function and an overrider |
If you provide different default arguments for a virtual function and an overrider, your class may cause lots of confusion.
// overrider.cpp #include <iostream> class Base { public: virtual int multiply(int value, int factor = 2) = 0; }; class Derived : public Base { // Bad: different defaults // for virtual functions public: int multiply(int value, int factor = 10) override { return factor * value; } }; int main() { std::cout << '\n'; Derived d; Base& b = d; std::cout << "b.multiply(10): " << b.multiply(10) << '\n'; std::cout << "d.multiply(10): " << d.multiply(10) << '\n'; std::cout << '\n'; }
Figure 5.14 shows the surprising output of the program.
Figure 5.14 Different default arguments for virtual functions
What’s happening? Both objects b
and d
call the same function. The function is virtual and, therefore, late binding happens. Late binding applies to member functions, but not to data members of a class including default arguments. They are statically bound, and early binding happens for that part.
Although this section has nine rules, only about four of them are covered, for two reasons. First, the rule “C.145: Access polymorphic objects through pointers and references” adds nothing new to the rule “C.67: A polymorphic class should suppress copying.” Second, the C++ Core Guidelines dedicate an entire section to smart pointers. The section about resource management provides complete details.
The remaining rules are about the dynamic_cast
and the erroneous assignment of a pointer to an array of derived class objects.
dynamic_cast
Before I write about the dynamic_cast
, let me emphasize that casts, including dynamic_cast
, are used way too often. The job description of the dynamic_cast
, according to cppreference.com, is “Safely converts pointers and references to classes up, down, and sideways along the inheritance hierarchy.”
Let’s first start with the use case of a dynamic_cast
.
Use |
It’s the job of a dynamic_cast
to navigate in a class hierarchy.
struct Base { // an interface virtual void f(); virtual void g(); }; struct Derived : Base { // a wider interface void f() override; virtual void h(); }; void user(Base* pb) { if (Derived* pd = dynamic_cast<Derived*>(pb)) { // ... use Derived's interface ... } else { // ... make do with Base's interface ... } }
To detect the right type for pb
during run time, a dynamic_cast
is necessary: dynamic_cast<Derived*>(pb)
. If the cast fails, you get a null pointer.
A downcast can also be performed with static_cast
, which avoids the cost of the run-time check. static_cast
is only safe if the object is definitely Derived
.
The following rules are two options you have for dynamic_cast
.
Use |
and
To make it short: You can apply a dynamic_cast
to a pointer or to a reference. If the dynamic_cast
fails, you get back a null pointer in the case of a pointer and a std::bad_cast
exception in the case of a reference. Consequently, use a dynamic_cast
to a pointer if a failure is a valid option; if a failure is not a valid option, use a reference.
The program badCast.cpp
shows both cases.
// badCast.cpp struct Base { virtual void f() {} }; struct Derived : Base {}; int main() { Base a; Derived* b1 = dynamic_cast<Derived*>(&a); // nullptr Derived& b2 = dynamic_cast<Derived&>(a); // std::bad_cast }
The g++ compiler complains about both dynamic_cast
s at compile time. At run time, the program throws the expected exception std::bad_cast
for the reference (see Figure 5.15).
Figure 5.15 dynamic_cast
causes a std::bad_cast
exception
Never assign a pointer to an array of derived class objects to a pointer to its base |
This may not happen very often, but when it happens, the consequences are terribly bad. The result may be an invalid object access or memory corruption. The code snippet shows the invalid object access.
struct Base { int x; }; struct Derived : Base { int y; }; Derived a[] = {{1, 2}, {3, 4}, {5, 6}}; Base* p = a; // Bad: a decays to &a[0] which is converted to a Base* p[1].x = 7; // overwrite Derived[0].y
The last assignment should update the Base member x
of the second array element, but due to pointer arithmetic, it points to the second int
after p[0].x
. This happens to be memory of a[0].y
! The reason is that Base*
was assigned a pointer to an array of derived objects Derived
. During this assignment (Base* p = a;
), the array a
decays to &a[0]
, which is converted to a Base*
.
Decay is the name of an implicit conversion that applies lvalue-to-rvalue, array-to-pointer, and function-to-pointer conversions, removing const
and volatile
qualifiers. This means that you can call a function accepting Derived*
with an array of Derived
s. Necessary information such as the length of the array of Derived
s is lost.
In the following code snippet, the function func
takes its array as a pointer to the first element.
void func(Derived* d);
Derived d[] = {{1, 2}, {3, 4}, {5, 6}};
func(d);
The array-to-pointer decay is perfectly fine in this func
case but causes problems in the previous p[1].x
case.
You can overload functions, member functions, template functions, and operators. You cannot overload function objects, and therefore, you cannot overload lambdas.
The seven rules to overloading and overloaded operators follow one key idea: Build intuitive software systems for your users. Let me rephrase this key idea with a well-known golden rule in software development: Follow the principle of least astonishment (also known as the principle of least surprise). The principle of least astonishment essentially means that the components of a system should behave in a way that most users will expect them to behave. This principle is very important for overloading and overloaded operators because with great power comes great responsibility.
Although the seven rules address the intuitive behavior of overloading and overloaded operators, they take different perspectives. They address their conventional usage, the implicit conversion of operators, the equivalence of overloaded operations, and the idea that you should overload operators in the namespace of their operands.
Conventional usage means that the user should not be surprised by unexpected behavior or mysterious side effects of the operators.
Use an operator for an operation with its conventional meaning |
Conventional meaning includes that you use the appropriate operator. For example, here are a few operators that we are used to:
==
, !=
, <
, <=
, >
, and >=
: comparison operations
+
, -
, *
, /
, and %
: arithmetic operations
->
, unary *
, and []
: access of objects
=
: assignment of objects
<<
, >>
: input and output operations
Conventional meaning includes that your data type should behave like a number if it models a number. This rule is a kind of a generalization of the rule “C.86: Make ==
symmetric with respect to operand types and noexcept
.”
In general, the implementation of a symmetric operator such as +
inside the class is not possible.
Assume that you want to implement a type MyInt
. MyInt
should support the addition of MyInt
s and built-in int
s. Let’s give it a try.
// MyInt.cpp struct MyInt { MyInt(int v):val(v) {}; MyInt operator + (const MyInt& oth) const { return MyInt(val + oth.val); } int val; }; int main() { MyInt myFive = MyInt(2) + MyInt(3); MyInt myFive2 = MyInt(3) + MyInt(2); MyInt myTen = myFive + 5; // OK MyInt myTen2 = 5 + myFive; // ERROR }
Due to the implicit conversion constructor (MyInt(int v):val(v)
), the expression myFive + 5
is valid. Constructors taking one argument are conversion constructors because they take in the concrete case an int
and return a MyInt
. In contrast, the last expression 5 + myFive
is not valid because the +
operator for int
and MyInt
is not overloaded (see Figure 5.16).
Figure 5.16 Missing overload for int
and MyInt
The small program has many issues:
The + operator is not symmetric.
The val
variable is public.
The conversion constructor is implicit.
It’s quite easy to overcome the first two issues with a nonmember operator +
that is in the class declared as a friend
.
// MyInt2.cpp class MyInt2 { public: MyInt2(int v):val(v) {}; friend MyInt2 operator + (const MyInt2& fir, const MyInt2& sec) { return MyInt2(fir.val + sec.val); } private: int val; }; int main() { MyInt2 myFive = MyInt2(2) + MyInt2(3); MyInt2 myFive2 = MyInt2(3) + MyInt2(2); MyInt2 myTen = myFive + 5; // OK MyInt2 myTen2 = 5 + myFive; // OK }
Now implicit conversion from int
to MyInt2
kicks in, and the variable val
is private
. Thanks to the implicit conversion, the 5
in the last line becomes a MyInt2(5)
.
According to rule “C.46: By default, declare single-argument constructors explicit
,” you should not use an implicit conversion constructor.
MyInt3
has an explicit conversion constructor.
// MyInt3.cpp class MyInt3 { public: explicit MyInt3(int v):val(v) {}; friend MyInt3 operator + (const MyInt3& fir, const MyInt3& sec) { return MyInt3(fir.val + sec.val); } private: int val; }; int main() { MyInt3 myFive = MyInt3(2) + MyInt3(3); MyInt3 myFive2 = MyInt3(3) + MyInt3(2); MyInt3 myTen = myFive + 5; // ERROR MyInt3 myTen2 = 5 + myFive; // ERROR }
Making the conversion constructor explicit
breaks the compilation (see Figure 5.17).
Figure 5.17 Using an explicit
constructor
The general way to solve the challenge is to implement two additional + operator
s for MyInt4
. One takes an int
as the left argument, and one takes an int
as the right argument.
// MyInt4.cpp class MyInt4 { public: explicit MyInt4(int v):val(v) {}; friend MyInt4 operator + (const MyInt4& fir, const MyInt4& sec) { return MyInt4(fir.val + sec.val); } friend MyInt4 operator + (const MyInt4& fir, int sec) { return MyInt4(fir.val + sec); } friend MyInt4 operator + (int fir, const MyInt4& sec) { return MyInt4(fir + sec.val); } private: int val; }; int main() { MyInt4 myFive = MyInt4(2) + MyInt4(3); MyInt4 myFive2 = MyInt4(3) + MyInt4(2); MyInt4 myTen = myFive + 5; // OK MyInt4 myTen2 = 5 + myFive; // OK }
Make a constructor taking one argument explicit
. The same reason holds for the conversion operator.
If you want to have fun, overload the operator bool
and make it not explicit. Making it not explicit means that integer promotion from bool
to int
can happen silently.
Let me design a data type MyHouse
that can be bought. I implement the operator bool
to easily check to see if a family has already bought the house.
1 // implicitConversion.cpp 2 3 #include <iostream> 4 #include <string> 5 6 struct MyHouse { 7 MyHouse() = default; 8 explicit MyHouse(const std::string& fam): family(fam) {} 9 10 operator bool(){ return not family.empty(); } 11 // explicit operator bool(){ return not family.empty(); } 12 13 std::string family = ""; 14 }; 15 16 int main() { 17 18 std::cout << std::boolalpha << '\n'; 19 20 MyHouse firstHouse; 21 if (not firstHouse) { 22 std::cout << "firstHouse is not sold." << '\n'; 23 } 24 25 MyHouse secondHouse("grimm"); 26 if (secondHouse) { 27 std::cout << "Grimm bought secondHouse." << '\n'; 28 } 29 30 std::cout << '\n'; 31 32 int myNewHouse = firstHouse + secondHouse; 33 int myNewHouse2 = (20 * firstHouse - 10 * secondHouse) 34 / secondHouse; 35 36 std::cout << "myNewHouse: " << myNewHouse << '\n'; 37 std::cout << "myNewHouse2: " << myNewHouse2 << '\n'; 38 39 std::cout << '\n'; 40 41 }
Now I can easily check with the operator bool
(line 10) to see if a family (line 21) or no family (line 26) lives in the house. Fine. Due to the implicit operator bool
, I can use objects of MyHouse
in arithmetic expressions (lines 32 and 33). Supporting arithmetic was not my intention. See Figure 5.18.
Figure 5.18 Implicit operator bool
This is weird!
Since C++11, you can make a conversion operator explicit
; therefore, no implicit conversion to int
kicks in. If I use the explicit operator bool
(line 11), the arithmetic of houses is not possible anymore, but houses can be used in logical expressions. See Figure 5.19.
Figure 5.19 Explicit operator bool
and
Both rules are closely related. Equivalent operations should have the same name. Or the other way around: Nonequivalent operations should not have the same name.
Here is the example from the C++ Core Guidelines.
void print(int a); void print(const string&); ... print(5);
Invoking print(5)
feels like generic programming. You don’t have to care which version of print
is used. This observation will not hold if the functions have different names.
void print_int(int a); void print_string(const string&); ... print_int(5)
If nonequivalent operations have the same name, the names are too general or just wrong. This is confusing and error prone.
std::string translate(const std::string& text); // translate into English Code translate(const Code& code); // compile the code
Define overloaded operators in the namespace of their operands |
Have you ever wondered why the following program works and displays Test
?
#include <iostream> int main() { std::cout << "Test\n"; }
First of all, when you execute the program, it essentially becomes the following program:
#include <iostream> int main() { operator << (std::cout, "Test\n"); }
std::cout
<<
"Test\n"
boils down to operator
<<
(std::cout, "Test\n");
. There is no operator <<
in the global namespace, but argument-dependent lookup (ADL) examines the std
namespace. The operator
<<
finds std::operator
<< (std::ostream&, const char*)
because std::cout
is in the std::
namespace.
Argument-dependent lookup (ADL, also called Koenig lookup) means that for unqualified function calls, the functions in the namespace of the function arguments are considered by the C++ compile time.
Let me rephrase the definition of ADL using operands and operators. The C++ run time also considers for operators the namespace of the operands. Consequently, you should define overloaded operators in the namespace of their operands.
A union is a special class type where all members start at the same address. A union can hold only one type at a time; therefore, you can save memory. A tagged union (aka discriminated union) is a union that keeps track of its types. std::variant
is a tagged union.
The C++ Core Guidelines state that the job of unions is to save memory. You should not use naked unions but tagged unions such as std::variant
.
A union can hold only one type at one point in time, so you can save memory because the elements of a union share the same memory. The union will be as big as the biggest type.
union Value { int i; double d; }; Value v = { 123 }; // initializes the first member with an int std::cout << v.i << '\n'; // write 123 v.d = 987.654; // now v holds a double std::cout << v.d << '\n'; // write 987.654
Value
is a “naked” union. You should not use it, according to the next rule.
“Naked” unions are very error prone because you have to keep track of the underlying type.
// nakedUnion.cpp #include <iostream> union Value { int i; double d; }; int main() { std::cout << '\n'; Value v; v.d = 987.654; std::cout << "v.d: " << v.d << '\n'; std::cout << "v.i: " << v.i << '\n'; // (1) std::cout << '\n'; v.i = 123; std::cout << "v.i: " << v.i << '\n'; std::cout << "v.d: " << v.d << '\n'; // (2) std::cout << '\n'; }
The union holds a double
in the first section and an int
value in the second section. If you read a double
as an int
(1), or an int
as a double
(2), you get undefined behavior (see Figure 5.20).
Figure 5.20 Undefined behavior with a “naked” union
To overcome this source of errors, you should use a tagged union.
Implementing a tagged union is quite sophisticated. In case you are curious, have a look at the rule “C.182: Use anonymous unions to implement tagged unions.”
To simplify the code sample below, I used the tagged union std::variant
, which is part of C++17.
1 // variant.cpp; C++17 2 3 #include <variant> 4 #include <string> 5 6 int main() { 7 8 std::variant<int, float> v; 9 std::variant<int, float> w; 10 11 int i = std::get<int>(v); // i is 0 12 13 v = 12; // v contains int 14 int j = std::get<int>(v); 15 16 w = std::get<int>(v); 17 w = std::get<0>(v); // same effect as the previous line 18 w = v; // same effect as the previous line 19 20 21 // std::get<double>(v); // error: no double in [int, float] 22 // std::get<3>(v); // error: valid index values are 0 and 1 23 24 try{ 25 std::get<float>(w); // w contains int, not float: will throw 26 } 27 catch (std::bad_variant_access&) {} 28 29 v = 5.5f; // switch to float 30 v = 5; // and back 31 32 std::variant<std::string> v2("abc"); // converting constructors ok // when unambiguous 33 v2 = "def"; // converting assignment ok when unambiguous 34 35 }
Lines 8 and 9 define the two variants v
and w
. Both variants can have an int
and a float
value. Their initial value is 0 (line 11). The default value for the first underlying type int
is 0. v
gets in the line 13 the value 13. Thanks to std::get<int>(v)
, you can get the value for the underlying type. Line 16 and the following two lines show three possibilities to assign the variant v
the variant w
. You have to keep a few rules in mind. You can ask for the value of a variant by type or by index. The type must be unique, and the index valid (lines 21 and 22). If not, you get a std::bad_variant_access
exception. Lines 29 and 30 switch the variant v
to float
and back to int
. If the constructor call or assignment call is unambiguous, a conversion takes place. This conversion is the reason you can construct a std::variant<std::string>
with a C-string or assign a new C-string to the variant (lines 27 and 28).
I have skipped two sections from the classes and class hierarchies part of the C++ Core Guidelines. The first one is the section on containers and other resource handles; the second one is the section related to function objects and lambdas.
I also skipped the six guidelines discussing containers and other resource handles because they lack content.
The four guidelines to function objects and lambdas are already part of Chapter 4, Functions, and Chapter 8, Expressions and Statements.
The rules related to smart pointers are presented in a bigger context in Chapter 7, Resource Management.