Chapter 5.2

P.10: Prefer immutable data to mutable data

The wrong defaults

C++ is the language with the “wrong” defaults. Among other things, objects are cre-ated mutable unless they are qualified with const. But why is this the wrong default?

There are two things you can do to data: read from it and write to it. In assembly languages this is also known as loading and storing. You load values from memory into registers and store values from registers into memory. It is very unusual to store a value to memory without loading from it first. Similarly, it is very unusual to write to an object without reading from it first. However, it is much more common to read from an object without subsequently writing to it.

Read-only objects are much more common than read-write objects, which are themselves much more common than write-only objects. The default should be the most common option, so we contend that objects should be immutable by default.

This would have some interesting side effects. Imagine that C++ did not have a const keyword, and that the mutable keyword was more extensively deployed. You might write functions like this:

void f1(std::string id)
{
  auto write = id_destination(); // write is immutable
  …
}

As you develop your function, there will come a point where you will need to qualify write as mutable, because you are writing to it and that would violate immutability. Imagine what would happen if you finished the function and noticed that write was still immutable. Immediately, this would demonstrate evidence of a bug. Why would you give something a name like write and never mutate it?

This bug-finding trick can be yours by simply declaring everything const and modifying that qualification only when you need to mutate the object. You can periodically requalify all the objects as const and check that everything that should indeed be written to causes the compiler to complain about its const qualification. It is a nice win for code clarity.

However, that is not the primary reason to prefer immutable data.

I have a car, and I have sunglasses that I keep in the car. They never leave the car. I have additional sunglasses for when I am not in the car. Driving into the sun without sunglasses is enough of a danger for me to avoid it where possible. This means the sunglasses are in one of two places: on my face, or in the special pocket I have for them in the car door. I have habituated myself to simply check that they are where they should be when I open the car door. I really do not need to think about it.

My car is sadly not an electric vehicle. I hope it will be my last gas-guzzler. When I start a journey, I must decide whether I have enough fuel for the journey. The manu-facturers have gone to some lengths to make this as simple as possible. There is a fuel gauge on the dashboard, with an estimated range telling me how far I can go before I need to refuel. There is a safety margin built in, so in the normal case I do not run out of fuel because I can do the arithmetic required. Sometimes things have gone wrong, though. There are long stretches of road in the UK where there are no filling stations, for example across moors. Getting stuck in a traffic jam in that sort of situation can be a tense occasion for everyone in the car if I have allowed the fuel tank to drain a little too far.

It is much easier for me to think about my sunglasses than it is to think about my fuel tank. My sunglasses live in one place, and I can get to them without think-ing about it. The volume of fuel in my fuel tank is an ever-changing quantity, and it introduces cognitive load to my driving experience. It is much easier for me to reason about the consistently located sunglasses than it is for me to reason about the vary-ingly filled fuel tank.

Consider this function:

double gomez_radius(double cr, double x1, double x2,
                    double y1, double y2, double rh)
{
  assert(rh > 1.0f);
  auto lt1 = lovage_trychter_number(cr, x1, y1);
  auto lt2 = lovage_trychter_number(cr, x2, y2);
  auto exp = 1;
  while (lt1 > lt2) {
    lt2 *= rh;
    ++exp;
  }
  auto hc = haldwell_correction(cr, lt1, lt2);
  auto gr = (lt1 * lt2 * sqrt(hc)) / rh * exp;
  return gr;
}

You can imagine this function being copied directly from a textbook. Apprehending this function is simplified by a sprinkling of const:

double gomez_radius(double cr, double x1, double x2,
                    double y1, double y2, double rh)
{
  assert(rh > 1.0f);
  auto const lt1 = lovage_trychter_number(cr, x1, y1);
  auto lt2 = lovage_trychter_number(cr, x2, y2);
  auto exp = 1;
  while (lt1 > lt2) {
    lt2 *= rh;
    ++exp;
  }
  auto const hc = haldwell_correction(cr, lt1, lt2);
  auto const gr = (lt1 * lt2 * sqrt(hc)) / rh * exp;
  return gr;
}

The number of moving parts has been reduced to two: the second Lovage-Trychter number and the exponent.

Imagine how complicated architecture would become if the acceleration due to the earth’s gravity varied significantly around the planet. Consider the advances in twentieth-century science that emerged once the speed of light was exposed as a con-stant. Constants are good! Their invariant nature provides us with one less thing to think about. Where you can fix the value of something, you should do so.

constness in function declarations

Although we cannot implicitly make objects const by default, we can qualify them whenever they are created. Member functions are also mutable by default. Again, do the smart thing and qualify them as const until they need to mutate something.

Things are a little more subtle here, though. As discussed in Chapter 3.4, there are two kinds of const, logical and bitwise. When you decide that a const function needs to mutate something, you need to ask yourself whether it is the function that should not be const-qualified, or the member data that should be mutable, exempting it from the rule that member data cannot be changed in a const member function.

This is a matter of API design. The purpose of const qualification is to tell the user, “There will be no observable difference in the state of the object between con-secutive calls to const functions.” If your function is in fact changing the state of the abstraction rather than the state of private implementation details, then your func-tion should not be const-qualified.

For example, you may want to design a thread-safe queue for message passing. This is a common exercise when you learn about thread safety: rather than syn-chronizing thread progress by sharing memory, it is safer to synchronize by com-municating between threads. This leads to the maxim referred to earlier: “Don’t communicate by sharing memory, share memory by communicating.” You can argue among yourselves about whether passing messages counts as sharing memory.

A message-passing queue might have this public interface:

template <typename T>
class message_queue {
public:
  … // definitions for iterator &c.
  void push(T);
  T pop();
  const_iterator find(const T&) const;

private:
  mutable std::mutex m_lock;
  … // other implementation details
};

Typically, you would have two threads: a producer and a consumer. The producer pushes messages and the consumer pops messages. The push and pop functions are of course going to mutate the object, but the find function, which searches for a par-ticular object in the queue, should not. While this function is executing it will need to lock the mutex, which is a non-const operation. If find were not a const function, it could change any member data at all. Because it is const, the only value it can change is the mutex, which has been marked mutable. Simply reading the class declaration tells you this. Indeed, the only context where a mutex member datum does not need to be mutable is if there are no const member functions. Even then, it is a good habit to make mutexes mutable.

Function parameters are also, by default, mutable. They behave like local vari-ables. It is unusual for a function to mutate its arguments: they are used to inform its operation. Of course, the functions in <algorithm> which take input iterators are a notable exception to this. However, given that function parameters are also rarely declared const anyway, you may consider avoiding the const qualification of func-tion parameters. For example:

void f(char const* const p);
void g(int const i);

might seem pedantic rather than correct. This is a matter of style and taste. There is another related matter of style and taste regarding const, and that is where to put it.

  • References cannot be const-qualified.

  • Pointers are const-qualified to the right.

  • Member functions are const-qualified to the right.

  • Types can be const-qualified to the right or to the left of the type. This is known as East const and const West.

References cannot be const-qualified, because a reference cannot be reseated; that is, its referent cannot be changed. Qualifying a reference with const would add no further information.

Pointers sit to the right of a type name. If pointers could be const-qualified to the left, then there would be potential for ambiguity:

int const * j; // is the int being qualified, or the pointer?

The same is true for member functions:

int const my_type::f1(); // is the return type const-qualified
                         // or the member function?

It is only types whose const qualification offers a choice of position. It is possibly unwise to be dogmatic about on which side const should appear; however, I favor East const, as you may have inferred from reading this text. I like the consistency, and I also find additional clarity in declaring my objects like this:

int const& a = …;    // a is an int const-reference
int const* b;        // b is a pointer-to-const int
int & c = …;         // c is a reference to an int
int const d;         // d is a const-int

English offers little help here sadly, since it places adjectives to the left of the thing being qualified, so there is a certain dissonance in writing const to the right. Having said that, consistency can improve readability, but this is not a hard-and-fast rule. My editor remarks that she is successfully training her brain to treat int as an adjective, as well as const, which describes the nature of the name. Adjective order-ing is an interesting digression (why do we say “big friendly dog” rather than “friendly big dog”?) but out of scope for this text. Ultimately, always prefer readabil-ity to consistency and, more broadly, prefer pragmatism to dogmatism, especially in engineering.

In addition to parameters, it is worth thinking about return values. When a func-tion returns a value, it is an rvalue. Qualifying it as const is irrelevant because it is about to be assigned from, constructed from, or destroyed. This being the case, is there ever any reason to const-qualify a return type?

This leading question should tell you that, in fact, there is. Although the returned object is an rvalue, it is not destroyed immediately, and it is possible to invoke func-tions on it. Look at this class:

template <typename T>
class my_container {
public:
  …
  T operator[](size_t);
  T const operator[](size_t) const;
};

The subscript operator, or more precisely the bracket operator, is being used to return an object by value rather than by reference, which would be more usual. It may be the case that this container has very unstable iterators; perhaps they become invalidated by operations on another thread. In this case, returning by value is the only way of safely exporting values. There are two overloads of this operator so that the correct one will be used for [] on const and non-const objects. Consider this piece of code:

my_container<Proxies> const p = {a, b, c, d, e};
p[2].do_stuff();

If the member function Proxies::do_stuff() is overloaded by const qualification, then the correct function will be called on the rvalue returned by invoking c[2].

We hope you are thinking to yourself that this is a rather contrived example, and we agree that this is the case. This very contrivance should convey to you that const-qualifying return types should be a deliberate and unusual step.

Finally, there is an exception to qualifying function parameters as const, and that is when they are passed by pointer or by reference. By default, you should take such arguments by reference-to-const or by pointer-to-const. This signals to your callers that you will not be changing the objects that are being passed in. This reinforces the idea of input and input-output parameters, as discussed in Chapter 4.1, The separation of type from const, reference, and pointer qualification becomes more useful here. For example:

int f1(std::vector<int> const& a, int b, std::vector<int> & c);

It is immediately clear here that the function is using two inputs, a and b, to modify another input, c (consider which values can be mutated and propagated back to the call site). The return value is likely an error code. The separation of const& from type and identifier highlights the qualification that gives us a flavor of what part we expect the object to play in the function.

Summary

It is tempting to respond to this guideline by simply sprinkling const wherever you can and murmuring “job done” to yourself. A little more subtlety than that is required, though. There is no correct default for const qualification. It is certainly true that in most cases you will want to create your objects as const-qualified and then update them as development continues, but it is not simply a case of dropping a const next to the type: there are some places where const is not appropriate. Having said that, the guideline’s direction of preference is absolutely true.

  • Immutable data is easier to reason about than mutable data.

  • Make objects and member functions const wherever you can.

  • Consider whether to use East const or const West to assist readability.

  • Function parameters passed by value and returned by value do not benefit from being const-qualified, except to correctly propagate constness in some specific situations.