Chapter 3.1

I.11: Never transfer ownership by a raw pointer (T*) or reference (T&)

Using the free store

Ownership is important. It implies responsibility which, in C++, means cleaning up after yourself. If you create something, you clean it up. While this is a completely trivial matter for objects of static and automatic storage duration, it is a minefield for objects of dynamic duration which are allocated from the free store.

Once memory is allocated from the free store it can be very easily lost. It can only be returned to the free store using the pointer to which it was assigned. This pointer is the only available handle to the memory. If that pointer falls out of scope without being copied, then the memory can never be recovered. This is known as a memory leak. For example:

size_t make_a_wish(int id, std::string owner) {
  Wish* wish = new Wish(wishes[id], owner);
  return wish->size();
}

At the end of the function the Wish pointer falls out of scope, leaving the memory unrecoverable. We can change the function a little by returning the pointer so that the caller can take ownership and delete the object later, freeing the memory.

Wish* make_a_wish_better(int id, std::string owner) {
  Wish* wish = new Wish(wishes[id], owner);
  return wish;
}

This is perfectly well-formed code, although we would not call it modern in the idi-omatic sense. Unfortunately, it carries a burden: the caller must take ownership of the object and ensure that it is destroyed via the delete operator when they are fin-ished with it, freeing the memory. There is also a danger of deleting it before every-one has finished with the object it points to. If make_a_wish takes a pointer from another object, how should it be signaled that the other object has finished with it and no longer needs it?

Historically, this type of function led to the free store being exhausted by zom-bie objects whose ownership was not clearly signposted and for which the necessary deleting never took place. This signposting might take several forms. The function author might name the function allocate_a_wish, hinting to the client that an allo-cation had taken place and it was their responsibility now.

This is a rather weak way of signaling ownership; it is unenforceable, and it relies on the client remembering that they have this responsibility and discharging it appro-priately. It also requires the author to embed the implementation in the interface. This is a bad habit since it implicitly exposes implementation details to the client and prevents you from changing them without introducing confusion.

While constrained naming may seem weak, it is not as weak as remarking on it in some documentation on a server somewhere remote, dark, and uninviting. Nor is it as weak as a note in a header file that nobody ever reads. While it is a lesser evil, it is certainly not foolproof.

Worse still is returning a value via a reference rather than a pointer. How does the caller know when the object has been destroyed? All that the caller can do with such a value is hope that another thread doesn’t destroy it in the meantime, and ensure the object is used before calling another function that may trigger its destruction. This is a lot of context to keep in mind while writing your code.

If you work with particularly elderly codebases you may see instances of std::auto_ptr. This is the first attempt at solving this problem which ended up being standardized in C++98. The std::auto_ptr would contain the pointer itself and pro-vide overloaded pointer semantics, acting as a handle to the object. The std::auto_ptr could be passed around, releasing ownership when it was copied from. When it fell out of scope while retaining ownership it would delete the contained object. However, these unusual copy semantics meant std::auto_ptr objects could not be safely contained in standard containers, and the class was deprecated in the second standard (C++11) and removed in the third (C++14).

The committee doesn’t deprecate things without replacements, though, and the introduction of move semantics made it possible to create pointer owning objects for which containment was not a problem. As std::auto_ptr was deprecated in C++11, std::unique_ptr and std::shared_ptr were introduced. These are known as “smart” pointers or “fancy” pointers, and entirely solve the ownership problem for you.

When you receive a std::unique_ptr object you become the owner of what the object points to. When its name falls out of scope it deletes the contained object. However, unlike the std::auto_ptr, it does not con-tain a flag identifying ownership, so it can be safely contained in a standard container. The reason it doesn’t need a flag is because it cannot be copied, it can only be moved, so there is no ambiguity about who is cur-rently responsible for the object it contains.

Your default choice for holding objects with dynamic storage duration should be a std::unique_ptr. You should only use std::shared_ptr where reasoning about lifetime and ownership is impossibly hard, and even then, you should treat it as a sign of impending technical debt caused by a failure to observe the appropriate abstraction.

When you receive a std::shared_ptr object you gain an interest in what the object points to. When it falls out of scope that interest is withdrawn. The instant that nothing any longer has an interest in the object, it is deleted. Ownership is shared between everything that has an interest in the contained object. The object will not be destroyed until no objects remain that hold an interest in it.

Your default choice for holding objects with dynamic storage duration should be a std::unique_ptr. You should only use std::shared_ptr where reasoning about life-time and ownership is impossibly hard, and even then, you should treat it as a sign of impending technical debt caused by a failure to observe the appropriate abstraction. One example of this could be a Twitter viewer that organizes tweets into different columns. Tweets may contain images that make them large, and they may also be shared across columns. A tweet only needs to continue to exist while it is in view in one of the columns, but the user decides when a tweet is no longer needed by scroll-ing it away from all columns. You might decide to keep a container of tweets and use counts, effectively reference counting the tweets manually, but that is simply dupli-cating the std::shared_ptr abstraction at another level of indirection. Particularly, though, the user is making the decision about the lifetime of the tweet, not the pro-gram. It should be rare that such a situation arises.

The performance cost of smart pointers

Sometimes you might decide that you do not want to use smart pointers. Copying a std::shared_ptr is not without cost. A std::shared_ptr needs to be thread safe, and thread safety costs cycles; only the control block of a std::shared_ptr is thread safe, not the resource itself. It may be implemented as a pair of pointers, one of which points to the thing being contained and the other pointing to the bookkeeping mechanism. It remains cheap to copy in terms of moving memory around, but the bookkeeping mechanism will have to acquire a mutex and increment the reference count when it is copied. When it falls out of scope it will have to acquire the mutex again and decrement the reference count, destroying the object if that count reaches zero.

The std::unique_ptr is a simpler, cheaper beast. Since it can only be moved, not copied, only one instance of it can exist, so when it falls out of scope it must delete the object it is containing. No bookkeeping is required. However, there is still the overhead of containing a pointer to the function that will delete the object. The std::shared_ptr contains such an object too as part of the bookkeeping.

This is not something you should worry about until it turns up as a hot spot in your profiling measurements. The safety of a smart pointer is a very valuable part of your engineering effort. However, if you find that using smart pointers registers in your profile, you might look at where you are passing them and discover that sharing or transferring ownership is unnecessary. For example:

size_t measure_widget(std::shared_ptr<Widget> w) {
  return w->size(); // (We're assuming that w is non-null)
}

This function does not require any kind of ownership considerations. It is simply calling a function and returning that value. This function would work just as well:

size_t measure_widget(Widget* w) {
  return w->size(); // (We're still assuming that w is non-null)
}

Pay particular attention to what has happened to w, or rather what has not happened to w. It has not been passed on to another function, or used to initialize another object, or indeed had its lifetime extended in any way. If the function were to look like this:

size_t measure_widget(Widget* w) {
  return size(w); // (You guessed it…)
}

then that would be a different matter. You do not own w, so it is not yours to pass around. The size function may take a copy of w and cache it for later use, so unless you are certain of the implementation of that function and you are also in charge of how it might change, passing w is unsafe. If the object w points to is destroyed later, then that copy of w will be pointing to nothing and dereferencing it will be poten-tially disastrous.

This function takes an object by pointer and then passes it to another function. This implies ownership, which is not conveyed in the function signature. Do not transfer ownership by a raw pointer.

The correct way to implement this function is:

size_t measure_widget(std::shared_ptr<Widget> w) {
  return size(w);
}

You are now giving the size() function an interest in the std::shared_ptr. If the call-ing function subsequently destroys w, the size() function can still retain a copy.

Using unadorned reference semantics

A raw pointer is not the only way to pass an object by reference rather than by value. You can also achieve this using a reference. Using a reference is the preferred mecha-nism of passing by reference. Consider this version of measure_widget:

size_t measure_widget(Widget& w) {
  return w.size(); // (References cannot be null without evil intent)
}

This is superior because it passes the burden of checking that the object exists to the caller. They must dereference the object and pay the penalty of dereferencing a null pointer. However, the same ownership problem exists if w is passed on. If the refer-ence is stored as part of another object, and the referent is destroyed, then that refer-ence will no longer be valid.

The function signature should tell the caller everything they need to know about ownership. If the signature includes a T*, then the caller can pass a pointer to an object, or a null pointer, and not worry about its lifetime. The caller is simply pass-ing an object by reference to the function and then carrying on with things. If the sig-nature includes a T&, then the caller can pass a reference to an object, and not worry about its lifetime. The same benefits apply.

If the signature includes a std::unique_ptr<T>, then the caller must surrender ownership of the object. If the signature includes a std::shared_ptr<T>, then the caller must share ownership of the object with the function. This implies the caller cannot be sure when the object will be destroyed.

If you deviate from these rules, you can introduce painfully subtle bugs into your codebase which will result in tiring arguments about ownership and responsibility. Objects end up being destroyed early or not at all. Do not transfer ownership by raw pointer or reference. If your function takes a pointer or a reference, do not pass it on to a constructor or another function without understanding the responsibilities of so doing.

gsl::owner

We’ve covered passing and returning values by raw pointer and by reference and seen that it is not a good idea. Users might infer ownership when they shouldn’t. Users might want ownership rights when they can’t have them. The correct course of action is to use smart pointers to indicate ownership.

Unfortunately, you may be working in legacy code that can’t be modified very much. It may be part of a dependency of other legacy code that is relying on the ABI. Exchanging pointers for smart pointers would change the layout of any objects that contain them, breaking the ABI.

It’s time to introduce properly the Guidelines Support Library (GSL). This is a small library of facilities designed to support the Core Guidelines. There are a lot of items in the Core Guidelines, some of which are very hard to enforce. The use of raw pointers is a case in point: how do you signal ownership of a pointer if you can’t use smart pointers? The GSL provides types to aid enforcement.

The GSL is divided into five parts.

  • GSL.view: these types allow the user to distinguish between owning and non-owning pointers, and between pointers to a single object and pointers to the first element of a sequence.

  • GSL.owner: these are ownership pointers, which include std::unique_ptr and std::shared_ptr as well as stack_array (a stack allocated array) and dyn_array (a heap allocated array).

  • GSL.assert: these foreshadow the contracts proposal by providing two macros, Expects and Ensures.

  • GSL.util: no library is complete without a homeless bag of useful things.

  • GSL.concept: this is a collection of type predicates.

The GSL predates C++17, and parts of the GSL, particularly the concept section, have been superseded by Standard C++. It’s available on GitHub at https://github.com/Microsoft/GSL. Simply #include <gsl/gsl> to get the full set of objects.

This chapter is most concerned with one of the view types, gsl::owner<T*>. Let’s look at an example:

#include <gsl/gsl>

gsl::owner<int*> produce()       // You will become the proud owner
{
  gsl::owner<int*> i = new int;  // You're the owner
  return i;                      // Passing ownership out of the function
}

void consume(gsl::owner<int*> i) // Taking on ownership
{
  delete i;                      // It's yours, you can destroy it
}

void p_and_c()
{
  auto i = produce();           // create…
  consume(i);                   // …and destroy
}

As you can see, enclosing the pointer with owner<> signals that ownership is being established. Let’s change things a little:

int* produce()                            // Just a raw pointer
{
  gsl::owner<int*> i = new int;
  return i;
}

What happens now?

You might be forgiven for thinking that the compiler will warn you that you are passing to an unowned pointer from an object signaling ownership. Unfortunately, this is not the case. The definition of owner is:

template <class T,
          class = std::enable_if_t<std::is_pointer<T>::value>>
using owner = T;

As you can see, there is no magic here. The definition of gsl::owner is very simple: if T is a pointer type, then gsl::owner<T> is an alias to T, otherwise it is undefined.

The purpose of this type is not to enforce ownership, but rather to hint to the user that there is a change in ownership going on. Rather than embedding this informa-tion in the function name, it is embedded in the type. While it is quite possible to create a type named owner that does all the required enforcement to correctly track and maintain ownership, there is no need: std::shared_ptr and std::unique_ptr do this job entirely adequately. The gsl::owner type is merely syntactic sugar that can be dropped into an existing codebase with no impact to the code, the ABI, or the execution, but with a large impact on readability and comprehension, as well as a contribution to the efficacy of static analyzers and code reviews.

As the GSL becomes more widely used we might expect IDEs to learn about its types and warn about abuses of ownership through signals in the editor, such as red underlining or floating lightbulb hints. Until then, do not use gsl::owner as an enforcement type, but rather as a documentation type. Ultimately, treat gsl::owner as a last resort when you are truly unable to use higher-level ownership abstractions.

Summary

In summary:

  • Owning something means being responsible for something.

  • C++ has smart pointers for unambiguously signaling ownership.

  • Use smart pointers to signal ownership, or gsl::owner<T>.

  • Do not assume ownership from a raw pointer or a reference.