Chapter 4.3

ES.5: Keep scopes small

The nature of scope

“Scope” is another of those words that is overloaded with meaning. It comes from computer science, but each programming language puts its own little twist on it. In the specific case of C++, scope is a property of the declarations in your code, and it is where visibility and lifetime intersect.

All declarations in your program appear in one or more scopes. Scopes can nest, like namespaces, and most scopes are introduced with an identifier. Names are only visible in the scope they are declared in, but the lifetime of objects is not necessar-ily limited to the scope of their name. This fuzziness catches engineers out time and again when it comes to objects of dynamic storage duration.

Deterministic destruction is the preeminent feature of C++. When the name of an object of automatic storage class falls out of scope, the object is destroyed. Its destructor is invoked, and everything is cleaned up. There is no waiting around for a garbage collector to work its magic and sweep the floor, which is usually the case with managed languages, and which can lead to unpleasant nondeterministic side effects such as running out of resources at inopportune moments, or for cleanup to be skipped entirely. However, when the name of a raw pointer falls out of scope, it is the pointer that is destroyed, not the object it points to, leaving the object to persist without a name and thus without a means of having its destructor invoked.

The lifetime of an object of dynamic duration is only loosely bound to the scope of its name. Indeed, the pointer can be assigned to another name and survive beyond the scope of the name it was first bound to. It can be assigned to many names and cause all manner of headaches when it is to be decided when the object should be destroyed. This is why we have the std::shared_ptr class. When the last name bound to a std::shared_ptr object falls out of scope, the object it points to is destroyed. We looked at this where we discussed Core Guideline I.11: “Never transfer ownership by a raw pointer (T*) or reference (T&).” Reference counting is used to do all the book-keeping. The reference is increased when a name is bound to the std::shared_ptr object, and decreased if the name falls out of scope.

It is important to remember that names have scope, not objects. Objects have lifetime, or more precisely, storage duration. There are four types of storage dura-tion: static, dynamic, automatic, and thread-local. How many scopes can you name? More than two? In fact, there are six. They are:

Block
Namespace
Class
Function parameter
Enumeration
Template parameter

We shall now look at them all.

Block scope

Block scope is probably the first one you thought of, although you might not know its name. A block, or compound statement, is a sequence of statements enclosed by braces, as in the following example:

Click here to view code image

if (x > 15) {
  auto y = 200;
  auto z = do_work(y);
  auto gamma = do_more_work(z);
  inform_user(z, gamma);
}

The body of the if statement is a single block scope. This is a healthy example of the form. It is a small scope. It is easy to see what is going on: y, z, and gamma are declared, and then destroyed at the end of the scope after inform_user returns. Block scopes are found after function definitions, control statements, and so on.

Scopes can be nested. Block scopes can nest by simply opening another pair of braces:

Click here to view code image

if (x > 15) {
  auto y = 200;
  auto z = do_work(y);
  {                              // New scope
    auto y = z;                  // Scope of the next y starts here
    y += mystery_factor(8);      // Still referring to the next y
    inform_user(y, 12);          // Still referring to the next y
  }                              // Scope of the next y ends here
  y = do_more_work(z);           // Scope of first y resumes here
  inform_user(z, y);
}

This example demonstrates a discontiguous scope, and there are some interesting things going on here. The symbol y has been “shadowed” in a nested scope, so there are two objects called y. The first y lives in the outer scope and is no longer in scope once the scope of the next y starts. It comes back in scope at the end of the nested scope, which means its scope is broken up, or discontiguous.

This example is not as healthy as the first one. It is entirely legal, although at best unwise, to reuse names in nested scopes. This is the sort of thing that happens when you paste code from another location and is yet another reason why pasting code is a last resort. The scope and the nested scope take up fewer than a dozen lines that are readable after a fashion, but it is only a matter of time before the code is amended and expanded. Once that happens, the likelihood of the two objects named y becom-ing mixed up by the author increases.

Some implementations will warn you if you mask a name in this way. Indeed, my implementation of choice introduced this and interfered quite heavily with our no-warnings policy. My stubborn refusal to disable warnings meant that the largest part of the upgrade to this new version of the compiler was fixing all instances of this, uncovering a surprising number of bugs in the process.

Namespace scope

Like block scope, namespace scope begins after an opening brace. Particularly, it is the brace after the namespace identifier, thus:

Click here to view code image

namespace CG30 { // Scope begins here

Any symbol declared before the matching closing brace is visible from its point of declaration. Unlike block scope, it remains visible after the closing brace in all subse-quent definitions for the same namespace, for example:

Click here to view code image

namespace CG30 {
  auto y = 76; // Scope of y begins here
} // Scope of y is interrupted here

…

namespace CG30 { // Scope of y resumes here
  auto x = y;
}

While the scope of y is interrupted, you can still reference it by explicitly resolving the scope. This is done by using ::, the scope resolution operator, and prefixing with the scope you are intending to resolve to. To refer to y declared in the namespace CG30, you write CG30::y.

There is one occasion where namespace scope does not begin after an opening brace, and that is for the outermost scope. The start of a translation unit is the beginning of a namespace called global namespace scope. Habitually, this scope gets called file scope or global scope, but those are hangovers from C. Now that we have namespaces, we have a more accurate name.

Since global namespace scope is never interrupted, symbols declared there are vis-ible everywhere. While this is enormously convenient, it is an equally enormously terrible idea to declare anything other than namespaces at global namespace scope, with the exception of main() and operator overloads whose operand types are declared in different namespaces. Globals are bad, m’kay?

There is another special namespace called the anonymous namespace. Symbols declared in an anonymous namespace are in scope until the end of the enclosing scope and have internal linkage. For example:

Click here to view code image

namespace {
  auto a = 17; // private to the current translation unit.
}

Having mentioned linkage, we need to clearly differentiate between scope, storage duration, and linkage. We need to be clear on the difference between a name and an object. A name has a scope that determines when that name is visible without requir-ing scope resolution. An object has a storage duration that determines the lifetime of an object. An object binds to a name.

Objects with static or thread-local storage duration also have a linkage, which is internal or external. Internal linkage makes an object inaccessible from another translation unit. Objects with automatic storage duration have no linkage.

Objects with dynamic storage duration do not bind to a name, nor do they have linkage. They bind to a pointer that binds to a name. That indirection is what causes so many problems with memory leaks, but it is also a reason why C++ delivers supe-rior performance: the engineer can precisely schedule the lifetime of the object rather than rely on garbage collection.

If you open an anonymous namespace at global namespace scope, then all the symbols will be in scope until the end of the translation unit. This can be particu-larly unpleasant if you open an anonymous namespace in a header file: the transla-tion unit will typically end long after the end of a #include directive. Additionally, if you reuse the header file, you will end up with multiple instances of the same defined entity, which may not be what you intended. If you are going to open an anonymous namespace at global namespace scope, do not do it in a header file.

Click here to view code image

namespace CG30 {
  auto y = 76;
  namespace {
    auto x = 67;
  } // x is still in scope
  auto z = x;
} // Scope of x is interrupted here.

namespace {
  constexpr auto pi = 3.14159f;
} // pi is still in scope

The final namespace scope to consider is the inline namespace scope. Like the anon-ymous namespace, the scope of symbols declared in an inline namespace is not inter-rupted at the end of that namespace, but at the end of the enclosing namespace, like so:

Click here to view code image

namespace CG30 {
  auto y = 76;
  inline namespace version_1 {
    auto x = 67;
  } // x is still in scope
  auto z = x;
} // Scope of x is interrupted here

As you can see, there is a little more to namespaces than meets the eye. The global namespace scope is the largest possible scope, so keeping scopes small means not declaring any symbols in it. We can see a useful overlap of guidelines here. Also, keeping a namespace scope small makes it easier to apprehend the contents. Long, rambling namespaces lose their coherence and should be avoided.

Class scope

Class scope represents yet another variation on the block scope. The scope of a sym-bol declared in a class begins at the point of declaration and persists beyond the end of the class definition. It includes all default arguments for member-function param-eters, exception specifications, and the member function bodies:

Click here to view code image

class RPM {
  static constexpr int LP = 33;
  static constexpr int Single = 45;

public:
  static constexpr int SeventyEight = 78;
  int RotorMax(int x, int y = LP);
}; // RotorMax, LP, Single and SeventyEight remain in scope
   // within member function bodies of RPM

int RPM::RotorMax(int x, int y)
{
  return LP + x + y;
}

While the scope of SeventyEight is interrupted, you can still reference it by explicitly resolving the scope since it is a public member of the class. This is done by, again, using ::, the scope resolution operator, and writing RPM::SeventyEight. The scope resolution operator narrows the number of places for the compiler to search for the name.

What does keeping class scope small entail? It means keeping the interface mini-mal and complete. As an interface grows it loses coherence.

We have all seen the epic interface: that one class in the project that has become a home for waifs and strays, with a name like Manager or Globals or Broker, and a nick-name like The Blob. Badly named classes invite badly designed APIs. Classes with broad names invite broad APIs. Manager is both a bad and broad name for a class. The increasing cost of a big API is the loss of meaning and the brake that it applies to development: whenever you are forced to interact with The Blob, you need to parse screenfuls of interface and, with luck, many pages of documentation.

Badly named classes invite badly designed APIs.

There are several techniques for keeping class scopes small. In fact, some of these are expressed as Core Guidelines. Both C.45: “Don’t define a default constructor that only initializes data members; use member initializers instead” and C.131: “Avoid trivial getters and setters,” discussed in the first section, have the side effect of reducing class scope size. In the case of C.45, one fewer function implies smaller scope because the class definition is smaller and there is one fewer member function definition, which is also part of the scope of the class. In the case of C.131, the same reasoning applies: fewer member functions imply smaller scope.

Core Guideline C.4: “Make a function a member only if it needs direct access to the representation of a class” reduces scope by replacing member functions with nonmember nonfriend functions where possible. These functions may be declared near the class, but they do not need to be declared in the scope of the class. Of course, this increases the size of the namespace scope, but any declaration is going to have an impact on a scope somewhere.

Simply refusing to grow the interface of a class beyond a certain size will keep class scopes small. You might decide that, whenever a public interface exceeds 10 functions, it is time to examine the abstraction and see if there are two more inside, ready to be realized. Perhaps the class invariants can be partitioned into two, and a more refined pair of abstractions can be drawn out.

The same approach can be taken to shrinking The Blob. Checking the invariants (and often, a class will start off with a handful of invariants whose quantity grows slowly over time), gathering them, and partitioning them will highlight the set of abstractions that model the broad collection of concepts contained therein.

We hope that this overlap of guidelines is making it apparent that following the guideline of keeping scopes small is a golden rule that rewards you many times over.

Function parameter scope

Three scopes remain for discussion. Function parameter scope is a little like block scope but with the addition of the function signature. The scope of a function parameter starts at its point of declaration within the function signature and ends at the end of the function declaration or, if the function is being defined, at the end of the function body. For example:

Click here to view code image

float divide(float a, // Scope of a begins here
            float b); // Scope of a ends here

float divide(float a, // Scope of a begins here
            float b) {
  return a / b;

} // Scope of a ends here

There is another variation, which involves the use of a function try block:

Click here to view code image

float divide(float a, // Scope of a begins here
            float b)
try {
  std::cout << "Dividing\n";
  return a / b;
} catch (…) { // Scope of a continues beyond here
  std::cout << "Dividing failed, was the denominator zero?\n";
} // Scope of a ends here

Function parameter scope ends at the end of the final catch clause.

This is rather like the block scope, which is to be expected since a function is like a big compound statement. Here is another overlap of guidelines. Consider F.3: “Keep functions short and simple.” This is served by keeping the function scope small. In fact, adhering to F.2: “A function should perform a single logical operation” will usu-ally yield functions with small scopes.

Enumeration scope

The nature of the enumeration scope seems clear and keeping it small seems counter-intuitive. After all, there are 56 two-letter US state abbreviations, 118 elements of which the news has come to Harvard, and 206 bones in the adult human body. These are constants, and they are independent of any edict to keep scopes small.

However, this is not the whole story. Look at these scopes:

Click here to view code image

enum US_state_abbreviation { // unscoped enumeration
  …
  VT,
  VA,
  VI,
  WA,
  WV,
  WI,
  WY
}; // Scope of VI (Virgin Islands) does not end here.

enum class Element { // scoped enumeration
  …
  Nh,
  Fl,
  Mc,
  Lv,
  Ts,
  Og
}; // Scope of Lv (Livermorium) ends here.

US_state_abbreviation southernmost = HI; // HI in scope
// Element lightweight = H;              // H not in scope
Element lightweight = Element::H;        // H in scope

While the scope of H is interrupted, you can still reference it by explicitly resolving the scope. This is done by, you guessed it, using ::, the scope resolution operator, and writing Element::H. The scope resolution operator tells the compiler “I mean this one, over here, look!”

Perhaps the name Element is itself in a namespace. Indeed, we would hope that is the case, since declaring anything other than a namespace at global namespace scope is a bad idea. In that case, you direct the compiler to the name Element by resolving its scope, for example, Chemistry::Element::H. You may draw parallels between this and the global Domain Name System, where top-level domains like .com, .net, and country codes resolve the scope; for example google.ie versus google.fr.

In the above example we have yet another overlap of guidelines. Enum.3: “Prefer class enums over “plain” enums” is motivated by the readiness with which enumera-tors convert to ints, leading to alarming consequences as outlined in the prior chap-ter. However, preferring enum classes, more properly known as scoped enumerations, will minimize the scope of an enumeration to its definition. It will keep the scope small. This is particularly advantageous for single-character identifiers, like H for hydrogen.

Template parameter scope

For completeness, we shall review template parameter scope. The scope of a tem-plate parameter name begins at the point of declaration and ends at the end of the smallest template declaration in which it was introduced. For example:

Click here to view code image

template< typename T,            // scope of T begins
          T* p >                 // T remains in scope
class X : public std::pair<T, T> // T remains in scope
{
  …
  T m_instance;                  // T remains in scope
  …
};                               // scope of T ends

This is another scope whose size is dependent on external factors, such as the size of the class being defined. Keeping this scope small is unachievable without keeping the class scope small, so there is little to add to this part.

Scope as context

As you can see, keeping scopes small will reward you many times over. Fundamen-tally, scope is how we think about parts of things. The notions of scope and relevance are closely related. You can think of your code as a story made up of chapters, each chapter being a different scope. All the declarations in your code appear in a scope, ideally not the global scope, and it is an obvious step to associate related declara-tions in a single scope, and to pinpoint associations in minimal scopes.

Scopes are how abstractions are identified and enclosed.

Scopes are how abstractions are identi-fied and enclosed. They are collections of names. Be it a class scope or a function scope or a namespace scope, it is the scope that contains the declarations relevant to that abstraction. They are the fundamental building blocks of your solution domain. Keeping your scopes small keeps your abstractions small as well.

Not all scopes have names, though, nor do all abstractions need names. It is per-fectly acceptable to open a block scope within another scope and use it to perform some small, self-contained task; maybe it needs to be bounded by a std::scoped_lock. However, keep in mind that nested scopes carry the danger of hiding names. You may unwittingly interrupt the scope of an existing name by redeclaring it in a nested scope.

Scope and duration are related but not always interchangeable. For example, names with global namespace scope are all bound to objects with static storage duration. However, names with enumeration scope are not bound to objects at all; they are merely literal constants requiring no storage.

Moving your attention from scope to scope requires a mental context switch, just as reading a book requires you to build a model of the plot and characters. Scope can be thought of as the immediate context for what is going on in your program. It par-titions the solution domain, the thing that your program is achieving, into individual pieces that can be easily apprehended. By mastering the interplay of scope, context, and abstractions you can easily decompose any problem into manageable parts and make programming a joy.

Summary

In summary:

Names have scope, not objects.
Keep clear the distinction between scope and storage duration.
Readability is inversely proportional to scope size.
Keep scopes small to minimize retention of resources required by objects of automatic storage duration.
Beware of hiding names when nesting scopes.
Prefer scoped enumerations to keep scopes small.
Maintain minimal and complete interfaces to keep scopes small.
Keep scopes small to optimize abstraction.