Chapter 4.2

Enum.3: Prefer class enums over “plain” enums

Constants

Constants are great. Types are great. Constants of a specific type are really great. This is why class enums are just fantastic.

Historically, and we hope you no longer feel the need to do this, constants were defined as preprocessor macros. You might have seen something like this in a geometry-related source file:

#define PI 3.1415926535897932385

If you were unfortunate, in another file you might have seen something like this:

#define PI 3.1415926 // fine for float, insufficient for double

Or this:

#define PI 3.1415926535987932385 // mis-transcribed

Or this:

#define Pi 3.1415926535897932385 // Slightly different name

They may even have been defined in header files, just to really spoil your day. These preprocessor symbols have no type, nor scope. They are simply lexically substituted during the preprocessor phase. An early win for C++ was the realization that you could declare scoped objects of const-qualified type (please never call them const variables). A single, well-placed definition of pi was a welcome sight. In fact, since C++20 we have a standard definition of pi. You will find it in the <numbers> header, defined in the namespace std::numbers:

template <>
inline constexpr double pi_v<double> = 3.141592653589793;
inline constexpr double pi = pi_v<double>;

Some constants are important, but their values are arbitrary. Unlike PI, or E, or MONTHS_IN_YEAR, there are times when we need a handful of named values to repre-sent some ideas. These have always been in the form of small integers, such as 1 for edit, 2 for browse, -1 to quit, and so on. There is still code out there with vast swathes of macros defining related integers. From WinUser.h, part of the Windows SDK:

#define WM_CTLCOLORSCROLLBAR            0x0137
#define WM_CTLCOLORSTATIC               0x0138
#define MN_GETHMENU                     0x01E1

#define WM_MOUSEFIRST                   0x0200
#define WM_MOUSEMOVE                    0x0200
#define WM_LBUTTONDOWN                  0x0201

Why is there a jump from 0x01E1 to 0x0200? Most likely, there was a change in domain after MN_GETHMENU and there was no guarantee that there would be no further addi-tions. It would be easy to identify the domain by looking at the second nibble. Or maybe it was completely arbitrary. We can never know. We cannot capture this infor-mation with simple preprocessor definitions.

Enumerated types provide a way of gathering constants together, which make them ideal for identifying, for example, error values, thus creating a specific error abstraction. Rather than declaring:

#define OK = 0
#define RECORD_NOT_FOUND = 1
#define TABLE_NOT_FOUND = 2

you can define an enumeration instead:

enum DB_error {
  OK,
  RECORD_NOT_FOUND,
  TABLE_NOT_FOUND
};

The enumerators have the same values as the preprocessor constants since enumera-tions start at zero by default and increment by one for each enumerator. They are spelled using uppercase letters since they are directly replacing those preprocessor constants. This should be the exception rather than the rule and enumerators should usually be spelled using lowercase letters. This is a matter of style rather than part of the standard: following this style prevents collisions with preprocessor symbols, which are conventionally spelled using uppercase letters.

Unfortunately, the enum keyword does not define a scope, nor does it define an underlying type. This can lead to some interesting problems. Consider an enumera-tion of two-letter codes for US states. Here is a snippet:

enum US_state {
  …
  MP, // Great quiz question…
  OH,
  OK, // Uh-oh…
  OR,
  PA,
  …
};

Since the braces do not define a scope as normally expected, OK is now an ambiguous identifier. If the enumerations themselves are defined in unrelated scopes this is not a problem. Otherwise, we need to modify the enumerators to disambiguate them. OK is obviously a hugely useful identifier, so it cannot be allowed to exist unadorned. In the pre-C++11 world you would come across enumerators like S_OK, R_OK, E_OK, and so on, leading to somewhat illegible code. Indeed, using a single letter was a luxury afforded only to the biggest players. In the above example you would be far more likely to use DBE_OK or USS_OK. Once you had adorned one enumerator, you felt obliged to adorn all the others, with the result that your code would become scat-tered with TLAs and underbars prefixing all your enumerators.

Fortunately, although ugly and inconvenient, this hindrance would manifest at compile time in the form of a simple error, easily resolved by uglifying your collid-ing enumerators a little bit more. The other, rather more insidious problem was that of implicit conversion. Functions could cheerfully return an enumerator like OK, or more likely DBE_OK, and that value could be freely converted to an int. The same was true the other way around: a function could take an int but be passed an enumerator. This leads to interesting bugs where you pass an enumerator from one enumeration and it might be interpreted as an enumerator from another enumeration.

Scoped enumerations

C++11 expanded the enum keyword and added two new features. The first of these was the scoped enumeration. This introduced an amendment to the syntax by add-ing the keyword struct or class to the declaration:

enum class DB_error { // Scoped, and now lowercase identifiers…
  OK, // …except for OK which is uppercase anyway.
  record_not_found,
  table_not_found
};

enum struct US_state {
  …
  MP, // Northern Mariana Islands, since you ask…
  OH,
  OK,
  OR,
  PA,
  …
};

The scoped enumeration provides the enumerators with a scope, and thus a way of disambiguating them from identically named enumerators elsewhere. When you use a scoped enumerator, you explicitly resolve the scope using the scope resolution operator, which might look something like this:

static_assert(DB_error::OK != US_state::OK);

except this will not compile since the left- and right-hand sides of the != sign are of different types, requiring an explicit operator != overload.

There is no guidance on when to use struct or class. Personally, I use class when I have defined other operations on the enumeration. For example, consider a days-of-the-week enumeration:

enum class Day {
  monday,
  tuesday,
  wednesday,
  thursday,
  friday,
  saturday,
  sunday
};

Perhaps you want to be able to cycle through the days of the week, so you might decide to define a pre-increment operator:

constexpr Day operator++(Day& d) {
  switch (d) {
  case Day::monday:    d = Day::tuesday;   break;
  case Day::tuesday:   d = Day::wednesday; break;
  case Day::wednesday: d = Day::thursday;  break;
  case Day::thursday:  d = Day::friday;    break;
  case Day::friday:    d = Day::saturday;  break;
  case Day::saturday:  d = Day::sunday;    break;
  case Day::sunday:    d = Day::monday;    break;
  }
  return d;
}

Day today = Day::saturday;
Day tomorrow = ++today;

Underlying type

Enumerations have an underlying type which can be defined at the point of declara-tion or definition. One advantage this gives is that, since C++11, enumerations can be declared without being defined, since the size of the type can be inferred. If no underlying type is specified, then a default is used. Such an enumeration cannot be forward-declared. The default underlying type depends on whether the enumeration is scoped or unscoped.

If the enumeration is unscoped, then the underlying type is an implementation-defined integral type that can represent all the enumerator values. Looking at our days of the week, the enumerator values range from zero to six, so one might expect the underlying type to be char. If the enumerator is scoped, then the underlying type is int. This may seem a little wasteful. On a typical implementation, a char would be sufficient. To specify the underlying type, new syntax was added in C++11, thus:

enum class Day : char {
  monday,
  tuesday,
  wednesday,
  thursday,
  friday,
  saturday,
  sunday
};

This syntax is available to both scoped and unscoped enumerations. Specifying the underlying type should be restricted to situations where it is necessary; the three-byte saving is only going to be noticeable if you have thousands of instances of an object storing Day instances. The default is the easiest to read and write. However, specify-ing the type can also aid ABI compatibility.

Enumerations are also used to define power-of-two constants for bitwise mask-ing. For example:

enum personal_quality {
  reliable = 0x00000001,
  warm = 0x00000002,
  punctual = 0x00000004,
  …
  generous = 0x40000000,
  thoughtful = 0x80000000
};

The underlying type could be int if it stopped at generous, but thoughtful requires an unsigned int. In the general case you should not specify the value of enumerators: it can lead to typing errors and it can degrade the performance of switch statements. However, this is an exception.

Prior to C++11 there was something of a gray area. Some implementations would permit forward declaration of enumerations by fixing their size to 32 bits unless an enumerator exceeded the maximum representable value. However, this was not guar-anteed to be portable. This is the problem with implementation-defined parts of the standard: you need to be able to identify how all your target implementations define them. This is still the case: an unscoped enumeration that does not specify an under-lying type defaults to an implementation-defined underlying type.

Implicit conversion

Another feature of the unscoped enumeration is that it is freely convertible to an int. A common bug of yore was to pass an enumerator to a function taking an int. This was considered perfectly acceptable and common practice, but the enumerator value may have different meanings in different scopes.

// territory.h
enum US_state { // unscoped enumeration
  …
  MP, // There are four other territories
  OH,

  OK,
  OR,
  PA,
  …
};
…
void submit_state_information(int); // Hmm, US state or nation state?

// US_reporting.cpp
submit_state_information(OH); // OH is from an unscoped enumeration

With luck, submit_state_information does indeed take a US state and not a nation state. Unfortunately, with that API, there is no way of being sure.

You are still able to convert scoped enumerations to their underlying type, but you have to do it explicitly with static_cast or with std::underlying_type:

// territory.h
enum struct US_state { // scoped enumeration
  …
  MP,
  OH,
  OK,
  OR,
  PA,
  …
};
…
void submit_state_information(int);

// US_reporting.cpp
submit_state_information(static_cast<int>(US_state::OH));

That explicit cast demonstrates that you are personally taking responsibility for the effects of this potentially dangerous activity. In the above invocation, you have decided that submit_state_information takes a US state. Casting is always a good place to start looking for weird bugs.

Alternatively, you can construct an int from the enumeration:

// US_reporting.cpp
submit_state_information(int(US_state::OH));

Of course, this problem is down to bad API design. The function should have been more explicit about its parameter:

void submit_state_information(US_state);

Unfortunately, occasionally it can be unwise to rely on the best efforts of your col-leagues and the safest approach is to seek clarification.

Summary

Prefer class (scoped) enumerations over “plain” (unscoped) enumerations to benefit from a reliable default underlying type and to enable disambiguation of symbols.

Simply adding the keyword class to all your enum declarations is cheap and improves safety by disallowing implicit conversion, while enabling you to improve readability by removing “decoration” from enumerators that was only added to make them unique.

Postscript

Back in the last decade I arranged for some C++ training for my team to be given by Scott Meyers. This was a pretty big deal for us, and he truly delivered. One of the things that sticks in my mind was that it was the first time I heard someone say enum differently to me. I am English, and I’ve only ever heard this word pronounced to rhyme with “bee-come.” He rhymed it with “resume,” as if he were beginning to say “enumeration” but lost interest after the second syllable.