Chapter 1.1

P.2: Write in ISO Standard C++

What is ISO Standard C++?

This book is all about writing good code. The first piece of advice is therefore to write in ISO Standard C++. But what exactly is that?

A history of C++

C++ didn’t start out as a standardized language. It started out as an extension to the C programming language, called “C with classes,” invented by Bjarne Stroustrup.¹ C wasn’t a standardized language at that time either: Bjarne delivered his extension as a preprocessor called Cpre. These features included classes and derived classes, with public/private access levels, friends, assignment operator overloading, constructors, and destructors. Also, inline functions and default function arguments were included, along with type-checking of function arguments.

1. Stroustrup, B, 1995. A History of C++: 1979–1991, www.stroustrup.com/hopl2.pdf.

In 1982 he started work on a fresh effort called C++, which added further fea-tures, including virtual functions, function and operator overloading, references, constants, and dynamic allocation. In addition, he created a C++ front end for C compilers called Cfront. This worked by passing in C++ code that Cfront would then compile to C. He also wrote a book called The C++ Programming Language (often known as TCPL), which was published in 1985. This served as the definitive reference on what C++ was, and commercial compilers started to appear.

While these compilers were becoming widely used, Bjarne continued work on C++, adding further features to what became C++2.0. These included multiple inher-itance, abstract base classes, static and const member functions, the protected access level, as well as improvements to existing features. There was a big leap in the popu-larity of C++. By Bjarne’s estimates, the number of users doubled every 7.5 months.

Conferences, journals, and books emerged, and the competing implementa-tions of the compiler demonstrated that there needed to be something more precise than TCPL. In 1989 Dmitry Lenkov of HP wrote a proposal for American National Standards Institute (ANSI) standardization of C++, identifying the need for a care-ful and detailed definition of each language feature to prevent the growth of dialects, and also identifying necessary additional features such as exception handling and a standard library. The ANSI C++ committee, X3J16, first met in December 1989. The Annotated Reference Manual, or ARM, written by Margaret Ellis and Bjarne and published in 1990, became the single description of the whole of C++. It was written specifically to get the ANSI C++ standardization effort off to the best possible start.

It wasn’t just an American concern, of course, and many international represent-atives attended. In 1991 the ISO C++ committee WG21 was convened and the two committees held joint meetings from then on. The goal was to write a draft standard for public review in four years with the hope of an official standard two years later. However, the first standard, ISO/IEC 14882:1998, was finally published in September 1998, not quite nine years after the first meeting.

This was not the end of the story, though. Work on bug fixes to the standard continued, and in 2003 C++03 was released. Further work was undertaken to add additional features to develop the language further. This included auto, constexpr, decltype, move semantics, range for, uniform initialization, lambdas, rvalue refer-ences, static assertions, variadic templates... the list went on, as did the development schedule. Eventually, the next version was shipped in 2011 before everyone forgot that C++ was a growing language.

Given that C++03 was a correction to C++98, this meant there was a 13-year gap between the first standard and C++11. It was clear that such a long period between standards was not in anybody’s interests and so the “train model” was developed: a new standard would be shipped every three years, and if a feature wasn’t ready, it would hop on the next “train” three years later. Since then, C++14, C++17, and C++20 have been shipped on schedule.

Encapsulating variations

Variations in run-time environment

The standard has very little to say about what is required of the environment in which a C++ program executes. An operating system is not a requirement. File storage is not a requirement. A screen is not a requirement. A program written for a typical desktop environment may need mouse input and windowed output, which requires specialized code for each particular system.

Writing fully portable code for such a program is not feasible. ISO Standard C++ has a very small library compared to languages like C# and Java. It is a specifica-tion for implementers of ISO Standard C++. The C# and Java standard libraries are supplied by the owners of the language, but C++ does not have a funded library development organization. You need to use the individual features of each target environment to support those parts not available in the standard library. These will be offered in the form of a header file and a library file; typically, there will be many of these per system. As far as possible, hide those away behind your own interfaces. Minimize the amount of variation between codebase versions targeting different systems.

For example, you might want to know if a particular key on the user’s keyboard is being pressed. One approach might be to use the preprocessor to detect which plat-form you are using and execute the appropriate piece of code, like this:

Click here to view code image

#if defined WIN32
auto a_pressed = bool{GetKeyState('A') & 0x8000 != 0};
#elif defined LINUX
auto a_pressed = /*really quite a lot of code*/
#endif

This is very ugly: it is operating at the wrong level of abstraction. The code that is specific to Windows and Linux² should live in separate files elsewhere, exposed in a header file, so the code should look like this:

2. https://stackoverflow.com/questions/41600981/how-do-i-check-if-a-key-is-pressed-on-c

Click here to view code image

auto a_pressed = key_state('A');

The function key_state is an interface that encapsulates this extension. The imple-mentation does the right thing for the appropriate platform, away from your flow of control and without the additional baggage of preprocessor macros. Separating each implementation into a separate file further supports that abstraction.

Variations in C++ language level and compiler

C++ compiler implementers must entirely and precisely support the standard if they want to announce that their compiler is standard-compliant. However, this does not tie their hands entirely, and leaves the door open for them to add additional features or extensions. For example, GCC included additional type traits such as __has_trivial_constructor and __is_abstract before they were added to the standard. These have both been present in the type traits library since C++11 under different names: std::is_trivially_constructible and std::is_abstract.

Note that __is_abstract is preceded by a double underscore: the double under-score is reserved by the standard to implementers. Implementers are NOT allowed to add new identifiers to the std namespace. This would be a very bad idea, as they might subsequently be added to the standard with a completely different meaning. What this means in practice for C++ developers is that it is possible to accidentally write code that appears to be using standard features, but in fact is accidentally using a compiler-specific feature. A good way to guard against this is to build and test your code on more than one compiler and operating system, to discover accidentally non-standard code.

These two features were provided for good reason: they were useful metapro-gramming tools. Indeed, they were so useful that they were added to the standard. Many parts of the standard, both language and library features, start life as fea-tures in popular tools and libraries. Sometimes the use of nonstandard features is inescapable.

Extensions to C++

Some library writers also add their own extensions. For example, the Qt³ library uses a feature called signals and slots to communicate between objects. Three symbols are added to make use of this feature: Q_SIGNALS, Q_SLOTS, and Q_EMIT. If you were to read a source file making use of these keywords, it would seem like any other lan-guage keyword. Qt supplies a tool called moc that parses these keywords to produce output that a C++ compiler can parse fully and correctly, in just the same way that Cfront would parse early C++ code so that its output could be consumed by C compilers.

3. https://doc.qt.io/

The point to bear in mind is that the standard offers something that these exten-sions don’t: rigorously defined semantics. The ISO C++ Standard is absolutely unambiguous, which is one of the reasons why it is so hard to read. You are, of course, free to use language extensions if you are mindful of the cost to portabil-ity. Qt in particular makes heroic efforts to achieve portability across different plat-forms. However, those extensions are not necessarily guaranteed to be present in other implementations, nor are they necessarily guaranteed to mean the same thing.

Safety in header files

For example, consider #pragma once. This is a simple directive that tells the compiler not to #include a file a second time. It reduces the amount of time the compiler spends compiling a translation unit. Every compiler I’ve used over the past 20 years implements this pragma directive, but what does it actually mean? Does it mean “stop parsing until you get to the end of the file”? Does it mean “don’t open this file a second time”? Although the visible effect is the same, the meaning is not precisely defined for all platforms.

You cannot assume the meaning of something will be preserved across platforms. Even if you are safe now, you cannot guarantee that you will be in the future. Relying on a feature like this is like relying on a bug. It’s dangerous and may be changed or corrected at any time (although see Hyrum’s Law⁴). In this case, rather than using #pragma once, the Core Guidelines recommend using header guards as described in SF.8: “Use #include guards for all .h files.” With header guards, we know exactly what will happen.

4. 2021. Available at: https://www.hyrumslaw.com/ [Accessed 16 July 2021].

Variation in fundamental types

Operating system implementation is not the only kind of system variation. As you may know, the width of arithmetic types like int and char is not standardized. You might think an int is 32 bits wide, but I remember a time when an int was 16 bits wide. At times I needed a type that was exactly 32 bits wide, and, fearful of making the mistake of assuming int would always be 32 bits wide (it had changed once, why not again?), I used the implementation headers to discover which type was that wide, and created an alias to that type:

Click here to view code image

typedef __int i32; // older way of doing this: do not use now

I introduced an identifier called i32 that was an alias of the platform’s definition of a type named __int. I was entirely safe if the project was ported to another platform: I could find out how the target platform defined a 32-bit signed integral type and sim-ply update the typedef definition for that platform as required.

Of course, when the next standard was released, in this case C++11, new types were introduced to the library in header <cstdint> that defined fixed-width integral types. I was able to update the definition in two attractive ways:

using i32 = std::int32_t;

First, I was able to use the new type to future-proof my definition: the type being ali-ased is part of the standard and is extremely unlikely to change, because backward compatibility is so important to the language. This declaration will remain valid through subsequent versions of the standard (indeed, nine years and three standards have passed, and this code is still valid).

Second, I was able to move to the new using keyword, which allows you to use left-to-right style for identifier and definition separated by an equals sign. You can also see this style with the auto keyword:

auto index = i32{0};

The identifier is introduced on the left of the equals sign and the definition is on the right of the equals sign.

As superior refactoring tools emerged, I took the plunge and swapped all instances of i32 for std::int32_t, for minimum ambiguity.

Regulatory constraints

It should be mentioned that sometimes you simply can’t use ISO Standard C++. Not because of some lack in the library or missing language feature, but because the host environment forbids the use of certain features. This can be for regulatory reasons, or because the implementation is incomplete for the platform you are developing for.

For example, some industry sectors forbid dynamic allocation during perfor-mance-critical functions. Allocation is a nondeterministic activity that can also throw an out-of-memory exception; that is to say, there is no guarantee how long such a call will take. Throwing exceptions is also forbidden in several sectors for sim-ilar reasons, which immediately precludes dynamic allocation since std::operator new throws std::bad_alloc on failure. In situations like this, the Core Guidelines need to be extended and customized to the specific environment.

Conversely, some sectors forbid the use of libraries that have not undergone a cer-tification performed by the sector’s regulatory body. For example, the use of Boost⁵ may be problematic in some environments. This enforces widespread use of ISO Standard C++.

5. https://www.boost.org

Learning the old ways

Backward compatibility of C++

It’s important to remember where this language came from, and also what motivates its development. There is code in my current project that I wrote in 2005. It looks a little peculiar to today’s programmers as it uses long-discarded paradigms, no auto, no lambdas: it’s a history lesson in source.

However, it still works. It still compiles and runs well. During your career, you will come across code of various ages. It’s important to make use of the latest stand-ard and build with the latest compiler you can find, but it’s also important to know where the language came from and to plan for the future.

Stability over decades is a feature.

Sometimes it isn’t possible to use the latest version of the standard. In embed-ded development, regulation, certification of systems, or elderly infrastructure may force you to use C++11, or even C++98. C++ relies on backward compatibility. It is backward-compatible with C. It is backward-compatible with prior standards. Sta-bility over decades is a feature. This is one of its great strengths: billions of lines of code around the world still build with a modern compiler, occasionally with a little tweaking. At some point you may be asked to maintain this code.

Forward compatibility and “Y2K”

Conversely, write code that is built to last. At the end of the last century, a problem was unearthed in much of the world’s elderly computer software: only two digits were used to represent the year.⁶ Memory was at a premium, and the efficient thing to do was simply store 74 rather than 1974. The developer thought nothing of it: “This piece of software will not be running in 25 years; surely it will have been replaced.”

6. https://www.britannica.com/technology/Y2K-bug

Ah, such pessimism, or perhaps optimism, depending on your point of view. Of course, once the date rolled around to the year 2000, then the year was represented as 00, spoiling time interval calculations, interest payment calculations, indeed, ANYTHING to do with the passage of time.

This was known as the Y2K bug or the millennium bug. It proved to be a bonanza for older contractors, who toured the computers of the world effecting repairs on 25-year-old systems at considerable expense. Disaster was largely averted because the problem was identified in sufficient time and there were enough engineers avail-able to put things right.

However, if the engineers had planned for the future instead, had assumed that their code would run “forever,” and were writing the code at a point in time when four-digit integers occupied the same space as two-digit integers, this would have been avoided. It would have been clear that two digits was NOT enough to repre-sent all dates that may be required, and at least a third digit would be needed, and really a fourth digit would just be simpler all around to accommodate the turn of the millennium.

Incidentally, this is not the only date problem. Linux has a similar problem with measuring time in seconds since January 1, 1970. This was stored as a 32-bit signed integer, which means it will roll over on January 19, 2038. I say “was”: from Linux 5.6 the problem was solved.

An important pair of career skills is therefore writing code for the future and learning to read code from the past.

Staying on top of developments to the standard

C++ is developing all the time. With every publication of a new standard there comes a cornucopia of new language features and library additions. There is no especial virtue in simply using the most novel features; they should be used where they give definite and concrete benefit. However, the C++ community is very fortu-nate to have many excellent teachers ready to unfold and reveal all these new things to us. Finding these resources is made easier in four ways.

IsoCpp

First of all, there is isocpp.org.⁷ This is the home of C++ on the Web and is run by the Standard C++ Foundation. This is a Washington 501(c)(6) not-for-profit organi-zation whose purpose is to support the C++ software developer community and pro-mote the understanding and use of modern Standard C++ on all compilers and platforms. On this site you can find a tour of C++ written by Bjarne, a huge C++ FAQ, details about how to participate in the standardization process, and a regularly updated list of recent blog posts from the C++ community. From here, you can inves-tigate other posts on these blogs.

7. https://isocpp.org/about

Conferences

Second, there are several conferences that take place around the world every year. It has become the habit for these conferences to record all the talks and publish them on YouTube for free public consumption. This is a truly amazing resource, and it is quite a challenge to simply keep up with them year on year.

CppCon is run by the Standard C++ Foundation. It takes place in early autumn in the US in Aurora, Colorado, and generates nearly two hundred hours of content. The Association of C and C++ Users (ACCU) holds an annual conference every spring in Bristol, UK, and occasionally also in the autumn. It focuses on C++ but also features broader programming topics and generates nearly a hundred hours of content. Meeting C++ is held in Berlin, Germany, in November, generating nearly fifty hours of content. You can afford to be quite choosy: watching one talk per day will keep you busy for most of the year, and that’s before mentioning the many other smaller conferences that happen in places like Australia, Belarus, Israel, Italy, Poland, Russia, Spain...

Other resources

On top of blogs and conferences, there are many books besides this one. Some of these will appear in references throughout this text, as will quotations from confer-ence talks.

Finally, there is day-to-day discussion available on chat servers such as Discord and Slack.⁸ The Discord server is moderated by the #include⁹ diversity and inclusion group for C++ programmers, which has a very welcoming community.

8. https://cpplang.slack.com

9. https://www.includecpp.org

With so many resources available you should be able to keep pace with devel-opments in Standard C++. Continuing to write ISO Standard C++ code is within everyone’s grasp. Doing so is important not just for future maintainers, whoever they may be, including yourself, but also for future clients of your code. There is broad use of C++, serving many areas of commerce, industry, and society. A stable, reliable approach to writing code is of global importance. Step up, do the right thing, and write in ISO Standard C++.