Chapter 1.6

NR.2: Don’t insist to have only a single `return`-statement in a function

Rules evolve

It astonishes me that, 20 percent of the way through the 21st century, people still argue about this. This section of the Core Guidelines is called “Non-rules and myths.” The remarkably prevalent advice that there should be only one return state-ment in a function falls right into this category.

Mandating a single return statement is an old, old rule. It is easy to forget the advances that have been made in programming. I bought my first computer, or rather, my parents did, back in 1981. It was a Sinclair ZX81, powered by an NEC Z80 CPU running at 3.25MHz. The operating system, supplied on 8KB of read-only memory (ROM), included a BASIC interpreter which enabled me to write simple programs in the meager 1KB of RAM.

Of course, I was 14 years old and I wanted to write games, and I discovered that the best way to do that was to bypass the BASIC interpreter entirely and write in native Z80 assembly language. With the help of Mastering Machine Code on Your ZX81 by Toni Baker,¹ the astounding Programming the Z80 by Rodnay Zaks,² and an assembler, I was able to write and sell my first games to my friends at school.

1. Baker, T, 1982. Mastering Machine Code on Your ZX81. Reston, VA: Reston Publishing Company, Inc.

2. Zaks, R, 1979. Programming the Z80. Berkeley, CA: Sybex.

I found it much harder to write things in Z80 assembly language than I did in BASIC. Particularly, BASIC had line numbers and the concept of subroutines. I could branch by executing GOTO, or I could branch by executing GOSUB, which would take me back to where I came from when the interpreter parsed the RETURN keyword. As I grew more proficient with Z80, I could see common concepts between it and BASIC and started to grasp the nature of programming languages, and I could map the idea of line numbers to the program counter, and GOTO and GOSUB to “jp” and “call.”

Z80 also allowed me to do some quite ghastly things. For example, I developed the habit of writing my code such that if there were two steps, A and B, and B was a useful piece of functionality by itself, I would put A before B so that I wouldn’t have to call B; I could simply run on into B after A had finished. This had the side effect of making it unclear how I had got to B, whether it was from A or from some other place, but that did not matter to me because I knew everything that was going on.

No, really.

Another thing I could do was alter the stack so that I could return up the call stack by an arbitrary number of callers for the price of one instruction. It was faster. Since I knew everything that was going on, I could speed up my games. These things mattered.

No, really, they did.

I moved on to a ZX Spectrum: it came with more RAM (16KB) as well as color and sound! However, as my platform increased in scope, my ambition grew along with my code, and it became increasingly hard to debug. I was unable to work out where I had come from, and what code had already been executed. I quickly realized that I was making my life very, very difficult by indulging in these shenanigans. I con-sidered the trade-off between execution speed and code comprehension. I decided that the extra cycles gained were not worth the loss of comprehension if I could never eliminate all bugs. I learned that it is fun and quite easy to write super-fast Z80 code, but it is next to impossible to debug it: there is a middle ground between per-formance and legibility. This was a valuable lesson.

As a result, I changed the way I wrote code. I organized it into reusable parts and was rigorous about documenting in the source where the parts started and finished. No more decrementing the stack pointer. No more jumping about to useful parts of larger functions. Life became considerably easier.

I graduated to an Atari ST, with a Motorola 68000 CPU running at 8MHz, and a truly extraordinary 512MB of RAM. My programming style of well-organized parts kept me sane. There was one and only one place to start a piece of code, and you would always go back to where you came from. I told everyone it was the one true way: I was religious in my zeal.

It turned out I was not the only person to write code like this. FORTRAN and COBOL programmers could tie themselves in similar knots if they decided not to take the same kind of care. This led to a simple piece of wisdom: “Single Entry, Sin-gle Exit.” There should be one and only one entry point for a function. There should be only one place it returns to: where it was called from.

Single Entry, Single Exit was part of the structured programming philosophy, which emerged from Edsger Dijkstra’s letter to the editor titled “GOTO statement considered harmful.”³ The book Structured Programming⁴ is still an excellent read and you should consider reading both of these. They informed how programming was done for over a decade.

3. www.cs.utexas.edu/users/EWD/ewd02xx/EWD215.PDF

4. Structured Programming by Ole Johan-Dahl, Edsger W. Dijkstra, and Charles Anthony Richard Hoare.

Unfortunately, old habits die hard. Not only that, but the motivation behind those habits starts to recede from memory with the passage of time. New innovations diminish the importance of old wisdom. Functions in C++ obviate the need to pay attention to the idea of “Single Entry.” The syntax of C++ makes it impossible to jump halfway into a function. Similarly, there is no way to return to anywhere other than the call site (exception handling and coroutines aside).

You might be asking yourself what this has to do with multiple return statements. Well, unfortunately, there was a mix-up with the prepositions in the minds of the programmer community, and single exit came to mean “return FROM one place only” rather than “return TO one place only.”

Disaster.

Ensuring cleanup

Let’s look at an example demonstrating the wisdom of a single return statement. Assume this function is calling an old C library exporting functions for acquiring, manipulating, and displaying resources based on an integer identifier:

Click here to view code image

int display(int ID)
{
 auto h = get_handle_to_resource(ID);
 if (h == 0) {
   return 0;
 }
 auto j = manipulate_resource(h);
 if (j < 0) {
   release_handle(h);
   return 0;
 }
 auto k = supply_resource_to_system(h, j);
 if (k < 0) {
   return 0; /* forgot to release handle */
 }
 display_resource(h);
   release_handle(h);
   return 1;
}

In this snippet of code, we see a very common situation: a handle to a resource is acquired and must subsequently be released before the handle falls out of scope. The engineer has forgotten to release the handle at the third return statement. One solu-tion is to have a single return statement, preceded by the release of the handle, thus:

Click here to view code image

int display(int ID)
{
  auto result = 0;
  auto h = get_handle_to_resource(ID);
  if (h != 0) {
    auto j = manipulate_resource(h);
    if (j >= 0) {
      auto k = supply_resource_to_system(h, j);
      if (k >= 0) {
        display_resource(h);
        result = 1;
      }
    }
  }
  release_handle(h);
  return result;
}

Ah, no, hang on, that’s wrong. The call to release_handle should only be made if the handle was successfully gotten. Let’s try again:

Click here to view code image

int display(int ID)
{
  auto result = 0;
  auto h = get_handle_to_resource(ID);
  if (h != 0) {
    auto j = manipulate_resource(h);
    if (j >= 0) {
      auto k = supply_resource_to_system(h, j);
      if (k >= 0) {
        display_resource(h);
        result = 1;
      }
    }
    release_handle(h);
   }
   return result;
}

This approach does not scale well for longer functions with many conditional branches, since each condition will introduce a further indentation, reducing legibil-ity; but functions should be small anyway, which weakens this argument. It also introduces additional state in the form of the return value, which increases the read-er’s cognitive load a little when reasoning about the function which, although this is not too great a burden, runs the risk of it being modified after the correct value has been calculated, a risk that will increase as the function grows. This has the advan-tage of release_handle being called whatever happens, although it needs to be called within the correct if branch. This argument of reliable cleanup remains a strong case for a single return statement. It is sensible advice.

For C programmers.

The bug in the first implementation of display was that the resource handle was not released in every pathway before leaving the function. The fix was to ensure all appropriate paths through the function ended with a single call to release_handle, after which return could safely be called.

The preeminent feature of C++ is deterministic, programmer-definable cleanup, provided by destructors. You will not convince me otherwise. At a stroke, an entire class of errors was eliminated with the introduction of this feature. It is deterministic in that you know precisely when it will be called. In the case of automatic objects, which are created on the stack, it is when the name falls out of scope.

Using RAII

Rather than use flow control to ensure code is executed, it is safer to make use of the idiom known as Resource Acquisition Is Initialization, or RAII. In this case, we bun-dle the acquire and release functions with the handle into a single struct:

Click here to view code image

int display(int ID)
{
  struct resource {
    resource(int h_) : h(h_) {}
    ~resource() { release_handle(h); }
    operator int() { return h; }

  private:
    int h;
  };

  resource r(get_handle_to_resource(ID));
  if (r == 0) {
    return 0;

  }
  auto j = manipulate_resource(r);
  if (j < 0) {
    return 0;
  }
  auto k = supply_resource_to_system(r, j);
  if (k < 0) {
    return 0;
  }
  display_resource(r);
  return 1;
}

Note that this code is not signaling errors by throwing exceptions but by using multi-ple return statements with different values to signal success. If this were a C++ library rather than a C library, we might expect the functions to throw an error rather than return. What would our example look like then?

Click here to view code image

void display(int ID)
{
  struct resource {
    resource(int h_) : h(h_) {}
    ~resource() { release_handle(h); }
    operator int() { return h; }

  private:
    int h;
  };

  resource r(get_handle_to_resource(ID));
  auto j = manipulate_resource(r);
  supply_resource_to_system(r, j);
  display_resource(r);
}

Of course, we might also expect the functions to take user-defined types rather than ints, but please let that pass for the sake of this example.

Now we have no explicit return at all. This is to be expected; after all, this func-tion simply does something, not even signaling success or failure. With exceptions used to signal failure there is no need for a return statement: the code assumes suc-cess and invisibly throws if it fails. It doesn’t calculate a value and return it.

This struct is so useful it should be pulled out of that function and made available to other users. Indeed, I’ve seen many codebases that interface with C libraries that contain something like this:

Click here to view code image

template <class T, class release_fn>
struct RAII
{
  RAII(T t_) : t(t_) {}
  ~RAII() { release_fn r; (t); }
  operator T() { return t; }

private:
  T t;
};

where T is usually a built-in or other trivial type.

It behooves me to acknowledge that exceptions are not universally deployed in C++ codebases. Throwing an exception requires the program to unwind the stack; that is, to destroy every automatic object created between the try site and the catch site. This introduces extra bookkeeping to the program to achieve this, which occu-pies memory. C++ is used in the broadest variety of environments, some of which are extremely sensitive to memory constraints or execution times. I have witnessed a stand-up fight in a parking lot over a 1KB buffer suddenly becoming available after some cunning optimization.

I have witnessed a stand-up fight in a parking lot over a 1KB buffer suddenly becoming available after some cunning optimization.

Compilers, as a matter of course, offer options to disable exception handling, which produces smaller, faster binaries. This is a dangerous thing to do. First, this will cost you in incomplete error handling. dynamic_cast throws an exception if a cast to a reference fails. The standard library throws an exception if allocation fails. Accessing a std::variant object incorrectly will generate an exception.

Second, you are not guaranteed smaller and faster binaries. The introduction of complex, explicit error-handling code may soak up all your advantage and yield additional cost. However, if it is important enough, if it is worth the trade-off, engi-neers will write code to accommodate the absence of exception handling. It’s not a pretty sight, but needs must when the devil drives.

However, if your codebase does permit exception handling, then single return statements carrying the successfully computed value back to the call site become the normal way of things. Multiple return statements might signal a function that is try-ing to do too much.

Writing good functions

There are nearly fifty Core Guidelines about functions. We give two of them their own chapter, but it’s worth considering some others here in the context of multiple return statements. For example, Core Guideline F.2: “A function should perform a single logical operation.” Although this guideline talks about splitting larger func-tions into smaller component functions, and viewing large numbers of parameters with suspicion, one of the side effects of following this advice is that your functions are likely to have a single return instruction, which is the result of the function.

In the same vein, Core Guideline F.3: “Keep functions short and simple.” The function that acts as a counterexample stretches over 27 lines of text and includes three return statements. However, the final example, which puts some of the logic into two helper functions, is nearly a third of the size but still contains three different return instructions, decided upon by the input parameters.

Core Guideline F.8: “Prefer pure functions.” This is a tall order but excellent advice. A pure function is one that does not refer to state outside of its scope. This makes them parallelizable, easy to reason about, more amenable to optimization, and, again, likely to be short and simple.

The important point is that there are very, very few rules that are cast-iron. Rules like that tend to end up being encoded in the language itself. For example, “don’t leak resources” is encoded in the destructor feature and in the library smart pointers. A single return statement might be a sign of other good practices being observed, but it isn’t a universal rule. It’s a matter of taste and style. Consider the following function:

Click here to view code image

int categorize1(float f)
{
  int category = 0;
  if (f >= 0.0f && f < 0.1f) {
    category = 1;
  }
  else if (f >= 0.1f && f < 0.2f) {
    category = 2;
  }
  else if (f >= 0.2f && f < 0.3f) {
    category = 3;
  }
  else if (f >= 0.3f && f < 0.4f) {
    category = 4;
  }
  else {
    category = 5;
 }

    return category;
}

Now compare it with this function:

Click here to view code image

int categorize2(float f)
{
  if (f >= 0.0f && f < 0.1f) {
    return 1;
  }
  if (f >= 0.1f && f < 0.2f) {
    return 2;
  }
  if (f >= 0.2f && f < 0.3f) {
    return 3;
  }
  if (f >= 0.3f && f < 0.4f) {
    return 4;
  }
  return 5;
}

Which of these is “better”? They both do the same thing. Any compiler is likely to yield identical output for each. The second contains multiple return statements but fewer characters. The first contains extra state but clearly identifies a series of mutu-ally exclusive conditions, rather than hiding that information away after an indenta-tion. Depending on your programming experience, what programming languages you’ve been exposed to, what coding advice you’ve already absorbed in your profes-sional life, and a whole host of other conditions, you will have a slight preference for one over the other. Neither is objectively better; nor is “each function should have only one return statement” a reasonable rule.

Summary

We all want hard and fast rules that we don’t have to think about, but these golden edicts are few and far between and usually end up becoming encoded within the pro-gramming language anyway. The single-return rule is old and due for retirement in the context of C++.

Understand the source of received wisdom.
Differentiate between returning results and throwing exceptions.
Identify rules that are matters of taste.

Chapter 1.6

NR.2: Don’t insist to have only a single return-statement in a function

Rules evolve

Ensuring cleanup

Using RAII

Writing good functions

Summary

NR.2: Don’t insist to have only a single `return`-statement in a function