API design is a valuable skill. As you decompose a problem into its constituent abstractions, you need to identify the abstractions and design an interface for them, giving the client clear and unambiguous usage instructions in the form of a com-pletely obvious set of carefully named functions. There is a saying that code should be self-documenting. While this is a lofty ambition, it is in API design that you should try hardest to meet this goal.
Codebases grow. They just do. There is no getting away from it. Time passes, more abstractions are discovered and encoded, more problems are solved, and the problem domain itself expands to accommodate more use cases. This is fine and perfectly normal. It is part of the usual operation of development and engineering.
As these extra abstractions are added to the codebase, the problem of unambigu-ously naming things rears its ugly head. Naming is hard. That phrase will come up a lot in your programming career. Sometimes you want to let the client, which is often yourself, do the same thing but in a slightly different way.
This is where overloading may seem like a good idea. The difference between two abstractions may simply be the arguments that are passed to them; in all other respects they are semantically identical. Function overloading allows you to reuse a function name with a different set of parameters. But if they are indeed semantically identical, can you express that difference in terms of a default argument? If so, your API will be simpler to understand.
Before we start, we want to remind you of the difference between a parameter and an argument: an argument is passed to a function. A function declaration includes a parameter list, of which one or more may be supplied with a default argument. There is no such thing as a default parameter.
For example, consider the following function:
office make_office(float floor_space, int staff);
This function will return an instance of an office, with a particular area in square meters specified by floor_space
, and facilities for a particular number of staff. The physical construction of the office takes place on a single floor and everyone is dis-tributed nicely, with appropriate kitchen space and bathroom facilities, along with the correct number of coffee machines, table tennis tables, and massage therapy rooms. One day, during one of those problem domain expansion moments, it is announced that some offices will occupy their own two-floor building. This compli-cates matters rather, since you must ensure you have stairways in the right places, appropriate fire escape routes, much more complicated air conditioning, and of course, you now need a slide between floors, or maybe a fire station pole. You need to tell the constructor that you are trying to make a two-floor office. You can do this with a third parameter:
office make_office(float floor_space, int staff, bool two_floors);
Unfortunately, you must go through all your code and add false
to all your call sites. Or you can default the final argument to false
, which means that the calling code does not need to supply it. That looks like this:
office make_office(float floor_space, int staff, bool two_floors = false);
One speedy recompilation later and all remains right with the world. Unfortunately, the demons of domain expansion are not finished with you yet: it turns out that the single-floor offices sometimes need the name of the building they will be situated in. You, the ever-accommodating engineer, expand your function’s parameter list once again:
office make_office(float floor_space, int staff,bool two_floors = false, std::string const& building_name = {});
You reimplement your function, but it irritates you. There are four arguments, the final argument will only be needed if the third argument is false, and it all looks messy and complicated. You decide to overload your function:
office make_office(float floor_space, int staff, bool two_floors = false); office make_office(float floor_space, int staff, std::string const& building_name);
You now have what is known as a function overload set, and it is up to the compiler to choose which member of the set to invoke according to the arguments passed in. The client is forced to call the correct function when a building must be identified. Identification implies a single-floor office.
For example, some client code may wish to create an office with 24,000 square meters of space for 200 people. The office is situated on one floor in a building called “Eagle Heights.” The correct invocation is therefore
auto eh_office = make_office(24000.f, 200, "Eagle Heights");
Of course, you must ensure that the appropriate semantics are observed in each function, and that they do not diverge in operation. This is a maintenance burden. Perhaps providing a single function and demanding the choice be made explicitly by the caller is more appropriate.
“Hang on,” we can hear you say. “What about writing a private implementation function? I can ensure consistency of creation. I can just use one of those and all is right with the world.”
You would be right. However, two functions may be viewed with suspicion by clients. They may worry that your implementation is divergent, that not everything is quite right. An abundance of caution may instill fear within them. A single function with two default arguments to switch between algorithms is a reassuring sight.
“No, you’re being ridiculous now,” we hear you cry. “I write great code and my clients trust me. I have unit tests everywhere and everything is fine, thank you very much.”
Unfortunately, although you may indeed write great code, your client does not. Take another look at the initialization of eh_office
and see whether you can spot the bug. Meanwhile, we shall consider overload resolution.
Overload resolution is a tricky beast to master. Nearly two percent of the C++20 standard is devoted to defining how overload resolution works. Here is an overview.
When the compiler encounters the invocation of a function, it must decide which function it is referring to. Prior to the encounter, the compiler will have made a list of all the identifiers that have been introduced. There may have been several functions with the same name but with different parameters, an overload set. How does the compiler choose which of these are viable functions and which should be invoked?
First, it will choose the functions from the set with the same number of param-eters, or fewer parameters and an ellipsis parameter, or more parameters where the excess consists of default parameters. If any of the candidates has a requires
clause (new to C++20), then it must be satisfied. Any rvalue argument must not correspond to a non-const
lvalue parameter, and any lvalue argument must not correspond to an rvalue reference parameter. Each argument must be convertible to the corresponding parameter via an implicit conversion sequence.
In our example, the compiler has been introduced to two versions of make_office
that differ in their third parameter. One takes a bool
that is defaulted to false
, and one takes a std::string const&
. The initialization of eh_office
matches both as far as parameter count is concerned.
Neither of these functions has a requires
clause. We can skip over this step. Simi-larly, there is nothing exotic about the reference bindings.
Finally, each argument must be convertible to the corresponding parameter. The first two arguments do not even require converting. The third argument is a char const*
and obviously converts to a std::string
via the nonexplicit constructor that is part of the std::string
interface. Unfortunately, we have not finished yet.
Once there is a set of functions, they are ranked by parameter to find the best viable function. A function F1 is preferred over another F2 if implicit conversions for all arguments of F1 are not worse than those of F2. In addition, there must be at least one argument of F1 whose implicit conversion is better than the corresponding implicit conversion of F2.
That word “better” is troubling. How do we rank implicit conversion sequences?
There are three types of implicit conversion sequence: standard conversion sequence, user-defined conversion sequence, and ellipsis conversion sequence.
There are three ranks for a standard conversion sequence: exact match, promo-tion, and conversion. Exact match means no conversion is required and is the pre-ferred rank. It can also mean lvalue-to-rvalue conversion.
Promotion means widening the representation of the type. For example, an object of type short can be promoted to an object of type int, known as integral promotion, while an object of type float can be promoted to an object of type double, known as floating-point promotion.
Conversions differ from promotions in that they may change the value, which may cost precision. For example, a floating-point value can be converted to an inte-ger, rounding to the nearest integer value. Also, integral and floating-point values, unscoped enumerations, pointers, and pointer-to-member types can be converted to bool
. These three ranks are C concepts and are unavoidable if compatibility with C is to be maintained.
That partially covers standard conversion sequences. User-defined conversions take place in two ways: either through a nonexplicit constructor, or through a nonex-plicit conversion operator. This is what we are expecting to happen in our example: we are expecting our char
const*
to convert to a std::string
via the nonexplicit constructor which takes a char const*
. This is as plain as the nose on your face. Why have we dragged you through this exposition on overloading?
In the above example, the client is expecting the char const*
to participate in a user-defined conversion to a std::string
, and for that temporary rvalue argument to be passed as a reference to const
to the second function’s third parameter.
However, user-defined conversion sequences take second priority to standard con-version sequences. In the earlier paragraph on conversions, we identified a standard conversion from pointer to bool
. If you have ever seen older code that passes raw pointers around the codebase, you will have seen something like
if (ptr) { ptr->do_thing(); }
The condition of the if
statement is a pointer, not a bool
, but a pointer can be con-verted to false if it is zero. This is a brief and idiomatic way of writing
if (ptr != 0) { ptr->do_thing(); }
In these days of modern C++, we see raw pointers less frequently, but it is useful to remember that this is a perfectly normal, reasonable conversion. It is this standard conversion that has taken first place and been selected in preference to the seemingly more obvious user-defined conversion from char const*
to std::string const&
. The function overload that takes a bool
as its third argument is invoked, to the surprise of the client.
Whose bug is this anyway: yours, or the client’s? If the client had written then there would be no error. The literal suffix signals that this object is in fact a std::string
, not a char
const*
. So, it is obviously the client’s fault. They should know about the conversion rules.
auto eh_office = make_office(24000.f, 200, "Eagle Heights"s);
However, that is not a very helpful approach. You should make an interface easy to use correctly and hard to use incorrectly. Missing out a literal suffix is a very easy mistake to make. Additionally, consider what would happen if you added the function overload taking a bool
AFTER you had defined the constructor taking a std::string
const&
. The client code would have behaved as expected with or without the literal suffix. Unfortunately, adding the overload introduces a better conversion, and suddenly the client code has broken.
Perhaps you remain unconvinced. You might now try replacing the bool
with a bet-ter type. Perhaps you would like to define an enumeration to use in place of the bool
:
enum class floors {one, two}; office make_office(float floor_space, int staff, floors floor_count = floors::one); office make_office(float floor_space, int staff, std::string const& building_name);
We really are going to have to stop you there. You have introduced a new type simply to facilitate the correct use of an overload set. Ask yourself whether that really is clearer than this:
office make_office(float floor_space,int staff,bool two_floors = false, std::string const& building_name = {});
If you remain unconvinced, ask yourself what you will do when the next round of problem domain expansion heaves into view with the remark, “Actually, we would like to be able to name the buildings where the two-floor offices are commissioned.”
The advantage of a default argument is that any conversion is immediately apparent on inspection. You can see that a char
const*
is being converted to a std::string const&
. There is no ambiguity about which conversion might be chosen since there is only one place for a conversion to happen.
In addition, as alluded to earlier, a single function is more reassuring than an over-load set. If your function is well named and well designed, your client should not need to know or worry about which version to call, but as the example shows, this is easier said than done. A default argument signals to a client that the function has flexibility about how it can be invoked by providing an alternative interface to the implementation and guarantees that it implements a single semantic.
You should make an interface easy to use correctly and hard to use incorrectly.
A single function also avoids code replication. When you overload a function, you start with the very best of intentions. Of course you do. The overloaded function does a few things differently, and you plan to encapsulate the remaining similarities in a single function that both functions call. As time passes, though, it is very easy for the overloads to overlap as it becomes hard to tease out the actual differences. You end up with a maintenance problem as the functionality grows.
There is one limitation. Default arguments must be applied in reverse order through the parameter list of a function. For example:
office make_office(float floor_space, int staff,bool two_floors, std::string const& building_name = {});
is a legal declaration, while
office make_office(float floor_space, int staff,bool two_floors = false, std::string const& building_name);
is not. If the latter function is invoked with only three arguments, there is no way to unambiguously bind the final argument to a parameter: should it bind to two_floors
or building_name
?
We hope you are convinced that function overloading, although cute, is not to be taken lightly. We touched only lightly on overload resolution. There is plenty more detail to be pored over if you want to truly understand which overload will be selected. You will notice we did not cover ellipsis conversion sequences, nor did we discuss what happens if there is a function template in the mix. If you are supremely confident about the use of overloads, though, we have one request: please do not mix default parameters with overloaded functions. This becomes very hard to parse and sets traps for the unwary. It is not an interface style that is either easy to use correctly or hard to use incorrectly.
Overloading functions signals to the client that a piece of functionality, an abstrac-tion, is available in a variety of ways. One function identifier can be invoked with a variety of parameter sets. Indeed, the function overload set has been described as the fundamental building block of the API, rather than the function, as one might expect.
However, in the admittedly somewhat contrived example for this chapter, you might find that
office make_office(float floor_space, int staff, floors floor_count); office make_office(float floor_space, int staff, std::string const& building_name);
is not as clear as
office make_office_by_floor_count(float floor_space, int staff, floors floor_count); office make_office_by_building_name(float floor_space, int staff, std::string const& building_name);
Function overloading is a great tool, but you should use it sparingly. It is a very shiny hammer and sometimes you actually need to peel an orange. The symbols are yours to define, and you should specify them as tightly as you can.
There is more to cover regarding overloading—for example, the rather long list of tie-breaks to ranking the best viable function; were this a textbook we would go into complete detail. However, it suffices to say that overloading should not be under-taken lightly.
The guideline starts with the phrase “Where there is a choice.” There are some places where you cannot provide an alternatively named function.
For example, there is only one constructor identifier, so if you want to construct a class in a variety of ways, you must provide constructor overloads.
Similarly, operators have a singular meaning that is very valuable to your clients. If you have, for some reason, written your own string class and you want to concat-enate two strings together, your clients will much prefer writing
new_string = string1 + string2;
to
new_string = concatenate(string1, string2);
The same is true for comparison operators. It is unlikely, however, that you would want a default argument when overloading operators.
The standard provides the customization point std::swap
, where you are expected to overload the function optimally for your class. Indeed, Core Guideline C.83: “For value-like types, consider providing a noexcept
swap function” suggests this explic-itly. Again, it is highly unlikely that you would want a default argument when over-loading this function.
Of course, sometimes there simply is no default argument available. So, when you MUST overload, do so consciously, and, to reiterate, do NOT mix default arguments with overloading. This falls into the chainsaw-juggling category of API design style.
We considered the growth of code and considered the impact on API design, looked at a simple example of overloading and saw where it could go subtly wrong. We looked at the subtleties of overloading and skimmed the surface of the rules on how the compiler prefers one function over another and used those rules to highlight where the example function call had not been the call we expected. Particularly, the bug was caused by offering a bool
with a default argument in a function overload, which is very generous in allowing what can be converted to it. We used that to dem-onstrate that a default argument should be preferred to an overloaded function where possible, and that mixing the use of function overloads with default arguments is a very risky venture.
The example was of course a straw man, but the fact remains that dangers lurk around overload sets for the unwary engineer. You can mitigate this danger through judicious use of a default argument and delay the introduction of a function over-load. You can research the full consequences of overload resolution at your favorite online resource, and we advise you to do so if ever you feel like ignoring this particu-lar guideline.