I didn’t become a programmer until a year after college.
By that point I had been writing programs for a decade, majored in computer science, and worked at a small software start-up for a year. Unbeknownst to me, that was just practice for my final test.
The fateful sequence of events began when my manager called me into his office. My company, Dendrite Americas, was writing software that allowed representatives of pharmaceutical companies, armed with laptops, to plan their sales meetings with doctors. Every night they would dial in to our central computer to upload notes they had gathered during the day and then download updates to their database of doctors—quite advanced for the late 1980s. In certain cases, in no discernible pattern, the street address of one doctor was being replaced with that of a different doctor. Nobody else had been able to figure out what was going on, and my manager wanted me to give it a try.
Being selected for this assignment, from among the fifteen or so programmers at the company, was a compliment. The sheriff in an old Western movie was choosing his posse and telling me, “Tex, you’re the best shot we’ve got.” Nonetheless, I felt my stomach sink—a feeling that I had never felt when facing a programming task. The challenging part in situations like this is not so much fixing the problem as it is finding it, and I was nervous about whether I would be able to find it.
Bugs in software are described by repro steps: the sequence that the user follows to reproduce the bug, as in, “Run spell-check, then try to save the document, and you’ll get an error.” They can be broadly divided into two categories: bugs that happen every time, and bugs that happen only sometimes, despite following the same repro steps. Those that happen every time are vastly preferable, at least from the point of view of a programmer trying to solve them, because if you can make the bug happen reliably, you can eventually narrow down where the problem is. Like the annoying rattle in your car that goes away when the mechanic is listening, software bugs that occur intermittently make you pull your hair out. In reality, even intermittent bugs do happen “every time”; it’s just that a certain set of factors have to come together, and the repro steps, as best as they are known, are not detailed enough to always trigger the exact situation.
There is another way to divide software bugs: bugs in a program that you wrote yourself, and bugs in somebody else’s program. When the bug is in somebody else’s program, you know nothing about the details, so you are starting at square one, or line one. The problem I was being tapped to fix was in the worst quadrant: an intermittent bug in somebody else’s program. This was a new experience for me. Previously, in high school and college, I had rarely worked with code written by somebody else. And even when I had, the data I was working with was small enough, and the programs I was working with were simple enough, that any bug was easy to reproduce.
Throw in the pressure of being put on the spot, with paying customers waiting for a fix while their salespeople wandered aimlessly around New Jersey, and this was the moment when I was going to earn my stripes as a programmer.
If you watch home improvement shows, you have no doubt seen the knob-and-tube reveal, in which the contractor informs the homeowners, “I have bad news,” and then after a commercial break is seen ripping off a piece of the wall to uncover the dreaded knob-and-tube system. This archaic method for transmitting electricity inside a house is a potential fire hazard, such that when upgrades are made to a house, the knob and tube has to be replaced if it is deemed unsafe (or possibly if the plot of the show is deemed to be lacking in dramatic tension).
Finding electric problems is a bit like debugging, and clearly, since knob and tube has been obsolete for seventy-five years, it falls in the category of a bug in somebody else’s work. The difference is that knob and tube, despite being hidden behind a wall, is easy to locate: start with the plug on the wall and track it back from there. When I was called in to debug this mysterious problem in our software, I had no idea what I was looking for. Was it knobs? Was it tubes? Which metaphoric wall was I supposed to look behind? And if I found the right place, would the problem occur while I was watching?
I’ll give away the ending: after a couple days of excavation, I found the bug and was rewarded with a bottle of champagne as well as the respect of my peers and a blissful return to more mundane tasks—until the next time I had to track down a flaky bug in somebody else’s program. But let’s take a detour to consider exactly how programmers approach writing and debugging software.
We will need something to debug, so below are two lines from a program in C#. The purpose of this code is to show the user an error message stored in a variable named ErrorMessage
, of type string
. This is a snippet of a program, so we assign a specific value to ErrorMessage
(the text string “This is my error message”), as opposed to having it determined by an actual error:
string ErrorMessage = "This is my error message";
MessageBox.Show(ErrorMessage);
Look past the slightly backward syntax and arbitrary-looking punctuation; you may correctly infer that this code will cause the computer to show a message box—one of those pop-up windows that hovers over the screen—containing the text of the error message, as contained in the variable ErrorMessage
. On my Windows 10 system, the message box—not the prettiest of message boxes, but you can see the connection between the code and result—looks like this:1
MessageBox.Show
is an API, similar to what we saw in Fortran and BASIC, although in the argot of C# it is also known as a method. You might say, “My code calls the MessageBox.Show
method.” Or since programmers tend to talk of their code as an extension of themselves, you could assert, “I call the MessageBox.Show
method.” Or most commonly, since code that doesn’t work yields the richest bounty of conversational fodder, you would be complaining, “I call the MessageBox.Show
method and I can’t figure out why it isn’t working properly.”
Given that software is built up in layers, this code is at a layer above MessageBox.Show
, calling down into it. A similar concept exists in the physical world—the roof of a house may be built on prefabricated girders, which are themselves made of wood, steel, and nails. The electric appliances in your house are layered above the electric system in your walls, relying on it to provide power when needed. But these layers don’t go that deep; the knobs and tubes are only one layer removed from what all visible to the homeowner. Some quick work with a claw hammer and all is laid bare for the camera. The MessageBox.Show
method contains its own code, which in turn calls other code in a stack that is dozens of levels deep. Which means the knobs and tubes may be buried so deeply that you will never discover them until your software metaphorically catches fire.
In the code snippet above, ErrorMessage
is passed as a parameter to MessageBox.Show
. In C#, as in many modern programming languages, parameters are specified in a comma-separated list enclosed in parentheses, so we’ll follow the convention that C# method names are followed with empty opening and closing parentheses in order to distinguish them from other programming constructs. MessageBox.Show
will henceforth be styled as MessageBox.Show()
.
The code in a program generally involves figuring out the desired parameters to a method, calling that method, and using the information returned from that method to decide what to do next. As code gets more complicated, the number of layers increases, and method calls are the glue that holds those layers together. A lot of code examples show only one layer, but that is unusual in real programs; code rarely proceeds for more than five lines without calling a method.
Let’s change our code to display the error message in uppercase so as to be extra memorable. Uppercasing is easy using a method named ToUpper()
; in C#, you call a method on a variable—in this case, ErrorMessage
—by using dot notation, as shown below. We’ll also add another string variable named EM_Upper
to hold the uppercased string:
string ErrorMessage = "This is my error message";
string EM_Upper = ErrorMessage.ToUpper();
MessageBox.Show(EM_Upper);
The second line sets EM_Upper
to hold the result of calling the ErrorMessage.ToUpper()
method, and we then pass that as a parameter to MessageBox.Show()
, instead of the original ErrorMessage
. The error message is now displayed in uppercase, like this:
Now we add one more twist to this code: in addition to having the error message displayed in the message box, we will show a title at the top. We’ll use the title “ERROR!,” which we pass as a second parameter to MessageBox.Show()
:
string ErrorMessage = "This is my error message";
string EM_Upper = ErrorMessage.ToUpper();
MessageBox.Show(EM_Upper, "ERROR!");
The difference is in the third line of code; MessageBox.Show()
now has two parameters, separated by a comma. The string “ERROR!” is displayed as the title of the message box, where previously there had been no title:
Both those parameters to MessageBox.Show()
are of type string
(the second parameter is not the name of a variable but instead contains the actual text of the string, surrounded by double quotes). There are multiple ways you can call MessageBox.Show()
with different sets of parameters, because the people who wrote MessageBox.Show()
(in this case at Microsoft, which invented C# and wrote the collection of methods that allow C# programs to do things such as show message boxes) decided it would be useful to offer the choice. The C# compiler knows the type of the parameters and their order, which form a signature of sorts; this call to MessageBox.Show()
has the signature “first parameter is a string
, second parameter is a string
.”
What if you accidentally got those backward, and specified the first parameter as the title of the message box, and the second as the text? Here is an example:
string ErrorMessage = "This is my error message";
string EM_Upper = ErrorMessage.ToUpper();
MessageBox.Show("ERROR!", EM_Upper);
It seems obvious to us what the intent is since we have been thinking about the code, and know that “ERROR!” is the title and the message is in EM_Upper
. But the compiler doesn’t know that, since the call still matches the method signature; one string is treated like any other string, and it lets the incorrect code through unchallenged:
Fred Brooks, who managed both hardware and software teams at IBM, and later founded the Department of Computer Science at the University of North Carolina, once compared the creativity in programming to that in poetry, but noted, “The program construct, unlike the poet’s words, is real in the sense that it moves and works, producing visible outputs separate from the construct itself. … The magic of myth and legend has come true in our time. One types the correct incantation on a keyboard, and a display screen comes to life, showing things that never were nor could be.” But he cautioned, “One must perform it perfectly. The computer resembles the magic of legend in that respect, too. If one character, one pause, of the incantation is not strictly in proper form, the magic doesn’t work. Human beings are not accustomed to being perfect, and few areas of human activity require it.”2
The things that make us swear at computers often involve them perfectly executing a set of instructions that don’t do what we want them to do. In his book I Sing the Body Electronic, which chronicles a year embedded with a Microsoft development team in the 1990s, longtime Seattle observer Fred Moody recounts a discussion with a programmer:
Developers like to highlight the difference between the world of computing and the world outside of computing by citing the common directions on a bottle of shampoo—“Lather. Rinse. Repeat.” … In the everyday world, common sense tells you not to keep lathering and rinsing forever. In the world of computing, where there is no common sense and where everything must be rigorously defined, such an instruction is careless and dangerous.3
The joke is that the instructions don’t tell you when to stop repeating; a computer would keep doing this until it ran out of shampoo (or possibly longer, if it neglected to stop when the shampoo bottle was empty).
Reversing two parameters to a method is the sort of mistake that can be hard for a programmer to catch, because our human commonsense filter kicks in when reading the code—the same common sense that makes us interpret the shampoo directions the way they were intended. The correct code looks maddeningly similar to the incorrect version; you see the title and message being passed to MessageBox.Show()
, so what could be wrong?
Luckily there is documentation available for MessageBox.Show()
explaining which parameter goes where, and even if you don’t read the documentation and get the parameters backward, the error will be apparent if you run the program. In this case, you can quickly solve the problem: you know where the call to MessageBox.Show()
is in your code, it’s pretty obvious that you switched the two parameters, and presto, it is fixed. Now imagine that the method you were calling was written by another programmer at your company and did not have clear documentation—a much more typical situation when debugging. Or imagine that the mistake did not result in a visually obvious, and otherwise harmless, transposition of two strings.
Although you can fix this problem by changing your code, you can’t pin the blame squarely on your code or on the code in MessageBox.Show()
; you just had a different understanding of how the parameters worked. The actual problem involves an interaction between your code and the details of how MessageBox.Show()
orders its parameters; the bug is somewhere in the gap between them.
I want to emphasize this point: method calls hold together all the layers of software, and miscommunication across these layers is a major cause of unexpected problems. A collection of methods at a layer boundary is an API; I’ll use that term to refer in an abstract sense to both a single method and collection of methods.
So to restate: APIs hold together all the layers of software, and miscommunication across these layers is a major cause of unexpected problems. This miscommunication can take a variety of forms, from not realizing the proper value that the API expects in a parameter, to not realizing how the API will interpret that parameter in guiding its internal logic, to misunderstanding when and in what form the API returns values. Many debugging sessions end with a programmer, after having examined their own code with a fine-tooth comb and found nothing amiss, cracking open the documentation, slapping their forehead, and exclaiming, “Oh, I didn’t realize that the API that I was calling worked that way.”
The decisions here are in the hands of the programmer writing the code inside the API; the caller of the API is stuck with whatever that other person decided. Often you cannot see the code in the API you are calling; it is provided to you only as compiled code, with no source code available. Unfortunately, people who write code that supplies APIs to other people generally don’t spend a lot of time worrying about the external clarity of their API but instead get bogged down in the internal implementation of the API. This is because for any given code you want to write, there are likely multiple ways to write it, and not a lot of wisdom about which one to choose.
The book The Paradox of Choice, by the psychologist Barry Schwartz, argues (in a section of the book titled “Why We Suffer”) that having more choices does not make people happier:
Freedom and autonomy are critical to our well-being, and choice is critical to freedom and autonomy. Nonetheless, though modern Americans have more choice than any group of people ever has had before, and thus, presumably, more freedom and autonomy, we don’t seem to be benefiting from it psychologically.4
He then goes on to explain that too much choice can sometimes be a burden on us. Programmers are frequently victims of this; there are so many ways to write even trivial code like the stuff above, it is hard to know what the “right” way is.
Getting back to our code from before, here is the correct version (without the swapped parameters to MessageBox.Show()
) as we last saw it:
string ErrorMessage = "This is my error message";
string EM_Upper = ErrorMessage.ToUpper();
MessageBox.Show(EM_Upper, "ERROR!");
Observe that in the second line, we declare a new variable, EM_Upper
, to hold the uppercase version. This seems reasonable; although we don’t need the original mixed-case error message, we may want it at some point in the future, so we allocate a second variable to hold the uppercase version and keep the original in ErrorMessage
.
Hang on. Do we need to retain the original value of ErrorMessage
? Sure, we might need it in a future version of this code, but right now we don’t. And if we don’t need it, we can avoid EM_Upper
entirely and instead replace ErrorMessage
with its uppercase version and pass that to MessageBox.Show()
:
string ErrorMessage = "This is my error message";
ErrorMessage = ErrorMessage.ToUpper();
MessageBox.Show(ErrorMessage, "ERROR!");
Come to think of it, since all we do with the uppercase error message is pass it to MessageBox.Show()
—we don’t use the original or the uppercase version in any later code—we can combine the last two lines of code into one:
string ErrorMessage = "This is my error message";
MessageBox.Show(ErrorMessage.ToUpper(), "ERROR!");
Except—and I’m just pointing this out to be helpful, you understand—now we haven’t saved the uppercase version anywhere. We passed it to MessageBox.Show()
and then it disappears into the ether. Does this matter? If the code later needs the uppercase value a second time, we would need to call ToUpper()
again, which seems wasteful, so shouldn’t we save the output of ToUpper()
somewhere just in case? And if we do decide to save it, should we stash it back in ErrorMessage
or create EM_Upper
as a separate variable, thus retaining the original version as well but using a little more of the computer’s memory?
These last few paragraphs of discussion are all theoretical. The code works right now, so why complicate it? Why are you worrying about saving both versions or calling ToUpper()
twice when the code needs to do neither of those things?
Sure, replies the devil on the other shoulder, but maybe if you set yourself up for the future now, when you are familiar with the code, you will have less chance of making a mistake later and save time overall. A well-prepared devil might point out that one study of software maintenance noted, “There is a unique maintenance aspect called ‘knowledge recovery’ or ‘program understanding.’ It becomes a major cost component as software ages (assume 50% of both enhancements and defect fixing).”5 Half your future maintenance costs will be spent relearning the details of your program that you will have forgotten in the meantime! Surely it is better to make those changes now, when the code is fresh in your mind.
Keep in mind we’re talking about three lines of code here.
This problem is not unique to software; many tasks can be accomplished in multiple ways. What’s different is the ease with which you can change your mind and update your code, and the lack of criteria for determining which approach will be most useful in the long term. If you are building a bridge, you will know the distance it is supposed to span and weight it is supposed to hold, and it is understood that the current design is based on those factors. Nobody is going to assume that the same bridge design, with just a few modifications, will handle twice as much distance or weight, or that it will be simple to accommodate such a change when the bridge is halfway built. Changing software is so easy—a few keystrokes, a wave of the compiler, and you are done—that the temptation is always there, and the incentive to figure everything out ahead of time is less, for the same reason. The end result is that almost every piece of software written eventually winds up being modified to solve a different problem than what it was designed for.
“The computer’s flexibility is unique,” points out the researcher John Shore in his essay “Myths of Correctness”:
No other kind of machine can be changed so much without physical modifications. Moreover, drastic modifications are as easy to make as minor ones, which is unfortunate, since drastic modifications are more likely to cause problems. With other kinds of machines, drastic modifications are correspondingly harder to make then minor ones. This fact provides natural constraints to modification that are absent in the case of computer software.6
André van der Hoek and Marian Petre, editors of the book Software Designers in Action, observe, “Almost any product can be changed, in some way, after it is delivered. What makes software unusual is the expectation of the customer, the user, and other stakeholders, that it will change.”7 Since software almost always has a potential alternate future ahead of it, there is no statute of limitations on suggested improvements. And you never know, unless you can see the future, which ones will result in real savings and which ones will be needless complications.
While the choices we’re discussing here—what API to call, or whether to use a variable or not—seem innocuous in this situation, these are precisely the sort of choices that can, if made incorrectly, lead to software that has real problems, that crashes, hangs, or allows other users to steal your files. The electricians who installed knob-and-tube wiring back in the 1930s were following correct, state-of-the-art practices for the time; it was only determined in retrospect that they were actually creating a large, expensive, and potentially life-threatening problem for future generations to deal with. Is that API choice you are making a clever decision, or will it be determined by a future programmer, as they slander your name, to be horribly misguided?
Still, you can’t dither forever, so eventually you choose a spot on the continuum of present versus future gratification, and write your code to match. Are you done now?
No, unless you are programming all by yourself, which usually only happens during the sunny idyll known as college (or high school). You are now a professional programmer, working with other programmers, so what comes next is an opportunity for all your coworkers to give their opinions through an activity with the seemingly auspicious name of code review.
A code review is where other people offer constructive criticism of your code. This sounds like a good idea, like grabbing a second electrician to give your work the once-over. That’s what you would expect people to do as they graduate from do-it-yourself wiring projects in their own home to becoming professional electricians: find somebody else to point out your knobs and tubes before they can cause any fires! The language used gives this analogy a helpful shove in the wrong direction, because electricians are governed by an electric code (that word again). The electric code is where it is written down, “Thou shalt not use knob-and-tube wiring for new homes,” and more important, “When evaluating existing knob-and-tube wiring, these are the things you look for to determine if it is safe.” The phrase “code review” implies that your fellow programmers are noting deviations from accepted practices and standards, comparing your code to a “professional programmer’s [the other kind of] code.”
Unfortunately there is no equivalent of an electric code for software, and the books available to programmers, although they offer advice in some of these areas, are not backed up by any empirical studies; they tend to add fuel to both sides of a debate. A code review is really about other programmers giving their personal opinions on how they would have written the code, backed up by nothing more than their own experiences. And since your peers know all about how flexible code is, they are likely to feel that their suggestions should be adopted, no matter how late in the game it is, because the game never ends. This makes them more likely to suggest changes and more likely to pooh-pooh your code if you don’t agree with them.
The most likely feedback from a code review is other programmers’ opinions on the same questions that you noodled over before you even sent the code out for review: Should you modify your code now in anticipation of future needs, of which you are currently unaware?
Consider variable names—a favorite topic. Our code above has two variables, ErrorMessage
and EM_Upper
. Those names are inconsistent; the second one has an explanation of the meaning of the variable (it’s in uppercase) that is absent from the first one, while the first one spells out the purpose of the variable (it’s an error message), but the second one uses initials. We, the author of the code, know how it evolved to this point: we started with only one variable, ErrorMessage
, and added the second one later. Until we added the second one, we didn’t know what was going to distinguish it from the first, so plain-old ErrorMessage
seemed reasonable. Meanwhile, when inventing EM_Upper
, we decided to save a bit of typing and shorten the first part.
Now once we did add the second variable and the case—mixed versus upper—became the distinguishing factor, we could have renamed all uses of the original ErrorMessage
to be ErrorMessageMixed
, but really we should then go change EM_Upper
to ErrorMessageUpper
, or alternately change ErrorMessageMixed
to EM_Mixed
. And come to think of it, the initial error message might already be in uppercase; we can’t assume it is mixed case, so perhaps ErrorMessageOriginalCase
would be a better name. Which would mean ErrorMessageUpper
should become ErrorMessageUpperCase
. Meanwhile we’re still not 100 percent sure we need two variables, so are we willing to commit to all that typing? Looming over this is the nonzero chance of making a mistake (in particular, accidentally replacing one of the uses of ErrorMessage
with ErrorMessageUpper
instead of ErrorMessageMixed
, which would compile fine, but botch everything when you ran it).
A second programmer arriving on the scene for a code review knows none of this history, nor do they appreciate your inner struggle. What they see is the inconsistency between the names ErrorMessage
and EM_Upper
, which they will likely point out. By the way, variable names go away when the program is compiled, so changing the variable name has no effect on how the program runs or the user’s experience running it. This is just programmers arguing about readability for the sake of future programmers who encounter the code.
Believe it or not, one of the most contentious questions, which you can’t see at all when reading code in a book, is this: When indenting code, which happens a fair bit, do you type a series of spaces or a single tab character? Some people like to see code indented four spaces each time, and some like to see it indented eight spaces; using tab characters means each person can see the indent level they like by adjusting the tab settings on their own computer, but some people consider that heresy and think the original programmer should be able to control exactly how the indenting is seen. Worse, a file with a mix of tabs and spaces can devolve into a visual torment. When I worked on the first versions of Windows NT (the precursor to today’s Windows) back in the early 1990s, there was a strict rule that there would be no tab characters in the source code, on pain of baleful stares from your coworkers (once they got finished removing the tabs from your code). To this day, if you want to see an eye roll from a programmer, utter the magic phrase “tabs versus spaces.”8
So you’ve got the variable names and indenting, and further discussion topics such as whether you should put a space before the equal sign when assigning to a variable, before the opening parenthesis of a method call, after the comma that separates method parameters, before the semicolon at the end of a line, or really anywhere there is or isn’t a space (which is almost anywhere, since most programming languages ignore extra spaces in the code), or whether blank lines in your code are a sinful waste or glorious luxury.
Harlan Mills once wrote about a similar situation, maintaining, “Since there was no mathematical rigor to inhibit these discussions, some became quite vehement.”9 Vehement is an understatement; these arguments are often called “religious” because they rely entirely on faith, not demonstrated evidence.
Despite the energy expended, code reviews rarely turn up real user-visible bugs. They are more about enforcing local norms such as “this is how we name our variables.” Really bad bugs, security issues, or potential crashes usually involve a series of mistakes, each small enough to go unremarked in isolation, acting in unfortunate concert with just the wrong set of data. Oftentimes they are a misunderstanding between the programmers responsible for two adjacent layers of code at the API boundary. Brooks recognized this problem forty years ago: “The most pernicious and subtle bugs are system bugs arising from mismatched assumptions made by the authors of various components” (the loosely defined terms component and module are often used to denote “a bundle of code that provides a set of related APIs”).10 Code reviewers do try to anticipate future modifications to the code, but when reviewing code that provides an API, they rarely try to predict future misunderstandings by callers of the API, especially since that code hasn’t been written yet. When your perspective is from the inside of an API upward, it all seems perfectly logical.
As a result, code reviews almost never get into the issue of the usability and clarity of the APIs being exported to a layer above. The API name and parameter list is the box that holds the code being reviewed, normally accepted as fact while your eye slides over it to get to the meaty algorithmic parts inside. In a way that makes sense, since the internals are the “hidden” part that may never get looked at again, while the API external surface will be seen by any other programmers who call it, but the latter will have a disproportionate effect on whether code from two different programmers will work together as planned. And once the code providing an API is judged complete, programmers are even less likely to change the API name and parameters than they are to rename an unclear variable; for one thing, they would also need to change any code that is now calling that API.
Having said that, I do applaud code reviewers for worrying about the readability and maintainability of the code, because while it may not have much effect on clarity across API calls, it relates directly to another important source of bugs, which is code handoff across time: the situation where another programmer needs to modify the code for future use (or where you, as mentioned earlier, come back to the code some time later; the amount of time it takes your own code to become foreign to you is depressingly short). The code reviewer is a good simulation of that future person since they themselves do not know the code either. The problem is that while the code reviews are well intentioned, it’s not at all clear that the problems they point out will have an effect on future maintainability; it’s frequently a “he said, she said” sort of argument (or unfortunately, often a “he said, he said” one).
At one point in Microsoft’s history, there was a push to write code using Hungarian notation, where variable names were prepended with little duodenums that described their type—such as a number, string, and so on (the “and so on” exists because programmers can create their own types, built up from collections of strings and numbers). For example, all variables that were strings would start with sz
, so Hungarian-styled code was peppered with variable names like szUsername
and szAddress
. The theory was that it was useful to know at a glance, without knowing anything else about how the code worked, if a particular variable was a string or number, to prevent you from accidentally using one where the other was called for. This led to extreme all-sizzle-no-steak examples like szA
, where sz
informed the masses that this was a string, but the A
part was a flashback to the days of BASIC, which did nothing to tell you what the string was used for.
Hungarian was a source of contention in the halls of Microsoft; the Office team adopted it, but the Windows NT team thought it was silly, so I thought it was silly by osmosis (would the discussion above, about EM_Upper
versus ErrorMessageUpperCase
, have been meaningfully impacted if it instead were about szEM_Upper
versus szErrorMessageUpperCase
?). The Windows NT naming style tended toward long intercapped names, such as MaximumBufferLength
, a style known as “camel casing” because the capital letters look like a multihumped Dr. Seuss camel; the name told you a lot about the purpose of the variable but kept mum on its type.
(It will be important to certain readers for me to clarify that camel case actually has the initial letter in lowercase, as in exampleVariableName
, and the ones with the first letter capitalized that we used in Windows NT were called “Pascal case,” but camel case is a much more evocative phrase.11 Anyway, back to our story.)
Proponents of Hungarian in turn derided these long, not-quite-camel-case names as being error prone as well as wasteful of keystrokes and disk storage (two things that had historically been in short supply for programmers, although truthfully no longer were at that moment in history). Which led to the counterargument that it is clear enough that the variable CurrentlyLoggedInUserName
is a string and no extra characters at the beginning are needed to indicate that. In addition, by the early 1990s compilers were better at recognizing when code was passing the wrong type of variable around (enforcing that your method call matched the method signature, such as not using a string when a number was called for, was an innovation in compiler technology whose absence had in the past caused all sorts of entertaining bugs). This made Hungarian less necessary than it was in, say, 1986: if the compiler is going to catch a type mismatch, then you don’t need Hungarian; and the more interesting mistakes do not involve bollixing up strings and number but are instead about using one more complicated type where a different more complicated type is needed, and those complicated types won’t have easily recognized Hungarian prefixes to guide you.
Since the Office and Windows NT teams had their own separate piles of source code, pro- and anti-Hungarian arguments could be lobbed back and forth with no ground given; each side had the other’s worst-case offenders to parade around the public square, with camel casers chuckling at pwszA
and Hungarian advocates snorting at SomeReallyLongVariableName
. If there was ever a thought of compromise (how about long camel-cased names with Hungarian prefixes also?), I never heard about it. This was serious business, with no time for such foolish ideas! Besides, anything other than complete capitulation by the other side would have meant fixing up a lot of variable names in your own code—a daunting prospect that nobody wanted to tackle. The few programmers who were brave enough to switch teams were quickly assimilated into their new culture, and the twain never met.
To add fuel to this bonfire of whataboutism, the two sides weren’t arguing about the same thing. The original Hungarian system, which became known as Apps Hungarian because it was used in the division that wrote applications such as Office, prescribed prefixes that were more informative than just the type of a variable; you might distinguish a variable that held a row number from one that held a column number by using the prefix row
or col
.12 Somehow (the blame is generally placed on the team that wrote the documentation for the Windows API, apparently following a misguided impulse to simplify the notation) it made its public debut in a form known as Systems Hungarian, in which the variable name prefixes only identified the type—as in number versus string—which is much less useful (although the more your Hungarian prefixes resemble real words, the more the difference between Hungarian and Windows-NT-style boils down to the capitalization of the first letter—still fertile ground for religious argument, of course).13 Thus the Apps Hungarian that was venerated by the Office team was different from the Systems Hungarian that was used as a punching bag by the Windows team.
By good fortune, a writer named G. Pascal Zachary wrote a book about that Windows NT project and recorded his impressions of the Hungarian battle raging in Redmond, Washington, at the time:
Some disputes, however, involved what programmers call “religious differences.” The points at stake seem important only to zealots; a neutral party might say that both sides are right. But zealots—unable to silence their opponents with logical arguments—hurl insults.
One of the oddest disputes, which brought out the worst in zealots, involved the notational system used to write instructions in C, one of the most popular computer languages. Over the years Microsoft had adopted its own conventions, called Hungarian, after its creator, Budapest-born Charles Simonyi. … [I]t lacked the ready familiarity of conventional notation, which relied largely on English words rather than opaque abbreviations. The differences between the two styles spawned many arguments, whose merits were lost on outsiders.14
One of the programmers on the team (not me) is quoted describing Hungarian as “the stupidest thing I’d ever seen,” although it’s unclear if he is referring to Apps Hungarian or Systems Hungarian. He adds the quasiwise summary, “Coding style wars are a waste of valuable resources, although the confusion caused by Hungarian probably wastes more time.”15 And if his arguments sound reasonable, remember he was part of the crew that obsessed over tabs versus spaces for indenting source code. The same programmer can have a perfectly rational, “live and let live” attitude about, say, spaces between method parameters, but go into conniptions at the sight of an unneeded blank line in the source code. For that matter, I wouldn’t have described the variable names used in Windows NT as “conventional notation”; they seemed oddly long to me when I first joined the team, being used to “opaque abbreviations” sans Hungarian prefixes.
Luckily the code reviewers will eventually stop commenting, or you will get tired of listening, and you can update your code to reflect the feedback you choose to heed. Of course, any change to your code is an opportunity to make new mistakes that will in turn require their own debugging; it is a particularly numbing experience to decide that today is the day you are going to do your civic duty and rename that obscure variable, only to discover that you have accidentally broken something while making the change, and the compiler is now complaining that “an expression tree lambda may not contain a coalescing operator with a null literal left-hand side”—an actual C# compiler error, albeit one that is unlikely to be caused by a typo in a variable name.16
Let’s make one more change to our code: have it only display the error message if that message contains the word “JavaScript” in it (JavaScript is another programming language). Since we have uppercased the message, we can check if it contains the capital word “JAVASCRIPT” using the Contains()
method (I’ve removed the first line, where it explicitly sets ErrorMessage
to “This is my error message,” because it would make the code look slightly ridiculous; clearly that string does not contain “JAVASCRIPT,” so there’s no reason to check):
string EM_Upper = ErrorMessage.ToUpper();
if (EM_Upper.Contains("JAVASCRIPT")) {
MessageBox.Show(EM_Upper, "ERROR!");
}
The line that reads
if (EM_Upper.Contains("JAVASCRIPT")) {
performs a test; if EM_Upper
contains the string “JAVASCRIPT” anywhere within itself, then the code between the { }
runs, and otherwise it doesn’t. The word if
is a recognized keyword in the C# language; for notational convenience, I am going to write such keywords in capital letters, even in languages that traditionally are written in lowercase, so it will be referred to as an IF
statement.
As conscientious programmers, we are aware that our code can run in multiple countries, where the error messages might be translated into a different language, but we have been assured that the term JavaScript, being the name of a programming language, won’t be translated.
Is this correct? Well, the basic idea is correct, but it does have a bug, and you might not realize that for a while, because it depends on a detail of the implementation of a method that you—and many experienced programmers—are likely completely unaware of.
As an example of nonobvious method implementation details, consider the internals of MessageBox.Show()
, the actual code that shows a message box. An important aspect of writing the code was deciding what behavior made sense to callers of their method and how that behavior should be accessible via the parameters.
You may have noticed, from the screenshots earlier in this chapter, that in addition to showing the message and title, the message box will display a button labeled “OK” that the user can click:
This is perfectly reasonable, noted in the documentation, and apparent when you run the program, but not clear from knowing the method name and parameter list.
The situation appears benign: MessageBox.Show()
will show a button that the caller didn’t explicitly request, but is that harmful? The answer is “no” in this case. In fact, by passing extra parameters, you can have some control over what buttons are shown—assuming you know that the method supports this. Yet methods having unknown side effects is the cause of many bugs.
That was exactly the situation with the bug at my first job out of college, where the street address of certain doctors was being replaced. When I was enlisted to help, it was quickly apparent that the problem was in the API that retrieved the doctor’s information from the database on the computer (as opposed to, say, retrieving the right address but messing up the code to display it). The exact name of that API escapes my memory, but it doesn’t matter; we’ll call it GetDatabaseRecord()
. This API presumably took parameters specifying which doctor to retrieve, although those don’t matter either. Of course GetDatabaseRecord()
was itself built on other API calls, which were built on other API calls, and so on. My task was to paw through these underlying layers of code and find out why they were occasionally misbehaving.
After some investigation, I discovered that another programmer had modified a section of the program to calculate and display extra data about the doctors in the database. In certain cases, this required them to load other doctor records out of the database (I don’t recall the details, but let’s say that in the case where two doctors had attended medical school together, the database had this noted somewhere—something that would only be true in certain instances, and didn’t jump out as an obvious difference between the doctors showing a corrupted address and the others, so it was not noted in the repro steps). Because the street address of a doctor, somewhat uniquely among all the other data fields in the database, was a string that could be of wildly varying length, we stored the “street address of the last doctor loaded from the database” in a specific variable in memory that had enough room for any reasonably sized address. This is the sort of optimization you made to save memory on the underpowered computers of the day.
In the feature that the other programmer was adding, the street address wasn’t needed, so this change didn’t matter, but it meant that sometimes the address in that special “street address of the last doctor we loaded from the database” variable wasn’t what we thought it was, because while loading information for one doctor, we proceeded to load information for his medical school buddy, and this updated the “street address of the last doctor we loaded from the database” variable. This was all happening several layers below the code that called GetDatabaseRecord()
; that code hadn’t changed, but the internal behavior of GetDatabaseRecord()
—or more precisely and vexingly, the internal behavior of an API that was itself called several layers below GetDatabaseRecord()
—had changed. This wasn’t maliciousness on the part of the other programmer, yet it was subtle enough that even she herself, when the “wrong address” problem was first being looked into, didn’t realize that her earlier change was causing the problem.
Once again you are at the mercy of whoever wrote the API you are calling—not only to define the parameters in a logical order, but also to document all the intended side effects, even if it isn’t obvious why they would matter. When calling an API, you have precious little information about the details of its implementation and how reliable it is—just a name and a parameter list, as a thin line of glue holding together your software.
I don’t remember exactly how I fixed the problem, but it was easy once I had found it; the solution can be left as an exercise for the reader. I might have added an extra parameter to GetDatabaseRecord()
to tell it “don’t load extra doctor information” and made that parameter “true” in this specific case. If I was feeling motivated, I could have rewritten the code that used the single “street address of the last doctor we loaded from the database” variable so that it instead used multiple variables as needed. This second way would have been more “correct,” but it also would have been a larger change, delaying my champagne reward, with more risk of breaking something else while making the fix. On the plus side, it would have meant that a future caller of the GetDatabaseRecord()
API had less to understand about the implementation, which would make things less error prone. Just as people argue, during code reviews, about the correct way to write a piece of code, they also argue about the correct way to fix a bug once the cause is found, typically involving this sort of trade-off between “less immediately risky but somewhat ugly” and “more complicated but more elegant for the long term.”
Let’s return, finally, to our code that checks if the error message contains “JAVASCRIPT,” which I claimed a few pages ago had a real bug in it: a monolingual English speaker may be unaware that the concept of uppercasing is subject to regional interpretation. English has a lowercase i with a dot on top and an uppercase I with no dot on it; for your reference, I have included several examples of both in this sentence. Most languages written in the same alphabet have the same lowercase i and uppercase I. In Turkish, however, there is a lowercase dotted i and uppercase dotted İ, and a lowercase undotted ı and uppercase undotted I. And the dotted-or-not aspect doesn’t change when you capitalize, so the capital of i is İ, not I as it is in English. When you uppercase a word with an i in it, such as the string “JavaScript,” the capital in English is “JAVASCRIPT” and the capital in Turkish is “JAVASCRİPT” (notice there is a dot above the uppercase I). And despite what a common sense–laden human might think, to a computer those are most definitely not the same thing.
If you call ToUpper()
with no parameters, as we did, the implementation uppercases based on the language setting configured on the computer, which the user can choose. If your user is on a computer configured for Turkish, the uppercasing of “JavaScript” will be different than if the machine is configured for English, and the Contains()
method won’t match it as expected. You might have thought it was clever to have your code make the comparison against the uppercased version of the error message, but this could cause hard-to-diagnose problems, especially if you try to reproduce the bug on a machine configured for English.
The fix to our Turkish uppercasing problem is simple once you know about it; change your ToUpper()
call from17
EM_Upper = ErrorMessage.ToUpper();
to
EM_Upper = ErrorMessage.ToUpper(InvariantCulture);
As with the MessageBox.Show()
method, ToUpper()
has multiple versions that take different parameters. The simplest version assumes that it should use the culture (the preferred term since it encompasses more than just language, extending to areas such as currency symbols) that the computer is configured for. This is normally right, but not in the case of comparing uppercased strings; passing InvariantCulture
as a parameter to ToUpper()
tells it to uppercase in a way that is guaranteed to be the same on all computers (here’s an insider tip: the secret is “always do it like they do in English”; it doesn’t have to be politically correct, just consistent).
This works fine, but just like wanting MessageBox.Show()
to display something other than an “OK” button, you have to know to do it. The parameterless “use current culture” version of ToUpper()
is more convenient to call (with convenience defined as less typing by the programmer) and easier to discover, but its existence allows the calling code to be unaware of the notion of cultural differences in uppercasing, which is bad. The problem is not just that the default local versus invariant culture choice made by ToUpper()
is a hidden choice made by somebody else; it’s that the existence of such a choice might be unknown to the caller, either because a programmer doesn’t know about cultures at all or they don’t realize that the “tell me which culture to use” version exists.
One of the tricky aspects of designing a method is deciding what to make a required parameter (which always has to be passed in, such as the title of the message box or string you want to uppercase) versus an optional parameter (the buttons to display in a message box or culture to use when uppercasing), the related question of what the default behavior should be if the optional parameters are not specified, and lastly, what is not even specified via a parameter (such as the font to use in a message box) and therefore gives the caller no choice at all.
As usual there is no one right answer, despite many brain waves having been expended on code reviews of these questions. Whatever decisions are made, ignorance of the default behavior of a method is a common problem. Viewed through that lens, the existence of the default-culture version of ToUpper()
is not a convenience but rather a tragic mistake, source of needless bugs, and missed opportunity to educate programmers about regional differences—all for the sake of saving a little bit of typing! Default parameters, while generally considered a useful convenience, likely do more harm than good. They are essentially like allowing a bridge builder to say, “Give me some steel to build my bridge,” rather than requiring them to always specify the exact properties of the steel that they need. Programmers unknowingly think the simplest call is the right one—until they get an irate call from Ankara.
Brooks explained the difference between a program—“complete in itself, ready to be run by the author on the system on which it was developed”—and a programming systems product—“the intended product of most systems programming efforts.”18 To get from the former to the latter, you introduce complications in two dimensions. The first complication is going from a single author to a program that can be “run, tested, repaired, and extended by anybody.” The second complication is going from a single program to “a collection of interacting programs, coordinated in function and disciplined in format, so that the assemblage constitutes an entire facility for large tasks.”19
Back in high school and college, I was working on plain-old programs, but in industry I was working on programming systems products. As Brooks noted, the two big new complications in making this transition are communication between programmers across time and communication between components across API boundaries.20 These are two areas where my self-taught education had left me severely lacking.
Brooks added that “this then is programming, both a tar pit in which many efforts have foundered and a creative activity with joys and woes all its own.”21 Don’t programmers learn how to deal with these problems correctly? They may eventually. But given their self-taught beginnings, they tend to be focused on another aspect of their software, which I’ll discuss in the next chapter.