2 
The Education of a Programmer

Informed by knowledge of what is taught to prospective doctors, lawyers, or accountants, you could not be faulted for picturing my computer science education at Princeton as devoted to instruction on how to design software, reporting on experiments with different languages and methodologies, relaying tips for corralling elusive bugs and hard-to-pinpoint slowdowns, and in general having the faculty impart their combined wisdom on software engineering to the eager assemblage of students.

Before I get into details on why it wasn’t quite like that, I want to cover a bit of terminology. People who write software programs are called programmers. They are also called developers or software developers as well as software engineers, software development engineers, or sometimes software design engineers. I called myself a programmer when I was younger, while at Microsoft we were informally called developers (often shortened to “devs”), but my title was software development engineer. One of the questions in this book is whether the word engineer belongs there. But for now, consider them all to be interchangeable.

Meanwhile, people who go to college to study programming usually major in computer science—another two-word phrase that may not yet have earned its second word—but they may instead major in software engineering. Some claim there is a distinction between those two, with computer science more focused on theory and software engineering more concerned with the application of that theory, yet there is no agreement on the difference, or whether it exists at all, so treat those two as equivalent too.

In any case, off I went in 1984, home-brewed programming experience in hand, to Princeton University to study computer science. Princeton’s was a typical high-level computer science program: good facilities, smart students, and professors who were recognized as leading authorities in their areas of research. But the professors’ areas of research were primarily related to theoretical computer science, essentially the study of algorithms (Princeton had a reputation for being a bit more algorithm-focused than other schools,1 although in my observation of the graduates of various other schools, when I was later interviewing them for jobs at Microsoft, I did not notice any difference in the training they had received). There was only one class that centered on how to write software: the introductory computer science class that I took my first year, where we were taught a language called Pascal.

The big advance that Pascal had over early 1980s’ BASIC was that it supported passing parameters to its version of subroutines, which it called procedures. (Pascal made a somewhat-unnecessary distinction between procedures, which did not return a value, and functions, which did. I’ll use procedures here; both would fall under the more language-independent term API.) In addition, any variables declared inside a procedure were local variables, which means you didn’t need to worry about whether they had the same name as a variable declared outside the procedures. This made it possible to call a procedure written by somebody else without knowing the details of how it was implemented—the basis for building up code in layers. This is essentially impossible in IBM PC BASIC with its line-number-based subroutines with no parameters, and all variables being global.

My sister, who is a year older than me, also took a Pascal course in college. I once got into a debate with her about whether it was important to be able to define procedures with parameters, as Pascal allowed, or whether BASIC’s support for unnamed, unparameterized subroutines was sufficient. This latter position is hopelessly naive in retrospect, but nonetheless it was the side I chose. At this point my sister’s programming experience in Pascal consisted of the typical short assignments (sort an array of integers, etc.) that I had cut my teeth on when learning WATFIV, whereas I had written several games in BASIC of decent complexity (with what would be described today as 8-bit graphics, although since the IBM PC only supported 16 colors, they actually were 4-bit graphics). There is no doubt in retrospect that I was wrong and my sister was right. With sufficient care you could work around the problems of not having named subroutines and local variables, but they are such a convenience and avoid so many preventable mistakes that dismissing them is indefensible. In my defense, recall Dijkstra’s comment that BASIC programmers are “mentally mutilated beyond hope of regeneration.”2 Perhaps at that point my brain was so torn up inside from my intense exposure to BASIC that I couldn’t think straight and lost an argument to my sister.

In my introductory class at Princeton, we learned the basics of Pascal and wrote the simple programs used to learn a language. Our textbook, Programming in Pascal by Peter Grogono,3 did a thorough job of explaining the syntax of Pascal, without spending much time on what you should do with it, and I don’t recall the instructor getting into too many details either. As Harlan Mills, a noted writer on software topics who spent over twenty years managing teams at IBM, once wrote,

Our present programming courses are patterned along those of a “course in French Dictionary.” In such a course we study the dictionary and learn what the meanings of French words are in English (that corresponds to learning what PL/I or Fortran statements do to data). At the completion of such a course in French dictionary we then invite and exhort the graduates to go forth and write French poetry. Of course, the result is that some people can write French poetry and some not, but the skills critical to writing poetry were not learned in the course they just took in French dictionary.4

The trend in programming circles at that time was structured programming. Donald Knuth is a longtime professor of computer science best known for the multivolume opus The Art of Computer Programming, a comprehensive summary of software algorithms that he began working on in 1962. He wrote, “During the 1970s I was coerced like everybody else into adopting the ideas of structured programming, because I couldn’t bear to be found guilty of writing unstructured programs.”5 It’s likely that for any language then in use, there existed a book whose title contained the word structured followed by the name of the language. You could find Structured Programming Using PL/1 and SP/k (1975), Structured Programming in APL (1976), Programming in FORTRAN: Structured Programming with FORTRAN IV and FORTRAN 77 (1980), Structured COBOL: A Pragmatic Approach (1981), Problem Solving and Structured Programming in Pascal (1981), Structured Basic (1983), and so on.

Gerald Weinberg is another longtime observer of the software landscape who, among other things, worked for IBM on the software for Project Mercury, the US program to put an astronaut into space in the early 1960s. In his foreword to the 1976 book Structured Programming in APL (APL was another programming language, whose name is unrelated to the term API), Weinberg lays out the structured manifesto with his usual flourish:

APL has earned such a reputation for disorderly conduct that “structured APL” rings as off-key as “immaculate pigsty” or “honest politician.” Yet we must not blame the language for the disorderly conduct of its users—or misusers. In the hands of responsible and properly educated programmers, APL becomes a marvelously disciplined tool, a tool unlike any other programming language in common use.

The problem, of course, lies in the phrase “properly educated.” For too long, in too many places, APL users have learned the language “in the streets,” as any examination of their programs would show. Their textbooks are little more than reference manuals, and offer no corrective to the worst effects of the oral tradition.6

Learning “in the streets,” textbooks as “little more than reference manuals”—indeed! Weinberg was making the same point I am making in this book, forty-plus years later: most programmers are not properly educated in how to program, and it shows in their code.

It was unclear whether structured programming was a process—a structured approach to producing a program—or result—a program that is structured, no matter how it got that way. Knuth’s and Weinberg’s quotes above make it sound like it is the second. I concur; in the end, the code is what remains as well as what will determine how quickly a new programmer can figure out how a program works. Nonetheless, the literature, while clearly enamored of the term, varied a lot in deciding what structured programming really was.

Structured COBOL: A Pragmatic Approach gets to page ninety-five before providing a brief section on structured programming, explaining, “This is the first mention of the term structured programming, although every program presented so far has been ‘structured.’ Structured programming is the discipline of making a program’s logic easy to follow. This is accomplished by limiting a program to three basic logic structures: sequence, selection, and iteration.”7 The book then presents flowcharts for each of those three logic structures. Sequence here just means “one program statement following another”; selection means IF statements and the resulting choice from those; and iteration is loops in all their various forms.

Structured BASIC devotes one six-page chapter to structured programming, about halfway through the book, which starts out by stating, “Structured programming is an additional approach to program development that usually results in more efficient code, less time spent on development, program logic that is easier to follow, and a resultant program that is easier to debug and modify.”8 It’s hard to argue against that, but the approach that the authors present is a mix of “structure charts,” which are a visual representation of the different parts of a program, plus flowcharts that indicate the same three basic concepts in programs (sequence, selection, and iteration), so clearly the authors view structured programming as “a structured approach to producing programs.” They do briefly mention, at the end of the chapter, that comments can be helpful, and that indenting IF and FOR blocks can help with readability—the only nod toward structuring of the actual code.9

Meanwhile, Structured Programming in APL, despite Weinberg’s rousing introduction, waits until the epilogue before devoting two and a half pages to the topic of structured programming (to be fair, the book does use structure diagrams extensively), starting out with this: “Perhaps, in closing, we should mention something about the mysterious phrase ‘structured programming,’ which appears in the title, but nowhere else. At the time the book is being written there is still some controversy about exactly what structured programming is. But there is no disagreement about the fact that it is valuable.”10 What follows is a hand-wavy definition that encompasses the differences between engineering a bridge and engineering software, the fact that software is often modified from its original purpose, structure charts and the sequence-selection-iteration trinity, the importance of design, and the right length for a program; it also includes the sentence “there is still controversy over whether to use names (for variables and labels and programs) that are meaningful or meaningless.”11

This is all fine, but it’s incredibly basic: all programs in high-level languages, however structured they claim to be, consist of sequences of instructions, selection by IF statements (or their equivalent), and iteration in loops. At the bottom level, those are the bricks from which software is built. If this is structured programming, it’s hard to imagine what territory is left for unstructured programming to claim.

Structured Programming Using PL/1 and SP/k probably provides the best summary:

Certain phrases get to be popular at certain times; they are fashionable. The phrase, “structured programming” is one that has become fashionable recently [the book came out in 1975]. It is used to describe both a number of techniques for writing programs as well as a more general methodology. … The goals of structured programming are, first, to get the job done. This deals with how to get the job done and how to get it done correctly. The second goal is concerned with having it done so that other people can see how it is done, both for their education and in case those other people later have to make changes in the original program.12

The authors also offer a diplomatic warning about GOTO statements (which are written as GO TO in PL/I): “Since computer scientists came to recognize the importance of proper structuring in a program, the freedom offered by the GO TO statement has been recognized as not in keeping with the idea of structures in control flow. For this reason we will never use it.”13

On a related note, at the end of Structured COBOL’s discussion of structured programming, it offers the following:

Conspicuous by its absence in Figure 6.1 [which lays out flowcharts for sequence, selection, and iteration] is the GO TO statement. This is not to say that structured programming is synonymous with ‘GO TO less’ programming, nor is the goal of structured programming merely the removal of all GO TO statements. The discipline aims at making programs understandable, which in turn mandates the elimination of indiscriminate page turning brought on by abundant use of GO TO. … [U]nstructured programs often consist of 10% GO TO statements.14

Now we are getting somewhere, and I think the authors doth protest too much: when you boil down the difference between structured and unstructured programming, what remains is getting rid of GOTO.

What is so bad about GOTO?

The ever-voluble Dijkstra wrote a letter to Communications of the ACM in 1968 titled “Go To Statement Considered Harmful.” The letter’s title sounds suitably Dijkstra-esque, but he later claimed that it was provided by Niklaus Wirth, the inventor of Pascal, who was the editor of the magazine at the time; Dijkstra’s original title was the less provocative: “A Case against the Go To Statement.”15 The letter begins,

For a number of years I have been familiar with the observation that the quality of programmers is a decreasing function of the density of go to statements in the programs they produce. More recently I discovered why the use of the go to statement has such disastrous effects, and I became convinced that the go to statement should be abolished from all “higher level” programming languages (i.e., everything except, perhaps, plain machine code).16

Dijkstra’s insight is that a source code listing is static, but what we care about, when figuring out what a program does and if it is correct, is the state of the computer while executing it (what he calls the “process”), which is dynamic. He points out that

our intellectual powers are rather geared to master static relations and that our powers to visualize processes evolving in time are relatively poorly developed. For that reason we should do (as wise programmers aware of our limitations) our utmost to shorten the conceptual gap between the static program and the dynamic process, to make the correspondence between the program (spread out in text space) and the process (spread out in time) as trivial as possible.17

In other words, while reading the code it should be as easy as possible to keep track in your mind of the state of all the variables and what they mean as the computer is executing a given line of code. Dijkstra then explains that with sequencing, selection, and iteration, it is relatively easy to figure out the state of the process at any point in the code, but when you allow the code to arbitrarily jump to any other location via a GOTO, it is hard to know the state of the process as it passes through the targeted location because that location now has multiple ways that it can be reached, and you can’t know what state the process was in at all the different points from which it may make that jump. Dijkstra concludes, “The go to statement as it stands is just too primitive; it is too much an invitation to make a mess of one's program.”18

Mills makes a similar observation: “In block-structured programming languages, such as Algol or PL/I, such structured programs can be GO TO–free and can be read sequentially without mentally jumping from point to point.” (He continues, though, by repeating the official story that “in a deeper sense the GO TO–free property is superficial. Structured programs should be characterized not simply by the absence of GO TO’s, but by the presence of structure.”)19

The argument against GOTO was bolstered by an academic paper laying out the Böhm-Jacopini theorem, which proved that any program could be written without GOTO statements.20 The proof relied on a somewhat-contorted programming style; in particular, you wound up using extra variables to avoid certain GOTO statements. While inside a loop, it is frequently useful to exit the loop early, before you have finished every planned iteration. An example would be code to look for something in an array, here presented in the language C# (pronounced “C sharp”). The first line is equivalent to the FOR J = 1 TO 10 loop that we saw in our BASIC illustration in the last chapter, but rewritten in C# (and looping from 0 to 9 instead of 1 to 10):

for (j = 0; j < 10; j++) {
     // is the j’th element of the array
     // the one we want?
     if (this_is_the_one(j)) {
          // if it is, then we can exit the loop
          goto endloop;
     }
}
endloop:

The GOTO statement jumps to the named label (endloop), avoiding unnecessary iterations through the loop, and at the end the value of j tells you which element of the array you want. Without this GOTO, per the formal Böhm-Jacopini theorem, you need to add an extra variable to short-circuit the unneeded array iterations, and another one to keep track of where it was found:

foundit = false;
foundlocation = 0;
for (j = 0; j < 10; j++) {
     if (!foundit) {
          if (this_is_the_one(j)) {
               foundit = true;
               foundlocation = j;
          }
     }
}

This is cheesy and harder to read than the first version with the GOTO statement. In fact, many languages (including C#, for those of you grinding your teeth at the code above) have a statement called BREAK that executes “exit the loop now” without needing the label (and the shunned GOTO statement), making this much cleaner:

for (j = 0; j < 10; j++) {
    // is the j’th element of the array
    // the one we want?
    if (this_is_the_one(j)) {
         // if it is, then we can exit the loop
         break;
    }
}

Nonetheless, some languages consider the BREAK statement (and related CONTINUE statement, which skips only the remainder of the current iteration of the loop, not all future iterations) to be a GOTO in disguise and don’t allow it. Pascal, back in those days, was one of those purist languages, with Böhm-Jacopini available as justification; Programming in Pascal has an instance of skipping unneeded loop iterations using an extra variable (which it calls a state variable), and the same example using a GOTO, and decides that the state variable is better: “Although this example illustrates the effect of the goto statement, it does not justify the use of the goto statement.”21

Personally I find code with a BREAK in it (or even a GOTO to a clearly labeled “loop end” label) much easier to read than code that adds an extra variable to avoid using it. The problem is not so much these “nearby” GOTOs but rather the more indiscriminate use in which the program jumps all over the place, such as you see in the BASIC Computer Games books or DONKEY.BAS. (As an extra bonus, in BASIC, if you had a subroutine starting at, say, line 700, but no GOTO at line 690 that skipped past the subroutine, the BASIC interpreter would roll right into line 700 and start executing your subroutine, with whatever state the relevant variables happened to be in; one of the benefits of Pascal and other languages that formally declare procedures is that they avoid this problem, since the procedure code is not part of the main code path.)22 This GOTO-laden style of programming was derided as “spaghetti code” because trying to follow the path taken through the code was like trying to follow a single strand of spaghetti in a bowl; it would disappear into unknown places and then reappear somewhere else, with no clarity on what exactly happened in between or even whether you were following the same strand. Kemeny and Kurtz, the inventors of BASIC, acknowledged that the effect of every line of code having a line number, which therefore made every line of code available as the potential target of a GOTO, was the “one very serious mistake” they made in the design of the language.23

If GOTO statements are so terrible, you might wonder why people used them in languages where they were not necessary. The authors of Fortran with Style explained in 1978: “The unconditional transfer of control, which is the function of the GOTO statement, has been associated with programming since its inception. Its historical ties have left indelible marks on today’s programming languages.”24 While high-level languages are built up from selection and iteration (IFs and loops), assembly language has lower-level building blocks, which you may recall from the last chapter: moving data between registers and memory, performing operations on registers, comparing registers, and jumping to other locations in the program. That “jumping to other locations in the program” is a GOTO (although the term jump is usually used in assembly language), and a higher-level-language construct like an IF is built up using jumps in assembly language; when reading code in a higher-level language, your eye will automatically slide down past the block of code that follows an IF test, but in assembly language you have to explicitly jump past it.

As Mills explains in his 1969 essay “The Case against GO TO Statements in PL/I” (whose opinion on GOTO can be accurately inferred from the title), for a programmer coming from assembly languages, jumps are a natural thing, and you can’t write any reasonable program without them.25 It makes sense that assembly-language programmers, when moving to a higher-level language, would not differentiate jumps done in the context of selection and iteration, which arguably are still “structured,” from jumps to random places in the program. Mills exhorts his audience, “It might not be obvious … that GO TO’s could be eliminated in everyday PL/I programming without its being excessively awkward or redundant. But some experience and trying soon uncover the fact that it is quite easy to do; in fact, the most difficult thing is to simply decide to do it in the first place.”26

He also states, “It is not possible to program in a sensible way without GO TO’s in FORTRAN or COBOL. But it is possible in ALGOL or PL/I.”27 Our “sum up the numbers” Fortran program from chapter 1 had two GO TO statements for a very simple algorithm. In the more modern language Pascal, this could be written without them as

var sum, x: integer;
begin
    sum := 0;
    repeat
        read(x);
        sum := sum + x
    until x = 0;
    writeln(sum);
end.

What about the books that claimed to teach structured programming in Fortran and COBOL?

Programming in FORTRAN: Structured Programming with FORTRAN IV and FORTRAN 77 describes the well-known troika, with some names changed: “Three basic control structures are sequence (begin-end), decision (if-then-else) and loop (while-do). These, sufficient to present any algorithm, constitute the fundamental means of a systematic programming process called structured programming.”28 But wait! Those are the theoretical constructs; Fortran doesn’t actually have a while-do loop, so the book explains how to write one using GOTO statements.29 The begin-else control structure merely involves putting statements one after the other, so that subject is never broached again. For if-then-else, later versions of Fortran do provide reasonable support, whereby you can have a set of lines of code that runs when the IF condition is true, and another set that runs when it is false, which is known as supporting block IFs. But earlier versions only let you run one statement when the IF condition was true, so that statement was perforce often a GOTO (as our “sum up the numbers” Fortran did). The book points out, “The old FORTRAN standard did not contain the block IF construct; thus it is not available in FORTRAN IV or in such compilers as WATFOR and WATFIV.”30 Yes, I must confess that the first programming book I ever read was for a language variant so antediluvian that it didn’t even have block IF statements.

Meanwhile, Structured COBOL: A Pragmatic Approach has it slightly easier. COBOL does have block IF statements, and a loop construct called PERFORM UNTIL, although it requires you to put the body of the loop into a separate procedure that makes it harder to read,31 and COBOL suffers from so much other clunkiness that I can see why Mills threw it under the “not structured” bus; you can’t program in a “sensible way” in COBOL no matter what parts of the language you use. Nevertheless, the authors can get away without using GOTO statements, except in one specific case: when a program wants to exit, they use a GOTO to jump to the end of the program. As they write about one program listing, “Figure 11.2 also contains five ‘villainous’ GO TO statements, but their use is completely acceptable (to us, if not to the most rigid advocate of structured programming).” They further explain, “The authors maintain that a structured program can include limited use of the GO TO, provided it is a forward branch to an EXIT paragraph” (this must be the pragmatism of the book’s subtitle manifesting itself).32 I agree; if you are reading through the program and get to the point where it is about to exit, there is no mental model that needs to be maintained. My copy of this book used to belong to my mother, from a programming class she took in 1983. In the margins of the book, next to the first quote, she wrote, “Hear hear,” and next to the second, “God will forgive you!” Knuth wrote in favor of allowing GOTO for “error exits,” and even Dijkstra, in his anti-GOTO screed, conceded that “abortion clauses” or “alarm exits,” meaning this same sort of “jump to the end of a block of code” approach, might be acceptable.33

Anyway, let’s say that structured programming just means “use GOTO as little as possible.” Clearly, from my hopeless argument with my sister, even this lesson, which in hindsight seems obvious, needed to be drilled into habitués of Fortran, COBOL, or BASIC. I suppose I could say that I learned structured programming at Princeton in that I did absorb the lesson about eschewing GOTO. But that’s probably one of the few lessons I was explicitly taught there. Because in high school I had succeeded in teaching myself programming and accomplishing reasonable results with it, I was extremely confident that the way I had learned things was correct, despite having no real basis for this claim beyond my own experience.

The logician Raymond Smullyan proposes in his book What Is the Name of This Book? that people are either conceited or inconsistent:

A human brain is but a finite machine, therefore there are only finitely many propositions which you believe. Let us label these propositions p1, p2, …, pn, where n is the number of propositions you believe. So you believe each of these propositions p1, p2, …, pn. Yet, unless you are conceited, you know that you sometimes make mistakes, hence not everything you believe is true. Therefore, if you are not conceited, you know that at least one of the propositions p1, p2, …, pn is false. Yet you believe each of the propositions p1, p2, …, pn. This is straight inconsistency.34

Smullyan’s point was that a reasonably modest person is behaving inconsistently, which he happily admits to; when it comes to programmers, however, the conceited approach usually wins out.

Once my introductory Pascal class was over, and I had learned to appreciate the value of passing parameters to named procedures, the rest of the undergraduate classes that I took dealt with more specific topics: how to design a compiler, how a virtual memory manager worked, and how three-dimensional graphics were projected onto a two-dimensional display—all interesting, but those classes all focused on the specific algorithms needed for those problems, and since I have never worked on those areas in my professional career, it’s not knowledge that I draw on in my everyday work. Nobody taught us how to design large programs and get them working on a deadline. We were given assignments that required large programs along with a deadline to get them working, and we made it happen as best we could.

Sophomore year was the first time I took a class where I used the programming language called C, which I wound up using for most of my college and professional career. After the professor explained the goals of the first assignment, a student rather hesitantly raised their hand and asked how we were supposed to learn C? No problem, said the professor, use this book (it was the original The C Programming Language, written by the language’s inventors, Brian Kernighan and Dennis Ritchie). I learned C by reading the book, looking at examples to try to discern their underlying motivation, and most important, trying things and fixing them when they didn’t work—the same process I had used to learn IBM PC BASIC four years earlier.

As Mills put it, the book taught me the dictionary. The rest of it, the bulk of what I learned about software engineering—how to split a big problem into smaller ones, how to connect the pieces together, how to figure out why it didn’t work, and how to decide when it was finished—I figured out on my own by trial and error, and everybody else in the class figured it out on their own with their own trials and errors.

And as a final point, I had only one class that involved modifying a program that somebody else had written; the rest of my projects were all greenfield ones, in which I started from scratch. Modifying existing code is what a professional programmer spends the vast majority of their time doing, but my work at school gave me little preparation for sitting down with a large program and figuring out what the heck the original author was thinking—or if they had done something right, using their code as an example.

I have some code saved from my time at Princeton (what, you say you don’t have thirty-year-old printouts stashed in your attic?), which I can look at now through the lens of the intervening time spent earning a living as a programmer. It’s what I would expect—a projection of my BASIC experience onto C (except without GOTOs): short variable names that don’t clarify their meaning, no comments to explain what is going on or delineate different areas of the program, and repeated code that should have been pulled into a shared function (which is the term C uses for an API). I assume the code worked, although I would be hard put to verify that now by reading it. It served its purpose: to procure a grade for a class assignment and then never be looked at again.

How are all these stories about my education related? The common theme is that in all cases, I was self-taught. In high school I was fairly evidently learning on my own. But even my Princeton years are deceptive. A casual observer would note that I was taking a lot of computer science classes and I was learning a lot about how to write software. The second was a by-product of the first, though, and not a direct result. What was missing was anybody explaining what I should do before I did it wrong a couple of times, or anybody looking over the details of how I had written a program as opposed to the result that it achieved. Despite graduating with a degree in computer science, I was sorely lacking in the wisdom that I would eventually acquire, through experience, during my career as a programmer.

And it’s not just me: essentially all programmers working today were self-taught. The people who designed the Internet were self-taught, those who architected Windows were self-taught, and the people who wrote the software that is running on your microwave oven were self-taught too.

What does this mean for software engineering? The most obvious issue is that it is incredibly wasteful to have everybody figure things out from square one (or perhaps squares two and three), over and over and over again. The notion of experimenting and using the results to inform further steps, building up an engineering process on the work of those who have gone before, is almost completely absent from software development. We’re not standing on the shoulders of giants; at best they are offering us a knee up as a boost. The instructor Scott Bain, who teaches at a company called NetObjectives that offered training at Microsoft, once pointed out that there is no well-defined path to becoming a software engineer: you don’t go to college and major in a certain subject, then take a set of well-determined certification tests, then do an apprenticeship, and then become certified. You can do the college major thing, but once that’s done you hang up your shingle and say you are a programmer, and hope a company fishes you out of the ocean of similar people. And worse yet, it may be that your first year in college is already too late to start down the path, if you haven’t spent the last couple years in high school hacking away in your basement.

That makes it hard for people who want to hire programmers (either to work at a software shop like Microsoft or as consultants on a project for a business) to figure out who is qualified. But the subtler effect is that it can scare off people who are considering becoming programmers. How do you embark on the path if it isn’t well defined? Do you have to devote yourself at a young age to poring over programming manuals in your spare time? If you weren’t a member of the programming club in high school, are you permanently behind?

And the largest, most important group that it scares off is women.

In 2002, a fascinating book appeared: Unlocking the Clubhouse by Jane Margolis and Allan Fisher. The book studied students in Carnegie Mellon University’s highly respected computer science program as a basis for understanding why women are underrepresented in the industry. Although the female computer science students all appeared highly qualified, and arrived at college motivated and confident, many of them soon experienced a similar sense of inferiority. Here is a sample of quotes from students:

Then I got here and just felt so incredibly overwhelmed by the other people in the program (mostly guys, yes) that I began to lose interest in coding because really, whenever I sat down to program there would be tons of people around going, “My God, this is so easy. Why have you been working on it for two days, when I finished it in five hours?”

I’m actually kind of discouraged now. Like I said before, there are so many other people who know so much more than me, and they’re not even in computer science. I was talking to this one kid, and … oh my God! He knew more than I do. It was so … humiliating kind of, you know?

What am I doing here? So many other people know so much more than me, and this just comes so easy to some people. … It’s just like there are so many people that are so good at this, without even trying. Why am I here? … You know, someone who doesn’t really know what she is doing?35

I don’t think the men playing the role of “other people” in these quotes had an innate ability to write software; it’s that they had been practicing much longer than the women (which doesn’t excuse them for making fun of someone for taking more time to finish a program). I’ve discussed this topic with female computer science students and heard similar comments—in fact, almost stunningly similar: the same discouraging sense that the people who had been hacking away in high school knew so much more, and were so much more capable and prepared for future success (and as a former high school hacker who majored in computer science, I was inadvertently complicit in creating the equivalent environment at Princeton). The sameness of the quotes might suggest a glimmer of hope that at least these women could find solace and support in each other, and wage a determined battle against the propeller-heads. But it appears that the mental anguish was a lonely, internal battle, with each individual mind beset by nagging questions that caused these intelligent, motivated, capable women to repeatedly doubt their abilities, until one by one they dropped the fight and majored in another subject.

The underlying problem begins at an early age, according to Margolis and Fisher: “Very early in life, computing is claimed as male territory. At each step from early childhood through college, computing is … actively claimed as ‘guy stuff’ by boys and men. … The claiming is largely the work of a culture and society that links interest and success with computers to boys and men.” They write,

Despite the rapid changes in technology and some fifteen years of literature covering the era of the ubiquitous personal computer, a remarkably consistent picture emerges: more boys than girls experience an early passionate attachment to computers, whereas for most girls attachment is muted and is “one interest among many.” … Developing and exploring the computer are truly epiphanies for many of these male students. They start programming early. They develop a sense of familiarity; they tinker on the outside and on the inside, and they develop a sense of mastery over the machine.36

In addition, “Girls at nine and ten are feisty, filled with spirit and confidence, but as puberty hits, they begin to pull within themselves, doubt themselves, swallow their own voices, and doubt their own thoughts.”37 The problem is exacerbated in the years leading up to college: “In secondary schools across the nation, a repeated pattern plays out: a further increase in boys’ confidence, status and expertise in computing and a decline in the interest and confidence of girls. Curriculum, computer games, adolescent culture, friendship patterns, peer relations, and identity questions such as ‘who am I?’ and ‘what am I good at?’ compound this issue.”38

Computer science is not the only field in which women may receive societal messages that steer them away during high school. And certainly there are areas, particularly sports, where success as a professional almost always requires dedicated commitment and interest in high school, if not earlier. Yet computer science packs a one-two punch because, currently, it can be self-taught at a young age: women are losing interest in the field at precisely the same time that men are not only cultivating their interest but also learning the actual skills that propel them to a successful career, which makes it much harder to catch up in college.

In my 1988 graduating class at Princeton, according to the alumni directory, five out of forty-one computer science majors were women.39 I know one of the women had not programmed before arriving on campus, although most if not all of the others had experience similar to mine. It’s a small sample size, but more important, those are the five who stuck it out until the end; I don’t know how many other women started down the path to major in computer science but then changed their minds after the types of dispiriting experiences described in Unlocking the Clubhouse. There may also have been male computer science majors or former computer science majors who were new to programming when they arrived at the school, although every male whom I can recall discussing the topic with had been writing programs in high school.

During the time I was at Princeton, there was a campus computer network, and the school attempted an early experiment at extending it into dorm rooms. The only problem was that a year elapsed between when the administrators asked who was interested and when they ran the network cables, so you only had a network connection if the person who had lived in your room the year before had requested one. Even then, what was available on the network was quite limited; there were no websites (the World Wide Web and associated protocols not having been invented yet), so you could only connect to a few mainframe computers in an updated version of “Adam in his parents’ bedroom with the line terminal.” And even the people who had computers in their rooms had PCs, which were different from the computers on which we could work on our assignments. So for this variety of reasons, all programming for my classes was done in a computer lab in a building named after John von Neumann, the famous mathematician who had worked at the Institute for Advanced Study near the Princeton campus.

“The Neum,” as we called it (rhymes with … nothing much, but the vowel sound is the same as the word “boy”), was a below-ground bunker, whose roof, according to legend, had a half-inch-thick slab of iron embedded in it to prevent enemy powers from spying on the computers inside. I’d crank out my programming assignments during nighttime coding jags, fortified by a gallon of Wawa iced tea and a foot-long bacon cheesesteak from Hoagie Haven (which closed at midnight, so you had to plan ahead to lay in your provisions). Inside were long tables with computers (video terminals, actually)—a precursor to the open workspace that many software companies use today on the pretext of the better sharing of information, but in this case it was just the easiest way to arrange them.

Working elbow to elbow with my fellow grunts should have at least given us an opportunity to learn from each other, and possibly for the few women in the class to offer support to each other, but I don’t remember this happening; we mostly coded away in grim, solitary silence (one classmate had a girlfriend who would sit quietly next to him while he worked, which must have been somewhat boring for her and slightly nerve-racking for him). Even when I worked on a project with a partner, we usually split the work up and tackled it independently. So the benefit I got from Princeton was not the learning from my peers that was so helpful in other classes. It was that I was forced to write a lot of programs, giving me ample opportunity to figure out how to write them, debug problems, and fix those, but all on my own. I had a job working at the computer center, where we would staff various locations to answer questions, yet it was well known that working at von Neumann was an easy shift because nobody ever asked any questions. Many of us had been self-taught in high school and continued to self-teach ourselves in college.

This is ridiculous, right? The world depends on software, but are software skills really gained in marathon programming sessions as a teenager? In trying to imagine a similar situation in the field of medicine, I picture a student at a medieval medical school writing a letter home, complaining, “Everybody else is so much better at using leeches than I am … and this one kid! Back home he’s already performed three pocketknife amputations!” Indeed, it was common in the United States two hundred years ago for doctors to learn their trade by apprenticing themselves to an established doctor, but eventually the need for formal education was recognized.40 The fact that today’s programmers can on their own acquire such a head start in knowledge—not useless knowledge, but the same knowledge that was being learned (by a process independent from formal instruction) in school—is an indictment of the discipline of software engineering, not of the inexperienced students.

Several of my classmates at Princeton wound up working at Microsoft too; I once had a discussion with one of them, who had also written software in his spare time in high school, about how we really should have skipped Princeton and gone to work for Microsoft directly after high school. We were kidding, but there was a large nugget of truth in there. Assuming Microsoft would have hired us back then, we would have emerged in 1988 with a lot more experience (and money) than we did after college, and most important, we weren’t much less qualified in 1984 than we were in 1988. This is partly related to a onetime historical window, because in 1984 there weren’t a lot of people out there who had several years of experience programming on the IBM PC. But the primary reason is that as preparation for developing large pieces of commercial software, writing video games in BASIC was almost as useful as going to college and majoring in computer science. And certainly by 1988, four years of working at Microsoft would have left us far more qualified than if we’d spent the last four years earning computer science degrees. While it would be unthinkable today for a doctor, say, to skip medical school and go straight to practicing medicine, no such gap exists for programmers. Microsoft would occasionally hire a developer who had majored in something like music and watch them be as successful as the computer science majors, which was impressive for them personally, but a little strange if you think about it.

Yet this is roughly where we stand with software education today. In 2011, George Washington University professor David Alan Grier wrote in IEEE Computer magazine, “It isn’t necessary to have a bachelor of science degree to be considered a software engineer. According to the Bureau of Labor and Statistics, a software engineer is the leader of a programming or system development project, not necessarily a trained engineer. … Only in some cases, notably the most restrictive professions, does it consider ‘skills, education and/or training needed to perform the work at a competent level,’” with software engineer apparently not making the restrictive professions list. Grier then concludes, “Those who can do the work, no matter how they may have been trained, can generally find work.”41 People can gain a big leg up on their college computer science education by learning in their spare time, on their own. And unfortunately, this knowledge is often acquired at a stage of people’s lives where women, for whatever reason, tend to be less into computers than men. And excluding half the planet from your talent pipeline certainly affects your ability to hire all the qualified people you want.

All this raises the question, Why has the software industry continued to operate this way?

The software industry has evolved in just a couple of generations, leaving little time to reflect on how things are done. As Mills again notes, from 1976,

In the past twenty-five years a whole new data processing industry has exploded into a critical role in business and government. Every enterprise or agency in the nation of any size, without exception, now depends on data processing software and hardware in an indispensable way. In a single human generation, several hardware generations have emerged, each with remarkable improvements in function, size, and speed. But there are significant growing pains in the software which connects this marvelous hardware with the data processing operations of business and government.

Had this hardware development been spaced out over 125 years, rather than just 25 years, a different history would have resulted. For example, just imagine the opportunity for orderly industrial development with five human generations of university curriculum development, education, feedback for the expansion of useful methodologies and pruning of less useful topics, etc. As it is, we see a major industry with minimal technical roots.42

More important, however the software sausage is made, it is so incredibly useful that there has not been much pressure to improve how things are done. In an environment where customers are clamoring, “Give me more of that sweet, sweet software!” the industry has no real incentive to step back and try to improve things.

Fundamentally, people in the software industry see nothing wrong with being self-taught because, hey, it worked for them. Weinberg once wrote,

Another essential personality factor in programming is at least a small dose of humility. Without humility, a programmer is foredoomed to the classic pattern of Greek drama: success leading to overconfidence (hubris) leading to blind self-destruction. Sophocles himself could not have invented a better plot (to reveal the inadequacy of our powers) than that of the programmer learning a few simple techniques, feeling that he is an expert, and then being crushed by the irresistible power of the computer.”43

Unfortunately, humility is not something that programmers tend to have thrust on them. Which brings us to the real problem with programmers being self-taught: it makes them arrogant.

And why not? By dint of sheer brainpower, without ever having to go through an apprenticeship, pay their dues, submit to any standardized certification, or even get a relevant college degree, programmers have arrived at a situation where they can be paid large sums of money to pursue an activity that many of them would do in their spare time anyway, in an environment that entails no undue physical exertion or risk. What better validation could there be of their own greatness?

We’ll keep this thought in mind as we dig into what it’s like to work as a professional programmer.

Notes