10 
The Golden Age

If you’re like me, you dream of a day when software engineering is studied in a thoughtful, methodical way, and the guidance given to programmers sits atop a foundation of experimental results rather than the shifting sands of individual experience. Perhaps with a time machine, it would be possible to travel into the future and live in such a world.

Somewhat surprisingly, there is another way. It still requires a time machine, but you would point it in the opposite direction, toward the past. About forty-five years in the past to be precise.

After alighting in the early 1970s and locating the nearest computer bookstore, you would discover that you were in the middle of a fertile time for software engineering research. Books from that era wrestle with every problem that confronts us today, despite the fact that this period predates almost every piece of software still running. The first work on UNIX began in 1969; C was invented in 1971. Essentially everything that came before—mainframe systems running programs written in languages like COBOL and Fortran—has been replaced, with the Y2K crisis providing the final nail in many coffins. Since the software from that era is functionally obsolete, it is tempting to dismiss research from the same time period as equally outdated.

This would be a mistake. Today we have faster hardware, more expressive programming languages, and better debugging tools. But if you read the old books, it is clear that the fundamental issues have not changed. People need to learn to program, they write a lot of code, it doesn’t integrate with other code, debugging it is hard, new programmers don’t understand it, and so on. The software wasn’t as complicated as the largest programs now, but the languages and tools were also more primitive, so the difficulty was about the same—presumably at a roughly equivalent position on the spectrum of human cognitive demand.

What is different is that back then, there were a group of people in academia and industry that was taking a systematic approach to figuring out the problems as well as how to solve them. This was the period just after the NATO conferences, the paint was still drying on the term software engineering, and the discipline was being investigated the way other engineering disciplines had been investigated in the past.

Consider the 1971 book Debugging Techniques in Large Systems.1 The title would attract interest today: we work on large software systems and have to debug them. The book is not a monograph by one person; it’s a collection of papers pulled from a conference held in summer 1970 at the Courant Institute of Mathematical Sciences at New York University—the first in a planned annual series. The participants were from both the academic and industrial communities (IBM figures heavily in the industry representation), and the conference was supported by a grant from the mathematics program of the Office of Naval Research.

I own a reasonably complete collection of modern books on debugging, which I pulled together when I wrote Find the Bug, my contribution to the corpus. But none of those books is the result of a symposium; they are all in the “here are some things I figured out while working with code” vein that is so prevalent in modern software books. It’s not that the topics have changed much in the intervening decades. Debugging Techniques in Large Systems covers compilers that can catch bugs, how to design software to reduce errors, better debugger tools, how software should be tested to reliably find bugs, and the ever-elusive issue of proving program correctness.

Ironically, a lot of the advice exchanged back then was much harder to take advantage of than it would be today, because everybody was using different computer systems that ran incompatible software. It wasn’t a question of being able to come home from a conference with a new tool that you could use immediately; instead you would have an idea of how you could improve the tools available on your own system, if you chose to undertake the task. Nonetheless, there was great interest in sharing knowledge for the sake of advancing the software engineering discipline.

Unfortunately, the excitement didn’t last. When I started college in 1984, Debugging Techniques in Large Systems book was only thirteen years old, but for whatever reason I never was exposed to it or any other book about software from this era except for the original The C Programming Language reference.

I had barely heard of the software researchers who were active back in the day. I knew the sound bite version: Brooks was known for saying that “adding people to a late software project makes it later,” and Dijkstra was the guy who said “GOTO statement considered harmful.” I was oblivious to how much of what they had written about software was still relevant and would continue to be. Those who ignore history, as they say, are doomed to repeat it. If you actually read Brooks’s The Mythical Man-Month, from which I have quoted extensively in this book, he writes about documentation, communication, roles on teams, estimation difficulties, scalability of teams, code comments, and cost of code size—all things that the industry has struggled with ever since. The book came out in 1975, the year that Microsoft was founded—and yet we knew nothing of it!

Then there is Mills, one of the greats of the era whom I had never heard of until I started doing research for this book. Reading through Software Productivity, a collection of Mills’s essays written between 1968 and 1981, you are treated to a preview of almost everything that has been debated about software since then: the different roles in software, how to design software, how to test it, how to debug it, unit testing, documentation, and so on. Mills was also a bomber pilot in World War II and created the first National Football League scheduling algorithm. To be fair, he was widely read at the time and had a successful career at IBM. After he died in 1996, the IEEE created the Harlan D. Mills Award for “long-standing, sustained, and impactful contributions to software engineering practice and research through the development and application of sound theory.”2 I can’t recall ever hearing news about anybody winning it, however (for the record, Parnas and Meyer are both past honorees).

Many other researchers took a scientific approach to studying programming and programmers. Harold Sackman, in 1970, studied (among other things) the question of how much better good programmers are than bad ones.3 Maurice Halstead, in 1977, examined whether the difficulty of a given programming problem could be quantified mathematically.4 Mills wrote a paper with Victor Basili, one of the pioneers in the field, on techniques for documenting programs so programmers could understand them quickly.5 The book Studying the Novice Programmer collects a series of studies on how people learn to program, covering such topics as how programmers understand the concepts of variables, loops, and IF statements, plus the personally relevant topic “A Summary of Misconceptions of High-School BASIC Programmers” (“The students’ apparent attributing of human reasoning ability to the computer gave rise to a wide variety of misconceptions”).6

The best part about all this research, besides the fact that it involved collaboration between industry and academia, is that the authors actually told you how to write software. No more endless checklists or theories on how to manage around the mess of software development; this is truly practical advice. Do this instead of that to make your code easier to debug; do that rather than this to make it easier for another programmer to read.

You want to settle the never-ending religious debates that continuously roil the waters of software development? Since all these have purported benefits, why not have two groups of people work with code written in different styles, and see how it affected the initial comprehension, ease of modification, and general maintainability of the code? Why not indeed! Here’s Ben Shneiderman in Software Psychology, published in 1980, testing students on how different commenting styles affect readability of the same FORTRAN program.7 Do mnemonic variables names matter? Larry Weissman did a series of experiments on that, reported in 1974.8 What about indentation? Tom Love and Shneiderman were just a few of the people in the 1970s who investigated whether it helps readability.9 Is GOTO really harmful? Max Sime, Thomas Green, and John Guest looked into that in 1973, as did Henry Lucas and Robert Kaplan in 1976.10

The battle over flowcharts is a rare example of how an engineering discipline is supposed to evolve. The flowchart approach has the programmer lay out every bit of control logic—every IF statement, loop, GOTO, and so on—on a diagram before coding it up. Flowcharts can work for basic decision trees. They are the ancestors of those diagrams on the last page of Wired magazine about “Should I Do X” or “What Sort of Y Should I Buy”—all those diamond shapes for decision points with an arrow leading away for each choice. Even the visual language, the shape of the boxes, is the same as software flowcharts. They were advocated in books such as Marilyn Bohl’s 1971 Flowcharting Techniques, and continued to grace the pages of various how-to books for a while.11

The problem with flowcharts is that they don’t help with the tricky parts of program comprehension. When reading an IF statement, the issue is not realizing that there is an IF statement in the code but instead knowing whether the IF logic is correct, which is as easy to read from the code as from a flowchart. Think back to our “Did the donkey hit the car” statement in DONKEY.BAS from chapter 1:12

1750 IF CX=DX AND Y+25>=CY THEN 2060

You could redraw this as a flowchart:

11250_010_fig_001.jpg

Figure 10.1 “Did the donkey hit the car?” in flowchart form

But that doesn’t help us understand whether the IF test (CX=DX AND Y+25>=CY) is correct, which is where bugs might lurk. And for what it’s worth, you don’t know if a flowchart has been kept in sync as the code has been twiddled by intervening owners, so in the end you have to read the code anyway.

Flowcharts were eventually debunked, based on studies by Richard Mayer, the omnipresent Shneiderman, and others.13 Luckily, this meshed with the empirical feedback from programmers that they were far more trouble than they were worth for anything of larger scope than a Microsoft interview question. Brooks, in a section of an essay titled “The Flow-Chart Curse”—with a title like that, do I need the quote? I’ll provide it anyway as another illustration of how ideas that may work well for small programs break down when applied to large ones—observed that “the flow chart is a most thoroughly oversold piece of program documentation. … They show decision structure rather elegantly when the flow chart is on one page, but the overview breaks down badly when one has multiple pages, sewn together with numbered exits and connectors.”14

But that’s about the only case I can think of where a once-fashionable programming methodology has been retired based on research—and even then I suspect it was primarily programmer lassitude, not the research studies, that led to flowcharts’ extinction (I can personally recall both being advised to use flowcharts and later deciding on my own that they were a waste of time).

What was the reason for this early attention to how to engineer software? It’s hard to know exactly, but I can speculate. There is a chicken-and-egg effect when you have a new university course of study spring up in a short period of time. How was the first set of computer science professors trained when they themselves went to college before a computer science major existed? The answer was that they were mostly mathematicians; Knuth, Mills, and Brooks all had PhDs in mathematics. As the son of a mathematician, I can state with confidence that despite possibly having a reputation for thinking deep thoughts in isolation, mathematicians are extremely collaborative and spend a lot of time meeting to exchange ideas, almost always building on the work of those who have gone before.15

Furthermore, when software first became a product that could be sold to customers, it was hardware companies that were writing the software; there were no “software-only” companies like Microsoft was in its early days. Each company was producing its own hardware that was incompatible with other hardware, and it needed an operating system to run the software along with compilers and tools to allow others to write software. Customers didn’t buy computers to heat their office; they needed software for the specific problem they were trying to solve, and who better (or who at all, really) was there to write this than the company that also made the hardware? IBM, which is historically thought of as a hardware company, had to write a lot of software in order to sell its machines. SABRE, the original computer-based airline reservation system that American Airlines rolled out in the early 1960s, was written by IBM as part of a combined hardware/software deal with the airline. Mills worked in the Federal Systems Division at IBM, tasked with writing software customized for government customers.

When designing hardware, a company is doing “real” engineering: electrical engineering has built-up knowledge about circuit design, heat dissipation and power, and other topics that can’t be solved with a “this worked for me last time” approach. Companies have to rely on research, both from academia and industry. In addition, you can’t easily make changes late in the design of hardware the way you can with software; up-front design is worth the time. Presumably a hardware company would approach a software problem with the same disciplined approach.

Given that, it is understandable that early software engineering, driven by a combination of mathematicians in academia and hardware companies in industry, started down the path taken by other engineering disciplines, and you can see the results of this in the literature produced during that time. An observer in 1975 would have had reasonable confidence that the trend would continue, and that in a few decades, things would be figured out, codified, and then taught to students and reinforced through professional training.

That is not the way it turned out, to put it mildly. What happened?

In The Psychology of Computer Programming, Weinberg theorizes about a change caused by the arrival of terminals (this was back in 1971). When he talks about terminals, he means typing at a console that is still connected to a mainframe computer but allows you to edit and run programs interactively—the same rig I used to connect to McGill’s computer from my parents’ room circa 1981, except possibly with a screen display rather than a printer. This is a significant advance over older systems, where to run a program you had to submit it in person as a stack of punched cards, and then wait awhile for it to be scheduled and run, with your output being delivered to you by an operator who had access to the actual computer. Weinberg is discussing reading code as a way to improve yourself as a programmer and lamenting that this is done less than in the past:

With the advent of terminals, things are getting worse, for the programmer may not even see his own program in a form suitable for reading. In the old days—which in computing is not so long ago—we had less easy access to machines and couldn’t afford to wait for learning from actual machine runs. Turnaround was often so bad that programmers would while away the time by reading each others’ programs. …

But, alas, times change. Just as television has turned the heads of the young from the old-fashioned joys of book reading, so have terminals and generally improved turnaround made the reading of programs the mark of a hopelessly old-fashioned programmer. Late at night, when the grizzled old-timer is curled up in bed with a sexy subroutine or a mystifying macro, the young blade is busily engaged in a dialogue with his terminal.16

Wading past the imagery in the last sentence, and ignoring his offhand use of a male personal pronoun as a stand-in for programmer, Weinberg’s point is that programming with interactive terminals moves you away from the slower approach that characterized early software development, where you spent more time up front making it right, because the delay in running it was so much greater (and you had more time to chat with other programmers while standing around waiting for the operators to deliver your results). That was more like hardware engineering, where fixing a problem becomes so much more difficult once you have built physical hardware.

There was another seed germinating at this time that contributed to things veering away from the predicted path. The 1968 NATO conference in Garmisch, Germany, is remembered for the origin of the term software engineering, and the agreement between academia and industry that something needed to be done. Less remembered is the second NATO conference in Rome the following year, which did not end with the same feeling of togetherness. John Buxton and Brian Randell, editors of the proceedings, wrote the following:

The Garmisch conference was notable for the range of interests and experience represented among its participants. In fact the complete spectrum, from the inhabitants of ivory-towered academe to people who were right on the firing-line, being involved in the direction of really large-scale software projects, was well covered. The vast majority of these participants found commonality in a widespread belief as to the extent and seriousness of the problems facing the area of human endeavor which has, perhaps somewhat prematurely, been called “software engineering.” …

The intent of the organizers of the Rome conference was that it should be devoted to a more detailed study of technical problems, rather than including also the managerial problems which figured so largely at Garmisch. However, once again, a deliberate and successful attempt was made to attract an equally wide range of participants. The resulting conference bore little resemblance to its predecessor. … A lack of communication between different sections of the participants became, in the editors’ opinions at least, a dominant feature. Eventually the seriousness of this communication gap, and the realization that it was but a reflection of the situation in the real world, caused the gap itself to become a major topic of discussion. Just as the realization of the full magnitude of the software crisis was the main outcome of the meeting at Garmisch, it seems to the editors that the realization of the significance and extent of the communication gap is the most important outcome of the Rome conference.17

In other words, once people started to get away from broad recognition of the problem and into details of potential solutions, the gap between academia and industry began to manifest itself. Roger Needham and Joel Aron addressed this difference in a working paper at the second conference:

The software engineer wants to make something which works; where working includes satisfying commitments of function, cost, delivery, and robustness. Elegance and consistency come a bad second. It must be easy to change the system in ways that are not predictable or even reasonable—e.g., in response to management directives. At present theorists cannot keep up with this kind of thing, any more than they can with the sheer size and complexity of large software systems.18

The report also contains a quote from Christopher Strachey, a computer scientist from Oxford University, from a discussion that was added on the last day to address the gap:

I want to talk about the relationship between theory and practice. This has been, to my mind, one of the unspoken underlying themes of this meeting and has not been properly ventilated. I have heard with great interest the descriptions of the very large program management schemes, and the programs that have been written using these; and also I heard a view expressed last night that the people who were doing this felt that they were invited here like a lot of monkeys to be looked at by the theoreticians. I have also heard people from the more theoretical side who felt that they were equally isolated; they were here but not allowed to say anything. …

I think we ought to remember somebody or other’s law, which amounts to the fact that 95 per cent of everything is rubbish. You shouldn’t judge the contributions of computing science to software engineering on the 95 per cent of computing science which is rubbish. You shouldn’t judge software engineering, from the high altitude of pure theory, on the 95 per cent of software engineering which is also rubbish. Let’s try and look at the good things from the other side and see if we can’t in fact make a little bridge.19

He is likely referring to Sturgeon’s law, coined by the science fiction writer Theodore Sturgeon, which posits that “90% of everything is crap.”20

Knuth has stated that he feels that at the beginning of the 1970s, academics were good programmers and industry professionals were not. Yet during that decade, as the scope of software that industry wrote increased, the situation reversed itself, and by the end of the decade the academics had drifted out of sync with what was going on in industry and restricted their programming, and therefore their area of expertise, to smaller programs that were no longer useful for generating advice for industry.21 Basili stated it as, “Researchers solve problems that are solvable, not necessarily ones that are real.”22 The gap, sadly, still persists to this day, despite the occasional bright spot such as design patterns.

But really, there is one obvious cause of the decline in academic research on software engineering. And that cause is me.

Not just me, of course. It’s me and people like me: the ones who came of age just after the personal computer revolution started in the mid-1970s, which took the move to interactive terminals and accelerated it to light speed. Weinberg commented on this in the silver anniversary edition of The Psychology of Computer Programming, observing a team of programmers at a company he was consulting with in the mid-1990s: “More interesting, however, was the coincidence that all of them had learned to program before they studied programming formally in school. That’s a major change brought about by the personal computer. In my day, I had not even seen a computer before I went to work for IBM in 1956.”23

It’s not a coincidence. Beginning in the late 1970s, access to a computer no longer required that you work at a computer hardware company or be associated with a university. Anybody could afford to bring a personal computer home, free from the oversight and advice of more experienced programmers and the methodology of an engineering company, and start programming on their own. And they did just that, in large numbers, learning nothing from the past and reinventing everything, over and over. Both literally and figuratively, we never looked back. Independent software companies thrived while the software divisions of hardware companies shrank, so that the modern software industry was created by people who had never been exposed to engineering rigor.

This is the era in which I grew up as a programmer. I started using computers just as the personal computer was becoming established. But I also majored in computer science at an Ivy League university! As I’ve said before, all of us were self-taught in how to actually program, and if my professors knew of Weinberg and Mills, they weren’t talking about them to the undergraduates. I don’t know why it was like this—whether the software world appeared to be changing so fast that this looked obsolete, there was so much else to teach us that there was no room, or maybe they had tried it and been ignored by callow personal computer habitués. Possibly it was seen as not relevant, because these topics tended to get lumped under “psychology” (a term that appears in the title of both Shneiderman’s and Weinberg’s classic books) or “human factors,” which sounds awfully un-engineering-ish.

In 1976, Shneiderman helped found the informal Software Psychology Society, which met monthly to discuss the intersection of computer science and psychology, including software engineering topics. In 1982, the society put on the Human Factors in Computing Systems conference, which led to the creation of the ACM Special Interest Group on Computer-Human Interaction (SIGCHI) conference.24 Yet for whatever reason, SIGCHI focused much more on the topic of user interaction with computer interfaces rather than programmer interaction with software tools (as eventually did Shneiderman himself).25 The net effect of all this was that at Princeton, I heard about none of this research, which at the time was only about a decade old and certainly relevant then—and relevant now as well.

I met several people in my early days at Microsoft who had previously worked at hardware companies and left because writing software there felt too slow and bureaucratic: The companies followed the same process for software that they did for hardware. I mean, why not; if you can work unencumbered by established precedent and instead invent everything yourself, wouldn’t you want to? No rules! Free money! It’s the same pitch that Agile is making today: the less methodology you use, the more you will be free to create works of genius. Unfortunately, rather than trying to learn something about process from these refugees, we enabled them in their quest to be freed from their shackles.

I realize now that this was the difference between Dave Cutler, who was in charge of the Windows NT project when I worked for him early in my Microsoft career, and almost any other executive at the company. Cutler was a generation older than me and had learned the ropes at Digital Equipment, a hardware company. Working there he had acquired an understanding of the need for planning and rigor in software development. Before beginning to write the code for the first version of Windows NT, the team produced a large notebook laying out the internal details of the system, focusing heavily on the APIs provided by each section; a copy of this notebook is now preserved in the Smithsonian Institution.26 This was before I arrived on the team; I’m not sure what I would have thought of this activity if I had observed it in person. I likely would have wondered why we weren’t jumping in and writing code. For that matter, I’m not sure what Bill Gates thought of it (although clearly he allowed Cutler to do it his way). Gates was just young enough that he was able to learn to program on his own, on a terminal like the one that Weinberg accused of leading programmers down the primrose path.

Wilson’s “How come I didn’t know we knew stuff about things?” moment, described in his SPLASH 2013 keynote, was inspired by his discovering, after ten years in the industry, the 1993 book Code Complete by Steve McConnell.27 This was one of the first books that attempted to assemble wisdom about how to write software. It deserves special mention because it does refer to academic studies to back up its recommendations, at least in areas where studies had been done, such as “What is the right number of lines of code in a single method?” For what it’s worth, the consensus on method length from the studies McConnell looked at was that around two hundred lines was getting to be too long.28 The topic has since been unmoored from any research and now bobs merrily in a sea of impassioned verbiage, such that there are scarcely any positive numbers for which somebody doesn’t consider that many lines in a method to be too many. Some people claim that the instant you feel you need a comment in your code, you should instead move that code into a separate method with a sentence-length, camel-cased method name, with said method name serving as the complete documentation; these people walk among us, undetected by the institutions meant to protect a civil society.

Most of the studies cited by McConnell were at least ten years old, since this type of investigation had mostly dried up by then (a second edition of the book, published in 2004, barely unearths any new studies). But at least he referenced them where he could. He even spends five pages discussing Hungarian notation, presenting arguments pro and con without choosing a winner.29 This is not surprising since Hungarian notation, being a product of industry, has never been formally studied—with both sides instead preferring to continue hurling invectives at each other (in the second edition, he cuts the treatment of Hungarian in half and genericizes it as “standardized prefixes,” but he also excises most of the arguments against it, leaving the reader with the impression that it’s a good idea).30

The IEEE Computer Society, a professional association with similar goals to the ACM, created the Software Engineering Body of Knowledge (SWEBOK), which is summarized in the book SWEBOK 3.0: Guide to the Software Engineering Body of Knowledge, known as the SWEBOK Guide (the ACM was initially involved in SWEBOK, but pulled out after disagreement on the direction it was taking).31 This initiative has a cargo cult aspect to it; other engineering disciplines have bodies of knowledge, so maybe if we create one of our own, we will acquire the engineering rigor that they possess. Essentially the IEEE has assembled the current wisdom on software engineering, without passing judgment on the actual value of it.

Given that API design is one of the most critical areas of software engineering (McConnell spends an entire chapter on the subject in Code Complete), it is instructive to see what the SWEBOK Guide has to say about it. Admittedly the book is less expansive than Code Complete, but still it is deflating to find only a quarter of a page devoted to such an important topic. After explaining what an API is, it states that “API design should try to make the API easy to learn and memorize, lead to readable code, be hard to misuse, be easy to extend, be complete, and maintain backward compatibility. As the APIs usually outlast their implementations for a widely used library or framework, it is desired that the API be straightforward and kept stable to facilitate the development and maintenance of the client applications.”32 That’s it. This advice is not wrong, although possibly mildly contradictory, but it’s woefully incomplete. And what does “should try” mean? Nowhere does it state how to accomplish all these goals or give any references to studies of them.

The SWEBOK Guide notes that it doesn’t contain detailed information, but points the reader to other literature: “The Body of Knowledge is found in the reference materials themselves.”33 In the case of API design, the redirect is to the book Documenting Software Architectures, which is a reasonable book about documenting your software design at various levels of granularity, including down to the individual API layer—yet it is about documenting a design that has been created, not about how to create it in the first place.34

Meanwhile, the three-part book Software Engineering Essentials, which is meant to provide more detail on SWEBOK and matches it point for point, has this to say about API design, in toto:

An API (application programming interface) is a language and message format used by an application program to communicate with the operating system or some other control program such as a database management system. An API implies that some program module is available in the computer to perform the operation or that it must be linked into the existing program to perform the tasks.35

There is no great insight there; it is only a definition of the term, attributed to PC Magazine Encyclopedia.

Much of what has been espoused in software engineering in the last twenty years—Agile development, unit testing, the debate about errors versus exceptions, and the benefits of different programming languages—has been presented without any experimental backing. Even object-oriented programming itself has not been subjected to rigorous testing to see if it is better than what came before or just more pleasing to the mind of programmers. As one metareview of the few studies of object-oriented programming put it in 2001, “The weight of the evidence tends to slightly favor OOSD [Object-Oriented Systems Development], although most studies fail to build on a theoretical foundation, many suffer from inadequate experimental designs, and some draw highly questionable conclusions from the evidence.”36

There are a few stalwart researchers who have continued to do experimental investigations into software engineering. Basili (an IEEE Mills award winner in 2003) deserves special mention, as one of the first and longest practitioners. In addition to a lengthy career as a professor of computer science at the University of Maryland, he spent twenty-five years as director of the Software Engineering Laboratory at NASA’s Goddard Space Flight Center. In honor of his sixty-fifth birthday, the book Foundations of Empirical Software Engineering was published in 2005, collecting twenty essays from throughout his career.37 If your curiosity is piqued by titles like “A Controlled Experiment Quantitatively Comparing Software Development Approaches” and “Comparing the Effectiveness of Software Testing Strategies,” then I encourage you to learn more about empirical studies. But too often his sort of work winds up in journals like Empirical Software Engineering or the Journal of Systems and Software, and never crosses over into industry, while working programmers flock to conferences on Agile development and other trendy topics.

When all of us “young blades” banded together in the early 1980s and mounted our successful assault on mainframe computers, we threw the engineering discipline baby out with the mainframe bathwater. The challenge for software engineering is how to get it back.

Notes