Introduction

In November 1988, a computer virus attacked computers connected to the still-nascent Internet. The virus exploited a programmer error: assuming that another computer could be trusted to send the right amount of data. It was a simple mistake, and the fix was trivial, but the programming language used was vulnerable to this type of mistake, and there was not a standard methodology for detecting that sort of problem.

In April 2014, a computer virus attacked computers connected to the now-ubiquitous Internet. The virus exploited a programmer error: assuming that another computer could be trusted to send the right amount of data. It was a simple mistake, and the fix was trivial, but the programming language used was vulnerable to this type of mistake, and there was not a standard methodology for detecting that sort of problem.

Remaining stuck on “vulnerable programming language” and “no way to detect mistakes” isn’t what we expect after a quarter century of progress. Other new engineering disciplines started out producing unreliable products. In the early days of aviation, people built planes in their garages, with predictable results. Now, a hundred years later, when a world without air travel is unimaginable, we have extremely reliable planes based on well-understood and agreed-on engineering standards.

Not so with writing software. Although labeled an engineering discipline, software has few of the hallmarks of engineering, where a body of knowledge is built up over time based on rigorous experimentation. Questions one would reasonably ask of an engineered product—How strong is it? How long will it last? How it might fail?—cannot be reliably answered for software, for either an individual part of a program or an entire suite of software. Professional licensing, a hallmark of most engineering disciplines, is viewed by the software industry as a potential source of lawsuits rather than an opportunity to establish standards.

The effect of this is not just user-visible bugs; it’s also a lot of wasted effort and reinvention on the part of programmers, leading to frustration and software that is delayed or never ships.

If you’ve heard about the software industry, it might be because of the unusual way in which programmers are interviewed. Websites, books, and even weeklong training classes are devoted to preparing people for the dreaded coding interview, which is presented as an all-or-nothing chance to impress with your skills and knowledge—especially through “whiteboard coding,” in which a candidate has to dash out short programs on the whiteboard. Some candidates complain that this isn’t an accurate representation of what their daily job would be, and they want companies to focus on other areas of their background. What they may not realize is that there isn’t much else in their background to focus on. Unlike in other engineering disciplines, having a degree in software engineering does not guarantee that you understand a known corpus of programming tools and techniques, because such a thing does not exist. You likely wrote a lot of code when you were in college, but there is no way of knowing if it was any good. So asking people to write code snippets on a whiteboard is the best way we have to evaluate people.

Consider this joke, although it’s no laughing matter: What do you call the person who graduated last in their medical school class? The answer is “doctor”—because graduating from medical school and completing your residency implies that you have learned what is needed to be one. I have asked doctors how they were interviewed when they were hired. They say that they were never asked specific medical questions or to perform simple medical procedures; instead, the talk was about how they speak to patients, how they feel about new medicines, and that sort of thing—because it is understood that they know the basics of medicine. Computer science graduates can make no such universal claim.

Back in November 1990, Mary Shaw of Carnegie Mellon University wrote an article for IEEE Software magazine titled “Prospects for an Engineering Discipline of Software.” Shaw explains that “engineering relies on codifying scientific knowledge about a technological problem domain in a form that is directly useful to the practitioner, thereby providing answers for questions that commonly occur in practice. Engineers of ordinary talent can then apply this knowledge to solve problems far faster than they otherwise could. In this way, engineering shares prior solutions rather than relying always on virtuoso problem solving.” She compares software to civil engineering, pointing out, “Although large civil structures have been built since before recorded history, only in the last few centuries has their design and construction been based on theoretical understanding rather than on intuition and accumulated experience.”1 As I leaf through the publications catalog of the American Society of Civil Engineers, full of intriguing titles such as Water Pipeline Condition Assessment and Cold Regions Pavement Engineering, I can appreciate how much theoretical understanding there is in other engineering disciplines.

Looking back at the history of engineering in various forms, Shaw writes, “Engineering practice emerges from commercial practice by exploiting the results of a companion science. The scientific results must be mature and rich enough to model practical problems. They must also be organized in a form that is useful to practitioners.”2 Yet in the years since her article appeared, the software engineering community has made little progress in building up the scientific results needed to support a true engineering discipline; it is still stuck in the “intuition and accumulated experience” phase. At the same time, software has become critically important to modern life; people assume it is much more reliable than the underlying engineering methodologies can guarantee.

Shaw ends her article with the following: “Good science depends on strong interactions between researchers and practitioners. However, cultural differences, lack of access to large, complex systems, and the sheer difficulty of understanding those systems have interfered with the communication that supports these interactions. Similarly, the adoption of results from the research community has been impeded by poor understanding of how to turn a research result into a useful element of a production environment. … Simply put, an engineering basis for software will evolve faster if constructive interaction between research and production communities can be nurtured.”3

At the 2013 Systems, Programming, Languages, and Applications: Software for Humanity (SPLASH) conference, sponsored by the Association for Computing Machinery (ACM), a professional organization, a programmer named Greg Wilson gave a keynote talk titled “Two Solitudes” about this divergence between academia and industry in the world of software. After working as a programmer for a while, Wilson discovered the landmark book Code Complete, one of the first to attempt to lay out a practice of software engineering, and one of the rare software books that references research studies on software practices. Wilson realized he had been previously unaware of all this; as he said in the talk, “How come I didn’t know we knew stuff about things?”4 Then he realized that none of his coworkers did either, and furthermore, they were happy in their ignorance and had no desire to learn more. He also commented, “Less than 20% of the people who attend the International Conference on Software Engineering come from industry, and most of those work in labs like Microsoft Research. Conversely, only a handful of grad students and one or two adventurous faculty attend big industrial conferences like the annual Agile get-together.”5

The hand-wringing over software engineering has been going on since the term was invented fifty years ago. This book won’t propose a solution, although it includes suggestions at the end, but it will attempt to provide a guided tour of the path that the software industry has taken from its early days to the present.

With a couple of exceptions, the chapters are arranged in chronological order, roughly paralleling my own experience as a programmer, starting around 1980. The book is not attempting to be a complete history of the software industry; rather, it digs into specific moments that are especially important and representative. Those moments involve a succession of ideas that were touted as the one single solution to all the problems facing programmers, before inevitably falling back to earth and being superseded by the next big thing. At the same time, the gap between academia and industry has continued to widen, so that each new idea becomes less moored in research, and software drifts further from, not closer to, the engineering basis that Shaw was hoping for.

Fundamentally, the book is about a question that I have often asked myself: Is software development really hard, or are software developers not that good at it?

Spoiler alert for technophobes: there is some code in this book. Do not be dismayed. It is impossible to understand the software industry without understanding what programmers are thinking about, and it’s impossible to understand what programmers are thinking about without digging into the actual code they write. The difference between good and bad software can be a single line of code—a seemingly inconsequential choice made by a programmer. To understand some of the problems with software, you need to understand enough about code to appreciate that difference, and why programmers write the bad line of code instead of the good line.

So please read the code! Thank you.

Notes