9 Agile

In February 2001, another convergence occurred in the area, when seventeen men met at the Snowbird Resort to discuss a common enemy.²

Who were these men? They were an assortment of Americans, Canadians, and Britons, plus one Dutchman. The group included Dave Thomas, coauthor of the influential 1999 book The Pragmatic Programmer; Alistair Cockburn, author of Writing Effective Use Cases; Martin Fowler, author of UML Distilled; Kent Beck, inventor of extreme programming (and unit testing); and Ward Cunningham, who invented the wiki. These are all various software techniques that could be characterized as lightweight (except for the wiki, which is just cool).

Their enemy was software development methodologies with names like Rational Unified Process and Capability Maturity Model, which were gaining traction, for lack of any opposition, as potential best practices for software development (or at least, as the best practices for managing software development). They could be termed heavyweight—lots of writing specs up front and defining specific milestones to be checked off during the process. Despite their lofty names, they weren’t making software development any more predictable.

Microsoft, certainly, had been beset by extremely public delays in the previous decade. The first version of Windows NT, on which I labored after joining the company in 1990, was initially estimated to take two and a half years but wound up taking nearly five.³ Windows 95, which shipped to customers in August 1995, was originally planned to be Windows 93.

All seventeen men at the meeting in Utah had what they considered to be a better idea, although they didn’t all have the same better idea. There had been cross-pollination, and many of them had worked or written together at various times, but they weren’t all pushing the same idea so much as pushing against the same set of them. Some of the seventeen were talking about better ways to specify software, some about better ways to produce time estimates for it, some about better ways to design it, some about better ways to actually write it, and others about better ways to coordinate work. But they recognized that as seventeen separate outposts, they were not making much progress in the battle against heavyweight processes, and they decided that combining forces would give them more leverage.

During that 2001 meeting in Utah, the group adopted the word agile to unify their efforts. The term originated with Fowler and has a nice ring to it. Agile sounds better than whatever slow, stodgy term it is the opposite of, and certainly a lot better than lightweight; in the movie The Karate Kid, Ralph Macchio starts out lightweight, but ends up agile and overcomes his bullies.

The main output of the Snowbird gathering was the “Manifesto for Agile Software Development”:

Agile is more of a branding exercise than any single approach, so a software development team announcing “we are Agile” doesn’t mean much; it primarily signifies being au courant with progressive software development. Agile has oozed out into the world beyond software. My brother and sister, who work in old, well-respected, and time-proven fields (transportation engineering and scientific publishing, respectively), have over the past decade been hearing about Agile methodology and how it could help them.

The most well-known technique under the Agile umbrella is known as Scrum. The term comes from the scrum in rugby, where two teams link arms at the beginning of the match and run into each other in an attempt to get the ball (visualize “link arms,” not “run into each other”). Scrum had been kicking around for about a decade before the Agile Manifesto was written, after being originally presented in a paper by Ken Schwaber and Jeff Sutherland at OOPSLA 1995.⁵

At its heart, Scrum is the Agile Manifesto mapped onto software project management. Programmers working on a Scrum team meet briefly every day to provide status and ask for help if needed, which precludes the need for any formal system to track dependencies between their work (“individuals and interactions over processes and tools”); aim to deliver new features in small increments rather than pieces of code that will only work once they are all completed and stitched together (“working software over comprehensive documentation”); rely on customer feedback on the delivered increments to figure out what to do next as opposed to planning a larger deliverable up front (“customer collaboration over contract negotiation”); and view changing customer requirements as a positive sign that people are using their stuff, rather than an excuse to complain (“responding to change over following a plan”).

Scrum is about how to manage software projects, not about how to write the code. This is not a secret; the “Scrum Guide” website states, “Scrum is a process framework that has been used to manage complex product development since the early 1990s. Scrum is not a process or a technique for building products; rather, it is a framework within which you can employ various processes and techniques.”⁶ In fact, one key assertion behind Scrum is that there exists no solid process or technique to develop software, but that’s OK; as Fowler writes, “A process can be controlled even if it can’t be defined.”⁷ This is certainly in contrast to what Mills wrote a generation earlier: “My approach to software has been that of a study in management, dealing with a very difficult and creative process. The first step in such an approach is to discover what is teachable, in order to be able to manage it. If it cannot be taught, it cannot be managed as an organized, coordinated activity.”⁸

It is often stated that Scrum replaced a software development process known as waterfall. Brooks, in his famous essay “The Mythical Man-Month,” gives the following advice on splitting up time within a project: “⅓ planning,

coding, ¼ component test and early system test, ¼ system test [once] all components [are] in hand.” His goal was to prod managers to allow more time for testing (and to a lesser extent, planning): “In examining conventionally-scheduled projects, I have found that few allowed one-half of the projected schedule for testing, but that most did indeed spend half of the actual schedule for that purpose. Many of these were on schedule until and except in system testing.”⁹ In other words, testing will extract its pound of flesh, whether you budget adequate schedule time or not; if you don’t, then the project will slip.

What was implicit in that guidance was the one-way flow of the development process: first you plan, then you code, then you test each component, and then you test the whole thing together. This is where the word waterfall comes from, since the process is like water going over a fall. You don’t reopen the planning process after coding has started, nor do you begin coding before the planning is complete. And you don’t start testing until you reach “code complete” (fixing bugs found in testing can involve writing code, but the idea is to confine yourself to fixing bugs as opposed to making enhancements; in Knuth’s framing of the distinction, you should always feel guilty, never virtuous). Brooks was arguing for changing the division of time between the phases, but not for changing the unidirectional transit through them.

In 1995, Brooks came out with a twenty-year anniversary edition of his book The Mythical Man-Month, in which the original essay of the same name had appeared. He included a new essay titled “The Mythical Man-Month after 20 Years,” which states that “the waterfall model is wrong.”¹⁰ Brooks was responding directly to his own original essay “Plan to Throw One Away,” which claimed that the first version of any system is going to be terrible, “too slow, too big, awkward to use, or all three,” and “the only question is whether to plan in advance to build a throwaway, or to promise to deliver the throwaway to customers” (as he puts it, “Seen this way, the answer is much clearer”).¹¹ Brooks had originally advocated building a “pilot system,” that first terrible system, but never delivering it to your customer and instead starting over with the knowledge you have gained.

I never particularly liked this advice; it is un-engineer-y to not be able to build software that is usable the first time. Brooks did state that chemical engineering plants are built this way, with a smaller plant used to test a process, but presumably this is only done once for a given chemical process, not for every plant that uses the same process. Long bridge designs are based on information gathered from building shorter bridges, yet not every long bridge needs a specific shorter version constructed to prove that the real one won’t fall down.

So I was glad to see that Brooks, with twenty years of hindsight, had changed his mind about “Plan to Throw One Away” (this was retroactive gladness years later; like most programmers, I wasn’t reading his books in the 1990s, a period when Microsoft was heedlessly using a waterfall-ish approach). His main complaint was that his earlier advice accepted the waterfall model as fact: “Chapter 11 [the ‘Plan to Throw One Away’ essay] is not the only one tainted by the sequential waterfall model; it runs through the book, beginning with the scheduling rule in Chapter 2 [the ‘Mythical Man-Month’ essay].”¹²

“The basic fallacy of the waterfall model is that it assumes a project goes through the process once,” Brooks continues,

Rather than mimicking the smooth flow of water over a waterfall, the process was beginning to look more like riding a barrel over one: moments of sheer terror followed by getting crushed in the churn at the end.

Brooks mentions that the waterfall model was enshrined in a US Department of Defense specification for all military software. Back in the realm of users who pay for their own software, he recommends, “implementers may well undertake to build a vertical slice of a product, in which a very limited function set is constructed in full, so as to let early sunlight into places where performance snakes may lurk.”¹⁴ It’s not just performance snakes; it’s any issues the users might encounter. If you know your software will need a database layer, rather than start by implementing all the functionality you think you will eventually require, you write the minimal database support needed for a single user-visible feature, along with the minimal support needed in any other layers, and present that to the customer; that is the ultimate way you will determine if your database is designed properly. Then you go back and work on another feature: lather, rinse, repeat.

It’s not just the users who prefer vertical slices; it’s also the programmers. Brooks remarks that at one point during his tenure at the University of North Carolina, “I switched to teaching incremental development. I was stunned by the electrifying effect on team morale of that first picture on the screen, that first running system.”¹⁵

Scrum focuses aggressively on delivering new functionality to the user as often as possible. The timeline for delivery is known as a sprint, and typically lasts two to four weeks. Around the sprint, Scrum has certain artifacts, including the product backlog (a list of all work that could be considered for future sprints) and burndown chart (which tracks the total estimated work remaining in this sprint, hopefully hitting zero at the end of the sprint—with the caveat that only items that are entirely complete can have their hours crossed off). At the beginning of each sprint, the team selects what it feels are the right product backlog items (right in terms of both “what the customer wants next” and “what we think will fit in one sprint”) and then spends the rest of the sprint working to deliver those items. The term sprint is frequently misinterpreted to imply frenetic activity that leaves you burned out, but the goal is that teams can work sprint after sprint without breaks. A sprint should feel more like one mile of a marathon, not a mad dash to the finish line.

Since Brooks was writing about replacing the waterfall model with the incremental delivery of vertical slices in the same year that Scrum was being presented at OOPSLA, it might appear that Schwaber’s delineation of Scrum represented the recording of an already-emerged consensus about how to move away from the waterfall model of development. Actually, Scrum was a much more aggressive departure.

Schwaber’s paper was presented as part of a larger OOPSLA workshop titled “Business Object Design and Implementation,” where a “business object” is a reusable software component that can be knitted together with other business objects to create applications—the familiar object-oriented dream. The Scrum paper doesn’t have anything particular to do with this. The summary paper from the entire business object workshop merely states, “New systems will require that loosely coupled, reusable, plug compatible components be constructed using a tightly coupled development method that combines business process reengineering, analysis, design, implementation, and reusable component market delivery systems similar to today’s custom IC chip industry.” Yet it also includes the following summary of the Scrum paper: “The stated, accepted philosophy for systems development is that systems development process is a well understood approach that can be planned, estimated, and successfully completed. This is an incorrect basis.”¹⁶ Essentially, Schwaber was saying that the premise of the rest of the workshop was faulty; there was no “tightly coupled development method,” nor will there ever be one.

The mid-1990s, when Brooks wrote his updated essay and Scrum was getting started, was also the time that the book Microsoft Secrets came out. The book lays out various principles, as reported by Microsoft employees, for how the company handles development, including “work in parallel teams, but ‘synch up’ and debug daily,” “always have a product you can theoretically ship,” and “continuously test the product as you build it.”¹⁸ This sounds marvelously Agile, but having worked on a large Microsoft product in the early 1990s, I know that the techniques used were a far cry from what Scrum was advocating. Brooks, in his 1995 update, is impressed at learning that Microsoft builds and tests its software every night.¹⁹ The reality is that while we did ensure that the software built every night—meaning that it produced a compiled program without hitting any errors—and did do minimal testing each day, it was a product we could “theoretically ship” only in the most theoretical meaning of the word theoretical. Our software was developed in milestones lasting six to nine months each, and although we did not follow a strict waterfall process during each milestone, we definitely back-loaded the testing, and only toward the end of any given milestone was the software reliable enough to release externally. Effectively our “sprint” duration, and the timeframe for the “slices” we delivered, was six to nine months—much longer than the two to four weeks that Scrum advocates. Even equating the milestones with “long sprints” is wrong, because Scrum explicitly contraindicates the “mini waterfall” approach to a sprint, where you might split your four weeks into a week of planning, two weeks of coding, and one week of testing. Every feature delivered during a sprint is supposed to be ready to ship to a customer on the day it is completed.

So while many companies had moved away from the pure waterfall model to something a bit more iterative (or never used pure waterfall to begin with), Scrum made for a dramatic acceleration of that movement.

Even more dramatic is the second-best-known Agile methodology, Extreme Programming (known as XP). Invented by Beck (who had earlier originated unit testing in its current form), XP is based on a set of rules around the planning, managing, designing, coding, and testing of software.²⁰ Planning and managing are Scrum-like, with daily meetings and a focus on frequent small releases, although XP is more prescriptive in some cases. The guidance on the design/coding/testing phases gets into actual specifics of how the software should be engineered, which Scrum ignores. The key to the approach is writing unit tests, and ensuring that those unit tests are run often, and writing new unit tests whenever new code is added or a bug is found that snuck past the current set of unit tests.

XP also mandates the somewhat-controversial practice of pair programming, in which two programmers work together at all times, sharing one computer; typically one is coding while the other is watching. The idea here is, quite literally, that two heads are better than one; the second programmer provides a continuous code review of what the first programmer is doing (having a second programmer watching also discourages unnecessary web-based diversions while you are supposed to be working, which no doubt has an effect on productivity).

XP does attempt to avoid some of the “religious” arguments about coding. It mandates that a coding convention be written down and adhered to—without specifying a particular coding convention, but at least ensuring that arguments over proper coding style will happen once, be resolved in some way, and thereafter not brought up again (tabs or spaces, just pick one and go with it). On the question of making your code flexible to anticipate future changes, XP is clear: don’t do it. Write the code for the requirements of the feature you are working on now, and if the requirements change, because of a new feature or user feedback, modify the code then. Since the code will have good unit tests, you can make these future modifications without worrying about accidentally breaking something because your understanding of the code is not fresh in your mind. And until you have new requirements, you won’t know what changes are needed, so it is foolish to attempt to anticipate them now. Any opinions to the contrary can be dismissed with the mantra “You Ain’t Gonna Need It” (YAGNI).

Scrum and XP are designed for small teams; the daily meeting doesn’t scale well beyond ten to fifteen people because the chance of any one programmer caring about another programmer’s status decreases as the number of attendees rises. Schwaber and Sutherland’s original 1995 OOPSLA paper starts out with an axiom: “Small teams of competent individuals, working within a constrained space that they own and control, will significantly outperform larger development groups.”²¹ It is hard to argue with this if performance is based on a metric like “code delivered per person”; it is generally understood, in areas far beyond software engineering, that the communication and coordination overhead will increase as you work on larger projects. Nonetheless, it is misleading to imply that small teams will produce more software than larger ones of any size.

Beck writes, “Size clearly matters. You probably couldn’t run an XP project with a hundred programmers. Not fifty. Not twenty, probably. Ten is definitely doable,” and later says, “If you have programmers on two floors, forget it. If you have programmers widely separated on one floor, forget it.”²²

Another problem is that when you are working on the first version of a piece of software, it can be hard to produce anything that users can use in two to four weeks, or even in small multiples of two to four weeks. Before Windows NT could start shipping public releases at the end of six- to nine-month milestones, it took several years to create anything that was usable at all, because so much of the internals of an operating system need to be written before it can handle a single request from a user.

Microsoft used to be feature driven in its products, meaning that teams would establish a planned set of features and then work until they were available, accepting whatever schedule slip was needed; at a certain point the company switched to being date driven, where teams would set a date and only include the features that could be completed by that date, cutting features in the middle of a project if they looked to be in danger. This made things much more predictable for customers. But this is a luxury that is available when you have an existing product; for the first version of Windows NT, the critical feature “the operating system works” could not be cut. Date-driven scheduling is heralded as a breakthrough in project management, but it’s no coincidence that the switch away from being feature driven happened around the time that all of Microsoft’s major products (Windows, Office, its compilers, the SQL Server database program, and the Exchange e-mail server) had established, working versions, which could then have individual features added on (or not) in subsequent versions.

In fact, the original OOPSLA Scrum paper states, “Scrum is concerned with the management, enhancement and maintenance of an existing product, while taking advantage of new management techniques and the axioms listed above. Scrum is not concerned with new or reengineered systems development efforts.” By the time Schwaber’s first Scrum book came out in 2002, this distinction had been lost, and Scrum was presented as applicable to both new and ongoing projects.²³

As an aside, the OOPSLA paper states another axiom: “Product development in an object-oriented environment requires a highly flexible, adaptive development process,” and later says, “Object Oriented technology provides the basis for the Scrum methodology. Objects, or product features, offer a discrete and manageable environment. Procedural code, with its many and intertwined interfaces, is inappropriate for the Scrum methodology.”²⁴ I’m not sure what being object oriented has to do with it; procedural programming also required a highly flexible, adaptive development process. If you believe the loudest object-oriented supporters, procedural programming would require even more flexible processes since it is missing the special sauce that object-oriented programming provides. The vaguely implied equating of “objects” and “product features” makes me think this was either a sop to the OOPSLA crowd or reflection of the heady early days of the object-oriented frenzy. For what it’s worth, I have read several books on Scrum (including Schwaber’s, which makes no mention of this), become a Certified Scrum Master, and taught Scrum to teams inside Microsoft for several years, and I never heard that Scrum was unsuited to procedural programming or observed problems with Scrum that were unique to teams using procedural languages.

Mills once described courses available to programmers as “new names for common sense,” and while common sense is better than a lack of it, Scrum is still tackling the easiest problem in software: small teams working for a single customer on incremental improvements to an already-functioning piece of software.²⁵ This is not to say it is not useful; teams in those situations were taking archaic approaches, such as using a waterfall model to deliver the complete solution before getting any customer feedback, and Scrum can certainly get them on a better path.

Where waterfall attempts to plan out the details of a project carefully and predict a completion date, Scrum states that the team will work diligently on pieces of a project (the “best that a development team can devise” from Schwaber’s original paper—meaning, “trust us and stop nagging”), and in the right order, always focusing on delivering working code to the user at the end of each sprint. The Agile approach to preventing long schedules that slip is to avoid long schedules. As Schwaber and coauthor Mike Beedle explain, “Several studies have found that about two-thirds of all projects substantially overrun their estimates,” and they address the “risk of poor estimation and planning” this way: “Scrum manages this risk … by always providing small estimates. … Within the Sprint cycle, Scrum tolerates the fact that not all goals of the Sprint may be completed.”²⁶ Beck talks about “schedule slips—the day for delivery comes, and you have to tell the customer that the software won’t be ready for another six months,” and explains, “XP calls for short release cycles, a few months at most, so the scope of any slip is limited.”²⁷

In other words, these methodologies don’t change the fact that software engineers are bad at estimating; they just keep the estimates short, so that even a significant slip, in percentage terms, is not that bad in calendar terms. To be fair, they do emphasize frequent delivery of working code to customers, which allows customer-driven course correction as needed and encourages team members to complete higher-priority work first, which is a step in the right direction (absent this nudge, they would tend to tackle the most interesting technical problem first). And Agile proponents point out, accurately, that if a team stays together and works on similar kinds of work, it will become better at estimation—but that is not new information or unique to Agile.

Socrates is quoted in Plato’s Apology as saying, “I am wiser than this man; it is likely that neither of us knows anything worthwhile; but he thinks he knows something when he does not; whereas when I do not know, neither do I think I know; so I am likely to be wiser than he is to this small extent, that I do not think I know what I do not know.” In this sense Scrum, which says that software projects are inherently uncontrollable, is wiser than the waterfall methodology, which proposes to control them without knowing how. Although Scrum proponents may want to heed a follow-on insight from Socrates: “The good craftsmen seemed to have the same fault as the poets: each of them, because of his success at his craft, thought himself very wise in other most important pursuits, and this error of theirs overshadows the wisdom they had.”²⁸

Agile is not really the opposite of waterfall, partly because true waterfall wasn’t used much at the time that Agile came along. If you are looking for an approach that is as far from Agile as possible, it is the Personal Software Process (PSP) and Team Software Process (TSP), developed at the Software Engineering Institute (SEI), a software think tank at Carnegie Mellon University, under the guidance of Watts Humphrey, a longtime manager of software teams at IBM.

The PSP approach is laid out in the preface to Humphrey’s 1995 book A Discipline for Software Engineering:

The PSP solution is to take practices used for large-scale software development—and Humphrey, at IBM, was managing some of the largest-scale software development of his day—and scale them down to work for single-person programs. Having programmers use these techniques on small programs would prepare them for proper large-scale software development (which is addressed in phase two of Humphrey’s plan, the TSP). In this approach, the PSP is the exact opposite of Agile, which takes techniques optimized for small teams and then implies that they will work for larger teams.

Recognizing that this sounds a lot like exhorting people to eat their vegetables, Humphrey throws down a little challenge: “The PSP is a self-improvement process. Mastering it requires research, study, and a lot of work. But the PSP is not for everyone. Recall that the PSP is designed to help you be a better software engineer. Some people are perfectly happy just getting by on their jobs. The PSP is for people who strive for personal achievement and relish meeting a demanding challenge.” He also includes a chapter on how to stay the PSP course even if you are the only person on the team using it, when your manager and coworkers are giving you funny looks.³⁰

The PSP relies heavily on counting three things—lines of code, defects, and time—and performing various mathematical calculations among them, so as to enable predictions about the future. It makes use of formal code inspections—a group activity first proposed by Michael Fagan from IBM in 1976.³¹ Formal inspections differ from the individual code reviews we encountered in chapter 3 in a variety of important ways: the inspectors are given time ahead of the meeting to read the code; guidelines on what to look for in reviews are created and kept updated; the meeting has a formal leader to keep things moving; and the results of the inspection (the number of defects found per line of code) are tracked and analyzed.³²

My first experience with any kind of formal code review at Microsoft was back in 1993, when I was working on low-level networking code in Windows NT. A group of us sat down with printouts of my code and started walking through them. The date was January 20, the day of Bill Clinton’s presidential inauguration, and the day a windstorm swept through the Seattle area and knocked out power to Microsoft. Undaunted, we gathered in a conference room near the window—and completely missed the fact that a tree had fallen on a catering truck outside our building, with the truck’s cargo of pizzas subsequently being distributed free to everybody in the building, except for those of us hunkered down doing a code review in the fading light.

Beyond that trauma, I can see in retrospect that this was nothing like an inspection is supposed to be. Nobody had read the code ahead of time, and the results were not tracked; it was more like a set of parallel individual code reviews based on whatever surface-level faults could be spotted. In a report reprinted in a book by Tom Gilb and Dorothy Graham, these individual, ad hoc code reviews were described as “the least effective, but most used, of all defect removal techniques.”³³ Meanwhile, SEI (and Gilb and Graham, for that matter) has research demonstrating real improvements from code inspections.

On the other hand, like Agile, the PSP doesn’t say much about how to write code—how to put line B after line A. The closest it gets to talking about actual software design is to mention that both top-down and bottom-up designs can be useful in different situations, as can starting in the middle, and that focusing on vertical slices can be a good idea, but so can building the entire system up in layers.³⁴ Since it’s the PSP, you spend time thinking about your strategy up front and also ruminating about whether it went well afterward, which is better than blindly using the same strategy as the last time, but doesn’t provide much guidance on how to proceed in a new problem area. Scrum at least gives you concrete guidance, to concentrate on vertical slices, which may be bad advice in a given situation, but at least prevents dithering—since in the end, given the one-off nature of most projects, it is hard to know if a different design strategy would have worked better.

I was never exposed to the PSP until I worked in Engineering Excellence at Microsoft and we taught some of its concepts in our courses, but I can see why programmers would instinctively recoil from it. Thinking about all that tracking, just for my own personal improvement, makes my head ache. Thinking about the PSP book makes my arm muscles ache; at over 750 pages, it’s the longest software engineering book I know (the TSP book clocks in at a relatively svelte 450 pages).³⁵ It spends an entire 30-page chapter talking about how to count lines of code (admittedly a subject of debate in programming circles). Per PSP, you are supposed to count every syntax error that the compiler catches as a defect, which is duly logged for future analysis. Fixing compilation errors is a mechanical, annoying task, but one that doesn’t take that long; adding in the mechanical, annoying task of logging the errors makes me shudder (and you are further supposed to classify the errors into one of about twenty different categories).³⁶

There is even debate in PSP circles about whether it makes sense to do a code review before you compile the code the first time, which instinctively seems like a waste of time. Why spend time looking for errors that the compiler can catch in a few seconds? Ah, says the PSP data, but around 10 percent of errors that you would expect the compiler to catch are not caught because they inadvertently form valid syntax, and those are particularly sneaky bugs to figure out later.³⁷ And having all the bugs available to be found in a pre-compiler review makes the code a more target-rich area, which makes the code review more rewarding and hence more likely to be taken seriously.

It all does make a certain sense; classifying compiler errors by type could be a lot of work, but if I realize that I make specific kinds of mistakes more than others, I can focus on avoiding those and become more efficient, and yet. … I don’t consider myself somebody who is “just happy getting by on their jobs,” in Humphrey’s accusatory phrase, but I can well understand why in Engineering Excellence our Scrum courses had much better uptake than our PSP-inspired estimation and inspection courses, both of which I felt were more relevant inside Microsoft than Scrum training. When you’ve achieved a level of success being self-taught, it is much easier to accept a methodology like Scrum, which says that even the limited amount of tracking you are being asked to do is unnecessarily hobbling you, than one like PSP, which says that you need to do more of it.

For PSP to take hold as the natural order of things it would be helpful for it to be instilled early on, but for many programmers, “early on” is during high school; few people trying to hack out a quick mobile app or website are going to worry about something like PSP, if they have even heard of it. Also, the idea built into the PSP of taking processes appropriate to large-scale projects and scaling them down to small programs, where they are not needed except as training for future work on large programs, makes PSP a tough sell. It is not even taught to undergraduates at Carnegie Mellon, the university with which SEI is associated.

Fundamentally, Scrum in particular and Agile in general are optimistic: assume things will go well, trust your team, and fix the process if needed. PSP and other command-and-control techniques are pessimistic: assume things will go terribly wrong unless you invest a significant percentage of your time in preventing problems. As a manager, I much preferred taking an optimistic approach, and I think the people who worked for me appreciated it too. But we still did a lot of planning and tracking that went well beyond what any Agile methodology would recommend.

Agile is correct in recognizing that trying to figure out up front how long a software project will take, given our current techniques for both estimation and software engineering, is a fool’s errand. A favored tactic of managers, before Scrum and XP gave programmers the cover needed to tell them to butt out, was to ask a programmer for an estimate in the early days of a project, when they didn’t yet know enough about its details to be able to make a decent estimate, and then hold the programmer to that seat-of-the-pants estimate. In a book on software estimation (appropriately subtitled Demystifying the Black Art), Steve McConnell writes about the Cone of Uncertainty: the fact that the error range for software estimates starts out large, yet gradually shrinks as you begin to do more investigation into how you will implement the work, and shrinks even further after coding begins.³⁸ Asking a programmer to provide an estimate at the widest part of the cone, and then never revisiting it, is the worst-possible approach. (He further points out that having individual programmers supply estimates, rather than trying a “wisdom of the crowds” approach of asking multiple people for an estimate even if they won’t be the ones doing the work, further contributes to the inaccuracy of estimations, no matter at what point on the cone they are done.)³⁹

In The Soul of a New Machine, his Pulitzer-Prize-winning book about the engineering of a new minicomputer at a company called Data General in the late 1970s, Tracy Kidder describes how early estimates became etched in stone:

In a 2000 study of a high-tech company, Ofer Sharone referred to this situation, in which employees self-impose the type of workplace pressure that one would expect to come from their managers, as one part of what he calls “competitive self-management,” which can “engender intense anxiety among [the company’s] engineers regarding their professional competence.”⁴¹ The other part is grading employee performance on a rigid curve; I don’t know if this was in effect at Data General, but it certainly is at a lot of software companies.

The Soul of a New Machine was about hardware development, but it still provides the most accurate depiction I have ever read of what it is like to work on a large “version 1” software project. Creating the first version of a new computer (it was Data General’s first 32-bit minicomputer) is an all-or-nothing endeavor; you can’t ship half a computer. The result was high pressure and long hours, including this portrait of “signing up” in action, between a programmer named Dave Epstein and his boss, Ed Rasala:

There is a joke about software: “The first 90 percent of the work takes 90 percent of the time. The last 10 percent takes the other 90 percent.” Since estimates almost always grow rather than shrink, as new unaccounted-for work is discovered during the implementation, asking programmers for an estimate early on, when the Cone of Uncertainty is at its widest, winds up committing the programmer to a schedule that is too aggressive. Yet managers can justify it because they are doing the noble thing and building up a schedule from the programmers’ own estimates as opposed to mandating one from above.

Schwaber and Beck had presumably felt the pain of this. Schwaber and Beedle are clear that estimates are only for planning out a sprint and are not considered binding.⁴³ Beck mandates a forty-hour week, with few exceptions: “The XP rule is simple—you can’t work a second week of overtime. For one week, fine, crank and put in some extra hours. If you come in Monday and say, ‘To meet our goal, we’ll have to work late again,’ then you already have a problem that can’t be solved by working more hours.”⁴⁴ (Epstein, in the excerpt from The Soul of a New Machine above, did finish his project in the four weeks he signed up for, although clearly working more than forty hours a week.)

I confess that when I read The Soul of a New Machine, rather than being turned off by the descriptions of signed-up engineers working crazy hours, I wanted to be involved in such a project. It wasn’t just the notion of going out in a blaze of glory. A project like that, given the urgency to deliver, promised a freedom to do what was needed, bypassing whatever rules or conventions you felt were in the way. It was the same freedom that Agile promises to programmers—it’s just that we spent a lot more time savoring that freedom than Schwaber, Beedle, and Beck recommend. I was eventually on such a project, working on the first two versions of Windows NT from 1990 to 1994. But once that was completed, I was done with signing up and looked for mellower projects within Microsoft. Despite the impression from a story like The Soul of a New Machine that working crazy hours is a heroic undertaking, when it happens with software, it produces code that is rushed and poor quality, with a long tail of bugs for customers to uncover. In particular, the temptation is great to gloss over the handling of error cases: code that rarely runs, but is the most critical part when it is needed.

The Agile approach of only providing short estimates and not holding programmers’ feet to the fire is clearly better than offering long estimates, stamping them on programmers’ foreheads, and then missing the overall schedule anyway. Yet if you step back, this glosses over a more fundamental problem. Scrum is not a progressive way of managing software projects; it’s a logical reaction to the current state of software development, which attempts to contain the damage by not overpromising to customers. Some version of waterfall is the way engineering projects should work; it’s what any “real” engineering project is aiming to achieve, because ideally you would know enough to anticipate issues and plan accordingly, and be able to schedule out the work accurately based on previous experience on similar projects. Valuing “responding to change over following a plan” is another way of saying “don’t expect me to be able to predict what I will get done,” which is the current reality, but I hope things don’t stay that way. Because while some change is due to customers seeing the software and realizing they don’t like it, a lot is due to realizing that your internal implementation details, which the customer can’t see, need to be reworked—and being unable to recognize ahead of time that you are going down the wrong path makes software development unpredictable.

You may have read about Scrum, the product backlog, and the burndown chart, and thought, “Hey, I know nothing about software, but those things make sense.” Which emphasizes the fact that Scrum has nothing to say about how to actually engineer software; it is focused on getting customer feedback through rapid iterations. Mary Shaw wrote about quote-unquote software engineering back in 1990 that “unfortunately, the term is now most often used to refer to life-cycle models, routine methodologies, cost-estimation techniques, documentation frameworks, configuration-management tools, quality-assurance techniques, and other techniques for standardizing production activities. These technologies are characteristic of the commercial stage of evolution—‘software management’ would be a much more appropriate term.”⁴⁵ Scrum fits right into the software management category, not the software engineering category.

I do give Agile credit for acknowledging one important fact about programming, which previous methodologies tended to ignore: code is read a lot.

The significance of code reading has not been completely missed in the literature. The IBMers’ Structured Programming book has a long chapter on code reading, complete with case studies, which begins, “The ability to read programs methodically and accurately is a crucial skill in programming. Program reading is the basis for modifying and validating programs written by others, for selecting and adapting program designs from the literature, and for verifying the correctness of one’s own programs.”⁴⁶ Mills, in Software Productivity, has a chapter titled “Reading Code as a Management Activity” (from 1972, thus predating his coauthorship of the Structured Programming book). He anticipates “a new possibility in PL/I: that programmers can and should read programs written by others, not in traumatic emergencies, but as a matter of normal procedure in the programming process.”⁴⁷ Weinberg, in The Psychology of Computer Programming, also has a chapter on reading programs.⁴⁸ I’ll mention in passing that my previous book, Find the Bug, talks about how to read code too.⁴⁹ In Microsoft Secrets, an engineer on Excel praises Hungarian for its salutatory effect on code reading: “Hungarian gives us the ability to just go in and read code. … Being fluent in Hungarian is almost like being a Greek scholar or something. You pick up something and you can read it.”⁵⁰ The actual content in that quote is somewhat divergent from reality, but it does show that reading code was an activity that programmers did and cared about, and tried (in vain, in this case) to make easier.

In classic waterfall programming, the goal was to write the code for the entire system you were going to produce and then hand it off to testing. If bugs were found, the programmer might experience the tribulation of revisiting the code, but it would be considered normal and even a positive sign if it were never looked at again.

In Agile, with its focus on delivering small increments of functionality under the YAGNI banner, it is understood that the code will be modified eventually, when “You Do Need It”; new vertical slices are written that touch existing code, and better ways to arrange the resulting whole are discovered in a process known as refactoring. Code is not meant to be written once and then never touched again; far from it. Rather than interpret this as “we got it right the first time,” it would be seen as “we are probably not responding to our customers, and our code is getting moldy.” As Beck put it, “A day without refactoring is like a day without sunshine.”⁵¹ The recognition that code is going to be read and modified often is an important mental switch from older methodologies.

The most extreme refactoring-based Agile approach is known as test-driven development, which mandates not only that unit tests be written for all code but also that unit tests be written first, before the code they are going to test. Furthermore, you strictly alternate between writing one test and writing the code to make that test pass—with no peeking ahead! YAGNI is the mantra, so if you were writing the code to score a game of bowling (which is the standard test-driven development example), and your first unit test involved a game that was all strikes, your actual product code at that point should look something like this:

int ScoreGame(Board b) {
    return 300;
}

Do you see what I did there? Naturally once you wrote a second unit test, which tested something other than a game of all strikes, you would need to write code to actually score the game based on the Board parameter, not just hard code a score of 300 that satisfied the first test.

My biggest concern about Agile is that it currently dominates the programming methodology discussion while only covering a narrow subset of the problems that software engineers can hit. In this entire chapter, that one-line method was the only code sample needed. Despite this, Agile is pitched as the savior of programming projects; as Schwaber and Beedle wrote in their book, “The case studies we provide in this book will show that Scrum doesn’t provide marginal productivity gains like process improvements that yield 5–25% efficiencies. When we say Scrum provides higher productivity, we often mean several orders of magnitude higher, i.e., several 100 percents higher.”⁵² The actual case studies are underwhelming, to say the least (especially since they are handpicked, not controlled experiments), but Scrum still sells to an eager audience of programmers.

Earlier, I discussed the shift in the origin of software ideas, starting with universities in the early days, then moving to corporate research labs, and then turning to corporate product groups. Agile is the next evolution of this trend. Although its inventors began as programmers whose business was writing programs, they quickly morphed into consultants whose product was Agile knowledge itself. As with much other advice to programmers, Agile was not based on any research studies or empirical observation beyond what people noticed in their own work.

Academia, in particular, has almost nothing to do with Agile. It’s easy to see why: with its new terminology and overhyped promises, Agile can come across as a fad, which universities would want to avoid. This in turn makes universities seem slow and stodgy to Agile practitioners, and possibly to new college graduates who fall under the sway of Agile. What is missing, to mend this gap, is more research on when exactly Agile practices are helpful, and when they are not. A methodology such as test-driven development is no doubt useful in some situations, but not in every one, which it perforce is proposed as a solution to. This provides ample ammunition for both sides of any argument.

In 2007, Scott Rosenberg published the book Dreaming in Code, in which he embedded himself with a well-credentialed group of programmers trying to write version 1 of an application: a personal information manager named Chandler. He unfortunately happened on a somewhat-dysfunctional team. Many of the members were tainted by previous success in that they were unable to distinguish factors that had legitimately contributed to that success from factors that didn’t matter or were actively wrong. They all believed, in different ways, in the fallacy “if we just do this one thing, then the normal complications of software development won’t apply.”

Nonetheless, Rosenberg had enough personal experience with software to realize that their behaviors were not completely atypical. At a certain point he throws up his hands: “As I followed Chandler’s fitful progress and watched the project’s machinery sputter and cough, I kept circling back to the reactions I’d had to my own experiences with software time: It can’t always be like this. Somebody must have figured this stuff out.” He then spends a chapter wandering around some of the same back alleys I have covered here, including design patterns, XP, and PSP. Rosenberg concludes, “I can’t say that my quest to find better ways of making software was very successful,” but qualifies this by saying, “I don’t think the methodology peddlers are snake oil salespeople.”⁵³ It’s just that the solutions being proposed don’t help with a large, complicated project like Chandler. Rosenberg eventually got tired of waiting, and his book appeared before Chandler did.

One of the most recent flavors of Agile is the Software Engineering Methods and Theory (SEMAT) initiative, created “to identify a common ground for software engineering … manifested as a kernel of essential elements that are universal to all software development efforts.” SEMAT is introduced in a book subtitled Applying the SEMAT Kernel, but with a more ambitious title, The Essence of Software Engineering. Like any good jeremiad, it features a call to action, which states:

Software engineering is gravely hampered today by immature practices. Specific problems include:

– The prevalence of fads more typical of a fashion industry than of an engineering discipline

– The lack of a sound, widely accepted theoretical basis

– The huge number of methods and method variants, with differences little understood and artificially magnified

– The lack of credible experimental evaluation and validation

– The split between industry practice and academic research ⁵⁴

As I’ve written in similar situations, it’s hard to argue with all that. How does SEMAT propose to address this? Not, initially anyway, by actually doing any experimental evaluation and validation. As programmer and former professor Greg Wilson comments in his “Two Solitudes” keynote talk from the SPLASH 2013 conference, the SEMAT book doesn’t cite a single empirical study.⁵⁵ Instead, between the three forewords and twelve pages of testimonials at the end, it attempts to abstract out the common parts of software process management methodologies into a metamethodology, which could then be used to diagnose flaws in your actual methodology. Given that Agile is already somewhat removed from the actual problems of software engineering, taking a step back does not get you any closer to the essence.

Yet it does bring up a point about Agile. To some people, the problems with software relate to process management: making sure the requirements are correct, stakeholders are involved, and right team is in place. The actual writing of the software is an exercise left to the reader. For this audience, something like SEMAT is moving closer to the essence of software engineering.

I am not minimizing the importance of all that. The Agile techniques originated with consultants who were working on contract work; it helps you concentrate on the customer when you won’t get paid unless they like what you deliver. This customer focus had often been ignored by programmers, who viewed customer change requests as evidence of their fickle “luserness,” not a necessary step in making them happy. In Engineering Excellence at Microsoft, we studied a field known as human performance improvement, to be used in analyzing Microsoft teams. One of the key tenets of human performance improvement contends, “Put a good performer in a bad system, and the system will win every time.”⁵⁶ In other words, the inputs people get from the environment have a greater impact on their performance than what they themselves bring to the table. Having good requirements and involved stakeholders is a critical part of the environment in which programmers operate.

But for a lot of software engineers, this has already been decided; the spec is written, the team is chosen, and now code needs to be written. Agile tends to peter out just as the engineering gets complicated. If you have a team that can meet every day in a room, its project is small enough that it can test most of its work via unit tests; and if the team stays together for the duration of a project, it won’t hit mysterious problems calling unclear API because the person who wrote the API is probably in the room with the team. Schwaber and Beedle, discussing the issue of other people needing to learn the code that the team has written, came up with a simple yet impractical solution: “[We] instituted the following policy: whoever writes code owns it forever.”⁵⁷ I suppose from the team’s perspective the code can be owned forever, if their perception of time ceases once they stop working on it, in a sort of reverse big bang.⁵⁸ Unfortunately, customers don’t have this luxury.

The complications of software happen at larger and longer scale than those sorts of projects. While Agile may make easy problems a bit easier, it doesn’t help with the hard problems. It’s appealing to programmers, but to make software engineering more of an engineering discipline, something else is needed.

Notes