7

Automation

“But what’s happening?”

Flight 447 and the Jennifer Unit: When Human Messiness Protects Us from Computerized Disaster

When a sleepy Marc Dubois walked into the cockpit of his own airplane, he was confronted with a scene of confusion. The plane was shaking so violently that it was hard to read the instruments. An alarm was alternating between a chirruping trill and an automated voice: STALL STALL STALL. His junior copilots were at the controls. In a calm tone, Captain Dubois asked, “What’s happening?”1

Copilot David Robert’s answer was less calm. “We completely lost control of the airplane, and we don’t understand anything! We tried everything!”

Two of those statements were wrong. The crew were in control of the airplane. One simple course of action could have ended the crisis they were facing, and they had not tried it. But David Robert was certainly right on one count: he didn’t understand what was happening.

Air France Flight 447 had begun straightforwardly enough—an on-time takeoff from Rio de Janeiro at 7:29 p.m. on May 31, 2009, bound for Paris. Hindsight suggests the three pilots had their vulnerabilities. Pierre-Cédric Bonin, thirty-two, was young and inexperienced. David Robert, thirty-seven, had more experience, but he had recently become an Air France manager and no longer flew full-time. Captain Marc Dubois, fifty-eight, had experience aplenty, but he’d been touring Rio with an off-duty flight attendant. It was later reported that he had had only an hour’s sleep.

Fortunately, given these potential fragilities, the crew were in charge of one of the most advanced planes in the world, an Airbus 330, legendarily smooth and easy to fly. Like any other modern aircraft, the A330 has an autopilot to keep the plane flying on a programmed route, but it also has a much more sophisticated automation system called fly-by-wire. A traditional airplane gives the pilot direct control of the flaps, rudder, and elevators. This means that the pilot has plenty of latitude to make mistakes. Fly-by-wire is smoother and safer. It inserts itself between the pilot, with all his or her faults, and the plane’s mechanics, its flaps and fins and ailerons. A tactful translator between human and machine, it observes the pilot tugging on the controls, figures out how the pilot wants the plane to move, and executes that maneuver perfectly. It will turn a clumsy movement into a graceful one.

This makes it very hard to crash an A330, and the plane had a superb safety record: there had been no crashes in commercial service in the first fifteen years after it was introduced in 1994. But, paradoxically, there is a risk to building a plane that protects pilots so assiduously from even the tiniest error. It means that when something challenging does occur, the pilots will have very little experience to draw on as they try to meet that challenge.

The challenge facing Flight 447 did not seem especially daunting: thunderstorms over the Atlantic Ocean just north of the equator. These were not a major problem, although perhaps Captain Dubois was too relaxed when, at 11:02 p.m. Rio time, he left the cockpit for a nap, with the inexperienced Bonin in charge of the controls.

Faced with the storm, Bonin seemed nervous. The slightest hint of trouble produced an outburst of swearing: “Putain la vache. Putain!”—the French equivalent of “Fucking hell. Fuck!” He wished he could fly over the storm. More than once he expressed a desire to fly at “three-six”—36,000 feet—and lamented the fact that Air France procedures recommended flying a little lower. This desire for altitude was to become important: while it is possible to avoid trouble by flying over a storm, there’s a limit to how high the plane can go. The atmosphere becomes so thin that it can barely support the aircraft. Margins for error become tight. The aircraft will be at risk of stalling.

Unlike stalling a car, stalling an aircraft has nothing to do with the vehicle’s engine. An aircraft stall occurs when the plane tries to climb too steeply. At this steep angle, the wings no longer function as wings and the aircraft no longer behaves like an aircraft. It loses airspeed and falls gracelessly in a nose-up position.

In the thin air of high altitudes a stall is more likely. Fortunately, a high altitude also provides plenty of time and space to correct the stall. This is a well-known maneuver, fundamental to learning how to fly a plane: the pilot pushes the nose of the plane down and into a dive. The diving plane regains airspeed and the wings once more work as wings. The pilot then gently pulls out of the dive and into level flight once more.

In any case, an Airbus A330 will not stall. The fly-by-wire system will not allow the pilot to climb so steeply in the first place. Or so Pierre-Cédric Bonin seemed to believe.

As the plane approached the storm, ice crystals began to form on the wings. Bonin and Robert switched on the anti-icing system to prevent too much ice from building up and slowing the plane down. Robert nudged Bonin a couple of times to pull left, avoiding the worst of the weather; Bonin seemed slightly distracted, perhaps put on edge by the fact that Robert could have plotted a route around the storms much earlier. A faint odor of electrical burning filled the cockpit, and the temperature rose—Robert assured Bonin that all this was the result of the electrical storm, not an equipment failure.

And then an alarm sounded. The autopilot had disconnected. An airspeed sensor on the plane had iced over and stopped functioning—not a major problem, but one that required the pilots to take control. But something else happened at the same time and for the same reason: the fly-by-wire system downgraded itself to a mode that gave the pilot less help and more latitude to control the plane. Lacking an airspeed sensor, the plane was unable to babysit Pierre-Cédric Bonin.

The first consequence was almost immediate: the plane was rocking right and left, and Bonin was overcorrecting this with sharp jerks on the stick. With the fly-by-wire in its normal mode, those clumsy tugs would have been translated into smooth instructions. In the alternate mode, the plane did not intervene and the rocking continued. Gone was fly-by-wire as smooth-tongued interpreter; replacing it was fly-by-wire as a literal-minded translator that would relay any instruction, no matter how foolish.

And then Bonin made a simple mistake: he pulled back on his control stick and the plane started to climb steeply.

It’s not clear why Bonin was seeking altitude. Perhaps it felt safer up there. He had been muttering that it was “too bad” he couldn’t fly a little higher, above the storm clouds. But in the thin air, a steep climb would cause the plane to stall. The A330 knew this perfectly well, and as the nose of the plane rose and it started to lose speed, an automated voice barked out in English: STALL STALL STALL. That word was to be repeated seventy-five times in the following four and a half minutes. At no point did any of the crew members mention it.

Despite the warning, Bonin kept pulling back on the stick, and in the black skies above the Atlantic the plane climbed at an astonishing rate of 7,000 feet a minute. But the plane’s airspeed was evaporating; it would soon begin to slide down through the storm and toward the water, 37,500 feet below. Bonin’s error was basic—every pilot is taught about stalling and about how to fix a stall: put the nose of the plane down and regain speed. Had either man realized what was happening, they could have fixed the problem, at least in its early stages. But they did not. Why?

Perhaps language was a barrier. The men were French and spoke French in the cockpit; the STALL STALL STALL was in English. Yet all pilots speak English so that they can communicate with air traffic control wherever they are in the world.

David Robert—still young, a little rusty—may not have realized what Bonin was doing. In some airplanes Bonin’s yank back on his control stick would have been mirrored by the stick in Robert’s hands, so the more experienced pilot would have received some visceral information about his colleague’s error. But in the A330 the parallel controls are not directly linked. Nor did Robert have a direct indicator of the degree to which Bonin had stuck its nose in the air. And since the airspeed indicator had iced over, Robert may have lost confidence in what other indicators were telling him.

As for Bonin, he seems to have responded to the crisis by behaving as he should in the only challenging situation he was likely to have faced as a young pilot: aborting a landing. (At one stage he mentioned to Robert that he was in “TOGA”—Take Off, Go Around.) When aborting a landing, a pilot needs to fire up the engines and climb quickly away from the ground. The air is thicker at ground level, so the plane can climb much more steeply without stalling; and in any case, any aborted landing Bonin would have conducted would have been with the help of the fly-by-wire system that simply would not have allowed him to stall. Perhaps Bonin’s instincts were telling him to climb out of danger, above the lightning and the turbulence. If so, his instincts were wrong. Far from climbing out of danger, his climb was the danger.

But the real source of the problem was the system that had done so much to keep A330s safe across fifteen years and millions of miles of flying: the fly-by-wire. Or more precisely, the problem was not the fly-by-wire system, but the fact that the pilots had grown to rely on that system. Bonin was suffering from a problem called mode confusion. Perhaps he did not realize that the plane had switched to the alternate mode that would provide him with far less assistance. Perhaps he knew the plane had switched modes but did not fully understand the implication: that his plane would now let him stall. That is the most plausible reason Bonin and Robert ignored the STALL STALL STALL alarm—they assumed that this was the plane’s way of telling them that it was intervening to prevent a stall. In short, Bonin stalled the aircraft because in his gut he felt it was impossible to stall the aircraft. And he failed to pull out of the stall for exactly the same reason.

Aggravating this mode confusion was Bonin’s lack of experience in flying a plane without computer assistance. While he had spent many hours in the cockpit of the A330, most were spent monitoring and adjusting the plane’s computers rather than directly flying it. And of the tiny number of hours spent manually flying the plane, few if any would have been spent in the degraded fly-by-wire mode, and almost all would have been spent taking off or landing. No wonder Bonin instinctively moved the plane as if for an aborted landing. And no wonder he felt so helpless at the plane’s controls.

•   •   •

The Air France pilots “were hideously incompetent,” says William Langewiesche, a writer and professional pilot.2 And Langewiesche thought he knew why. He argued persuasively in the pages of Vanity Fair that the pilots simply weren’t used to flying their own plane’s at altitude without the help of the computer. Even the experienced Captain Dubois was rusty: of the 346 hours he had been at the controls of a plane during the past six months, only four were in manual control rather than overseeing the autopilot, and even then he’d had the help of the full fly-by-wire system. All three pilots were denied the ability to practice their skills, because the plane was usually the one doing the flying.

This problem has a name: the paradox of automation. It applies to a wide variety of contexts, from the operators of a nuclear power station to the crew of a cruise ship to the simple fact that we can’t remember phone numbers anymore because we have them all stored in our cell phones and that we struggle with mental arithmetic because we’re surrounded by electronic calculators. The better the automatic systems, the more out-of-practice human operators will be, and the more unusual will be the situations they face.3 The psychologist James Reason, author of Human Error, wrote: “Manual control is a highly skilled activity, and skills need to be practiced continuously in order to maintain them. Yet an automatic control system that fails only rarely denies operators the opportunity for practicing these basic control skills . . . when manual takeover is necessary something has usually gone wrong; this means that operators need to be more rather than less skilled in order to cope with these atypical conditions.”*4

The paradox of automation, then, has three strands to it. First, automatic systems accommodate incompetence by being easy to operate and by automatically correcting mistakes. Because of this an inexpert operator can function for a long time before his lack of skill becomes apparent—his incompetence is a hidden weakness that can persist almost indefinitely without being detected. Second, even if operators are expert, automatic systems erode their skills by removing the need for them to practice. Third, automatic systems tend to fail either in unusual situations or in ways that produce unusual situations, requiring a particularly skillful human response. For each of these three strands, a more capable and reliable automatic system makes the situation worse.

There are plenty of situations in which automation creates no such paradox. A customer service webpage may be able to handle routine complaints and requests, so that customer service staff are spared repetitive work and may do a better job for customers with more complex questions.

Not so with an airplane. Autopilots and the more subtle assistance of fly-by-wire do not free up the crew to concentrate on the interesting stuff. Instead, they free up the crew to fall asleep at the controls, figuratively or even literally. One notorious incident occurred late in 2009, when two pilots let their autopilot overshoot Minneapolis airport by over a hundred miles. They’d been looking at laptops and had become distracted.5

When something goes wrong, it is hard to snap to attention and deal with a situation that is very likely to be bewildering.

•   •   •

His nap abruptly interrupted, Captain Dubois arrived in the cockpit a mere 1 minute and 38 seconds after the airspeed indicator had failed. The plane was still above 35,000 feet, although it was falling at over 150 feet a second. The de-icers had done their job and the airspeed sensor was operating again, but the copilots no longer trusted any of their instruments. The plane—which was now in perfect working order—was telling them that they were barely moving forward at all, and were slicing through the air down toward the water, tens of thousands of feet below. But rather than realize the faulty instrument was fixed, they appear to have assumed that yet more of their instruments had broken. Dubois was silent for twenty-three seconds—a long time, if you care to count them off. Long enough for the plane to fall 4,000 feet.

It was still not too late to save the plane—if Dubois had been able to recognize what was happening to it. The plane’s nose was now so high that the stall warning had stopped—it, like the pilots, simply rejected the information it was getting as anomalous. A couple of times, Bonin did push the nose of the plane down a little, and the stall warning started up again—STALL STALL STALL—which no doubt confused him further. At one stage Bonin tried to engage the speed brakes, worried that they were going too fast—the opposite of the truth. The plane was clawing its way forward through the air at less than 60 knots, about 70 miles per hour—far too slow. It was falling twice as fast. Utterly confused, the pilots argued briefly about whether the plane was climbing or descending.

Bewilderment reigned. Bonin and Robert were shouting at each other, each trying to control the plane. All three men were talking at cross-purposes. The plane was simultaneously climbing—nose up—while losing altitude rapidly.

Robert: “Your speed! You’re climbing! Descend! Descend, descend, descend!”

Bonin: “I am descending!”

Dubois: “No, you’re climbing.”

Bonin: “I’m climbing? Okay, so we’re going down.”

Nobody said: “We’re stalling. Put the nose down and dive out of the stall.”

At 11:13 p.m. and 40 seconds, less than twelve minutes after Dubois first left the cockpit for a nap, and two minutes after the autopilot switched itself off, Robert yelled at Bonin, “Climb . . . climb . . . climb . . . climb . . . ,” the precise opposite of what needed to be done. Bonin replied that he’d had his stick back the entire time—the information that might have helped Dubois diagnose the stall, had he known.

Finally the penny seemed to drop for Dubois, who was standing behind the two copilots. “No, no, no . . . Don’t climb . . . no, no.”

Robert took the hint. He announced that he was taking control and pushed the nose of the plane down. The plane began to accelerate at last. But Robert was about one minute too late—that’s 11,000 feet of altitude. There was not enough room between the plummeting plane and the black water of the Atlantic to regain speed and then pull out of the dive.

In any case, Bonin silently retook control of the plane and tried to climb again. It was an act of pure panic. Robert and Dubois had, perhaps, realized that the plane had stalled—but they never said so. They may not have realized that Bonin was the one in control of the plane. And Bonin never grasped what he had done. His last words were: “But what’s happening?”

Four seconds later the aircraft hit the Atlantic Ocean at about 125 miles an hour. Everyone on board, 228 passengers and crew, died instantly.

•   •   •

Earl Wiener, a cult figure in aviation safety who died in 2013, coined what’s known as “Wiener’s Laws” of aviation and human error. One of them was “Digital devices tune out small errors while creating opportunities for large errors.”6

We might rephrase it as: “Automation will routinely tidy up ordinary messes, but occasionally create an extraordinary mess.” It’s an insight that applies far beyond aviation.

A few years ago, the police department in San Leandro, California, near Oakland, took at least 112 photographs of two cars owned by a local resident, Michael Katz-Lacabe. This fact did not emerge in some scandalous court case demonstrating Katz-Lacabe to be a terrorist or a gangland kingpin—it emerged because he filed a public records request to see the photos. And Katz-Lacabe’s cars weren’t photographed because he was suspected of any offense. They were photographed because everybody’s cars were photographed, the digital files scanned, license plates logged, and everything filed away with a date and a location. Mr. Katz-Lacabe’s daughters were photographed, too, aged five and eight at the time. Why? Because they were near the car when the virtual shutter clicked.7

The photographs of Mr. Katz-Lacabe’s cars—and children—were sent, with millions of others, to the Northern California Regional Intelligence Center, which is run by the U.S. federal government. A hundred million license plate photographs a second can be searched, thanks to software developed by Silicon Valley’s Palantir Technologies. The crime-fighting potential of such a vast and easily analyzable dataset is obvious. So, too, is its potential to be used in less palatable ways: as Katz-Lacabe mused to Andy Greenberg of Forbes magazine, the government could wind back the clock through its database of photos to see if someone has been “parked at the house of someone other than their wife, a medical marijuana clinic, a Planned Parenthood center, a protest.”

There’s clearly a debate to be had about the advantages and the authoritarian risks of such powerful technology, and it has been well discussed in recent years. But there’s another danger, a problem that gets far less attention but that Wiener’s observation brings into focus: What do we do in the rare cases when the technology fails?

Consider the tale of Victor Hankins, an ordinary British citizen who received an unwelcome seasonal gift for Christmas: a parking fine. Rather than return to an illegally parked car to find a ticket tucked under the windshield wipers, the first Hankins knew of his punishment was when a letter from the local council dropped onto his doormat. At 14 seconds after 8:08 p.m. on December 20, 2013, his car had been blocking a bus stop in Bradford, Yorkshire, and had been photographed by a camera mounted in a traffic enforcement van driving past. A computer had identified the license plate, looked it up in a database, and found Mr. Hankins’s address. An “evidence pack” was automatically generated, including video of the scene, a time stamp, and data to confirm the location. The letter from Bradford city council demanding that Mr. Hankins pay a fine or face court action was composed, printed, and mailed by an equally automatic process. There was just one problem: Mr. Hankins hadn’t been illegally parked at all. He had been stuck in traffic.8

In principle, such technology should not fall victim to the paradox of automation. It should free up humans to do more interesting and varied work—checking the anomalous cases, such as the complaint Mr. Hankins immediately registered, which are likely to be more intriguing than simply writing down yet another license plate and issuing yet another ticket. But the tendency to assume that the technology knows what it’s doing applies just as much to a bureaucracy as it does to pilots. Bradford’s city council initially dismissed Mr. Hankins’s complaint, apologetically admitting their error only when he threatened them with the inconvenience of a court case. We are reminded of an old joke: “To err is human, but to really foul things up takes a computer.”

On the very same day that Victor Hankins’s car was snapped, Google unveiled a neural network that could identify house numbers in photographs taken by the Google Street View cars. Give the network an hour, announced Google’s research team, and it could read every house number in France with 96 percent accuracy.9 That sounds impressive—but even a low error rate produces a vast number of mistakes. There are 25 million homes in France, so another way to describe Google’s new neural network is that it can misidentify a million street numbers an hour.

Such a high error rate is actually a source of comfort, because it means the method won’t be relied on. Companies such as UPS or FedEx would never accept as many as one in twenty-five of their parcels going to the wrong address; it would be a reputational disaster. Nor—one would hope—would the French armed police be happy to barge through a door being only 96 percent sure that it wasn’t the home of a startled, law-abiding innocent. And if they did so routinely, complaints would be taken seriously: if the police ombudsman service knows that the police screw up one case out of twenty-five, they should be disposed to give you a hearing when you complain that you were one of the mistakes.

But what if Google improves its accuracy by a factor of a million? Now there will be only a single mistake out of France’s 25 million homes. But a mistake is going to happen to someone, and nobody will believe them. The rarer the exception gets, as with fly-by-wire, the less gracefully we are likely to deal with it. We will assume that the computer is always right, and when someone says the computer made a mistake, we will assume they are wrong or lying. What happens when private security guards throw you out of your local shopping mall because a computer has mistaken your face for that of a known shoplifter? (The technology already exists, and it is being modified to allow retailers to single out their most pliable customers for special offers the moment they walk into the store.10) When you’re on the “criminal” list, how easy would it be to get off it?

Automated systems can be wonderful—but if we put too much trust in them, people suffer. Consider the experience of Rahinah Ibrahim, a thirty-nine-year-old lecturer and architect who was studying for a doctorate at Stanford University. On January 2, 2005, she tried to fly from San Francisco to Hawaii to present her research at a conference. That was an arduous enough journey given that Rahinah was wheelchair-bound while recovering from surgery, but it was about to become more arduous still: after trying to check in, she was arrested in front of her teenage daughter, handcuffed, and driven to a holding cell. After a few hours, she was told she was cleared to fly to Hawaii the next day.11

Two months later, having gone on to her home country, Malaysia, she learned at the airport that her U.S. student visa had been revoked without notice. Although the mother of a U.S. citizen, she would never be able to return to the United States.

While it had taken only the mention of court for Bradford Council to relent over Victor Hankins’s parking ticket, it took Rahinah nine years, and $4 million worth of volunteered legal assistance—battling constant efforts by the U.S. government to derail the proceedings—before U.S. District Judge William Alsup delivered a damning conclusion. Rahinah had been put on a no-fly list by mistake—possibly the result of confusion between Jemaah Islamiyah, a terrorist group, which killed 202 people with a car bomb in Bali in 2002, and Jamaah Islah Malaysia, a professional association of Malaysians who have studied overseas. Rahinah was a member of the second group, not the first.

Once that error entered the database it acquired the iron authority of the computer. As the judge wrote, “Once derogatory information is posted to the TSDB [Terrorism Screening Database], it can propagate extensively through the government’s interlocking complex of databases, like a bad credit report that will never go away.” The initial mistake spread like a virus that no official had any interest in curing.

The world is a messy place. Is that digit a 1 or a 7, a lowercase l or an uppercase I? A car is motionless: Is it parked or stuck in traffic? Is that person a shoplifter or a shoplifter’s twin brother? Is a group with an unfamiliar-sounding name a gang of killers or an association of internationally minded scholars? In a messy world, mistakes are inevitable.

Yet automatic systems want to be tidy. Once an algorithm or a database has placed you in a particular category, the black-and-white definitions of the data discourage argument and uncertainty. You are a shoplifter. You were parked at a bus stop. You are on the no-fly list. The computer says so—with the authority that leads a government to defend itself doggedly for years, rather than acknowledge that sometimes mistakes are made.

We are now on more lists than ever before: lists of criminals; lists of free-spending shoppers; lists of people who often drive around San Leandro, California; even lists of rape survivors.12 Computers have turned such lists from inert filing cabinets full of paper to instantly searchable, instantly actionable data. Increasingly, the computers are filling these databases with no need for humans to get involved, or even to understand what is happening. And the computers are often unaccountable: an algorithm that rates teachers and schools, Uber drivers, or businesses on Google’s search will typically be commercially confidential. Whatever errors or preconceptions have been programmed into the algorithm from the start, it is safe from scrutiny: those errors and preconceptions will be hard to challenge.13

Yet for all the power and the genuine usefulness of such data, perhaps we have not yet acknowledged how imperfectly a tidy database maps onto a messy world. We fail to see that a computer that is a hundred times more accurate than a human and a million times faster will make ten thousand times as many mistakes. And—recalling Johann Gottlieb Beckmann’s scientific forestry project—we fail to acknowledge the power of the database, not just to categorize the world, but to shape it.

This is not to say that we should call for death to the databases and algorithms. Even if we share the misgivings of Michael Katz-Lacabe, most of us will admit that there is at least some legitimate role for computerized attempts to investigate criminal suspects, keep traffic flowing, and prevent terrorists from getting on airplanes. But the database and the algorithm, like the autopilot, should be there to support human decision-making. If we rely on computers completely, disaster awaits.

•   •   •

Our learned helplessness in the hands of technology is sometimes more amusing than horrifying. In March 2012, three Japanese students visiting Australia decided to drive to North Stradbroke, guided by their GPS system. For some reason the GPS was not aware that their route was blocked by nine miles of the Pacific Ocean. These things happen, of course, but the reaction of the three tourists was extraordinary: in thrall to their technology, they drove their car down onto the beach and across the mudflats toward the ocean. As the water lapped around their Hyundai, they realized, to their embarrassment, that they were stuck. With astonished ferry passengers looking on, the students abandoned their car and waded to shore. The car was a write-off. “We want to come back to Australia again,” said one of the men. “Everyone is very nice, even today.”14

It’s fun to laugh at incompetent tourists. But it is also worth asking how on earth three sentient beings can drive into the Pacific Ocean on the instructions of GPS gone haywire.* Automated systems tend to lull us into passivity. In other contexts, we have a well-known tendency to accept whatever defaults are being suggested to us—for example, in countries where people are signed up by default to an organ donor register unless they tick a box to opt out, almost everyone accepts this default. In countries where you have to tick a box to opt in, coverage is far less. Much the same is true for corporate pensions. If the default is different, we act differently—despite the fact that these are vital, life-changing decisions.15

This tendency to passively accept the default option turns out to apply to automated decisions, too; psychologists call it automation bias. The problem—in Bradford, on the U.S. no-fly list, everywhere—is that once a computer has made a recommendation, it is all too easy to accept that recommendation unthinkingly.

Driving your car into the sea is an extreme example of automation bias, but most GPS users will recognize the tendency in a milder form. The first time you use the GPS, you’re wary. You check the map, perhaps print out some directions, familiarize yourself with the terrain, and manually estimate how long your journey will take. But after three or four successful outings, you’re hooked: Why bother with all that paraphernalia when the GPS will find you a route more quickly and reliably?

The GPS won’t let you down often, but when it does, it will let you down badly. The first time it happened to me, I was headed for a hotel in the center of York, a beautiful medieval city surrounded by city walls that constrain the flow of traffic. I arrived late at night to find my route blocked for overnight resurfacing works; the GPS hadn’t gotten the memo and ordered me to plow on through the blockage. Fortunately I wasn’t tempted to follow the GPS under the oncoming steam roller, but that was where my competence ended: I’d trusted the computer and had no backup plans. I didn’t know where I was, or where my hotel was. I had no map—this was before smartphones—so I was reduced to driving around aimlessly in the hope that the machine might eventually form an alternative plan.

After a few more trouble-free trips, I soon started trusting the GPS again—and it performed faultlessly for years, until recently I set off in the general direction of a country wedding armed only with a zip code that the computer turned out not to recognize. Not knowing why the GPS failed me, I have no way of predicting when it might do so again.

Gary Klein, a psychologist who specializes in the study of expert and intuitive decision-making, summarizes the problem: “When the algorithms are making the decisions, people often stop working to get better. The algorithms can make it hard to diagnose reasons for failures. As people become more dependent on algorithms, their judgment may erode, making them depend even more on the algorithms. That process sets up a vicious cycle. People get passive and less vigilant when algorithms make the decisions.”16

Decision experts such as Klein complain that many software engineers make the problem worse by deliberately designing systems to supplant human expertise by default; if we wish instead to use them to support human expertise, we need to wrestle with the system. GPS devices, for example, could provide all sorts of decision support, allowing a human driver to explore options, view maps, and alter a route. But all these functions tend to be buried deeper in the app. They take effort, whereas it is very easy to hit “Start Navigation” and trust the computer to do the rest.

Systems that supplant, not support, human decision-making are everywhere. We worry that the robots are taking our jobs, but just as common a problem is that the robots are taking our judgment. In the large warehouses so common behind the scenes of today’s economy, human “pickers” hurry around grabbing products off shelves and moving them to where they can be packed and dispatched. In their ears are headpieces: the voice of “Jennifer,” a piece of software, tells them where to go and what to do, controlling the smallest details of their movements. Jennifer breaks down instructions into tiny chunks, to minimize error and maximize productivity—for example, rather than pick eighteen copies of a book off a shelf, the human worker would be politely instructed to pick five. Then another five. Then yet another five. Then three more. Working in such conditions reduces people to machines made of flesh. Rather than ask us to think or adapt, the Jennifer unit takes over the thought process and treats workers as an inexpensive source of some visual processing and a pair of opposable thumbs.17

You could even argue that the financial crisis of 2007–2008, which plunged the world into recession, was analogous to absentmindedly driving a car into the Pacific. One of the weaknesses that contributed to the crisis was the failure of financial products called collateralized debt obligations (CDOs)—hugely complex structures, whose value depended in an opaque way on the health of the U.S. mortgage market. A grizzled market participant might have looked at rapidly inflating house prices and mused that a house price crash was possible, even though the United States had not experienced a nationally synchronized crash before. And if that grizzled market participant had been able to talk to the computers, the computers would have been able to demonstrate the catastrophic impact of such a crash on the value of CDOs. Unfortunately, there was no meeting of minds: the computers didn’t have the tacit knowledge of the experienced humans, so they didn’t process the idea that a crash was plausible, while the experienced humans didn’t understand what their intuition would imply for the value of CDOs.

It is possible to resist the siren call of the algorithms. Rebecca Pliske, a psychologist, found that veteran meteorologists would make weather forecasts first by looking at the data and forming an expert judgment; only then would they look at the computerized forecast to see if the computer had spotted anything that they had missed. (Typically, the answer was no.) By making their manual forecast first, these veterans kept their skills sharp, unlike the pilots on the Airbus 330. However, the younger generation of meteorologists are happier to trust the computers. Once the veterans retire, the human expertise to intuit when the computer has screwed up will be lost forever.18

•   •   •

We’ve seen the problems with GPS systems and with autopilot. Put the two ideas together, and you get the self-driving car.

Chris Urmson, who runs Google’s self-driving car program, hopes that the cars will soon be so widely available that his sons will never need to have a driving license. (His oldest son will be sixteen in 2020—Urmson is in a hurry.) There’s a revealing implication in that target: that unlike a plane’s autopilot, a self-driving car will never need to cede control to a human being. True to form, Google’s autonomous vehicles have no steering wheel, though one hopes there will be some way to jump out if they start heading for the ocean.19

Not everyone thinks it is plausible for cars to be completely autonomous—or, at least, not soon enough for Urmson junior. Raj Rajkumar, an autonomous-driving expert at Carnegie Mellon University, thinks completely autonomous vehicles are ten to twenty years away. Until then, we can look forward to a more gradual process of letting the car drive itself in easier conditions, while humans take over at more challenging moments.

“The number of scenarios that are automatable will increase over time, and one fine day, the vehicle is able to control itself completely, but that last step will be a minor, incremental step and one will barely notice this actually happened,” Rajkumar told the 99% Invisible podcast. Even then, he says, “there will always be some edge cases where things do go beyond anybody’s control.”

If this sounds ominous, perhaps it should. At first glance, it seems reasonable that the car will hand over to the human driver when things are difficult. But that raises two immediate problems. If we expect the car to know when to cede control, then we’re expecting the car to know the limits of its own competence—to understand when it is capable and when it is not. That is a hard thing to ask even of a human, let alone a computer.

Alternatively, if we expect the human to leap in and take over, how will the human be able to react appropriately? Given what we know about the difficulty highly trained pilots can have figuring out an unusual situation when the autopilot switches off, surely we should be skeptical about humans’ capacity to notice when the computer is about to make a mistake. “Human beings are not used to driving automated vehicles, so we really don’t know how drivers are going to react when the driving is taken over by the car,” says Anuj K. Pradhan of the University of Michigan.20 It seems likely that we’ll react by playing a computer game or chatting on a video phone, rather than watching like a hawk how the computer is driving—maybe not on our first trip in an autonomous car, but certainly on our hundredth.

And when the computer gives control back to the driver, it may well do so in the most extreme and challenging situations. The three Air France pilots had two or three minutes to work out what to do when their autopilot asked them to take over an A330; what chance would you or I have when the computer in our car says “Automatic Mode Disengaged” and we look up from our smartphone screen to see a bus careening toward us?

Anuj Prajan has floated the idea that humans should have to acquire several years of manual experience before they are allowed to supervise an autonomous car. But it is hard to see how this solves the problem. No matter how many years of experience a driver has, her skills will slowly erode if she lets the computer take over. Prajan’s proposal gives us the worst of both worlds: we let teenage drivers loose in manual cars, when they are most likely to have accidents. And even when they’ve learned some road craft, it won’t take long being a passenger in a usually reliable autonomous car before their skills begin to fade.

Recall that Earl Wiener said, “Digital devices tune out small errors while creating opportunities for large errors.”21 In the case of autopilots and autonomous vehicles, we might add that it’s because digital devices tidily tune out small errors that they create the opportunities for large ones. Deprived of any awkward feedback, any modest challenges that might allow us to maintain our skills, when the crisis arrives we find ourselves lamentably unprepared.

•   •   •

Every application of Wiener’s insight about large and small errors involves a trade-off. The GPS routinely saves me the minor hassle of planning before a trip, but at the cost of occasionally sending me scuttling apologetically into a rural church just ahead of the bridal procession. Is the odd frustration worth the cumulative time saved? Given that I have again slipped back into trusting the GPS, I must have concluded that it is.

When it comes to tidy databases, the trade-off is fraught. Automation makes it easier to punish parking infringements and keep potential terrorists off airplanes. (While there are valid debates to be had about ends and means, respectively, let us assume these are both good things.) But it creates unusual situations where individuals have to battle to get an unlikely-sounding story accepted: “I wasn’t parked illegally, I was stuck in traffic”; or “That’s not a terrorist group, it’s an alumni association.” Does more efficient service in the majority of cases justify trapping a small number of individuals in Kafkaesque battles against bureaucracy? That is a question with no easy answer. But it does tell us we should strive to listen to people who say they have fallen victim to a rare and unusual error and set up mechanisms to sort out these errors quickly.

With fly-by-wire, it’s much easier to assess whether the trade-off is worthwhile. Until the late 1970s, one could reliably expect at least twenty-five fatal commercial plane crashes a year. In 2009, Air France 447 was one of just eight crashes, a safety record. The cost-benefit analysis seems clear: freakish accidents like Flight 447 are a price worth paying, as the steady silicon hand of the computer has prevented many others.

Still, one cannot help but wonder if there is a way to combine the adaptability, judgment, and tacit knowledge of humans with the reliability of computers, reducing the accident rate even further. One priority could be to make semi-automated systems give feedback in a way that humans feel more viscerally. The crew of Air France 447 were told seventy-five times that they were stalling—STALL STALL STALL—but they didn’t feel it instinctively. If the cockpit had displayed a large image of the plane with its nose in the air, that might have conveyed the danger. Similarly, the control sticks weren’t physically mimicking each other, so the more senior copilot didn’t realize that young Bonin was overriding what he was doing. Again, a verbal warning announced that the pilots were giving the plane contradictory instructions, but the verbal warning was easily ignored—a more physical connection might have produced a more attentive response.

Some senior pilots urge their juniors to turn off the autopilots from time to time to maintain their skills. That sounds like good advice. But if the junior pilots turn off the autopilot only when it is absolutely safe to do so, they aren’t practicing their skills in a challenging situation. And if they turn off the autopilot in a challenging situation, they may provoke the very accident they are practicing to avoid.

An alternative solution is to reverse the role of computer and human. Rather than let the computer fly the plane with the human poised to take over when the computer cannot cope, perhaps it would be better to have the human fly the plane with the computer monitoring the situation, ready to intervene. Computers, after all, are tireless, patient, and do not need practice. Why, then, do we ask the people to monitor the machines and not the other way around? That is the way the best meteorologists acted when studied by psychologist Rebecca Pliske: the human made the forecast and then asked the machine for a second opinion. Such a solution will not work everywhere, but it is worth exploring.

If we are stuck with the problem of asking humans to monitor computers, it is vital to keep those humans interested, and there may be safe ways to add a little dose of mess. Airplanes are only one type of largely automated system that humans are tasked with keeping an eye on—others include high-speed trains, U.S. military drones, and warehouses full of robot forklifts. Supervising a drone sounds like an exciting job, but it can be dreadfully boring for much of the time. The drone might be circling over Afghanistan while the operator is snacking on M&M’s at Creech Air Force Base in Indian Springs, Nevada, keeping half an eye on the screen and idly daydreaming—then they suddenly need to snap to attention and decide whether or not to kill a potential target.

Enter Mary “Missy” Cummings, one of the U.S. Navy’s first female fighter pilots and now an expert in the field of humans supervising semi-autonomous machines. Missy Cummings and her team ran an experiment during which drone pilots were given a long, often boring simulated mission, punctuated with occasional life-or-death decisions. As they gazed at grainy images coming in from four different drones, tapped navigation instructions into the computer, and waited for something to happen, the pilots often became distracted. They’d sit with a book or a laptop in front of them, glancing back and forth between the mission and something more interesting. (These distractions were neither forbidden nor encouraged by researchers, who wanted to see what people would do.)

Unsurprisingly, the scientists showed that reaction times and other measures of performance dramatically deteriorated as the hours ticked by. But they also observed that many of the more successful pilots adopted an interesting tactic. Rather than attend to their task through sheer willpower, or divide their attention, trying to do both their job and their e-mail at the same time, they distracted themselves in brief bursts. A few minutes with their back to the drone monitors, doing something completely different, would refresh them as they returned to the task.

Such behavior suggests that when humans are asked to babysit computers, the computers themselves should be programmed to serve up occasional brief diversions. Even better might be an automated system that demanded more input, more often, from the human—even when that input wasn’t strictly needed.22 If you occasionally need human skill at short notice to navigate a hugely messy situation, it may make sense to artificially create smaller messes, just to keep people on their toes.

•   •   •

In the mid-1980s, a Dutch traffic engineer named Hans Monderman was sent to the village of Oudehaske. Two children had been killed by cars, and Monderman’s radar gun showed right away that drivers were going too fast through the village. Monderman pondered the traditional solutions—traffic lights, speed bumps, additional signs pestering drivers to slow down. They were expensive and, Monderman knew, often ineffective. Control measures such as traffic lights and speed bumps frustrated drivers, who would often speed dangerously between one measure and another.

And so Monderman tried something revolutionary. He suggested that the road through Oudehaske be made to look more like what it already was: a road through a village. First, the existing traffic signs were removed. (Signs always irritated Monderman: driving through his home country of the Netherlands with the writer Tom Vanderbilt, Monderman railed against their patronizing redundancy. “Do you really think that no one would perceive there is a bridge over there?” he would ask, waving at a sign that stood next to a bridge, notifying people of the bridge.23) The signs might ostensibly be asking drivers to slow down. However, argued Monderman, because signs are the universal language of roads everywhere, on a deeper level the effect of their presence is simply to reassure drivers that they were on a road—a road like any other, where cars rule. Monderman wanted to remind them that they were also in a village, where kids might play.

So, next, he replaced the tarmac with red-brick paving, and the raised curb with a flush sidewalk and gently curved guttering. Cars could stray off the road and onto the verge if they wished. They tended not to.

Where once drivers had, figuratively speaking, sped through the village on autopilot—not really attending to what they were doing—now they were faced with a messy situation and had to engage their brains. It was hard to know quite what to do, or where to drive—or which space belonged to the cars and which to the village children. As Tom Vanderbilt describes Monderman’s strategy, “Rather than clarity and segregation, he had created confusion and ambiguity.”24

A confusing situation always grabs the attention, as Brian Eno has argued. Perplexed, drivers took the cautious way forward: they drove so slowly through Oudehaske that Monderman could no longer capture their speed on his radar gun. Earl Wiener would have recognized the logic: by forcing drivers to confront the possibility of small errors, the chance of their making larger ones was greatly reduced.

Monderman was the most famous of a small group of traffic planners around the world who have been pushing against the trend toward an ever tidier strategy for making traffic flow smoothly and safely. The usual approach is to give drivers the clearest possible guidance as to what they should do and where they should go: traffic lights, bus lanes, cycle lanes, left- and right-filtering traffic signals, railings to confine pedestrians, and of course signs attached to every available surface, forbidding or permitting different maneuvers. Laweiplein in the Dutch town of Drachten was such a typical junction, and accidents were common. Frustrated by waiting in jams, drivers would sometimes try to beat the traffic lights by blasting across the junction at speed—or they would be impatiently watching the lights, rather than watching for other road users. (In urban environments, about half of all accidents happen at traffic lights.25) With a shopping center on one side of the junction and a theater on the other, pedestrians often got in the way, too.

Monderman wove his messy magic and created the “Squareabout.” He threw away all the explicit efforts at control. In their place, he built a square with fountains, a small grassy roundabout* in one corner, pinch points where cyclists and pedestrians might try to cross the flow of traffic, and very little signposting of any kind. It looks much like a pedestrianization plan—except that the square has as many cars crossing it as ever, approaching from all four directions. Pedestrians and cyclists must cross the traffic as before, but now they have no traffic lights to protect them. It sounds dangerous—and surveys show that locals think it is dangerous. It is certainly unnerving to watch the Squareabout in operation—drivers, cyclists, and pedestrians weave in and out of one another in an apparently chaotic fashion.

Yet the Squareabout works. Traffic glides through slowly but rarely stops moving for long. The number of cars passing through the junction has risen, yet congestion has fallen. And the Squareabout is safer than the traffic-light crossroads that preceded it, with half as many accidents as before. It is precisely because the Squareabout feels so hazardous that it is safer. Drivers never quite know what is going on or where the next cyclist is coming from, and as a result they drive slowly and with the constant expectation of trouble. And while the Squareabout feels risky, it does not feel threatening; at the gentle speeds that have become the custom, drivers, cyclists, and pedestrians have time to make eye contact and to read one another as human beings rather than threats or obstacles. When showing visiting journalists the Squareabout, Monderman’s party trick was to close his eyes and walk backward into the traffic. The cars would just flow around him without so much as a honk on the horn.

In Monderman’s artfully ambiguous Squareabout, drivers are never given the opportunity to glaze over and switch to the automatic driving mode that can be so familiar. The chaos of the square forces them to pay attention, work things out for themselves, and look out for one another. The square is a mess of confusion. That is why it works.