8Rise of the Robots

Modern driverless cars began to emerge from the labs of robotics researchers in the final decades of the twentieth century. Throughout the 1980s and 1990s, German autonomous-vehicle pioneer Ernst Dickmanns built several prototypes that used sensors and intelligent software to steer themselves. In Italy, Professor Alberto Broggi created a car that used machine vision software to follow painted lane markers. As primitive as these early driverless cars were by today’s standards, they had a major advantage over the radio-guided Buicks of the past: their intelligence was carried on-board the car rather than buried in the road.

Two catalysts helped make modern robotic cars a reality: The first was that microprocessors shrank and grew more powerful. The second catalyst was a 2001 U.S. Congressional mandate that dictated that by the year 2015, one-third of the vehicles used in military war zones should be fully autonomous. Like cell phones, GPS, and the internet, driverless cars are yet another consumer device whose origins are in military technology.

As part of the mandate, Congress tasked DARPA with driving the development of the necessary technologies and authorized the agency to give out cash prizes to anyone who could demonstrate they could build an autonomous vehicle. Equipped with an alluring pot of prize money, DARPA officials laid out a plan. The agency would sponsor a series of road races where researchers could compete for cash prizes by pitting their robotic vehicle against those of their colleagues from other universities and companies (readers will recall that DARPA later used a similar technique to motivate the development of disaster recovery robots like CHIMP).

Between 2001 and 2007, DARPA sponsored three road races, the DARPA Challenges of 2004, 2005, and 2007. The first DARPA Challenge of 2004 offered a cash prize of $1 million to the team whose autonomous vehicle could win a 150-mile long race through an uninhabited section of the Mohave Desert in the southwestern United States. The desert was a natural choice for the first challenge. Given the primitive state of autonomous vehicle technology in those early days, it was crucial that the roboticists test their work far from busy places such as shopping malls and streets crowded with pedestrians, baby carriages, and other potentially disastrous obstacles.

It turns out the desert was the right choice of venue. The robotics software and hardware with which the fifteen competing teams equipped their cars was too crude to handle the task at hand. Hardware sensors and GPS devices were slow and unreliable. The competing vehicles’ machine-vision software performed even more poorly, stranding vehicles in embankments and on rocks. Other vehicles were felled by mechanical problems. After a few hours of race time, none of the fifteen competing vehicles made it further than eight miles into the course. The $1 million cash prize remained unclaimed.

It would have been easy for everyone involved to give up after such a crushingly disappointing outcome. In an interview to CNN following the 2004 race, Tom Strat, deputy program manager of the DARPA Grand Challenge, remained firmly optimistic, saying “Even though nobody got more than about 5 percent of the way through the course, this has made these engineers even more determined.”1

Undeterred, DARPA ponied up funding to repeat the race again the next year. The 2004 race involved a fascinating but motley crew of all kinds of diverse autonomous vehicles, ranging from a small, two-ton truck to a sprightly dune buggy with oversized wheels. The second time around, for the 2005 race, DARPA tightened up its selection process for competitors, holding site visits and a National Qualifying Event to winnow down the field. The purse for the winning team was increased to $2 million.

Based on performance in the qualifying event, DARPA selected twenty-three teams to compete in the 2005 DARPA Challenge. The 2005 race also took place in an uninhabited desert. The rules were roughly the same as for the previous year: competing vehicles had to drive themselves through a 132-mile-long off-road course without relying on support from roadside infrastructure or human assistance.

By the end of race day, it was clear that DARPA 2005 represented a critical tipping point in the development of modern robotics. If only the engineers who built GM’s and RCA’s electronic highway half a century ago could have been witness to the miracle that took place on the unpaved desert roads. For the first time ever, five autonomous vehicles safely steered themselves through a treacherous sandy race course using only their own artificial perception to find their way.

The winner of the 2005 challenge was the Stanford Racing team that came in first place in just under seven hours. Close on their heels were two cars from Carnegie Mellon University that took second and third places. Fourth place went to a car from the Gray Insurance Company, and fifth to an entry built by the Oshkosh Truck Corporation.

Equally exciting was the software that Stanford’s victorious vehicle used to win the race. While the other competing teams planned their vehicle’s course in advance with topographic maps and aerial imagery, Stanford’s champion car, a souped-up VW Touareg named Stanley, used another approach. During the months leading up the 2005 challenge, Stanley’s mid-level controls learned to drive.

10222_008_fig_001.jpg

Figure 8.1 An infamous stretch of road in the DARPA Grand Challenge of 2005 called Beer Bottle Pass, approximately seven miles from the finish line, featured over twenty twists and turns.

Source: U.S. Federal Government (DARPA); Wikipedia

Machine learning and driving

Sebastian Thrun, the Stanford professor who led the team of students that developed Stanley, did a few key things differently. First, he realized that software, not hardware, would determine who won the race. Second, to create the mid-level control software, the “perceiving” and “reacting” portions of the car’s guiding software, Thrun and his team opted to not use rule-based software, the prevailing AI paradigm at the time. In the early days of their project, Thrun and his team decided that attempting to write a logical set of rules to deal with the vast array of topographical details and random objects the car would encounter during the race was simply not going to work.

Instead, Thrun and his team used machine learning. Thrun explained:

Many of the people who participated in the race had a strong hardware focus, so a lot of teams ended up building their own robots. Our calculus was that this was not about the strength of the robot or the design of the chassis. Humans could drive those trails perfectly; it was not complicated off-road terrain. It was really just desert trails. So we decided it was purely a matter of artificial intelligence. All we had to do was put a computer inside the car, give it the appropriate eyes and ears, and make it smart.

In trying to make it smart, we found that driving is really governed not by two or three rules but by tens of thousands of rules. There are so many different contingencies. We had a day when birds were sitting on the road and flew up as our vehicle approached. And we learned that to a robot eye, a bird looks exactly the same as a rock. So we had to make the machine smart enough to distinguish birds from rocks.

In the end, we started relying on what we call machine learning, or big data. That is, instead of trying to program all these rules by hand, we taught our robot the same way we would teach a human driver. We would go into the desert, and I would drive, and the robot would watch me and try to emulate the behaviors involved. Or we would let the robot drive, and it would make a mistake, and we would go back to the data and explain to the robot why this was a mistake and give the robot a chance to adjust.2

To understand why Thrun’s decision to use machine learning was a radical approach at the time, let’s revisit the two prevailing paradigms of artificial-intelligence software we covered in previous chapters: rule-based AI and data-driven AI (increasingly known as machine learning). As we discussed earlier, rule-based AI demands that its programmer first devise a theoretical model of the world, then write a set of rules called if-then statements to logically interact with that model. In contrast, machine learning involves applying an algorithm to large amounts of data and using statistical techniques to process that data, eventually exposing the software to enough data that it “learns” to recognize patterns without human oversight.

In chapter 5, we conducted an exercise in writing mid-level control software to guide a car through a busy intersection. We learned that rule-based code falters when asked to handle the wide variety of obstacles a car might encounter in the real world. Software that uses logical if-then statements to oversee a car’s perception and response will quickly be derailed by corner cases and exceptions to the rules.

Now imagine that we are undertaking a similar exercise, but this time our goal is to write mid-level control software that can guide a car through a desert. Our software must be capable of identifying which portions of ground in front of the car are safe to drive upon (“driveable”), and which portions are not (“undrivable”). One solution could be to use aerial images and GPS data of the desert landscape to manually chart a path for our car. After some thought, however, it would become apparent that such a solution would not anticipate all the obstacles that our car will face on the ground level as it navigates around uncharted potholes, debris, big rocks, and ditches.

For the purpose of this exercise, let’s assume that we all arrive at the same conclusion: the best way to keep the car on ground that’s safe to drive upon is to build mid-level control software that, similar to a human driver, identifies “driveable” ground in real time, but without using rules or logic. I witnessed one memorable demonstration of the futility of applying rules to driving in the 2005 challenge. One of my colleagues led a team that spent months writing a set of logical rules and applying them to the stream of data flowing in from their vehicle’s visual sensors. When sensor data indicated a substantial rise in the ground in front of the car, the control software would respond by turning the wheel and steering the car around the obstacle.

After months of hard work, this team’s code base was voluminous and detailed. Unfortunately, their quest for the million-dollar purse came (literally) to a screeching halt when during the race, their vehicle slammed on its brakes right before it entered a tunnel. The team later figured out that their mid-level control software lacked a specific rule that dealt with tunnels since none of the programmers had anticipated that one would appear on the course. Without clear guidance, the car took its best guess. Based on the height of the tunnel ceiling and the fact that it loomed large over the road in front of it, the car’s software classified the tunnel as a gigantic and steep wall. Unexpectedly presented with what it thought was a wall, its software did the right thing: it slammed on its brakes and refused to move until its human programmer came and coaxed the car back onto safer ground.

Rule-based software can be a valuable part of a driverless car’s toolkit for high-level control applications like route planning, and to manage low-level activities, such as checking the status of the gas tank. However, rule-based artificial intelligence has a tendency to break down in unstructured environments, leading some roboticists to refer to top-down AI software as “brittle.” In the 2004 and 2005 challenges, the software used by most of the contestants proved to be too brittle to do its job, one reason why so many of the competing vehicles failed to complete the course.

The third and final DARPA Urban Challenge took place in on an unused U.S. Air Force base seventy-five miles northeast of Los Angeles. To keep competing teams on their toes, it was decided that the sixty-mile race course would be an unstructured and dynamic environment, similar to what an autonomous vehicle would encounter in a chaotic war zone (or busy freeway). To win the first place prize of $2 million, teams would have to build mid-level control software that could guide their vehicle safely around other moving cars on an unfamiliar course without being explicitly programmed to do so. At that time, such an assignment was as formidable a challenge for an autonomous vehicle as climbing Mt. Everest during a blinding snowstorm without a map would be for a human.

The rules of the 2007 Urban Challenge were straightforward: without a human driver on board, each vehicle had to complete a list of simple driving tasks, or “missions,” in an urban environment. Missions included turning left into an intersection, going through a traffic circle, parking, and maintaining proper position in a two-lane road without colliding into on-coming traffic. To ensure that the vehicles were truly autonomous (rather than preprogrammed for this particular environment), just an hour before the race began each team was given a crude digital map of the local geography.

As race day for the 2007 challenge dawned, hundreds of egos were on the line. Eleven robotic cars from elite universities and companies lined up at the starting gates. The starter, a man in a baseball cap, dropped a green flag and the race began. One by one, the driverless cars cautiously rolled out of the gates, their trunks and back seats stuffed with computers and their steering wheels spinning back and forth as if guided by an invisible set of hands. Inside a nearby tent large enough to stage a circus, thousands of fans and spectators watched the action unfold on gigantic movie screens.

The race proceeded with mixed results. Judges in reflective orange safety vests scurried around holding stopwatches as competing teams tackled their assigned missions under the relentless sun of the Southern California desert. Like a squadron of visually impaired octogenarians, vehicle after vehicle chugged cautiously along while DARPA program managers monitored the race from behind giant concrete barricades.

Even in this face-off of the world’s elite roboticists, the best driverless-car technology of the day was still unpredictable. To ensure the safety of traffic judges and competing teams, each autonomous vehicle was trailed by a human “babysitter” vehicle, a professional driver in a specially reinforced Ford Taurus. Should a car’s on-board artificial intelligence fail, each robotic vehicle was equipped with a mandatory emergency remote E-stop button to be used if it posed a danger to nearby humans or other vehicles.

The sandy desert terrain, once the training ground for fighter pilots, began to resemble a movie set for a comedy about slow-motion fender benders. Talos, the entry from MIT, drove slowly into the side of Skynet, Cornell’s autonomous vehicle (an accident we described in chapter 5).3 Another competing vehicle quickly got itself eliminated from the race after it rebelliously steered itself off course and dove headfirst into the wall of a nearby building; the collision was noted in race logs as a “vehicle vs. building incident,” which is exactly what it sounds like. Two other vehicles, like adolescents paralyzed by stage fright at their first high-school dance, froze in place while pondering which way to turn, one at an intersection and the other at a traffic circle.4

Despite the vehicular high jinks, the race ended well, as six of the eleven competing teams completed their missions and finished the course. The winning vehicle was a vehicle named Boss, built by Carnegie Mellon University and its industry partner for the race, automotive giant GM. Boss completed the course in four hours and ten minutes, maintaining an average speed of fourteen miles per hour. Stanford University’s entry, a robot named Junior, came in a close second, while Odin, the vehicle from Virginia Tech University, was third.

The real winner of the day, however, was the robotics community. The results of the 2007 challenge proved that autonomous vehicles could someday be a viable technology, capable of successfully navigating bustling urban environments, negotiating four-way intersections, and detecting the presence of other cars on the road. Finally, perhaps the Da Vinci problem that had plagued the development of driverless cars for decades was coming to an end.

Playing checkers

There is no exact birth date for the modern autonomous vehicle. In reality, the modern driverless car emerged in stages. In DARPA’s previous sponsored races in 2004 and 2005, the cars’ performance improved from one competition to the next. By the time the third challenge rolled around in the year 2007, competing teams benefitted not only from hard-won experience, but also from rapid advances in hardware technologies and breakthroughs in artificial intelligence software, in particular, in machine learning.

A useful explanation of machine learning comes from a lively website called Stack Overflow, where a global community of several million programmers answer one another’s technical questions and then vote on the quality of the responses. On the site, the top-voted response to the question “What is machine learning” states:

Essentially, it is a method of teaching computers to make and improve predictions or behaviors based on some data. What is this “data”? Well, that depends entirely on the problem. It could be readings from a robot’s sensors as it learns to walk, or the correct output of a program for certain input. The ability to react to inputs that have never been seen before is one of the core tenets of many machine learning algorithms. Imagine trying to teach a computer driver to navigate highways in traffic. Using your “database” metaphor, you would have to teach the computer exactly what to do in millions of possible situations. An effective machine learning algorithm would (we hope!) be able to learn similarities between different states and react to them similarly.

The similarities between states can be anything—even things we might think of as “mundane” can really trip up a computer! For example, let’s say that the computer driver learned that when a car in front of it slowed down, it had to slow down too. For a human, replacing the car with a motorcycle doesn’t change anything—we recognize that the motorcycle is also a vehicle. For a machine learning algorithm, this can actually be surprisingly difficult! A database would have to store information separately about the case where a car is in front and where a motorcycle is in front. A machine learning algorithm, on the other hand, would “learn” from the car example and be able to generalize to the motorcycle example automatically. Another way to think about machine learning is that it is “pattern recognition”—the act of teaching a program to react to or recognize patterns.5

While machine-learning techniques sound organic—the software learns to recognize patterns or to solve certain problems—what’s actually happening is that an algorithm parses vast amounts of data to look for statistical patterns. Using the statistical patterns found, the algorithm then builds a mathematical model that ranks the probability of various possible outcomes to make predictions or reach a decision. The algorithm then validates whether its predictions are accurate (or its decisions appropriate ones) by testing them on new, unseen data. If they’re wrong, it goes back to update the model. In this way, a machine-learning program is fed data to “learn” from “experience” under the supervision of its human programmer, whose job consists of selecting the algorithm and providing the data and the initial right or wrong feedback.

Board games have long been a favorite of AI researchers to demonstrate new paradigms, and machine learning was no exception. When machine-learning techniques were first developed during the 1950s, limited computing power greatly restricted what board games they could be applied to. Since the computers of that era could not handle the number of calculations that a game of chess would require, researchers used the game of checkers instead.

In 1949, Arthur Samuel, an early artificial intelligence researcher and at that time a brand-new IBM employee, wanted to prove that a computer could perform com­plicated intellectual tasks. The year Samuel joined IBM, the company was still primarily known for making calculating machines. Samuel had an idea for how to increase the company’s visibility. He figured that if he could come up with some sort of application that a computer could do but an adding machine couldn’t, he could showcase the analytical power of IBM’s first commercial computer, the IBM 701.

Samuels decided that the best application for demonstrating the cognitive ability of a computer was a good game of checkers. Even better, Samuel’s goal was to teach a computer how to play checkers at a world-class level. If he had opted to solve the problem using the prevailing AI paradigm of his day, Samuel would have written copious amounts of if-then statements in an attempt to anticipate and guide the computer through any possible board configuration it might encounter.

Such a rule-based approach would have been a laborious undertaking. Every possible board configuration would need to be addressed in advance by a rule that dealt with that particular situation. Using a rule-based approach would go something like this: one rule might say “Give priority to moves that eliminate an opponent’s piece”; another rule might say “Give priority to moving pieces that are closer to reaching the opponent’s back line.”

The problem with this sort of AI, as Samuel quickly discovered, is that the number of rules needed to play a single game of checkers ballooned into a unmanageably long list of instructions. A more troubling problem, however, was that even if somebody created an exhaustive list of rules to address every possible board position, the computer following those rules would still be a mediocre checkers player. Similar to a novice human player who is guided by rigid protocol rather an intuitive sense of strategy, the computer would lack an appreciation for nuances and seemingly illogical moves, qualities that define a great checkers player.

Many AI experts would have gotten pretty good results by writing more elaborate sets of rules, perhaps tossing in some exceptions and pseudorandom moves to create the illusion that a strategy was being employed. Samuel, however, chose a different path. He decided to use machine learning, so the computer could learn to play checkers, not from a set of formal rules, but from its own experience.

A human player becomes an expert not by calculating all possible outcomes, but by observing typical board situations, memorizing them, and remembering what their outcomes were. Human players remember the moves that led to defeat; a similarly valuable lesson is offered by remembering the moves that led to victory. Samuel decided to program his computer to emulate a skilled human, to play by learning to recognize the patterns of particular board positions, particularly those that led to a winning game.

Samuel programmed the computer to begin learning by making random legal moves by playing against a software copy of itself. Sometimes the original copy of the checkers program would win, sometimes it would lose. After every game the computer would take a moment to record, in a large database, all the moves that led to a win and all the moves that led to a loss. This was how the computer gained experience.

The next time it played, armed with its ever-growing database of experiences, before making a move the software would look up the board configuration in its database. This way, it could see whether it had already encountered that particular configuration and, if so, what moves had led to a win. If it happened to be a configuration of pieces that the software had never before encountered, the software would make another random but legal move, and then store the result of that move.

Initially, Samuel’s machine-learning software played randomly, like a child stumbling through his first game. But after a few thousand games, its database of good and bad moves grew. After a few thousand games more, the software began to play with what some observers might call “strategy.” Since most moves can lead to both a loss and a win, depending on subsequent moves, the database didn’t just record a win/lose outcome. Instead, it recorded the probability that each move would eventually lead to a win. In other words, the database was essentially a big statistical model.

10222_008_fig_002.jpg

Figure 8.2 AI techniques used in driverless cars. Most robotic systems use a combination of techniques. Object recognition for real-time obstacle detection and traffic negotiation is the most challenging for AI (far left).

As the software learned, it spent countless hours in “self-play,” amassing more gaming experience than any human could in a lifetime. As the database grew, Samuel had to develop more efficient data-lookup techniques, leading to the invention of hash tables that are still used in large databases. Another of Samuel’s innovations was to use the database to factor in how the opponent would mostly likely respond to each move, an algorithm known today as minimax.

Ultimately, Samuel succeeded. His checker-playing program would later have an impressive impact on the world outside his lab. A few years later, on February 24, 1956, the program was demonstrated to the public on live television. In 1962, the computer beat checkers master player Robert Nealey, and IBM’s stocks rose 15 percent overnight.

It was impressive that the software could win against a master checkers player. But even more impressive was that the program became skilled enough to play checkers better than its creator, Samuel himself. A rule-based artificial-intelligence program’s expertise is bounded by that of its human creators. Samuel taught his machine-learning program something limitless: how to learn.

Many people reject the idea that a machine, a computer, can learn. We frequently hear misguided statements such as “a computer can’t be more intelligent than the human who programmed it.” That idea is rooted in the old way of thinking of a computer as an automated machine that simply carries out a prescribed set of instructions. The power of machine learning enabled Samuel’s computer to acquire skill the same way a human does, by learning from its own successes and failures. Just a child can eventually know more than its parents, a student can surpass her teacher, and an athlete can beat his coach, a computer can ultimately outperform its programmer.

Some expert chess players who have played against a software opponent have reported that they feel as if the computer program plays with intention, strategy, even passion. In 1996, Garry Kasparov lost his first match against IBM’s Deep Blue, the great-grandchild of Samuel’s checkers-playing algorithm. In an interview with TIME magazine, Kasparov said “I could feel—I could smell—a new kind of intelligence across the table.” He later concluded, “Although I think I did see some signs of intelligence, it’s a weird kind, an inefficient, inflexible kind that makes me think I have a few years left.”6 Kasparov was wrong. The next year Deep Blue won the tournament.

Deep Blue’s “weird, inefficient new kind of intelligence,” was at its heart just an application of statistics. One of the fascinating and frustrating characteristics of machine learning is its opacity. Some engineers feel uncomfortable with machine learning since they never completely understand exactly how AI reaches its conclusion.

One of the defining characteristics of machine learning, for better or for worse, is that the internal mathematical model that the machine-learning algorithm develops to make its predictions is usually incomprehensible to a human. As a result, a human supervisor can’t take a look at the software code to see whether it’s sound. The only way a human supervisor can validate a machine-learning model’s prediction is by feeding it new test cases.

Infinite state space

Although robotics researchers have used machine-learning techniques for decades, these robots worked in highly structured environments. Machine learning worked well for a checkers game, since a checkerboard is a simple state space that offers its players a finite number of possible moves. Samuel’s checkers program recognized board positions by looking them up in its database. Each board configuration was distinct and well defined, therefore easy to store.

Playing chess is more difficult than checkers because each board configuration has many more possible future moves, or what AI scientists call a higher branching factor. More complex still is a city street or a busy freeway that presents a state space that contains an infinite number of possible “moves” and “board positions.” In artificial-intelligence research, an environment that serves its robot an endless supply of novel situations is called an infinite state space.

Driverless cars must deal with an infinite state space. A robotic vehicle constantly encounters new situations, which makes it impossible to create a lookup table to store these experiences. Not only is each new experience difficult to boil down into a finite storable unit (such as a board position), storing an infinite number of experiences would result in a look-up database so large it would quickly outgrow even a powerful modern computer.

For years, the barrier of infinite state space has prevented roboticists from using machine learning for robots operating in unstructured and dynamic environments. It became possible to apply machine learning to infinite state spaces only recently, when new algorithms were developed, computing power improved, and sufficient amounts of training data became available. One of the reasons Stanford’s victory in the 2005 DARPA Challenge was significant was that their autonomous vehicle, Stanley, was the first successful application of machine learning to driving.

Stanley’s team cracked the problem of infinite state space by simplifying the real world outside the car into just two categories: drivable and not drivable. They trained their machine-learning software to sort the raw, real-time visual data from the lidar and cameras mounted on their car into one of these two finite categories. To teach the machine learning-software how to recognize drivable ground, every weekend the Stanford team returned to the desert to collect more visual data of the desert landscape. They adjusted the program when it made a mistake and continued the training.

To build a visual model of the processed data for their mid-level controller’s occupancy grid, the team color-coded the data streams. The portions of the ground in front of the car’s front bumper that the machine-learning software deemed drivable were assigned one color, and the portions deemed not drivable were assigned another color. Videos of Stanley’s dynamic occupancy grid show hypnotic brightly colored swirls drifting around the screen as the car moves forward, the machine-learning software simplifying the infinite state space of the desert into just these two categories.

Stanley’s victory in the 2005 DARPA Challenge proved that computer-vision applications can use machine learning to navigate complex and unpredictable real-world environments. One of the key enabling factors in the development of machine-learning software has been a new abundance of training data, previously a scarce resource. In the case of driverless cars, training data originates from several on-board hardware devices whose performance has improved dramatically over the past several years.

The modern toolbox

Driverless cars are a prime example of a force called recombinant innovation, the process of combining several existing technologies in new ways. Despite the popular stereotype of the lone, genius inventor, in reality many emerging technologies—particularly complex ones—are actually fresh combinations of old technologies put together again in a novel way.7 Recombinant innovation is an indirect by-product of Moore’s Law, the now-famous principle that over time, the performance of semiconductors improves at an exponential rate while their cost shrinks at a corresponding rate.

Moore’s Law has held true for semiconductor technologies for a few decades now. Its effect has carried over to other types of hardware that use computer chips such as digital cameras, televisions, and electronic toys. The effect of Moore’s Law has led some experts to describe any technology whose performance improves at an exponential rate as an exponential technology.

The gradual improvement of autonomous vehicle prototypes since the 1970s demonstrates the power of recombinant innovation, and also the beneficial effects of Moore’s Law.8 In the 1980s, Carnegie Mellon’s Navlab built an autonomous-vehicle prototype named Codger that was the size of a UPS delivery truck. Codger had to be hefty since it carried an expensive assortment of high-end technology, including a bulky color TV camera, a GPS receiver, a laser range finder, and several general purpose Sun-3 computers. Codger’s top speed was about 20 mph on empty roads. The vehicle was unsafe for city streets since it took about ten seconds for the software to work through each navigation “task” on a clear stretch of road and up to twenty seconds or more in “cluttered” environments.9

Fast forward to another state-of-the-art autonomous vehicle in the year 2007 and the situation looked more promising. To equip their SUV for the 2007 DARPA Challenge, the Cornell team spent $195,850 to buy lidar and radar sensors, a GPS, and a camera, and $46,550 to purchase several desktop computers, laptops, and peripherals.10 Although it cost less to outfit an autonomous vehicle in 2007 than it did in 1980, computers and sensors were still too slow to support autonomous driving. In a postmortem analysis of the 2007 DARPA Challenge, the leader of the CMU team, Chris Urmson (who later helped lead Google’s self-driving car initiative), ruefully noted that “available off-the-shelf sensors are insufficient for urban driving.”11

Fast forward again to the present day and the situation looks much more promising. Today, the cost of providing the data needed to feed a car’s mid-level control software is significantly less than in 2007. At the time this book was written, the cost of rigging up the hardware needed for an autonomous vehicle was roughly $5,000 per car, and in five to seven years will be even less.12

Modern hardware devices are not only cheaper but also smaller, so they can be discreetly tucked inside the car’s body and interior. A radar detector is the size of a hockey puck. A GPS receiver fits easily into a car’s dashboard. One slim laptop computer has more processing power than a 1960s mainframe that was the size of a minivan. A few lidar devices can be inserted next to a car’s headlights.

Today’s enabling hardware technologies also work better. In 2007, driverless cars were an intriguing “someday soon” technology. Just seven years after describing the inadequacy of off-the-shelf sensors, in 2014, Urmson noted that as Google’s cars reached their 700,000 mile mark, “thousands of situations on city streets that would have stumped us two years ago can now be navigated autonomously.”13 As the momentum continues, another few short years later, Google’s advanced prototypes have successfully driven more than 1,500,000 combined miles on city streets.

When Google’s driverless Prius captured the public’s attention in 2011, it appeared that the team of people on Google’s Chauffeur Project designed and built a functioning driverless car in just a few short years. While it’s tempting to see Google’s success as just another example of the company’s seemingly magical ability to stay a few steps ahead of the rest of the industry, Google enjoyed several other advantages. One obvious advantage was funding. Google has a generous, multiyear R&D budget to throw at thorny engineering problems.

As a point of comparison, in 2007, Google’s annual expenditures for research and development were a hefty $2.1 billion, or an estimated 12 percent of the company’s annual revenue.14 While it’s not clear how much of that was allotted to driverless-car research, in contrast, that same year, DARPA gave each competing team roughly $1 million apiece to outfit their vehicle with the pricey technological gear it required (and to buy free pizza for the students contributing their time to the team).

Another advantage Google enjoyed was lots of top-notch personnel. DARPA’s investment in the series of challenges created a pool of talent and brainpower that was mined by recruiters for Google’s Chauffeur Project, which was launched in 2007. In fact, Sebastian Thrun was hired by Google shortly after the final DARPA Challenge.

Later in an interview, Thrun described how Google built its team of experts by “cherry-pick[ing] the top talent from the Grand Challenges. … Then they branched out to prodigies of other sorts.”15 Once the rich vein of talent drawn to the DARPA Challenge was mined out, Google lured away more of the world’s best and brightest (and highly paid) experts from a number of different fields, such as machine learning, robotics, interface design, and laser technology.

Some of the Grand Challenge alumni that Google hired designed the company’s first-generation autonomous Prius. DARPA Challenge veteran Anthony Levandowski, famous for creating the world’s first “driverless motorcycle” while a student at Berkeley, cofounded a company after graduation called 510 Systems. Shortly afterward, 510 Systems was hired by the Discovery Channel in 2008 to transform a Prius into an autonomous pizza-delivery robot.

The “Pribot” delivery succeeded and Google took notice. Bryon Majusiak, an employee of 510 recalls: “From then on, we started doing a lot of work with Google. … We did almost all of their hardware integration. They were just doing software. We’d get the cars and develop the [low-level] controllers, and they’d take it from there.” Several autonomous Priuses later, in 2011, Google bought 510 Systems, lock, stock, and barrel.16

Money and the right staff are certainly critical success factors in large and ambitious engineering projects. But there’s a third reason that Google’s cars were able to outperform earlier attempts: time to prepare. Machine-learning software, like teenagers, needs time to learn to drive. The cars that competed in the different DARPA Challenges were the fruit of a mere 12–18 months of hard intellectual labor by students, professors, and professional engineering teams.

Because of the way the rules of the DARPA Challenges were structured, none of the participating teams had the luxury of years of development time, nor the opportunity to privately test their software on the actual race course. To ensure that no participant gained an unfair advantage, DARPA intentionally prevented competing teams from rehearsing their robotic cars on the streets and roads of the shuttered air force bases (or deserts) where the competitions took place. Instead, teams refined their machine vision software without knowing the exact details of the obstacles and situations their car would encounter in the actual race.

On race day, participants put their reputations on the line and publicly demonstrated their cars. In contrast, determined to uphold its public image of a software company known for providing rapid and eerily accurate insights into data, Google conducted its initial driverless engineering experiments in private. Its early technological failures—whatever they might have been—will never see the light of day or be painfully documented in the media. By the time Google’s fleet of Priuses were finally publicly unveiled in 2011, they were honed to near perfection and could perform flawlessly.

Several advantages, including a big research budget, talented developers, and the time to prepare in private have enabled Google to create a fleet of driverless cars that seemed to work on their first try. Important as these factors have been, we wonder whether there’s actually another more mundane reason for Google’s success: timing, the fact that between 2007 and 2011, the invisible, but powerful forces of Moore’s Law and recombinant innovation were in full swing. Today, driverless cars are finally hitting their stride, guided by intelligent software that’s fed by data from high-speed digital cameras, high-definition digital maps, radar, lidar, and GPS devices.

10222_008_fig_003.jpg

Figure 8.3 Arthur Samuel playing checkers on the IBM 7090 (February 24, 1956).

Source: Courtesy of IBM Archives

Notes