IT WAS THE first thing people saw as they drew close: a shining, stainless steel globe called the Unisphere, rising a full twelve stories into the air. Around it stood dozens of fountains, jetting streams of crystal-clear water into the skies of Flushing Meadows Corona Park, in New York’s Queens borough. At various times during the day, a performer wearing a rocket outfit developed by the US military jetted past the giant globe—showing off man’s ability to rise above any and all challenges.
The year was 1964 and the site, the New York World’s Fair. During the course of the World’s Fair, an estimated 52 million people descended upon Flushing Meadows’ 650 acres of pavilions and public spaces. It was a celebration of a bright present for the United States and a tantalizing glimpse of an even brighter future: one covered with multilane motorways, glittering skyscrapers, moving pavements and underwater communities. Even the possibility of holiday resorts in space didn’t seem out of reach for a country like the United States, which just five years later would successfully send man to the Moon. New York City’s “Master Builder” Robert Moses referred to the 1964 World’s Fair as “the Olympics of Progress.”
Wherever you looked there was some reminder of America’s post-war global dominance. The Ford Motor Company chose the World’s Fair to unveil its latest automobile, the Ford Mustang, which rapidly became one of history’s best-selling cars. New York’s Sinclair Oil Corporation exhibited “Dinoland,” an animatronic recreation of the Mesozoic age, in which Sinclair Oil’s brontosaurus corporate mascot towered over every other prehistoric beast. At the NASA pavilion, fairgoers had the chance to glimpse a fifty- one-foot replica of the Saturn V rocket ship boat-tail, soon to help the Apollo space missions reach the stars. At the Port Authority Building, people lined up to see architects’ models of the spectacular “Twin Towers” of the World Trade Center, which was set to break ground two years later in 1966.
Today, many of these advances evoke a nostalgic sense of technological progress. In all their “bigger, taller, heavier” grandeur, they speak to the final days of an age that was, unbeknownst to attendees of the fair, coming to a close. The Age of Industry was on its way out, to be superseded by the personal computer–driven Age of Information. For those children born in 1964 and after, digits would replace rivets in their engineering dreams. Apple’s Steve Jobs was only nine years old at the time of the New York World’s Fair. Google’s cofounders, Larry Page and Sergey Brin, would not be born for close to another decade; Facebook’s Mark Zuckerberg for another ten years after that.
As it turned out, the most forward-looking section of Flushing Meadows Corona Park turned out to be the exhibit belonging to International Business Machines Corporation, better known as IBM. IBM’s mission for the 1964 World’s Fair was to cement computers (and more specifically Artificial Intelligence) in the public consciousness, alongside better-known wonders like space rockets and nuclear reactors. To this end, the company selected the fair as the venue to introduce its new System/360 series of computer mainframes: machines supposedly powerful enough to build the first prototype for a sentient computer.
IBM’s centerpiece at the World’s Fair was a giant, egg-shaped pavilion, designed by the celebrated husband and wife team of Charles and Ray Eames. The size of a blimp, the egg was erected on a forest of forty-five stylized, thirty-two-foot-tall sheet metal trees; a total of 14,000 gray and green Plexiglas leaves fanning out to create a sizable, one-acre canopy. Reachable only via a specially installed hydraulic lift, the egg welcomed in excited fair attendees so that they could sit in a high-tech screening room and watch a video on the future of Artificial Intelligence. “See it, THINK, and marvel at the mind of man and his machine,” wrote one giddy reviewer, borrowing the “Think” tagline that had been IBM’s since the 1920s.
IBM showed off several impressive technologies at the event. One was a groundbreaking handwriting recognition computer, which the official fair brochure referred to as an “Optical Scanning and Information Retrieval” system. This demo allowed visitors to write an historical date of their choosing (post-1851) in their own handwriting on a small card. That card was then fed into an “optical character reader,” where it was converted into digital form, and then relayed once more to a state-of-the-art IBM 1460 computer system. Major news events were stored on disk in a vast database and the results were then printed onto a commemorative punch-card for the amazement of the user. A surviving punch-card reads as follows:
THE FOLLOWING NEWS EVENT WAS REPORTED IN THE NEW YORK TIMES ON THE DATE THAT YOU REQUESTED:
APRIL 14, 1963: 30,000 PILGRIMS VISIT JERUSALEM FOR EASTER; POPE JOHN XXIII PRAYS FOR TRUTH & LOVE IN MAN.
Should a person try and predict the future—as, of course, some wag did on the very first day—the punch-card noted: “Since this date is still in the future, we will not have access to the events of this day for [insert number] days.”
Another demo featured a mechanized puppet show, apparently “fashioned after eighteenth-century prototypes,” depicting Sherlock Holmes solving a case using computer logic.
Perhaps most impressive of all, however, was a computer that bridged the seemingly unassailable gap between the United States and Soviet Union by translating effortlessly (or what appeared to be effortlessly) between English and Russian. This miraculous technology was achieved thanks to a dedicated data connection between the World’s Fair’s IBM exhibit and a powerful IBM mainframe computer 114 miles away in Kingston, New York, carrying out the heavy lifting.
Machine translation was a simple, but brilliant, summation of how computers’ clear-thinking vision would usher us toward utopia. The politicians may not have been able to end the Cold War, but they were only human—and with that came all the failings one might expect. Senators, generals and even presidents were severely lacking in what academics were just starting to call “machine intelligence.” Couldn’t smart machines do better? At the 1964 World’s Fair, an excitable public was being brought up to date on the most optimistic vision of researchers. Artificial Intelligence brought with it the suggestion that, if only the innermost mysteries of the human brain could be eked out and replicated inside a machine, global harmony was somehow assured.
Nothing summed this up better than the official strapline of the fair: “Peace Through Understanding.”
Two things stand out about the vision of Artificial Intelligence as expressed at the 1964 New York World’s Fair. The first is how bullish everyone was about the future that awaited them. Despite the looming threat of the Cold War, the 1960s was an astonishingly optimistic decade in many regards. This was, after all, the ten-year stretch that began with President John F. Kennedy announcing that, within a decade, man would land on the moon—and ended with exactly that happening. If that was possible, there seemed no reason why unraveling and re-creating the mind should be any tougher to achieve. “Duplicating the problem-solving and information-handling capabilities of the [human] brain is not far off,” claimed political scientist and one of AI’s founding fathers, Herbert Simon, in 1960. Perhaps borrowing a bit of Kennedy-style gauntlet-throwing, he casually added his own timeline: “It would be surprising if it were not accomplished within the next decade.”
Simon’s prediction was hopelessly off, but as it turns out, the second thing that registers about the World’s Fair is that IBM wasn’t wrong. All three of the technologies that dropped jaws in 1964 are commonplace today—despite our continued insistence that AI is not yet here. The Optical Scanning and Information Retrieval has become the Internet: granting us access to more information at a moment’s notice than we could possibly hope to absorb in a lifetime. While we still cannot see the future, we are making enormous advances in this capacity, thanks to the huge data sets generated by users that offer constant forecasts about the news stories, books or songs that are likely to be of interest to us. This predictive connectivity isn’t limited to what would traditionally be thought of as a computer, either, but is embedded in the devices, vehicles and buildings around us thanks to a plethora of smart sensors and devices.
The Sherlock Holmes puppet show was intended to demonstrate how a variety of tasks could be achieved through computer logic. Our approach to computer logic has changed in some ways, but Holmes may well have been impressed by the modern facial recognition algorithms that are more accurate than humans when it comes to looking at two photos and saying whether they depict the same person. Holmes’s creator, Arthur Conan Doyle, a trained doctor who graduated from Edinburgh (today the location of one of the UK’s top AI schools), would likely have been just as dazzled by Modernizing Medicine, an AI designed to diagnose diseases more effectively than many human physicians.
Finally, the miraculous World’s Fair Machine Translator is most familiar to us today as Google Translate: a free service that offers impressively accurate probabilistic machine translation between some fifty-eight different languages—or 3,306 separate translation services in total. If the World’s Fair imagined instantaneous translation between Russian and English, Google Translate goes further still by also allowing translation between languages like Icelandic and Vietnamese, or Farsi and Yiddish, which have had historically limited previous translations. Thanks to cloud computing, we don’t even require stationary mainframes to carry it out, but rather portable computers, called smartphones, no bigger than a deck of cards.
In some ways, the fact that all these technologies now exist—not just in research labs, but readily available to virtually anyone who wants to use them—makes it hard to argue that we do not yet live in a world with Artificial Intelligence. Like many of the shifting goalposts we set for ourselves in life, it underlines the way that AI represents computer science’s Neverland: the fantastical “what if” that is always lurking around the next corner.
With that said, anyone thinking that the development of AI from its birth sixty years ago to where it is today is a straight line is very much mistaken. Before we get to the rise of the massive “deep learning neural networks” that are driving many of our most notable advances in the present, it’s important to understand a bit more about the history of Artificial Intelligence.
And how, for a long time, it all seemed to go so right before going wrong.
The dream of bringing life to inanimate objects has been with us for thousands of years. However, when it comes to the popularization of Artificial Intelligence for regular people, it makes sense to begin with the world’s first programmable computer: a thirty-ton colossus named ENIAC. Powered on at the University of Pennsylvania just six months after the Second World War ended in 1945, ENIAC stood for Electronic Numeric Integrator and Calculator. It had cost $500,000 of US military funding to create and possessed a speed that was around 1,000 times faster than other electro-mechanical machines it may have competed against. The machine, and the idea that it represented, fascinated the press. They took to calling it “the giant brain.”
The notion of building such a “giant brain” captured the popular imagination. Until the end of the Second World War, a “computer” was the term used for a person who carried out calculations in a field such as bookkeeping. All of a sudden, computers were no longer people, but machines equipped with vacuum tubes and transistors—yet capable of performing calculations at a speed even greater than the most gifted of people. The Second World War and its immediate aftermath triggered a surge of interest in the field of cognitive psychology. During wartime alone, membership of the American Psychological Association expanded from 2,600 to 4,000. By 1960—fifteen years later—it would hit 12,000 members. Researchers in cognitive psychology imagined the human brain itself as a machine, from which complex behavior arose as the aggregate result of multiple simple responses. Instead of wasting their time on unprovable “mental entities,” cognitive psychologists focused their attention only on what was strictly observable about human behavior. This was the birth of fields like “behaviorism,” which the influential psychologist B. F. Skinner (known for his experiments with rats) described as the “technology of behavior.”
Engineers may previously have balked at the more metaphysical aspects of psychology, but they were intrigued at the concept that the brain might be a computer. They were equally fascinated by the new focus on understanding memory, learning and reasoning, which many psychologists felt were the basis for human intelligence. Excitingly, they also saw the potential advantages machines had over people. ENIAC, for instance, could perform an astonishing 20,000 multiplications per minute. Compared with the unreliable memory of humans, a machine capable of accessing thousands of items in the span of microseconds had a clear advantage.
There are entire books written about the birth of modern computing, but three men stand out as laying the philosophical and technical groundwork for the field that became known as Artificial Intelligence: John von Neumann, Alan Turing and Claude Shannon.
A native of Hungary, von Neumann was born in 1903 into a Jewish banking family in Budapest. In 1930, he arrived at Princeton University as a math teacher and, by 1933, had established himself as one of six professors in the new Institute for Advanced Study in Princeton: a position he stayed in until the day he died. By any measure, von Neumann was an astonishing intellect. According to legend, he was able to divide eight-digit numbers in his head at the age of six. During the Second World War, von Neumann worked on the Manhattan Project at Los Alamos, where one of his jobs was the terrible task of working out the precise height at which the hydrogen bomb must explode to cause maximum devastation. Von Neumann’s major contribution to computing was helping to establish the idea of a computer program store in the computer memory. Von Neumann was, in fact, the first person to use the human terminology “memory” when referring to a computer. Unlike some of his contemporaries, he did not believe a computer would be able to think in the way that a human can, but he did help establish the parallels that exist with human physiognomy. The parts of a computer, he wrote in one paper, “correspond to the associative neurons in the human nervous system. It remains to discuss the equivalents of the sensory or afferent and the motor or efferent neurons.” Others would happily take up the challenge.
Alan Turing, meanwhile, was a British mathematician and cryptanalyst. During the Second World War, he led a team for the Government Code and Cypher School at Britain’s secret code-breaking center, Bletchley Park. There he came up with various techniques for cracking German codes, most famously an electromechanical device capable of working out the settings for the Enigma machine. In doing so, he played a key role in decoding intercepted messages, which helped the Allies defeat the Nazis. Turing was fascinated by the idea of thinking machines and went on to devise the important Turing Test, which we will discuss in detail in a later chapter. As a child, he read and loved a book called Natural Wonders Every Child Should Know, by Edwin Tenney Brewster, which the author described as “an attempt to lead children of eight or ten, first to ask and then to answer the question: ‘What have I in common with other living things, and how do I differ from them?’” In one notable section of the book, Brewster writes:
Of course, the body is a machine. It is a vastly complex machine, many, many times more complicated than any other machine ever made with hands; but after all a machine. It has been likened to a steam engine. But that was before we knew as much about the way it works as we know now. It really is a gas engine: like the engine of an automobile, a motor boat, or a flying machine.
One of Turing’s most significant concepts related to something called the Universal Turing Machine. Instead of computers being single-purpose machines used for just one function, he explained how they could be made to perform a variety of tasks by reading step-by-step instructions from a tape. By doing so, Turing wrote that the computer “could in fact be made to work as a model of any other machine.” This meant that it was not necessary to have infinite different machines carrying out different tasks. As Turing noted, “The engineering problem of producing various machines for various jobs is replaced by the office work of ‘programming’ the universal machine to do these jobs.”
One such job, he hypothesized, was mimicking human intelligence. In one notable paper, entitled “Intelligent Machinery,” Turing considered what it would take to reproduce intelligence inside a machine: a particular challenge given the limitations of computers at the time. “The memory capacity of the human brain is probably of the order of ten thousand million binary digits,” he considered. “But most of this is probably used in remembering visual impressions, and other comparatively wasteful ways. One might reasonably hope to be able to make some real progress [toward Artificial Intelligence] with a few million digits [of computer memory].”
The third of AI’s forefathers was a man named Claude Shannon, known today as the father of “information theory.” Born in 1916—making him the youngest of the three—Shannon’s big contribution to computing related to the way in which transistors work. Transistors are the billions of tiny switches that make up a computer. An algorithm is the sequence of instructions that tells a computer what to do by switching these transistors on and off. By having certain transistors switch on and off in response to other transistors, Shannon argued that computers were performing basic reasoning. If, he said, transistor 1 switches on when transistors 2 and 3 are also on, this is a logical operation. Should transistor 1 turn on when either transistor 2 or 3 is on, this is a second logical operation. And if transistor 1 turns on when transistor 2 is switched off, this is a third logical operation. Like a simple vocabulary of spoken language, all computer algorithms break down into one of three different states: AND, OR, and NOT. Combining these simple states into complex series of instructions, Shannon suggested that complex chains of logical reasoning could be carried out.
Of this group, only Shannon went on to play an active role in the official formation of Artificial Intelligence as its own discipline. Both Turing and von Neumann died tragically young, aged just forty-one and fifty-three respectively, although their ideas and influence continue to be felt today. Alan Turing was a homosexual at a time in English history in which it was a crime to be so. Despite his code-breaking work being vital to the British war effort against Nazi Germany, he was prosecuted and convicted of gross indecency in 1952. Forced to choose between prison and a painful chemical castration process, Turing opted for the latter. Two years later, he committed suicide by taking a bite of an apple laced with cyanide. He was given a posthumous royal pardon in 2013, and the suggestion was made that a “Turing’s Law” should be passed to pardon other gay men historically convicted of indecency charges.
Von Neumann’s death was caused by cancer, quite possibly the result of attending nuclear tests as part of the atom bomb project. In his obituary in the Economic Journal, one of von Neumann’s close colleagues described his mind as “so unique that some people have asked themselves—they too eminent scientists—whether he did not represent a new stage in human mental development.”
With two of its founders gone, the growing interest in building thinking machines was picked up by other, younger researchers. AI’s second wave of researchers became the first to officially name the field: formalizing it as its own specialized discipline. In the summer of 1956—when Elvis Presley was scandalizing audiences with his hip gyrations, Marilyn Monroe married playwright Arthur Miller, and President Dwight Eisenhower authorized “In God we trust” as the US national motto—AI’s first official conference took place. A rolling six-week workshop, bringing together the smartest academics from a broad range of disciplines, the event unfolded on the sprawling 269-acre estate of Dartmouth College in Hanover, New England. Along with Claude Shannon, two of the organizers were young men named John McCarthy and Marvin Minsky, both of whom became significant players in the growing field of Artificial Intelligence.
“The study [of AI] is to proceed on the basis of the conjecture that every aspect of learning or any other feature of intelligence can be so precisely described that a machine can be made to simulate it,” they wrote. “An attempt will be made to find how to make machines use language, form abstractions and concepts, solve the kinds of problems now reserved for humans, and improve themselves.”
Their ambition and self-belief was absolute, but their timeframe was perhaps somewhat compressed. “We think a significant advance can be made in one or more of these problems if a carefully selected group of scientists work on it for a summer,” they argued in their proposal for the Dartmouth conference.
Needless to say, things took a bit longer than that.
As more researchers took an interest in AI, it began to subdivide into different fields, reflecting the massive scope of what was being attempted. In some senses, this was inevitable. At the Dartmouth conference, it had proven difficult to even get everyone to agree on a name for their new field. John McCarthy pushed for the flashy-sounding Artificial Intelligence. Others were less convinced. Another researcher named Arthur Samuel thought the name sounded “phony,” while still others—Allen Newell and Herbert Simon—immediately reverted to calling their work “complex information programming.”
The rapid division of Artificial Intelligence into different specialties didn’t take long. For evidence, look no further than the UK’s “Mechanization of Thought Processes” conference, organized at the National Physical Laboratory in Teddington, Middlesex, in 1958. Just two years after the Dartmouth conference, AI was already split into fields including “artificial thinking, character and pattern recognition, learning, mechanical language translation, biology, automatic programming, industrial planning and clerical mechanization.”
The period that followed is often considered to be the glory days of classic AI. The field was fresh, apparent progress was being made, and thinking machines seemed to lurk just over the horizon. It didn’t hurt that funding was plentiful, either—largely thanks to government organizations such as the US Defense Department’s Advanced Research Projects Agency (ARPA). In June 1963, ARPA issued MIT a $2.2 million grant for researching “machine-aided cognition.” According to people who benefited from the grant, ARPA paid it in one installment and didn’t show much concern for how it was spent. This was far from an isolated incident.
John McCarthy referred to this as the “Look, Ma, no hands!” era of AI, summoning up images of youthful exuberance and derring-do. For more than a decade, researchers attacked an astonishingly broad series of challenges, often designed to disprove a hypothesis like: “No machine will ever be capable of doing X.” What the X stood for depended on who was doing the asking. One researcher wrote a checkers program capable of beating most amateurs, including himself. Another breakthrough included a perceptive AI able to rearrange colored, differently shaped blocks on a table using a robotic hand: an astonishing feat in computer vision. A program called SAINT proved able to solve calculus integration problems of the level found on a first-year college course. Another, called ANALOGY, did the same for the geometric questions found in IQ tests, while STUDENT cracked complex algebra story conundrums such as: “If the number of customers Tom gets is twice the square of 20 percent of the number of advertisements he runs, and the number of advertisements he runs is 45, what is the number of customers Tom gets?”*
A particularly impressive display of computational reasoning was a program called SIR (standing for Semantic Information Retrieval). SIR appeared to understand English sentences and was even able to learn relationships between objects in a way that resembled real intelligence. In reality, this “knowledge” relied on a series of pre-programmed templates, such as A is a part of B, with nouns substituting for the variables. However, it was enough to suggest to the likes of Marvin Minsky that similar approaches could begin to tackle a variety of problems. Here is a typical example of SIR in action, with the lowercase sentences referring to the text entered by the user and the uppercase sentences referring to answers by the machine:
Every boy is a person.
I UNDERSTAND.
A finger is part of a hand.
I UNDERSTAND.
Each person has two hands.
THE ABOVE SENTENCE IS AMBIGUOUS. PLEASE REPHRASE IT.
There are two hands on each person.
I UNDERSTAND.
How many fingers does John have?
THE ABOVE SENTENCE IS AMBIGUOUS, BUT I ASSUME (HAS) MEANS (HAS AS PARTS). I DON’T KNOW WHETHER FINGER IS PART OF JOHN.
John is a boy.
I UNDERSTAND.
How many fingers does John have?
THE ABOVE SENTENCE IS AMBIGUOUS, BUT I ASSUME (HAS) MEANS (HAS AS PARTS). HOW MANY FINGERS PER HAND?
Every hand has five fingers.
I UNDERSTAND.
How many fingers does John have?
THE ANSWER IS TEN.
What bonded together all of these seemingly disparate projects, all off in their own corners of AI, was the way that they conceived of intelligence and intelligent behavior. They took a top-down view of intelligence, which has come to be known as Symbolic AI or, today, Good Old-Fashioned AI. To a Good Old-Fashioned AI researcher, all intelligence is based on humans’ ability to understand the world by forming internal symbolic representations. We then create rules for dealing with these concepts, and these rules can be formalized in a way that captures everyday knowledge. If the brain is indeed a computer, this means that every situation we navigate relies on us running an internal computer program telling us, step by step, how to carry out an operation based entirely on logic. And if that is the case, surely those same rules about the organization of the world could also be passed on to a computer.
It all sounded almost too easy and, for a while, it was exactly that.
Although few saw it coming, there were several problems with Artificial Intelligence as it was developing. As is often the case with an exciting field that resonates with the general public, part of the blame must lie with the press. Overenthusiasm meant that impressive, if incremental, advances were often written up as though truly smart machines were already here. For example, one heavily hyped project was a 1960s robot called SHAKEY, described as the world’s first general-purpose robot capable of reasoning about its own actions. In doing so, it set benchmarks in fields like pattern recognition, information representation, problem solving and natural language processing.
That alone should have been enough to make SHAKEY exciting, but journalists couldn’t resist a bit of embellishment. As such, when SHAKEY appeared in Life magazine in 1970, he was hailed not as a promising combination of several important research topics, but as the world’s “first electronic person.” Tying SHAKEY into the space mania still carrying over from the previous year’s moon landing, Life’s reporter went so far as to claim SHAKEY could “travel about the Moon for months at a time without a single beep of direction from the earth.”
This was completely untrue, although not all researchers could resist playing up to it. At an AI conference in Boston during the 1970s, one researcher told a member of the press that it would take just five more years until intelligent robots like SHAKEY were picking up the stray socks in people’s homes. Pulled aside by a furious younger colleague, the researcher was told, “Don’t make those predictions! People have done this before and gotten into trouble. You’re underestimating how long this will take.” Without pausing, the older researcher responded, “I don’t care. Notice all the dates I’ve chosen were after my retirement date.”
AI practitioners weren’t always this cynical, but many were prone to the same fits of cyberbole. In 1965, Herbert Simon stated that in just twenty years’ time, machines would be capable “of doing any work a man can do.” Not long after, Marvin Minsky added that “within a generation . . . the problem of creating Artificial Intelligence will substantially be solved.”
Philosophical problems were also beginning to be raised concerning Symbolic AI. Perhaps the best-known criticism is the thought experiment known as “the Chinese Room.” Put forward by the American philosopher John Searle, it questions whether a machine processing symbols can ever truly be considered intelligent.
Imagine, Searle says, that he is locked in a room and given a collection of Chinese writings. He is unable to speak or write Chinese, and can’t even distinguish Chinese writing from Japanese writing or meaningless squiggles. In the room, Searle discovers a set of rules showing him a set of symbols that correspond with other symbols. He is then given “questions” to “answer,” which he does by matching the question symbols with the answer ones. After a while, Searle becomes good at this task—although he still has no concept of what the symbols are that he is manipulating. Searle asks whether it can be said that the person in the room “understands” Chinese. His answer is no, because there is a total lack of intentionality on his part. He writes: “Such intentionality as computers appear to have is solely in the minds of those who program them and those who use them, those who send in the input and those who interpret the output.”
If Searle was accusing AI researchers of acting like parents willing to seize on anything to proclaim their children’s brilliance, then AI researchers were, themselves, facing a similar uncomfortable truth: that their kids weren’t actually all that smart. Worryingly, tools which had shown promise in lab settings proved altogether less adept at coping in real-world situations. Symbolic AI was about building top-down, rule-based systems, able to work perfectly in laboratory settings where every element could be controlled. These “micro-worlds” contained very few objects and, as a result, limited actions that could be taken. Transferred to the chaos of everyday life, programs that had worked perfectly in training froze up like the England team in a World Cup opener.
Researchers acknowledged these weaknesses, describing such micro-worlds as “a fairyland in which things are so simplified that almost every statement about them would be literally false if asserted about the real world.” In all, AI struggled to deal with ambiguity; it was lacking the flexible abstract reasoning, data and processing power it needed to make sense of what it was shown. Anything that hadn’t been explicitly accounted for beforehand was cause for abject panic. The American writer Joseph Campbell quipped that this form of AI was not dissimilar to Old Testament gods, with “lots of rules and no mercy.”
Capping all of this uncertainty off was a bigger question about whether AI researchers were going about their work in the right way. A bit like starting work on a puzzle by piecing together the most complex pieces first, AI researchers had imagined that if they could solve the more advanced problems, the easy ones would take care of themselves. After all, if you can get a machine to play chess like a math prodigy, how tough could it be to simulate the learning of an infant? Pretty tough, it transpired. As a game, chess consists of clearly defined states, board positions and legal or illegal moves. It is a static world in which players have access to complete information, just so long as they can see the board and know the moves available to them. Chess may be a part of reality, but reality itself is nothing like chess. Suddenly, researchers like Hans Moravec began to voice startling suggestions like the notion that it is “comparatively easy to make computers exhibit adult-level performance on intelligence tests or playing checkers, and difficult or impossible to give them the skills of a one-year-old when it comes to perception and mobility.”
This concentration on the more complex aspects of life to the exclusion of more commonplace tasks may have had something to do with the sorts of people working in AI. In many cases brilliant scientists for whom the word “prodigy” can readily be applied, these researchers could handle the minutiae of chess or Boolean logic, but were absentminded and lacking in real-life common sense. In one commonly told anecdote, a highly intelligent MIT researcher named Seymour Papert once left his wife behind at a New York airport. He only realized that she was not accompanying him when he was halfway across the Atlantic. John McCarthy, meanwhile, could be tenacious when a problem challenged him, but caused no shortage of headaches by continually forgetting to fill out progress reports for the various agencies that funded him. McCarthy’s Introduction to Artificial Intelligence course at Stanford was reportedly so unfocused that students took to calling it “Uncle John’s Mystery Hour” behind his back. In the way that dogs are said to resemble their owners, is it any surprise that the focus of these researchers’ AI programs tended to be on lofty goals rather than mundane (but potentially more useful) feats?
As the psychologist Steven Pinker summed it up: “The main lesson of [the first] thirty-five years of AI research is that the hard problems are easy and the easy problems are hard.”
Facing these kinds of challenges, Good Old-Fashioned AI started to run into problems. From the 1970s, the field cooled off as the optimism of previous decades dissipated. Budgets were brutally slashed, plunging Artificial Intelligence into the first of several so-called “AI Winters.” In the United States, even the lovable SHAKEY the robot project shuddered to a halt when it became clear that it was not the robotic James Bond spy its funders at the Defense Department had hoped for. Forget spying, SHAKEY couldn’t even replace regular troops on the battlefield! One researcher who worked on the project remembers some military types coming in for a last-ditch look at SHAKEY rolling around the laboratory at its research institute, SRI International. Turning to one of its creators, a skeptical general asked, “Would it be possible to mount a thirty-six-inch bayonet on it?”
AI responded by shifting its ambitions, scaling back on some of its grander missions in favor of narrow, well-defined problems for which clear measures of success could be made. One such area was the growing field of video games. AI had been associated with game-playing since its earliest days, when Alan Turing and Claude Shannon attempted to build an automated chess player. In that instance, chess had been a micro-world designed to prove intelligent behavior that could later be rolled out in the real world. Now video games presented an end goal in and of themselves.
Not only were researchers’ skills in demand, but there was real money on offer, too. One such beneficiary was Alexey Pajitnov, a twenty-eight-year-old AI researcher then working for the Soviet Academy of Sciences’ Computer Center in Moscow. In June 1984, Pajitnov created a simple program to test out the lab’s new computer system. Brought to market by a shrewd entrepreneur under the name Tetris, Pajitnov’s falling blocks game proceeded to sell more than 170 million copies worldwide.
As the 1980s wore on, video games became increasingly intricate and AI experts were snapped up to help. Their ability to model complex behavior using simple rules meant that computer-controlled characters could possess their own motivations. In the hit game Theme Park, for instance, AI simple agents flocked around the parks built by users, taking routes no programmer explicitly mapped out.
In one sense, video games were the perfect place for Good Old-Fashioned AI. Questions about whether behavior was truly intelligent, or just acting like it, meant nothing if the AI was being used to model the zombie enemy in a first-person shooter. (In fact, it would be considerably crueler if the agents were intelligent.) Even today, video-game developers employ more AI practitioners than any other industry.
A second new application for AI was working alongside humans as problem-solving tools. Although reasoning is a key part of intelligence, researchers knew that this was not the only part. To build Artificial Intelligence capable of being used in the real world to solve genuine problems, experts decided they needed machines that could combine reasoning with knowledge. For example, a computer that was going to be useful in neuroscience would have to be intimately acquainted with the same concepts, facts, representations, methods, models, metaphors and other facets of the subject that a qualified neuroscientist would be.
This meant that programmers suddenly had to become “knowledge engineers,” capable of taking human experts in a variety of fields and distilling their knowledge into rules a computer could follow. The resulting programs were called “expert systems.” These were systems built on an extensive collection of probabilistic “IF . . . THEN” rules. One early attempt at an expert system was called DENDRAL, a program designed to help organic chemists identify unknown organic molecules. “For a while, we were regarded at arm’s length by the rest of the AI world,” creator Edward Feigenbaum told Pamela McCorduck, one of the earliest writers to chronicle the history of Artificial Intelligence. “I think they thought DENDRAL was something a little dirty to touch because it had to do with chemistry, though people were pretty generous about ‘oohs’ and ‘ahs’ because it was performing like a PhD in chemistry.”
Another similar project was MYCIN, designed to help recommend the correct dosage of antibiotics for severe infections such as meningitis. Like a real doctor, MYCIN drew conclusions by combining pieces of probabilistic evidence from the previous experience of its programmers. These years of experience were squeezed and shaped until they resembled “rules” like the following:
IF . . . the infection which requires therapy is meningitis, and the type of infection is fungal, and organisms were not seen on the stain of the culture, and the patient is not a compromised host, and the patient has been to an area that is endemic for coccidiomycoses, and the race of the patient is Black, Asian, or Indian, and the cryptococcal antigen in the csf test was not positive, THEN . . . there is a 50 percent chance that cryptococcus is not one of the organisms which is causing the infection.
On their own, such probabilistic rules didn’t amount to much. When combined in their hundreds, however, they could regularly find the right answer. DENDRAL and MYCIN remained lab experiments that were never used in the real world. Another expert system called XCON proved more successful. Created in 1978, XCON lacked the world-improving ambitions of DENDRAL and MYCIN. Instead of helping scientists form hypotheses, or doctors treat infectious diseases, XCON aided engineers in configuring VAX supercomputers by choosing the right system components for a customer’s requirements. In short, it was the world’s greatest know-it-all shop assistant.
For the first time, big business began to show a real interest in AI as something more than a demo of the future. As long as expert systems could make them money, it shockingly turned out that companies didn’t care too much about whether expert systems were real AI or simply “clever programming.” XCON’s first day of work took place in 1980 at the Salem, New Hampshire, factory of DEC, the Digital Equipment Corporation. By 1986, XCON had processed a whopping 80,000 orders, was saving DEC an estimated $25 million a year, and achieved accuracy rates of 95–98 percent. If it had only married the boss’s daughter, it could’ve had a future as CEO.
Other rival companies soon crawled out of the woodwork, offering custom solutions for companies wanting their own expert systems. Dipmeter Advisor could advise with the analysis of geological formations in oil-well drilling. The scintillating Grain Marketing Advisor made clear its ambitions to help farmers properly market and store their grain crops. “How can you take immediate advantage of expert systems technology to enhance your existing data processing applications, on your existing hardware, using your current . . . staff?” asked an ad printed in Computerworld magazine in October 1986. “Only Teknowledge has the answer. And it’s yours. Free. At a half-day seminar in your area.”
In all, during 1985, a massive $1 billion was spent by approximately 150 companies wanting to get in on the Artificial Intelligence business. That year, a meeting of the American Association for Artificial Intelligence and the International Joint Conference on Artificial Intelligence had close to 6,000 attendees. Over half of them were venture capitalists, recruiters and media folk. In 1987, Fortune magazine—hardly the place for cutting-edge computer research—praised the arrivals of “Live Experts on a Floppy Disk.” For the first time in AI’s history, researchers were getting as rich as the new PC upstart entrepreneurs like Steve Jobs and Bill Gates.
Interestingly, seasoned researchers like Marvin Minsky shied away from this. It would be easy to assume that the old guard of AI would have been eager to cash in after more than a quarter century of hard work. In fact, they were waiting for the other shoe to drop. It didn’t take long. As with the speculative dot-com bubble of the late 1990s, exponents tended to overstate the abilities of expert systems to a dangerous degree. One textbook invoked the “phone-call rule” suggesting that “any problem that can be and frequently is solved by your in-house expert in a ten to thirty-minute phone call can be automated as an expert system.” The underlying concept of expert systems was solid, but they had problems. They were expensive, required constant updating and—counterintuitively—could become less accurate the more rules were incorporated. “As rule sets become larger, undesirable interactions between rules become more common, and practitioners found that the certainty factors of many other rules had to be ‘tweaked’ when more rules were added,” Stuart Russell and Peter Norvig write in the textbook Artificial Intelligence: A Modern Approach.
In the fiscal year ending 1987, two of the leading expert system companies—Teknowledge and Intellicorp—lost millions of dollars. Other AI companies fared even worse—filing for bankruptcy, leaving employees and executives out in the cold. After a warm spell, AI’s second winter was back.
AI’s following cold snap was worse than its first. Money dried up again. Government grants vanished once more. The budget for AI research from the US Defense Advanced Research Projects Agency, DARPA (the new name for ARPA from 1972), declined by a full one-third between 1987 and 1989. Advertising rates fell in specialist Artificial Intelligence magazines. When Daedalus, the official journal of the American Academy of Arts and Sciences, dared publish an entire issue on AI in 1988, the philosopher Hilary Putnam was outraged. “What’s all the fuss about now?” Putnam wrote. “Why a whole issue of Daedalus? Why don’t we wait until AI achieves something and then have an issue?” The backlash was felt throughout the tech world. Membership in the Association for the Advancement of Artificial Intelligence tailed off. By its nadir in 1996, it had plummeted to just 4,000 members worldwide. Short of a miracle, the dream of Artificial Intelligence appeared to be over.
That year, two students at Stanford—one the child of an AI researcher, the other of a mathematician—came up with a clever way to build a smart web catalogue by ranking pages based on the number of incoming links. In 1997, twenty-four-year-old Larry Page and Sergey Brin turned their nifty algorithm into a company, launched from a garage in Menlo Park. To make it the “Worldwide Headquarters” they thought it should be, they kitted it out with a few tables, three chairs, a turquoise shag rug, a folding Ping-Pong table and a few other items. The garage door had to be left open for ventilation.
It must have seemed innocuous at the time, but over the next two decades, Larry Page and Sergey Brin’s company would make some of the biggest advances in AI history. These spanned fields including machine translation, pattern recognition, computer vision, autonomous robots and far more, which AI researchers had struggled with for half a century.
Virtually none of it was achieved using Good Old-Fashioned AI.
The company’s name, of course, was Google.