Man Versus Machine—
Part 1

The IBM Challenge—Game 1

February 14–15, 2011

Contestants
Ken Jennings
Watson
Brad Rutter

the ultimate contest between man and machine has its roots in a steak house in Fishkill, New York, in 2004.

A group of IBM researchers who were dining together noticed that, for seemingly no apparent reason, most of the people in the restaurant had gathered around a television set in the bar. Curious, they joined the crowd.

Jeopardy! was playing on the TV set. Ken Jennings was in the middle of his record-setting winning streak. In Fishkill—and lots of places around the nation—people stopped what they were doing to see if Jennings would win again. And again.

Unlike other viewers, though, the IBM employees were less concerned with Jennings than with the idea he gave them for a new grand challenge. From time to time, IBM created grand challenges to pit man against machine. Ideally, these projects not only connected to the company’s research and ambitions, but they sparked interest around the world.

Deep Blue had been one such grand challenge. In 1997, after six years of research, IBM designed a computer that beat chess master Garry Kasparov. Now the IBM researchers began to wonder whether it might be possible to design a computer that could beat the best at Jeopardy!

Such a computer would bring fame and glory to IBM. More important, it would generate billions in sales.

The Wall Street Journal reported that, by 2010, IBM’s business-analytics revenue was $9.4 billion, but the company wanted it to grow to $16 billion in five more years. It already faced stiff competition from market leader Oracle Corp. and from SAP, a European multinational corporation.

Turning Jeopardy! into a grand challenge might very well put IBM in the spotlight, but, in the year or so following Jennings’s streak, opinion at the corporation was divided. Some thought it silly. Some said it was gimmicky. Some had doubts whether it was even possible.

Eventually, the company decided to give it a try. Initial funding came from the research budget. That didn’t require approval at the highest levels. It also freed the project from the usual commercial pressures.

The computer destined to be a Jeopardy! star was named Watson, for Thomas J. Watson Sr. (1874–1956), who was hired in 1914 to manage the merger of three companies that became International Business Machines, one of the greatest firms in US history.

By 2007, the grand challenge was far from grand. Three people were gathering data from old Jeopardy! shows and starting to train Watson. Said one employee who left the company that year, “It could barely beat a five-year-old.”

Eric Brown, a frontline researcher at IBM, heard about the project in 2006. His group as well as a sister group were looking at how computers answer questions. To him the grand challenge seemed like a great fit.

Up until then, most people who were trying to get computers to answer questions focused on what humans could do to ask questions in ways a computer could understand. With this project, IBM was going to try to teach a computer to understand what humans were saying.

That was going to be a lot harder than building Deep Blue. Chess, for all its complexity, was like a game of playground tag compared to Jeopardy! Chess has specific rules for each piece. It is mathematical by nature and takes place entirely on a board with sixty-four squares.

Jeopardy! involves the limitless ways language can be used and words are constructed. It incorporates humor, wordplay, and multiple meanings. It communicates different thoughts by the way words are organized and stressed in sentences. On top of all that, the subjects of clues in any given game of Jeopardy! are practically limitless.

IBM divided the challenge into five parts:

- Watson would need to understand language enough to interpret not only the meaning of the words but the wordplay and wit inherent in Jeopardy! clues.

- Watson would have to have a database with enough knowledge to cover the range of material in any given Jeopardy! game.

- Watson would have to comb through that database and generate precise answers.

- Watson would need to evaluate all the answers it came up with and figure out how confident it could be with each of them.

- Watson would have to work fast enough to beat the other players.

Plus, there was one more thing: IBM would have to convince Jeopardy! to accept the challenge.

Michael Loughran from IBM public relations called Harry Friedman. Loughran said he would be in Los Angeles and would like to come by to discuss the game.

“I thought they wanted to make a Jeopardy! game like some of the other electronics companies, one you could play at home or on a mobile device,” Friedman recalled. “I said, ‘I’ll be glad to talk to you about it, but you should know the game is exclusively licensed to several makers.’

“He said it’s not that exactly, which piqued my curiosity. He came by but, even after we talked, I still didn’t get it until he asked me if I remembered Deep Blue, the computer that played chess. He said they were talking about a computer that can play Jeopardy!

“I said, ‘I really don’t think we’d be interested.’ It sounded too much like a stunt. He said, ‘Okay, I understand. But think about it.’”

Loughran and IBM’s lead on the project, David Ferrucci, returned about once a month. Each time, Ferrucci gave Friedman more information about Watson. He explained how this computer would learn as it operates and talked about IBM’s plans to build businesses around artificial intelligence. Ferrucci explained how a computer like this could help doctors in remote locations access information to diagnose and treat disease.

“That’s not something your typical quiz show has the opportunity to do,” Friedman said. “And that very much appealed to us.”

There would be three prizes for the exhibition—$1 million for the winner, $300,000 for second place, and $200,000 for third place, with the human contestants, Ken Jennings and Brad Rutter, donating half their prize money to charity. IBM said it would give away whatever it won entirely to charity.

Supervising Producer Lisa Broffman remembers the meetings with IBM as two opposite cultures trying to communicate with each other. “We didn’t speak the same language. We both strive for excellence, but we function in completely different ways. It was our mutual respect for each other and shared belief in the importance of this project that led to a fascinating working relationship.

“And then we finally learned to understand each other and respect each other and function together in a productive way.”

The Jeopardy! staff began thinking of how to adjust the game so that neither man nor machine was advantaged or penalized.

Watson would get clues electronically at the same time Alex Trebek read and displayed them to the human contestants. Neither Watson nor the humans could buzz in until after the clue was read. After that, the humans could push the buzzer and Watson could activate a mechanical device.

Of course, there would be no visual or musical clues. Also, like the human players, Watson would have no access to the internet.

“It might have been fun to write complicated clues to make it tough on Watson,” said Billy Wisse, editorial producer, but he and his writing staff never got the chance. By agreement, the games that were played were chosen at random from about fifty that had been written for upcoming Jeopardy! shows. That way, no one could challenge that the clues were written to advantage either the humans or the computer.

IBM researchers continued to improve Watson’s play. The computer drew from a database containing two hundred million pages of information, including whole encyclopedias, plays, textbooks, dictionaries, and other books. It was the equivalent of one million books, about one-fourth the number of reference books contained in the main building of the New York Public Library on Fifth Avenue.

At first, IBM tested Watson against IBM employees who thought they were good at Jeopardy! That gave the researchers an idea of Watson’s accuracy but not how it would function in real time. Next, IBM figured out a way to speed up Watson’s responses. At the same time, the company created algorithms that helped Watson generate answers. The algorithms were then prioritized to give more weight to the ones that gave the best answers.

Brad Rutter and Ken Jennings at IBM.

In April 2009, IBM publicly announced its grand challenge at its annual stockholder meeting.

Later that year, Watson began sparring matches with former Jeopardy! champions, those who had won at least one televised game. “We learned an awful lot about how the system behaved in real time and some of the other elements of game strategy that were specific to Jeopardy!,” Brown said.

The next year, IBM had Watson compete against Jeopardy! champions who had participated in Tournament of Champions play. “We were playing against better competitors and, as the system improved its performance, we were even more assured and confident that we were moving in the right direction,” Brown said.

Month after month, Watson kept getting better. IBM regularly reported its progress to Jeopardy! Friedman and Schmidt visited IBM offices in New York to observe the developments there.

“We wanted to be assured that Watson would be competitive,” Schmidt said.

Friedman said IBM and Jeopardy! agreed on a threshold for the computer. “When Watson could answer 80 percent correctly, I think we mutually deemed that it was ready.” On December 14, 2010, IBM and Jeopardy! jointly announced that Watson, Ken Jennings, and Brad Rutter would play two games over three broadcasts starting on February 14, 2011.