PREFACE

The seed for this book was planted in an AI research lab on the top floor of a computer science department one night in 2010. Having attended some recent talks about self-driving cars, and curious about how they worked, I did a few web searches. The best explanations I could find were the original academic papers written by some of the researchers at Carnegie Mellon University and Stanford. I looked at them for a few minutes, gained a superficial understanding of how self-driving cars worked, and eventually moved on.

But over time, I found myself repeating this process again and again. Whenever I saw another breakthrough in artificial intelligence or machine learning hit the press, I came back to the same question: How does it work? The curious thing to me was that I’d spent countless hours studying and practicing machine learning in academia and industry, and yet I still couldn’t consistently answer that question. Perhaps I didn’t know AI and machine learning as well as I should, I thought, or perhaps college courses didn’t teach us the right material. Most college courses on these topics usually just teach the building blocks behind these breakthroughs—not how these building blocks should be put together to do interesting things.

But there was another, more fundamental reason I couldn’t figure out how they worked: most of these breakthroughs really did involve groundbreaking research; we simply didn’t know how to build them until a group of researchers figured it out and wrote about the process or built a prototype. That’s why researchers have been writing about these breakthroughs in peer-reviewed journals: precisely because they’re novel, impactful, and non-obvious (and peer-reviewed). But it still didn’t help that the details behind these breakthroughs, once published, were spread out, haphazardly, across many different sources.

Eventually I realized that I should share what I was learning during my own research with other people, so they wouldn’t need to jump through the same hoops to understand the same things. In other words: I wrote this book because it was a book I wanted to read.

I’ve written How Smart Machines Think with the hope that it will be helpful for tech enthusiasts young and old who are curious about science and technology in general, or to industry leaders who hope to learn more about whether machine learning and artificial intelligence might be useful for their companies. This book is meant to be accessible to a broad audience—from a curious high school student to a retired mechanical engineer. Although it will help if you know a little computer science, the only real prerequisites for this book are curiosity and a bit of an attention span. And I have intentionally kept the math in this book to a minimum to communicate the core ideas without alienating casual readers.

Experts in the robotics, AI, and machine learning communities will often know the implementation details of some of the algorithms I will describe; but the remaining narrative and the design of entire systems will still probably be new to many of them (except when that is their area of research). My hope is that there is something new in this book for everyone.