Chapter 9
Designing the Network

Part I of this book was all about the perceptron. Part II is about the perceptron’s big brother, and the most important idea in this book: the neural network. Neural networks are way more powerful than perceptrons. In Where Perceptrons Fail, we learned that perceptrons need simple data that are linearly separable. By contrast, neural networks can deal with gnarly data, like photos of real-world objects.

Even on a simple dataset like MNIST, our perceptron was just scraping by, making almost one mistake every ten characters. With neural networks, we can aim for an order of magnitude better accuracy: in this part of the book, we’d like to build an MNIST classifier that reaches 99% accuracy—one error every 100 characters.

Here’s a plan to reach for that lofty goal: over the next few chapters, we’ll build and tweak a neural network. Once again, you’ll learn by doing, hand-crafting code line by line. It will take us three chapters to complete our neural network v1.0:

After these three chapters, we’ll have a working neural network. Even then, however, you might struggle to understand how it works. The following chapter, Chapter 12, How Classifiers Work, will use our network as a concrete example to answer questions like: “What makes neural networks so powerful?”

After building a neural network and understanding how it works, we’ll have some more work to do. The first version of our network won’t be very accurate—in fact, it will only be about as accurate as the perceptron. In the remaining three chapters of Part II, we’ll unleash the network’s power and aim straight for that 99% accuracy.

I won’t spoil your fun by telling you whether we’ll eventually hit our goal. To find out, you’ll have to keep reading. (Or peek at the last pages of Part II… But you wouldn’t do that, would you?)

Now that we have a plan for action, let’s dive straight into our first task: designing a neural network that classifies MNIST digits.