Here’s the Plan

Now that you know about the softmax, let’s replace the second sigmoid with a softmax and complete the design of our neural network:

images/designing/network_plan.png

The input layer has 785 columns (one per pixel, plus the bias), and a sigmoid activation function. The hidden layer has 201 columns including the bias, and a softmax activation function. The output layer has 10 columns, matching the 10 classes in MNIST. Finally, the matrices of weights have the right dimensions to make all the matrix multiplications add up. That’s all we need to know to get started.

Speaking of code, you must be eager to get down to it. But first, let’s recap what we learned in this chapter.