Now that you have a few tricks under your belt, you can start experimenting with deep neural networks on your own. Speaking of which, I have a challenge for you.
To set up this challenge, I wrote a neural network and trained it on the MNIST dataset. You already saw plenty of MNIST classifiers, but this time, it’s a deep neural network written in Keras. Go and take a look at it. It’s in network_mnist.py, in this chapter’s source code.
This network doesn’t use any of the techniques from this chapter, but it still does okay. I trained it for 10 epochs and got this result:
| loss: 0.2026 - acc: 0.9388 - val_loss: 0.2487 - val_acc: 0.9228 |
Now it’s your turn. Your mission, should you choose to accept it, is to improve this network’s accuracy while respecting two constraints:
Don’t change the network’s layout. Keep the same number of layers and nodes per layers.
Don’t change the number of epochs or the batch size. Keep them at 10 and 32, respectively.
In all other respects, you can go wild. You can use the tricks that we described in this chapter, or even explore new ones—activation functions, optimization algorithms, or whatever else you learn by browsing Keras’s documentation.[26]
Here are a few ideas to get you up and running:
If your network does better than 97% on the validation set, then victory is yours.
Although I’m pretty sure that you can hit that 97%, I’d be surprised if you did much better than that. The techniques that we introduced in this chapter are essential, but they can only take us so far. To get radically better results, we need something more than incremental tricks—we need a fundamentally different neural network architecture. That is the subject of the next chapter.