What You Just Learned

Before this chapter, we only built and trained fully connected neural networks, where each node is connected with all the nodes in the layers before and after it. In this chapter, we moved beyond that generic architecture to see an example of a more specialized one: a convolutional neural network.

Different than fully connected networks, CNNs preserve the multidimensional shape of data such as images. In a CNN, the weights are organized into structures called filters that detect the geometric features in the data. Each layer applies an operation called a convolution between its input and the filters. CNNS are better than fully connected networks at dealing with images, because they respect the spatial information in the images instead of squashing them to a mostly meaningless byte jam.

Finally, we built our own convolutional neural network. We applied a few common practices, like the idea of having a few convolutional layers followed by fully connected layers. When we trained the CNN, we were rewarded with a pretty good accuracy on the tough CIFAR-10 dataset.

Besides CNNs, deep learning spans many other architectures, each suited to different problems and different types of data. Speaking of which, we still didn’t come to a conclusion on what deep learning is, how it works, and what its boundaries are. Those questions will be the subject of the closing chapter in this book.