Getting the data

In order to get to the process of building the neural networks quickly, we will be using the MNIST dataset, which is available in the MLDatasets.jl package. The package provides easy and user-friendly access to some of the datasets publicly available out there on the internet. If you don't have the MLDatasets package installed, you can do so by running the Pkg.add command:

Pkg.add("MLDatasets")

The moment the MLDatasets package is loaded, the MNIST dataset can be easily downloaded and made available with the traindata and testdata functions from the MNIST module:

using Images, ImageView, MLDatasets
train_x, train_y = MNIST.traindata()
test_x, test_y = MNIST.testdata()
The first time you make a call to any of the MLDatasets modules, it will present you with terms of service (TOS) and offer to download the data to your local machine. Please be aware that depending on a dataset's type, the size can vary and take over 100 MB. 

As you can see, both functions return tuples, such as train_x and train_y. train_x corresponds to images of data and train_y to the value in an image. The neural network will use the data from train_x to train and predict the value of train_y.

Next, we will create a preview of a set of the first 10 images from the training dataset. Sometimes, it helps to identify issues with data. This is shown in the following code:

preview_img = zeros(size(train_x, 1), 0)

for i = 1:10
preview_img = hcat(preview_img, train_x[:, :, i])
end

imshow(Gray.(preview_img))

We have used hcat to join the images together and imshow to preview the result:

In order to see what the corresponding values are for each image, we printed out the first 10 values from the train_y dataset. This is shown as follows:

train_y[1:10]
Main> 5, 0, 4, 1, 9, 2, 1, 3, 1, 4

Now that we have seen the data, it is time to move on to creating our first neural network.