Preparing the data

What could be the requirements for our first neural network? It should accept our images as an input, do some magic, and return probabilities for the image corresponding to one of 10 classes—that is, if the number is from 0 to 9.

Unfortunately, MXNet does not accept the images and responses in a format returned by the traindata and testdata functions, and they should both be converted into arrays controlled by MXNet.

From the MNIST dataset, we know that train_x and train_y consist of 60,000 unique images. We will first split the train datasets into two trains and validation into 50,000 and 10,000, accordingly. The validation dataset is required to control the learning process of the neural network and monitor its performance over every round of training: 

train_length = 50000
validation_length = 10000

In the preceding code example, we have defined two variables corresponding to the length of the datasets—train_length and validation_length

The next step is to convert the train data into a format supported by MXNet. We do this by first predefining the arrays of a type controlled by MXNet, which is shown in the following code:

using MXNet

train_data_array = mx.zeros((size(train_x, 1, 2)..., train_length...));
train_label_array = mx.zeros(train_length);

validation_data_array = mx.zeros((size(train_x, 1, 2)..., validation_length...));
validation_label_array = mx.zeros(validation_length);

Next, we copy the data from train_x and train_y to the newly created arrays. As we could see in the preview, our dataset is not ordered, therefore we don't need to shuffle it prior to building our new arrays. This is shown in the following code:

for idx = 1:train_length
train_data_array[idx:idx] = reshape(train_x[:, :, idx],
(size(train_x, 1, 2)..., 1...))
train_label_array[idx:idx] = train_y[idx]
end
for idx = 1:validation_length
validation_data_array[idx:idx] = reshape(train_x[:, :, train_length +
idx], (size(train_x, 1, 2)..., 1...))
validation_label_array[idx:idx] = train_y[train_length + idx]
end

The last step is to create an MXNet array data provider; an object connecting data with labels. It also provides an option to set the batch size. 

The batch size defines the number of samples that are going to be propagated through the network. Usually, the batch size is set to accept as many samples as can fit into your machine during processing. Because the images in the MNIST dataset are small, we set this to 1,000. In the bigger networks that we will be working with later throughout this book, the batch size might be reduced to as low as 100, 24, 10, or even 1. This is shown in the following code:

train_data_provider = mx.ArrayDataProvider(:data => train_data_array, :label => train_label_array, batch_size = 1000);
validation_data_provider = mx.ArrayDataProvider(:data => validation_data_array, :label => validation_label_array, batch_size = 1000);