Defining optimizer and training operations

The goal of the optimizer is to minimize the loss and it does this by adjusting the different weights that we have in all of the layers of our network. The optimizer used here is the Adam optimizer with a learning rate of 0.001.

The following screenshot shows the lines of code used for defining the optimizer and also shows the training operations:

The following screenshot shows some of the NumPy arrays that we created and will use for evaluation purposes: