Defining optimizer and training operations

The goal of the optimizer is to minimize the loss, and it does this by adjusting the different weights that we have in all the layers of our network. The optimizer used here is the gradient descent with a learning rate of 0.01. The following screenshot shows the lines of code used for defining the optimizer and also shows the training operations.

Each time we run the training operation training_op, the optimizer will change the values of these weights a little bit. In doing so, it minimizes the loss, and the predictions and the actual values are as close as possible.