Backpropagation

Backpropagation takes place once feed forward is completed. It stands for backward propagation of errors. In the case of neural networks, this step begins to compute the gradient of error function (loss function) with respect to the weights. One can wonder why the term back is associated with it. It's due to gradient computation that starts backwards through the network. In this, the gradient of the final layer of weights gets calculated first and the the weights of the first layer are calculated last.

Backpropagation needs three elements:

Training a neural network with gradient descent requires the calculation of the gradient of the loss/error function E(X,θ) with respect to the weights  and biases . Then, according to the learning rate α, each iteration of gradient descent updates the weights and biases collectively, denoted according to the following:

Here  denotes the parameters of the neural network at iteration in gradient descent.