Hyperparameters are one of the building blocks of the deep learning network. It is an element that determines the optimal architecture of the network (for example, number of layers) and also a factor that is responsible for ensuring how the network will be trained.
The following are the various hyperparameters of the deep learning network:
- Learning rate: This is responsible for determining the pace at which the network is trained. A slow learning rate ensures a smooth convergence, whereas a fast learning rate may not have smooth convergence.
- Epoch: The number of epochs is the number of times the whole training data is consumed by the network while training.
- Number of hidden layers: This determines the structure of the model, which helps in achieving the optimal capacity of the model.
- Number of nodes (neurons): There should be a trade-off between the number of nodes to be used. It decides whether all of the necessary information has been extracted to produce the required output. Overfitting or underfitting will be decided by the number of nodes. Hence, it is advisable to use it with regularization.
- Dropout: Dropout is a regularization technique that's used to increase generalizing power by avoiding overfitting. This was discussed in detail in Chapter 4, Training Neural Networks. The dropout value can be between 0.2 and 0.5.
- Momentum: This determines the direction of the next step toward convergence. With a value between 0.6 and 0.9, it handles oscillation.
- Batch size: This is the number of samples that are fed into the network, after which a parameter update happens. Typically, it is taken as 32, 64, 128, 256.
To find the optimal number of hyperparameters, it is prudent to deploy a grid search or random search.