One of the most important and generally-used methods for performing hyperparameter tuning is called the exhaustive grid search. This is a brute-force approach because it tries all of the combinations of hyperparameters from a grid of parameter values. Then, for each combination of hyperparameters, the model is evaluated using k-fold cross-validation and any other specified metrics. So the combination that gives us the best metric is the one that is returned by the object that we will use in scikit-learn.
Let's take an example of a hyperparameter grid. Here, we try three different values, such as 10, 30, and 50, for the n_estimators hyperparameter. We will try two options, such as auto and square root, for max_features, and assign four values—5, 10, 20, and 30—for max_depth. So, in this case, we will have 24 hyperparameter combinations. These 24 will be evaluated. For every one of these 24 combinations, in this case, we use tenfold cross-validation and the computer will be training and evaluating 240 models. The biggest shortcoming that grid search faces is the curse of dimensionality which will be covered in the coming chapters. The curse of dimensionality essentially means that the number of times you will have to evaluate your model increases exponentially with the number of parameters.
If certain combinations of hyperparameters are not tested, then different grids can be passed to the GridSearchCV object. Here, different grids can be passed in the form of a list of dictionaries because every grid is a dictionary in scikit-learn.
So, we perform the train-test split and use one part of the dataset to learn the hyperparameters of our model; the part that we left for testing should be used for the final model evaluation, and later we use the whole dataset to fit the model.