Training different regression models

The following screenshot shows a dataframe where we are going to save performance. We are going to run four models, namely logistic regression, bagging, random forest, and boosting:

We are going to use the following evaluation metrics in this case:

The most important of these is the recall metric. The reason behind this is that we want to maximize the proportion of actual defaulters that the model identifies, and so the model with the best recall is selected.