OpenCV machine learning algorithms

OpenCV implements eight of these machine learning algorithms. All of them are inherited from the StatModel class:

Version 3 supports deep learning at a basic level, but version 4 is stable and more supported. We will delve into deep learning in detail in further chapters.

To get more information about each algorithm, read the OpenCV document page for machine learning at http://docs.opencv.org/trunk/dc/dd6/ml_intro.html.

The following diagram shows the machine learning class hierarchy:

The StatModel class is the base class for all machine learning algorithms. This provides the prediction and all the read and write functions that are very important for saving and reading our machine learning parameters and training data.

In machine learning, the most time-consuming and computing resource-consuming part is the training method. Training can take from seconds to weeks or months for large datasets and complex machine learning structures. For example, in deep learning, big neural network structures with more than 100,000 image datasets can take a long time to train. With deep learning algorithms, it is common to use parallel hardware processing such as GPUs with CUDA technology to decrease the computing time during training, or most new chip devices such as Intel Movidius. This means that we cannot train our algorithm each time we run our application, and therefore it's recommended to save our trained model with all of the parameters that have been learned. In future executions, we only have to load/read from our saved model without training, except if we need to update our model with more sample data.

StatModel is the base class of all machine learning classes, such as SVM or ANN, except deep learning methods. StatModel is basically a virtual class that defines the two most important functions—train and predict. The train method is the main method that's responsible for learning model parameters using a training dataset. This has the following three possible calls:

bool train(const Ptr<TrainData>& trainData, int flags=0 ); 
bool train(InputArray samples, int layout, InputArray responses); 
Ptr<_Tp> train(const Ptr<TrainData>& data, int flags=0 ); 

The train function has the following parameters:

The last train method creates and trains a model of the _TP class type. The only classes accepted are the classes that implement a static create method with no parameters or with all default parameter values.

The predict method is much simpler and has only one possible call:

float StatModel::predict(InputArray samples, OutputArray results=noArray(), int flags=0) 

The predict function has the following parameters:

The StatModel class provides an interface for other very useful methods:

Now, we are going to introduce how a basic application that uses machine learning in a computer vision application is constructed.