Chapter 10

In the context of machine learning, there are three main approaches and techniques: supervised, unsupervised, and semi-supervised machine learning.
Supervised learning problems can be further grouped into regression and classification problems. A classification problem happens when the output variable is a category, and a regression problem is when the output variable is a real value. For example, if we predict the possibility of rain in some regions and assign two labels (rain/no rain), this is a classification problem. On the other hand, if the output of our model is the probability associated with the rain, this is a regression problem.
OpenCV provides the cv2.kmeans() function, implementing a k-means clustering algorithm, which finds centers of clusters and groups input samples around the clusters. k-means is one of the most important clustering algorithms available for unsupervised learning.
The cv2.ml.KNearest_create() method creates an empty k-NN classifier, which should be trained using the train() method, providing both the data and the labels.
The cv2.findNearest() method is used to find the neighbors.
To create an empty model, the cv2.ml.SVM_create() function is used.
In general, the RBF kernel is a reasonable first choice. The RBF kernel non-linearly maps samples into a higher dimensional space, so that it, unlike the linear kernel, can handle the case when the relation between class labels and attributes is non-linear. See A Practical Guide to Support Vector Classification (2003) for further details.