- In the context of machine learning, there are three main approaches and techniques: supervised, unsupervised, and semi-supervised machine learning.
- Supervised learning problems can be further grouped into regression and classification problems. A classification problem happens when the output variable is a category, and a regression problem is when the output variable is a real value. For example, if we predict the possibility of rain in some regions and assign two labels (rain/no rain), this is a classification problem. On the other hand, if the output of our model is the probability associated with the rain, this is a regression problem.
- OpenCV provides the cv2.kmeans() function, implementing a k-means clustering algorithm, which finds centers of clusters and groups input samples around the clusters. k-means is one of the most important clustering algorithms available for unsupervised learning.
- The cv2.ml.KNearest_create() method creates an empty k-NN classifier, which should be trained using the train() method, providing both the data and the labels.
- The cv2.findNearest() method is used to find the neighbors.
- To create an empty model, the cv2.ml.SVM_create() function is used.
- In general, the RBF kernel is a reasonable first choice. The RBF kernel non-linearly maps samples into a higher dimensional space, so that it, unlike the linear kernel, can handle the case when the relation between class labels and attributes is non-linear. See A Practical Guide to Support Vector Classification (2003) for further details.