15 Machine Learning: Classification, Regression and Clustering

Objectives

In this chapter you’ll:

  • Use scikit-learn with popular datasets to perform machine learning studies.

  • Use Seaborn and Matplotlib to visualize and explore data.

  • Perform supervised machine learning with k-nearest neighbors classification and linear regression.

  • Perform multi-classification with Digits dataset.

  • Divide a dataset into training, test and validation sets.

  • Tune model hyperparameters with k-fold cross-validation.

  • Measure model performance.

  • Display a confusion matrix showing classification prediction hits and misses.

  • Perform multiple linear regression with the California Housing dataset.

  • Perform dimensionality reduction with PCA and t-SNE on the Iris and Digits datasets to prepare them for two-dimensional visualizations.

  • Perform unsupervised machine learning with k-means clustering and the Iris dataset.