Notational Conventions for Probabilities
1 Machine Learning for Predictive Data Analytics
1.1 What Is Predictive Data Analytics?
1.3 How Does Machine Learning Work?
1.4 What Can Go Wrong with Machine Learning?
1.5 The Predictive Data Analytics Project Lifecycle: CRISP-DM
1.6 Predictive Data Analytics Tools
2 Data to Insights to Decisions
2.1 Converting Business Problems into Analytics Solutions
2.1.1 Case Study: Motor Insurance Fraud
2.2.1 Case Study: Motor Insurance Fraud
2.3 Designing the Analytics Base Table
2.3.1 Case Study: Motor Insurance Fraud
2.4 Designing and Implementing Features
2.4.2 Different Types of Features
2.4.6 Case Study: Motor Insurance Fraud
3.1.1 Case Study: Motor Insurance Fraud
3.2.2 Case Study: Motor Insurance Fraud
3.3 Identifying Data Quality Issues
3.3.4 Case Study: Motor Insurance Fraud
3.4 Handling Data Quality Issues
3.4.3 Case Study: Motor Insurance Fraud
3.5.1 Visualizing Relationships Between Features
3.5.2 Measuring Covariance and Correlation
4.3 Standard Approach: The ID3 Algorithm
4.3.1 A Worked Example: Predicting Vegetation Distributions
4.4.1 Alternative Feature Selection and Impurity Metrics
4.4.2 Handling Continuous Descriptive Features
4.4.3 Predicting Continuous Targets
5.2.2 Measuring Similarity Using Distance Metrics
5.3 Standard Approach: The Nearest Neighbor Algorithm
5.4.4 Predicting Continuous Targets
5.4.5 Other Measures of Similarity
6.2.3 Conditional Independence and Factorization
6.3 Standard Approach: The Naive Bayes Model
6.4.2 Continuous Features: Probability Density Functions
6.4.3 Continuous Features: Binning
7.2.1 Simple Linear Regression
7.3 Standard Approach: Multivariable Linear Regression with Gradient Descent
7.3.1 Multivariable Linear Regression
7.3.3 Choosing Learning Rates and Initial Weights
7.4.1 Interpreting Multivariable Linear Regression Models
7.4.2 Setting the Learning Rate Using Weight Decay
7.4.3 Handling Categorical Descriptive Features
7.4.4 Handling Categorical Target Features: Logistic Regression
7.4.5 Modeling Non-linear Relationships
7.4.6 Multinomial Logistic Regression
8.3 Standard Approach: Misclassification Rate on a Hold-out Test Set
8.4.1 Designing Evaluation Experiments
8.4.2 Performance Measures: Categorical Targets
8.4.3 Performance Measures: Prediction Scores
8.4.4 Performance Measures: Multinomial Targets
8.4.5 Performance Measures: Continuous Targets
8.4.6 Evaluating Models after Deployment
10 Case Study: Galaxy Classification
11 The Art of Machine Learning for Predictive Data Analytics
11.1 Different Perspectives on Prediction Models
11.2 Choosing a Machine Learning Approach
11.2.1 Matching Machine Learning Approaches to Projects
11.2.2 Matching Machine Learning Approaches to Data
A Descriptive Statistics and Data Visualization for Machine Learning
A.1 Descriptive Statistics for Continuous Features
A.2 Descriptive Statistics for Categorical Features
B Introduction to Probability for Machine Learning
B.2 Probability Distributions and Summing Out
B.3 Some Useful Probability Rules
C Differentiation Techniques for Machine Learning