Part II
Data mining Practicalities

  1. 3 All about data
  2. 3.1 Some Basics
  3. 3.2 Data Partition: Random Samples for Training, Testing and Validation
  4. 3.3 Types of Business Information Systems
  5. 3.4 Data Warehouses
  6. 3.5 Three Components of a Data Warehouse: DBMS, DB and DBCS
  7. 3.6 Data Marts
  8. 3.7 A Typical Example from the Online Marketing Area
  9. 3.8 Unique Data Marts
  10. 3.9 Data Mart: Do’s and Don’ts
  1. 4 Data Preparation
  2. 4.1 Necessity of Data Preparation
  3. 4.2 From Small and Long to Short and Wide
  4. 4.3 Transformation of Variables
  5. 4.4 Missing Data and Imputation Strategies
  6. 4.5 Outliers
  7. 4.6 Dealing with the Vagaries of Data
  8. 4.7 Adjusting the Data Distributions
  9. 4.8 Binning
  10. 4.9 Timing Considerations
  11. 4.10 Operational Issues
  1. 5 Analytics
  2. 5.1 Introduction
  3. 5.2 Basis of Statistical Tests
  4. 5.3 Sampling
  5. 5.4 Basic Statistics for Pre-analytics
  6. 5.5 Feature Selection/Reduction of Variables
  7. 5.6 Time Series Analysis
  1. 6 Methods
  2. 6.1 Methods Overview
  3. 6.2 Supervised Learning
  4. 6.3 Multiple Linear Regression for use when Target is Continuous
  5. 6.4 Regression when the Target is not Continuous
  6. 6.5 Decision Trees
  7. 6.6 Neural Networks
  8. 6.7 Which Method Produces the Best Model? A Comparison of Regression, Decision Trees and Neural Networks
  9. 6.8 Unsupervised Learning
  10. 6.9 Cluster Analysis
  11. 6.10 Kohonen Networks and Self-Organising Maps
  12. 6.11 Group Purchase Methods: Association and Sequence Analysis
  1. 7 Validation and Application
  2. 7.1 Introduction to Methods for Validation
  3. 7.2 Lift and Gain Charts
  4. 7.3 Model Stability
  5. 7.4 Sensitivity Analysis
  6. 7.5 Threshold Analytics and Confusion Matrix
  7. 7.6 ROC Curves
  8. 7.7 Cross-Validation and Robustness
  9. 7.8 Model Complexity