Practical Machine Learning by Witten, Ian H. -- Read -- Imperial Library of Trantor

Index

Practical Machine Learning

Table of Contents Practical Machine Learning Credits Foreword About the Author Acknowledgments About the Reviewers www.PacktPub.com

Support files, eBooks, discount offers, and more

Why subscribe? Free access for Packt account holders

Preface

What this book covers What you need for this book Who this book is for Conventions Reader feedback Customer support

Downloading the example code Downloading the color images of this book Errata Piracy Questions

1. Introduction to Machine learning

Machine learning

Definition Core Concepts and Terminology What is learning?

Data Labeled and unlabeled data Tasks Algorithms Models

Logical models Geometric models Probabilistic models

Data and inconsistencies in Machine learning

Under-fitting Over-fitting Data instability Unpredictable data formats

Practical Machine learning examples Types of learning problems

Classification Clustering Forecasting, prediction or regression Simulation Optimization Supervised learning Unsupervised learning Semi-supervised learning Reinforcement learning Deep learning

Performance measures

Is the solution good?

Mean squared error (MSE) Mean absolute error (MAE) Normalized MSE and MAE (NMSE and NMAE) Solving the errors: bias and variance

Some complementing fields of Machine learning

Data mining Artificial intelligence (AI) Statistical learning Data science

Machine learning process lifecycle and solution architecture Machine learning algorithms

Decision tree based algorithms Bayesian method based algorithms Kernel method based algorithms Clustering methods Artificial neural networks (ANN) Dimensionality reduction Ensemble methods Instance based learning algorithms Regression analysis based algorithms Association rule based learning algorithms

Machine learning tools and frameworks Summary

2. Machine learning and Large-scale datasets

Big data and the context of large-scale Machine learning

Functional versus Structural – A methodological mismatch

Commoditizing information Theoretical limitations of RDBMS Scaling-up versus Scaling-out storage Distributed and parallel computing strategies

Machine learning: Scalability and Performance

Too many data points or instances Too many attributes or features Shrinking response time windows – need for real-time responses Highly complex algorithm Feed forward, iterative prediction cycles

Model selection process Potential issues in large-scale Machine learning

Algorithms and Concurrency

Developing concurrent algorithms

Technology and implementation options for scaling-up Machine learning

MapReduce programming paradigm High Performance Computing (HPC) with Message Passing Interface (MPI) Language Integrated Queries (LINQ) framework Manipulating datasets with LINQ Graphics Processing Unit (GPU) Field Programmable Gate Array (FPGA) Multicore or multiprocessor systems

Summary

3. An Introduction to Hadoop's Architecture and Ecosystem

Introduction to Apache Hadoop

Evolution of Hadoop (the platform of choice) Hadoop and its core elements

Machine learning solution architecture for big data (employing Hadoop)

The Data Source layer The Ingestion layer The Hadoop Storage layer The Hadoop (Physical) Infrastructure layer – supporting appliance Hadoop platform / Processing layer The Analytics layer The Consumption layer

Explaining and exploring data with Visualizations Security and Monitoring layer Hadoop core components framework

Hadoop Distributed File System (HDFS)

Secondary Namenode and Checkpoint process Splitting large data files Block loading to the cluster and replication

Writing to and reading from HDFS Handling failures HDFS command line RESTFul HDFS

MapReduce

MapReduce architecture What makes MapReduce cater to the needs of large datasets? MapReduce execution flow and components Developing MapReduce components

InputFormat OutputFormat Mapper implementation

Hadoop 2.x

Hadoop ecosystem components Hadoop installation and setup

Installing Jdk 1.7 Creating a system user for Hadoop (dedicated) Disable IPv6 Steps for installing Hadoop 2.6.0 Starting Hadoop

Hadoop distributions and vendors

Summary

4. Machine Learning Tools, Libraries, and Frameworks

Machine learning tools – A landscape Apache Mahout

How does Mahout work? Installing and setting up Apache Mahout

Setting up Maven Setting-up Apache Mahout using Eclipse IDE Setting up Apache Mahout without Eclipse

Mahout Packages Implementing vectors in Mahout

Installing and setting up R Integrating R with Apache Hadoop

Approach 1 – Using R and Streaming APIs in Hadoop Approach 2 – Using the Rhipe package of R Approach 3 – Using RHadoop Summary of R/Hadoop integration approaches Implementing in R (using examples)

R Expressions

Assignments Functions

R Vectors

Assigning, accessing, and manipulating vectors

R Matrices R Factors R Data Frames R Statistical frameworks

Julia

Installing and setting up Julia

Downloading and using the command line version of Julia Using Juno IDE for running Julia Using Julia via the browser

Running the Julia code from the command line Implementing in Julia (with examples) Using variables and assignments

Numeric primitives Data structures Working with Strings and String manipulations Packages Interoperability

Integrating with C Integrating with Python Integrating with MATLAB

Graphics and plotting

Benefits of adopting Julia Integrating Julia and Hadoop

Python

Toolkit options in Python Implementation of Python (using examples)

Installing Python and setting up scikit-learn

Loading data

Apache Spark

Scala Programming with Resilient Distributed Datasets (RDD)

Spring XD Summary

5. Decision Tree based learning

Decision trees

Terminology Purpose and uses Constructing a Decision tree

Handling missing values Considerations for constructing Decision trees

Choosing the appropriate attribute(s)

Information gain and Entropy Gini index Gain ratio

Termination Criteria / Pruning Decision trees

Decision trees in a graphical representation Inducing Decision trees – Decision tree algorithms

CART C4.5

Greedy Decision trees Benefits of Decision trees

Specialized trees

Oblique trees Random forests Evolutionary trees Hellinger trees

Implementing Decision trees

Using Mahout Using R Using Spark Using Python (scikit-learn) Using Julia

Summary

6. Instance and Kernel Methods Based Learning

Instance-based learning (IBL)

Nearest Neighbors

Value of k in KNN Distance measures in KNN

Euclidean distance Hamming distance Minkowski distance

Case-based reasoning (CBR) Locally weighed regression (LWR)

Implementing KNN

Using Mahout Using R Using Spark Using Python (scikit-learn) Using Julia

Kernel methods-based learning

Kernel functions Support Vector Machines (SVM)

Inseparable Data

Implementing SVM

Using Mahout Using R Using Spark Using Python (Scikit-learn) Using Julia

Summary

7. Association Rules based learning

Association rules based learning

Association rule – a definition Apriori algorithm

Rule generation strategy

Rules for defining appropriate minsup Apriori – the downside

FP-growth algorithm Apriori versus FP-growth

Implementing Apriori and FP-growth

Using Mahout Using R Using Spark Using Python (Scikit-learn) Using Julia

Summary

8. Clustering based learning

Clustering-based learning Types of clustering

Hierarchical clustering Partitional clustering

The k-means clustering algorithm

Convergence or stopping criteria for the k-means clustering

K-means clustering on disk

Advantages of the k-means approach Disadvantages of the k-means algorithm Distance measures Complexity measures

Implementing k-means clustering

Using Mahout Using R Using Spark Using Python (scikit-learn) Using Julia

Summary

9. Bayesian learning

Bayesian learning

Statistician's thinking

Important terms and definitions Probability

Types of events

Mutually exclusive or disjoint events Independent events Dependent events

Types of probability Distribution Bernoulli distribution Binomial distribution

Poisson probability distribution Exponential distribution Normal distribution Relationship between the distributions

Bayes' theorem Naïve Bayes classifier

Multinomial Naïve Bayes classifier The Bernoulli Naïve Bayes classifier

Implementing Naïve Bayes algorithm

Using Mahout Using R Using Spark Using scikit-learn Using Julia

Summary

10. Regression based learning

Regression analysis

Revisiting statistics

Properties of expectation, variance, and covariance

Properties of variance Properties of covariance Example

ANOVA and F Statistics

Confounding Effect modification

Regression methods

Simple regression or simple linear regression Multiple regression Polynomial (non-linear) regression Generalized Linear Models (GLM) Logistic regression (logit link)

Odds ratio in logistic regression

Model

Poisson regression

Implementing linear and logistic regression

Using Mahout Using R Using Spark Using scikit-learn Using Julia

Summary

11. Deep learning

Background

The human brain Neural networks

Neuron Synapses Artificial neurons or perceptrons

Linear neurons Rectified linear neurons / linear threshold neurons Binary threshold neurons Sigmoid neurons Stochastic binary neurons

Neural Network size

An example

Neural network types

Multilayer fully connected feedforward networks or Multilayer Perceptrons (MLP) Jordan networks Elman networks Radial Bias Function (RBF) networks Hopfield networks Dynamic Learning Vector Quantization (DLVQ) networks Gradient descent method

Backpropagation algorithm Softmax regression technique

Deep learning taxonomy

Convolutional neural networks (CNN/ConvNets)

Convolutional layer (CONV) Pooling layer (POOL) Fully connected layer (FC)

Recurrent Neural Networks (RNNs) Restricted Boltzmann Machines (RBMs) Deep Boltzmann Machines (DBMs) Autoencoders

Implementing ANNs and Deep learning methods

Using Mahout Using R Using Spark Using Python (Scikit-learn) Using Julia

Summary

12. Reinforcement learning

Reinforcement Learning (RL)

The context of Reinforcement Learning

Examples of Reinforcement Learning Evaluative Feedback

n-Armed Bandit problem Action-value methods Reinforcement comparison methods

The Reinforcement Learning problem – the world grid example Markov Decision Process (MDP) Basic RL model – agent-environment interface Delayed rewards The policy

Reinforcement Learning – key features

Reinforcement learning solution methods

Dynamic Programming (DP)

Generalized Policy Iteration (GPI)

Monte Carlo methods Temporal difference (TD) learning

Sarsa - on-Policy TD

Q-Learning – off-Policy TD Actor-critic methods (on-policy) R Learning (Off-policy) Implementing Reinforcement Learning algorithms

Using Mahout Using R Using Spark Using Python (Scikit-learn) Using Julia

Summary

13. Ensemble learning

Ensemble learning methods

The wisdom of the crowd Key use cases

Recommendation systems Anomaly detection Transfer learning Stream mining or classification

Ensemble methods

Supervised ensemble methods

Boosting

AdaBoost

Bagging Wagging

Random forests Gradient boosting machines (GBM)

Unsupervised ensemble methods

Implementing ensemble methods

Using Mahout Using R Using Spark Using Python (Scikit-learn) Using Julia

Summary

14. New generation data architectures for Machine learning

Evolution of data architectures Emerging perspectives & drivers for new age data architectures Modern data architectures for Machine learning

Semantic data architecture

The business data lake Semantic Web technologies

Ontology and data integration

Vendors

Multi-model database architecture / polyglot persistence

Vendors

Lambda Architecture (LA)

Vendors

Summary

Index

← Prev
Back
Next →

← Prev
Back
Next →