Log In
Or create an account -> 
Imperial Library
  • Home
  • About
  • News
  • Upload
  • Forum
  • Help
  • Login/SignUp

Index
Practical Machine Learning
Table of Contents Practical Machine Learning Credits Foreword About the Author Acknowledgments About the Reviewers www.PacktPub.com
Support files, eBooks, discount offers, and more
Why subscribe? Free access for Packt account holders
Preface
What this book covers What you need for this book Who this book is for Conventions Reader feedback Customer support
Downloading the example code Downloading the color images of this book Errata Piracy Questions
1. Introduction to Machine learning
Machine learning
Definition Core Concepts and Terminology What is learning?
Data Labeled and unlabeled data Tasks Algorithms Models
Logical models Geometric models Probabilistic models
Data and inconsistencies in Machine learning
Under-fitting Over-fitting Data instability Unpredictable data formats
Practical Machine learning examples Types of learning problems
Classification Clustering Forecasting, prediction or regression Simulation Optimization Supervised learning Unsupervised learning Semi-supervised learning Reinforcement learning Deep learning
Performance measures
Is the solution good?
Mean squared error (MSE) Mean absolute error (MAE) Normalized MSE and MAE (NMSE and NMAE) Solving the errors: bias and variance
Some complementing fields of Machine learning
Data mining Artificial intelligence (AI) Statistical learning Data science
Machine learning process lifecycle and solution architecture Machine learning algorithms
Decision tree based algorithms Bayesian method based algorithms Kernel method based algorithms Clustering methods Artificial neural networks (ANN) Dimensionality reduction Ensemble methods Instance based learning algorithms Regression analysis based algorithms Association rule based learning algorithms
Machine learning tools and frameworks Summary
2. Machine learning and Large-scale datasets
Big data and the context of large-scale Machine learning
Functional versus Structural – A methodological mismatch
Commoditizing information Theoretical limitations of RDBMS Scaling-up versus Scaling-out storage Distributed and parallel computing strategies
Machine learning: Scalability and Performance
Too many data points or instances Too many attributes or features Shrinking response time windows – need for real-time responses Highly complex algorithm Feed forward, iterative prediction cycles
Model selection process Potential issues in large-scale Machine learning
Algorithms and Concurrency
Developing concurrent algorithms
Technology and implementation options for scaling-up Machine learning
MapReduce programming paradigm High Performance Computing (HPC) with Message Passing Interface (MPI) Language Integrated Queries (LINQ) framework Manipulating datasets with LINQ Graphics Processing Unit (GPU) Field Programmable Gate Array (FPGA) Multicore or multiprocessor systems
Summary
3. An Introduction to Hadoop's Architecture and Ecosystem
Introduction to Apache Hadoop
Evolution of Hadoop (the platform of choice) Hadoop and its core elements
Machine learning solution architecture for big data (employing Hadoop)
The Data Source layer The Ingestion layer The Hadoop Storage layer The Hadoop (Physical) Infrastructure layer – supporting appliance Hadoop platform / Processing layer The Analytics layer The Consumption layer
Explaining and exploring data with Visualizations Security and Monitoring layer Hadoop core components framework
Hadoop Distributed File System (HDFS)
Secondary Namenode and Checkpoint process Splitting large data files Block loading to the cluster and replication
Writing to and reading from HDFS Handling failures HDFS command line RESTFul HDFS
MapReduce
MapReduce architecture What makes MapReduce cater to the needs of large datasets? MapReduce execution flow and components Developing MapReduce components
InputFormat OutputFormat Mapper implementation
Hadoop 2.x
Hadoop ecosystem components Hadoop installation and setup
Installing Jdk 1.7 Creating a system user for Hadoop (dedicated) Disable IPv6 Steps for installing Hadoop 2.6.0 Starting Hadoop
Hadoop distributions and vendors
Summary
4. Machine Learning Tools, Libraries, and Frameworks
Machine learning tools – A landscape Apache Mahout
How does Mahout work? Installing and setting up Apache Mahout
Setting up Maven Setting-up Apache Mahout using Eclipse IDE Setting up Apache Mahout without Eclipse
Mahout Packages Implementing vectors in Mahout
R
Installing and setting up R Integrating R with Apache Hadoop
Approach 1 – Using R and Streaming APIs in Hadoop Approach 2 – Using the Rhipe package of R Approach 3 – Using RHadoop Summary of R/Hadoop integration approaches Implementing in R (using examples)
R Expressions
Assignments Functions
R Vectors
Assigning, accessing, and manipulating vectors
R Matrices R Factors R Data Frames R Statistical frameworks
Julia
Installing and setting up Julia
Downloading and using the command line version of Julia Using Juno IDE for running Julia Using Julia via the browser
Running the Julia code from the command line Implementing in Julia (with examples) Using variables and assignments
Numeric primitives Data structures Working with Strings and String manipulations Packages Interoperability
Integrating with C Integrating with Python Integrating with MATLAB
Graphics and plotting
Benefits of adopting Julia Integrating Julia and Hadoop
Python
Toolkit options in Python Implementation of Python (using examples)
Installing Python and setting up scikit-learn
Loading data
Apache Spark
Scala Programming with Resilient Distributed Datasets (RDD)
Spring XD Summary
5. Decision Tree based learning
Decision trees
Terminology Purpose and uses Constructing a Decision tree
Handling missing values Considerations for constructing Decision trees
Choosing the appropriate attribute(s)
Information gain and Entropy Gini index Gain ratio
Termination Criteria / Pruning Decision trees
Decision trees in a graphical representation Inducing Decision trees – Decision tree algorithms
CART C4.5
Greedy Decision trees Benefits of Decision trees
Specialized trees
Oblique trees Random forests Evolutionary trees Hellinger trees
Implementing Decision trees
Using Mahout Using R Using Spark Using Python (scikit-learn) Using Julia
Summary
6. Instance and Kernel Methods Based Learning
Instance-based learning (IBL)
Nearest Neighbors
Value of k in KNN Distance measures in KNN
Euclidean distance Hamming distance Minkowski distance
Case-based reasoning (CBR) Locally weighed regression (LWR)
Implementing KNN
Using Mahout Using R Using Spark Using Python (scikit-learn) Using Julia
Kernel methods-based learning
Kernel functions Support Vector Machines (SVM)
Inseparable Data
Implementing SVM
Using Mahout Using R Using Spark Using Python (Scikit-learn) Using Julia
Summary
7. Association Rules based learning
Association rules based learning
Association rule – a definition Apriori algorithm
Rule generation strategy
Rules for defining appropriate minsup Apriori – the downside
FP-growth algorithm Apriori versus FP-growth
Implementing Apriori and FP-growth
Using Mahout Using R Using Spark Using Python (Scikit-learn) Using Julia
Summary
8. Clustering based learning
Clustering-based learning Types of clustering
Hierarchical clustering Partitional clustering
The k-means clustering algorithm
Convergence or stopping criteria for the k-means clustering
K-means clustering on disk
Advantages of the k-means approach Disadvantages of the k-means algorithm Distance measures Complexity measures
Implementing k-means clustering
Using Mahout Using R Using Spark Using Python (scikit-learn) Using Julia
Summary
9. Bayesian learning
Bayesian learning
Statistician's thinking
Important terms and definitions Probability
Types of events
Mutually exclusive or disjoint events Independent events Dependent events
Types of probability Distribution Bernoulli distribution Binomial distribution
Poisson probability distribution Exponential distribution Normal distribution Relationship between the distributions
Bayes' theorem Naïve Bayes classifier
Multinomial Naïve Bayes classifier The Bernoulli Naïve Bayes classifier
Implementing Naïve Bayes algorithm
Using Mahout Using R Using Spark Using scikit-learn Using Julia
Summary
10. Regression based learning
Regression analysis
Revisiting statistics
Properties of expectation, variance, and covariance
Properties of variance Properties of covariance Example
ANOVA and F Statistics
Confounding Effect modification
Regression methods
Simple regression or simple linear regression Multiple regression Polynomial (non-linear) regression Generalized Linear Models (GLM) Logistic regression (logit link)
Odds ratio in logistic regression
Model
Poisson regression
Implementing linear and logistic regression
Using Mahout Using R Using Spark Using scikit-learn Using Julia
Summary
11. Deep learning
Background
The human brain Neural networks
Neuron Synapses Artificial neurons or perceptrons
Linear neurons Rectified linear neurons / linear threshold neurons Binary threshold neurons Sigmoid neurons Stochastic binary neurons
Neural Network size
An example
Neural network types
Multilayer fully connected feedforward networks or Multilayer Perceptrons (MLP) Jordan networks Elman networks Radial Bias Function (RBF) networks Hopfield networks Dynamic Learning Vector Quantization (DLVQ) networks Gradient descent method
Backpropagation algorithm Softmax regression technique
Deep learning taxonomy
Convolutional neural networks (CNN/ConvNets)
Convolutional layer (CONV) Pooling layer (POOL) Fully connected layer (FC)
Recurrent Neural Networks (RNNs) Restricted Boltzmann Machines (RBMs) Deep Boltzmann Machines (DBMs) Autoencoders
Implementing ANNs and Deep learning methods
Using Mahout Using R Using Spark Using Python (Scikit-learn) Using Julia
Summary
12. Reinforcement learning
Reinforcement Learning (RL)
The context of Reinforcement Learning
Examples of Reinforcement Learning Evaluative Feedback
n-Armed Bandit problem Action-value methods Reinforcement comparison methods
The Reinforcement Learning problem – the world grid example Markov Decision Process (MDP) Basic RL model – agent-environment interface Delayed rewards The policy
Reinforcement Learning – key features
Reinforcement learning solution methods
Dynamic Programming (DP)
Generalized Policy Iteration (GPI)
Monte Carlo methods Temporal difference (TD) learning
Sarsa - on-Policy TD
Q-Learning – off-Policy TD Actor-critic methods (on-policy) R Learning (Off-policy) Implementing Reinforcement Learning algorithms
Using Mahout Using R Using Spark Using Python (Scikit-learn) Using Julia
Summary
13. Ensemble learning
Ensemble learning methods
The wisdom of the crowd Key use cases
Recommendation systems Anomaly detection Transfer learning Stream mining or classification
Ensemble methods
Supervised ensemble methods
Boosting
AdaBoost
Bagging Wagging
Random forests Gradient boosting machines (GBM)
Unsupervised ensemble methods
Implementing ensemble methods
Using Mahout Using R Using Spark Using Python (Scikit-learn) Using Julia
Summary
14. New generation data architectures for Machine learning
Evolution of data architectures Emerging perspectives & drivers for new age data architectures Modern data architectures for Machine learning
Semantic data architecture
The business data lake Semantic Web technologies
Ontology and data integration
Vendors
Multi-model database architecture / polyglot persistence
Vendors
Lambda Architecture (LA)
Vendors
Summary
Index
  • ← Prev
  • Back
  • Next →
  • ← Prev
  • Back
  • Next →

Chief Librarian: Las Zenow <zenow@riseup.net>
Fork the source code from gitlab
.

This is a mirror of the Tor onion service:
http://kx5thpx2olielkihfyo4jgjqfb7zx7wxr3sd4xzt26ochei4m6f7tayd.onion