Log In
Or create an account -> 
Imperial Library
  • Home
  • About
  • News
  • Upload
  • Forum
  • Help
  • Login/SignUp

Index
Mastering Java Machine Learning
Table of Contents Mastering Java Machine Learning Credits Foreword About the Authors About the Reviewers www.PacktPub.com
eBooks, discount offers, and more
Why subscribe?
Customer Feedback Preface
What this book covers What you need for this book Who this book is for Conventions Reader feedback Customer support
Errata Piracy Questions
1. Machine Learning Review
Machine learning – history and definition What is not machine learning? Machine learning – concepts and terminology Machine learning – types and subtypes Datasets used in machine learning Machine learning applications Practical issues in machine learning Machine learning – roles and process
Roles Process
Machine learning – tools and datasets
Datasets
Summary
2. Practical Approach to Real-World Supervised Learning
Formal description and notation
Data quality analysis Descriptive data analysis
Basic label analysis Basic feature analysis
Visualization analysis
Univariate feature analysis
Categorical features Continuous features
Multivariate feature analysis
Data transformation and preprocessing
Feature construction Handling missing values Outliers Discretization Data sampling
Is sampling needed? Undersampling and oversampling
Stratified sampling
Training, validation, and test set
Feature relevance analysis and dimensionality reduction
Feature search techniques Feature evaluation techniques
Filter approach
Univariate feature selection
Information theoretic approach Statistical approach
Multivariate feature selection
Minimal redundancy maximal relevance (mRMR) Correlation-based feature selection (CFS)
Wrapper approach Embedded approach
Model building
Linear models
Linear Regression
Algorithm input and output How does it work? Advantages and limitations
Naïve Bayes
Algorithm input and output How does it work? Advantages and limitations
Logistic Regression
Algorithm input and output How does it work? Advantages and limitations
Non-linear models
Decision Trees
Algorithm inputs and outputs How does it work? Advantages and limitations
K-Nearest Neighbors (KNN)
Algorithm inputs and outputs How does it work? Advantages and limitations
Support vector machines (SVM)
Algorithm inputs and outputs How does it work? Advantages and limitations
Ensemble learning and meta learners
Bootstrap aggregating or bagging
Algorithm inputs and outputs How does it work?
Random Forest
Advantages and limitations
Boosting
Algorithm inputs and outputs How does it work? Advantages and limitations
Model assessment, evaluation, and comparisons
Model assessment Model evaluation metrics
Confusion matrix and related metrics ROC and PRC curves Gain charts and lift curves
Model comparisons
Comparing two algorithms
McNemar's Test
Paired-t test
Wilcoxon signed-rank test
Comparing multiple algorithms
ANOVA test Friedman's test
Case Study – Horse Colic Classification
Business problem Machine learning mapping Data analysis
Label analysis
Features analysis
Supervised learning experiments
Weka experiments
Sample end-to-end process in Java Weka experimenter and model selection
RapidMiner experiments
Visualization analysis Feature selection Model process flow Model evaluation metrics
Evaluation on Confusion Metrics
ROC Curves, Lift Curves, and Gain Charts
Results, observations, and analysis
Summary References
3. Unsupervised Machine Learning Techniques
Issues in common with supervised learning Issues specific to unsupervised learning Feature analysis and dimensionality reduction
Notation Linear methods
Principal component analysis (PCA)
Inputs and outputs How does it work? Advantages and limitations
Random projections (RP)
Inputs and outputs How does it work? Advantages and limitations
Multidimensional Scaling (MDS)
Inputs and outputs How does it work? Advantages and limitations
Nonlinear methods
Kernel Principal Component Analysis (KPCA)
Inputs and outputs How does it work? Advantages and limitations
Manifold learning
Inputs and outputs How does it work? Advantages and limitations
Clustering
Clustering algorithms
k-Means
Inputs and outputs How does it work? Advantages and limitations
DBSCAN
Inputs and outputs How does it work? Advantages and limitations
Mean shift
Inputs and outputs How does it work? Advantages and limitations
Expectation maximization (EM) or Gaussian mixture modeling (GMM)
Input and output How does it work? Advantages and limitations
Hierarchical clustering
Input and output How does it work? Advantages and limitations
Self-organizing maps (SOM)
Inputs and outputs How does it work? Advantages and limitations
Spectral clustering
Inputs and outputs How does it work? Advantages and limitations
Affinity propagation
Inputs and outputs How does it work? Advantages and limitations
Clustering validation and evaluation
Internal evaluation measures
Notation R-Squared Dunn's Indices Davies-Bouldin index
Silhouette's index
External evaluation measures
Rand index F-Measure Normalized mutual information index
Outlier or anomaly detection
Outlier algorithms
Statistical-based
Inputs and outputs How does it work? Advantages and limitations
Distance-based methods
Inputs and outputs How does it work? Advantages and limitations
Density-based methods
Inputs and outputs How does it work? Advantages and limitations
Clustering-based methods
Inputs and outputs How does it work? Advantages and limitations
High-dimensional-based methods
Inputs and outputs How does it work? Advantages and limitations
One-class SVM
Inputs and outputs How does it work? Advantages and limitations
Outlier evaluation techniques
Supervised evaluation Unsupervised evaluation
Real-world case study
Tools and software Business problem Machine learning mapping Data collection Data quality analysis Data sampling and transformation Feature analysis and dimensionality reduction
PCA Random projections ISOMAP Observations on feature analysis and dimensionality reduction
Clustering models, results, and evaluation
Observations and clustering analysis
Outlier models, results, and evaluation
Observations and analysis
Summary References
4. Semi-Supervised and Active Learning
Semi-supervised learning
Representation, notation, and assumptions Semi-supervised learning techniques
Self-training SSL
Inputs and outputs How does it work? Advantages and limitations
Co-training SSL or multi-view SSL
Inputs and outputs How does it work? Advantages and limitations
Cluster and label SSL
Inputs and outputs How does it work? Advantages and limitations
Transductive graph label propagation
Inputs and outputs How does it work? Advantages and limitations
Transductive SVM (TSVM)
Inputs and outputs How does it work? Advantages and limitations
Case study in semi-supervised learning
Tools and software Business problem Machine learning mapping Data collection
Data quality analysis
Data sampling and transformation Datasets and analysis
Feature analysis results
Experiments and results
Analysis of semi-supervised learning
Active learning
Representation and notation Active learning scenarios Active learning approaches
Uncertainty sampling
How does it work?
Least confident sampling Smallest margin sampling Label entropy sampling
Advantages and limitations
Version space sampling
Query by disagreement (QBD)
How does it work?
Query by Committee (QBC)
How does it work?
Advantages and limitations Data distribution sampling
How does it work?
Expected model change Expected error reduction
Variance reduction Density weighted methods
Advantages and limitations
Case study in active learning
Tools and software Business problem Machine learning mapping Data Collection Data sampling and transformation Feature analysis and dimensionality reduction Models, results, and evaluation
Pool-based scenarios Stream-based scenarios
Analysis of active learning results
Summary References
5. Real-Time Stream Machine Learning
Assumptions and mathematical notations Basic stream processing and computational techniques
Stream computations Sliding windows Sampling
Concept drift and drift detection
Data management Partial memory
Full memory Detection methods
Monitoring model evolution
Widmer and Kubat Drift Detection Method or DDM Early Drift Detection Method or EDDM
Monitoring distribution changes
Welch's t test
Kolmogorov-Smirnov's test CUSUM and Page-Hinckley test
Adaptation methods
Explicit adaptation Implicit adaptation
Incremental supervised learning
Modeling techniques
Linear algorithms
Online linear models with loss functions
Inputs and outputs How does it work? Advantages and limitations
Online Naïve Bayes
Inputs and outputs How does it work? Advantages and limitations
Non-linear algorithms
Hoeffding trees or very fast decision trees (VFDT)
Inputs and outputs How does it work? Advantages and limitations
Ensemble algorithms
Weighted majority algorithm
Inputs and outputs How does it work? Advantages and limitations
Online Bagging algorithm
Inputs and outputs How does it work? Advantages and limitations
Online Boosting algorithm
Inputs and outputs How does it work? Advantages and limitations
Validation, evaluation, and comparisons in online setting
Model validation techniques
Prequential evaluation Holdout evaluation Controlled permutations Evaluation criteria Comparing algorithms and metrics
Incremental unsupervised learning using clustering
Modeling techniques
Partition based
Online k-Means
Inputs and outputs How does it work? Advantages and limitations
Hierarchical based and micro clustering
Inputs and outputs How does it work? Advantages and limitations Inputs and outputs How does it work? Advantages and limitations
Density based
Inputs and outputs How does it work? Advantages and limitations
Grid based
Inputs and outputs How does it work? Advantages and limitations
Validation and evaluation techniques
Key issues in stream cluster evaluation Evaluation measures
Cluster Mapping Measures (CMM) V-Measure Other external measures
Unsupervised learning using outlier detection
Partition-based clustering for outlier detection
Inputs and outputs How does it work? Advantages and limitations
Distance-based clustering for outlier detection
Inputs and outputs How does it work?
Exact Storm Abstract-C Direct Update of Events (DUE) Micro Clustering based Algorithm (MCOD) Approx Storm
Advantages and limitations
Validation and evaluation techniques
Case study in stream learning
Tools and software Business problem Machine learning mapping Data collection Data sampling and transformation
Feature analysis and dimensionality reduction
Models, results, and evaluation
Supervised learning experiments
Concept drift experiments
Clustering experiments Outlier detection experiments
Analysis of stream learning results
Summary References
6. Probabilistic Graph Modeling
Probability revisited
Concepts in probability
Conditional probability Chain rule and Bayes' theorem Random variables, joint, and marginal distributions Marginal independence and conditional independence Factors
Factor types
Distribution queries
Probabilistic queries MAP queries and marginal MAP queries
Graph concepts
Graph structure and properties Subgraphs and cliques Path, trail, and cycles
Bayesian networks
Representation
Definition Reasoning patterns
Causal or predictive reasoning Evidential or diagnostic reasoning Intercausal reasoning Combined reasoning
Independencies, flow of influence, D-Separation, I-Map
Flow of influence D-Separation I-Map
Inference
Elimination-based inference
Variable elimination algorithm
Input and output How does it work? Advantages and limitations
Clique tree or junction tree algorithm
Input and output How does it work? Advantages and limitations
Propagation-based techniques
Belief propagation
Factor graph Messaging in factor graph Input and output How does it work? Advantages and limitations
Sampling-based techniques
Forward sampling with rejection
Input and output How does it work? Advantages and limitations
Learning
Learning parameters
Maximum likelihood estimation for Bayesian networks Bayesian parameter estimation for Bayesian network
Prior and posterior using the Dirichlet distribution
Learning structures
Measures to evaluate structures Methods for learning structures
Constraint-based techniques
Inputs and outputs How does it work? Advantages and limitations
Search and score-based techniques
Inputs and outputs How does it work? Advantages and limitations
Markov networks and conditional random fields
Representation
Parameterization
Gibbs parameterization Factor graphs Log-linear models
Independencies
Global Pairwise Markov
Markov blanket
Inference Learning Conditional random fields
Specialized networks
Tree augmented network
Input and output How does it work? Advantages and limitations
Markov chains
Hidden Markov models Most probable path in HMM Posterior decoding in HMM
Tools and usage
OpenMarkov Weka Bayesian Network GUI
Case study
Business problem Machine learning mapping Data sampling and transformation Feature analysis Models, results, and evaluation Analysis of results
Summary References
7. Deep Learning
Multi-layer feed-forward neural network
Inputs, neurons, activation function, and mathematical notation Multi-layered neural network
Structure and mathematical notations Activation functions in NN
Sigmoid function Hyperbolic tangent ("tanh") function
Training neural network
Empirical risk minimization
Parameter initialization Loss function Gradients
Gradient at the output layer Gradient at the Hidden Layer Parameter gradient
Feed forward and backpropagation How does it work? Regularization
L2 regularization L1 regularization
Limitations of neural networks
Vanishing gradients, local optimum, and slow training
Deep learning
Building blocks for deep learning
Rectified linear activation function Restricted Boltzmann Machines
Definition and mathematical notation Conditional distribution Free energy in RBM Training the RBM Sampling in RBM Contrastive divergence
Inputs and outputs How does it work?
Persistent contrastive divergence
Autoencoders
Definition and mathematical notations Loss function Limitations of Autoencoders Denoising Autoencoder
Unsupervised pre-training and supervised fine-tuning Deep feed-forward NN
Input and outputs How does it work?
Deep Autoencoders Deep Belief Networks
Inputs and outputs How does it work?
Deep learning with dropouts
Definition and mathematical notation Inputs and outputs
How does it work?
Learning Training and testing with dropouts
Sparse coding Convolutional Neural Network
Local connectivity Parameter sharing Discrete convolution Pooling or subsampling Normalization using ReLU
CNN Layers Recurrent Neural Networks
Structure of Recurrent Neural Networks Learning and associated problems in RNNs Long Short Term Memory Gated Recurrent Units
Case study
Tools and software Business problem Machine learning mapping Data sampling and transfor Feature analysis Models, results, and evaluation
Basic data handling Multi-layer perceptron
Parameters used for MLP Code for MLP
Convolutional Network
Parameters used for ConvNet Code for CNN
Variational Autoencoder
Parameters used for the Variational Autoencoder Code for Variational Autoencoder
DBN Parameter search using Arbiter Results and analysis
Summary References
8. Text Mining and Natural Language Processing
NLP, subfields, and tasks
Text categorization Part-of-speech tagging (POS tagging) Text clustering Information extraction and named entity recognition Sentiment analysis and opinion mining Coreference resolution Word sense disambiguation Machine translation Semantic reasoning and inferencing Text summarization Automating question and answers
Issues with mining unstructured data Text processing components and transformations
Document collection and standardization
Inputs and outputs How does it work?
Tokenization
Inputs and outputs How does it work?
Stop words removal
Inputs and outputs How does it work?
Stemming or lemmatization
Inputs and outputs How does it work?
Local/global dictionary or vocabulary? Feature extraction/generation
Lexical features
Character-based features Word-based features Part-of-speech tagging features Taxonomy features
Syntactic features Semantic features
Feature representation and similarity
Vector space model
Binary Term frequency (TF) Inverse document frequency (IDF) Term frequency-inverse document frequency (TF-IDF)
Similarity measures
Euclidean distance Cosine distance Pairwise-adaptive similarity Extended Jaccard coefficient Dice coefficient
Feature selection and dimensionality reduction
Feature selection
Information theoretic techniques Statistical-based techniques Frequency-based techniques
Dimensionality reduction
Topics in text mining
Text categorization/classification Topic modeling
Probabilistic latent semantic analysis (PLSA)
Input and output How does it work? Advantages and limitations
Text clustering
Feature transformation, selection, and reduction Clustering techniques
Generative probabilistic models
Input and output How does it work? Advantages and limitations
Distance-based text clustering Non-negative matrix factorization (NMF)
Input and output How does it work? Advantages and limitations
Evaluation of text clustering
Named entity recognition
Hidden Markov models for NER
Input and output How does it work? Advantages and limitations
Maximum entropy Markov models for NER
Input and output How does it work? Advantages and limitations
Deep learning and NLP
Tools and usage
Mallet KNIME Topic modeling with mallet Business problem Machine Learning mapping Data collection Data sampling and transformation Feature analysis and dimensionality reduction Models, results, and evaluation Analysis of text processing results
Summary References
9. Big Data Machine Learning – The Final Frontier
What are the characteristics of Big Data? Big Data Machine Learning
General Big Data framework
Big Data cluster deployment frameworks
Hortonworks Data Platform Cloudera CDH Amazon Elastic MapReduce Microsoft Azure HDInsight
Data acquisition
Publish-subscribe frameworks Source-sink frameworks SQL frameworks Message queueing frameworks Custom frameworks
Data storage
HDFS NoSQL
Key-value databases Document databases Columnar databases Graph databases
Data processing and preparation
Hive and HQL Spark SQL Amazon Redshift Real-time stream processing
Machine Learning Visualization and analysis
Batch Big Data Machine Learning
H2O as Big Data Machine Learning platform
H2O architecture Machine learning in H2O Tools and usage
Case study
Business problem Machine Learning mapping Data collection Data sampling and transformation
Experiments, results, and analysis
Feature relevance and analysis Evaluation on test data Analysis of results
Spark MLlib as Big Data Machine Learning platform
Spark architecture Machine Learning in MLlib Tools and usage Experiments, results, and analysis
k-Means k-Means with PCA Bisecting k-Means (with PCA) Gaussian Mixture Model Random Forest
Analysis of results
Real-time Big Data Machine Learning
SAMOA as a real-time Big Data Machine Learning framework
SAMOA architecture
Machine Learning algorithms Tools and usage Experiments, results, and analysis
Analysis of results
The future of Machine Learning Summary References
A. Linear Algebra
Vector
Scalar product of vectors
Matrix
Transpose of a matrix
Matrix addition Scalar multiplication Matrix multiplication
Properties of matrix product
Linear transformation Matrix inverse Eigendecomposition Positive definite matrix
Singular value decomposition (SVD)
B. Probability
Axioms of probability Bayes' theorem
Density estimation Mean Variance Standard deviation Gaussian standard deviation Covariance Correlation coefficient Binomial distribution Poisson distribution Gaussian distribution Central limit theorem Error propagation
Index
  • ← Prev
  • Back
  • Next →
  • ← Prev
  • Back
  • Next →

Chief Librarian: Las Zenow <zenow@riseup.net>
Fork the source code from gitlab
.

This is a mirror of the Tor onion service:
http://kx5thpx2olielkihfyo4jgjqfb7zx7wxr3sd4xzt26ochei4m6f7tayd.onion