Log In
Or create an account ->
Imperial Library
Home
About
News
Upload
Forum
Help
Login/SignUp
Index
Title Page
Copyright and Credits
Machine Learning in Java Second Edition
Contributors
About the authors
About the reviewer
Packt is searching for authors like you
About Packt
Why subscribe?
Packt.com
Preface
Who this book is for
What this book covers
To get the most out of this book
Download the example code files
Download the color images
Conventions used
Get in touch
Reviews
Applied Machine Learning Quick Start
Machine learning and data science
Solving problems with machine learning
Applied machine learning workflow
Data and problem definition
Measurement scales
Data collection
Finding or observing data
Generating data
Sampling traps
Data preprocessing
Data cleaning
Filling missing values
Remove outliers
Data transformation
Data reduction
Unsupervised learning
Finding similar items
Euclidean distances
Non-Euclidean distances
The curse of dimensionality
Clustering
Supervised learning
Classification
Decision tree learning
Probabilistic classifiers
Kernel methods
Artificial neural networks
Ensemble learning
Evaluating classification
Precision and recall
Roc curves
Regression
Linear regression
Logistic regression
Evaluating regression
Mean squared error
Mean absolute error
Correlation coefficient
Generalization and evaluation
Underfitting and overfitting
Train and test sets
Cross-validation
Leave-one-out validation
Stratification
Summary
Java Libraries and Platforms for Machine Learning
The need for Java
Machine learning libraries
Weka
Java machine learning
Apache Mahout
Apache Spark
Deeplearning4j
MALLET
The Encog Machine Learning Framework
ELKI
MOA
Comparing libraries
Building a machine learning application
Traditional machine learning architecture
Dealing with big data
Big data application architecture
Summary
Basic Algorithms - Classification, Regression, and Clustering
Before you start
Classification
Data
Loading data
Feature selection
Learning algorithms
Classifying new data
Evaluation and prediction error metrics
The confusion matrix
Choosing a classification algorithm
Classification using Encog
Classification using massive online analysis
Evaluation
Baseline classifiers
Decision tree
Lazy learning
Active learning
Regression
Loading the data
Analyzing attributes
Building and evaluating the regression model
Linear regression
Linear regression using Encog
Regression using MOA
Regression trees
Tips to avoid common regression problems
Clustering
Clustering algorithms
Evaluation
Clustering using Encog
Clustering using ELKI
Summary
Customer Relationship Prediction with Ensembles
The customer relationship database
Challenge
Dataset
Evaluation
Basic Naive Bayes classifier baseline
Getting the data
Loading the data
Basic modeling
Evaluating models
Implementing the Naive Bayes baseline
Advanced modeling with ensembles
Before we start
Data preprocessing
Attribute selection
Model selection
Performance evaluation
Ensemble methods – MOA
Summary
Affinity Analysis
Market basket analysis
Affinity analysis
Association rule learning
Basic concepts
Database of transactions
Itemset and rule
Support
Lift
Confidence
Apriori algorithm
FP-Growth algorithm
The supermarket dataset
Discover patterns
Apriori
FP-Growth
Other applications in various areas
Medical diagnosis
Protein sequences
Census data
Customer relationship management
IT operations analytics
Summary
Recommendation Engines with Apache Mahout
Basic concepts
Key concepts
User-based and item-based analysis
Calculating similarity
Collaborative filtering
Content-based filtering
Hybrid approach
Exploitation versus exploration
Getting Apache Mahout
Configuring Mahout in Eclipse with the Maven plugin
Building a recommendation engine
Book ratings dataset
Loading the data
Loading data from a file
Loading data from a database
In-memory databases
Collaborative filtering
User-based filtering
Item-based filtering
Adding custom rules to recommendations
Evaluation
Online learning engine
Content-based filtering
Summary
Fraud and Anomaly Detection
Suspicious and anomalous behavior detection
Unknown unknowns
Suspicious pattern detection
Anomalous pattern detection
Analysis types
Pattern analysis
Transaction analysis
Plan recognition
Outlier detection using ELKI
An example using ELKI
Fraud detection in insurance claims
Dataset
Modeling suspicious patterns
The vanilla approach
Dataset rebalancing
Anomaly detection in website traffic
Dataset
Anomaly detection in time series data
Using Encog for time series
Histogram-based anomaly detection
Loading the data
Creating histograms
Density-based k-nearest neighbors
Summary
Image Recognition with Deeplearning4j
Introducing image recognition
Neural networks
Perceptron
Feedforward neural networks
Autoencoder
Restricted Boltzmann machine
Deep convolutional networks
Image classification
Deeplearning4j
Getting DL4J
MNIST dataset
Loading the data
Building models
Building a single-layer regression model
Building a deep belief network
Building a multilayer convolutional network
Summary
Activity Recognition with Mobile Phone Sensors
Introducing activity recognition
Mobile phone sensors
Activity recognition pipeline
The plan
Collecting data from a mobile phone
Installing Android Studio
Loading the data collector
Feature extraction
Collecting training data
Building a classifier
Reducing spurious transitions
Plugging the classifier into a mobile app
Summary
Text Mining with Mallet - Topic Modeling and Spam Detection
Introducing text mining
Topic modeling
Text classification
Installing Mallet
Working with text data
Importing data
Importing from directory
Importing from file
Pre-processing text data
Topic modeling for BBC News
BBC dataset
Modeling
Evaluating a model
Reusing a model
Saving a model
Restoring a model
Detecting email spam 
Email spam dataset
Feature generation
Training and testing
Model performance
Summary
What Is Next?
Machine learning in real life
Noisy data
Class unbalance
Feature selection
Model chaining
The importance of evaluation
Getting models into production
Model maintenance
Standards and markup languages
CRISP-DM
SEMMA methodology
Predictive model markup language
Machine learning in the cloud
Machine learning as a service
Web resources and competitions
Datasets
Online courses
Competitions
Websites and blogs
Venues and conferences
Summary
Other Books You May Enjoy
Leave a review - let other readers know what you think
← Prev
Back
Next →
← Prev
Back
Next →