Log In
Or create an account -> 
Imperial Library
  • Home
  • About
  • News
  • Upload
  • Forum
  • Help
  • Login/SignUp

Index
Preface
What this book covers What you need for this book Who this book is for Conventions Reader feedback Customer support
Downloading the example code Downloading the color images of this book Errata Piracy Questions
Data Science Using Java
Data science
Machine learning
Supervised learning Unsupervised learning
Clustering Dimensionality reduction
Natural Language Processing
Data science process models
CRISP-DM A running example
Data science in Java
Data science libraries
Data processing libraries Math and stats libraries Machine learning and data mining libraries Text processing
Summary
Data Processing Toolbox
Standard Java library
Collections Input/Output
Reading input data Writing ouput data
Streaming API
Extensions to the standard library
Apache Commons
Commons Lang Commons IO Commons Collections Other commons modules
Google Guava AOL Cyclops React
Accessing data
Text data and CSV Web and HTML JSON Databases DataFrames
Search engine - preparing data Summary
Exploratory Data Analysis
Exploratory data analysis in Java
Search engine datasets Apache Commons Math Joinery
Interactive Exploratory Data Analysis in Java
JVM languages
Interactive Java
Joinery shell
Summary
Supervised Learning - Classification and Regression
Classification
Binary classification models
Smile JSAT LIBSVM and LIBLINEAR Encog
Evaluation
Accuracy Precision, recall, and F1 ROC and AU ROC (AUC) Result validation K-fold cross-validation Training, validation, and testing
Case study - page prediction Regression
Machine learning libraries for regression
Smile JSAT Other libraries
Evaluation
MSE MAE
Case study - hardware performance Summary
Unsupervised Learning - Clustering and Dimensionality Reduction
Dimensionality reduction
Unsupervised dimensionality reduction Principal Component Analysis Truncated SVD Truncated SVD for categorical and sparse data
Random projection
Cluster analysis
Hierarchical methods K-means
Choosing K in K-Means DBSCAN
Clustering for supervised learning
Clusters as features Clustering as dimensionality reduction Supervised learning via clustering
Evaluation
Manual evaluation Supervised evaluation Unsupervised Evaluation
Summary
Working with Text - Natural Language Processing and Information Retrieval
Natural Language Processing and information retrieval
Vector Space Model - Bag of Words and TF-IDF
Vector space model implementation
Indexing and Apache Lucene Natural Language Processing tools
Stanford CoreNLP
Customizing Apache Lucene
Machine learning for texts
Unsupervised learning for texts
Latent Semantic Analysis Text clustering Word embeddings
Supervised learning for texts Text classification Learning to rank for information retrieval
Reranking with Lucene
Summary
Extreme Gradient Boosting
Gradient Boosting Machines and XGBoost
Installing XGBoost
XGBoost in practice
XGBoost for classification
Parameter tuning Text features Feature importance
XGBoost for regression XGBoost for learning to rank
Summary
Deep Learning with DeepLearning4J
Neural Networks and DeepLearning4J
ND4J - N-dimensional arrays for Java Neural networks in DeepLearning4J Convolutional Neural Networks
Deep learning for cats versus dogs
Reading the data Creating the model Monitoring the performance Data augmentation Running DeepLearning4J on GPU
Summary
Scaling Data Science
Apache Hadoop
Hadoop MapReduce Common Crawl
Apache Spark Link prediction
Reading the DBLP graph Extracting features from the graph Node features Negative sampling Edge features Link Prediction with MLlib and XGBoost Link suggestion
Summary
Deploying Data Science Models
Microservices
Spring Boot Search engine service
Online evaluation
A/B testing Multi-armed bandits
Summary
  • ← Prev
  • Back
  • Next →
  • ← Prev
  • Back
  • Next →

Chief Librarian: Las Zenow <zenow@riseup.net>
Fork the source code from gitlab
.

This is a mirror of the Tor onion service:
http://kx5thpx2olielkihfyo4jgjqfb7zx7wxr3sd4xzt26ochei4m6f7tayd.onion