Mastering Predictive Analytics With R · 2nd Edition by Miller, James D. -- Read -- Imperial Library of Trantor

Log In

Or create an account ->

Imperial Library

Home
About
News
Upload
Forum

Help

Login/SignUp

Index

Mastering Predictive Analytics with R Second Edition Credits About the Authors About the Reviewer www.PacktPub.com eBooks, discount offers, and more Why subscribe? Customer Feedback Preface What this book covers What you need for this book Who this book is for Conventions Reader feedback Customer support Downloading the example code Downloading the color images of this book Errata Piracy Questions 1. Gearing Up for Predictive Modeling Models Learning from data The core components of a model Our first model – k-nearest neighbors Types of model Supervised, unsupervised, semi-supervised, and reinforcement learning models Parametric and nonparametric models Regression and classification models Real-time and batch machine learning models The process of predictive modeling Defining the model's objective Collecting the data Picking a model Pre-processing the data Exploratory data analysis Feature transformations Encoding categorical features Missing data Outliers Removing problematic features Feature engineering and dimensionality reduction Training and assessing the model Repeating with different models and final model selection Deploying the model Summary 2. Tidying Data and Measuring Performance Getting started Tidying data Categorizing data quality The first step The next step The final step Performance metrics Assessing regression models Assessing classification models Assessing binary classification models Cross-validation Learning curves Plot and ping Summary 3. Linear Regression Introduction to linear regression Assumptions of linear regression Simple linear regression Estimating the regression coefficients Multiple linear regression Predicting CPU performance Predicting the price of used cars Assessing linear regression models Residual analysis Significance tests for linear regression Performance metrics for linear regression Comparing different regression models Test set performance Problems with linear regression Multicollinearity Outliers Feature selection Regularization Ridge regression Least absolute shrinkage and selection operator (lasso) Implementing regularization in R Polynomial regression Summary 4. Generalized Linear Models Classifying with linear regression Introduction to logistic regression Generalized linear models Interpreting coefficients in logistic regression Assumptions of logistic regression Maximum likelihood estimation Predicting heart disease Assessing logistic regression models Model deviance Test set performance Regularization with the lasso Classification metrics Extensions of the binary logistic classifier Multinomial logistic regression Predicting glass type Ordinal logistic regression Predicting wine quality Poisson regression Negative Binomial regression Summary 5. Neural Networks The biological neuron The artificial neuron Stochastic gradient descent Gradient descent and local minima The perceptron algorithm Linear separation The logistic neuron Multilayer perceptron networks Training multilayer perceptron networks The back propagation algorithm Predicting the energy efficiency of buildings Evaluating multilayer perceptrons for regression Predicting glass type revisited Predicting handwritten digits Receiver operating characteristic curves Radial basis function networks Summary 6. Support Vector Machines Maximal margin classification Support vector classification Inner products Kernels and support vector machines Predicting chemical biodegration Predicting credit scores Multiclass classification with support vector machines Summary 7. Tree-Based Methods The intuition for tree models Algorithms for training decision trees Classification and regression trees CART regression trees Tree pruning Missing data Regression model trees CART classification trees C5.0 Predicting class membership on synthetic 2D data Predicting the authenticity of banknotes Predicting complex skill learning Tuning model parameters in CART trees Variable importance in tree models Regression model trees in action Improvements to the M5 model Summary 8. Dimensionality Reduction Defining DR Correlated data analyses Scatterplots Causation The degree of correlation Reporting on correlation Principal component analysis Using R to understand PCA Independent component analysis Defining independence ICA pre-processing Factor analysis Explore and confirm Using R for factor analysis The output NNMF Summary 9. Ensemble Methods Bagging Margins and out-of-bag observations Predicting complex skill learning with bagging Predicting heart disease with bagging Limitations of bagging Boosting AdaBoost AdaBoost for binary classification Predicting atmospheric gamma ray radiation Predicting complex skill learning with boosting Limitations of boosting Random forests The importance of variables in random forests XGBoost Summary 10. Probabilistic Graphical Models A little graph theory Bayes' theorem Conditional independence Bayesian networks The Naïve Bayes classifier Predicting the sentiment of movie reviews Predicting promoter gene sequences Predicting letter patterns in English words Summary 11. Topic Modeling An overview of topic modeling Latent Dirichlet Allocation The Dirichlet distribution The generative process Fitting an LDA model Modeling the topics of online news stories Model stability Finding the number of topics Topic distributions Word distributions LDA extensions Modeling tweet topics Word clouding Summary 12. Recommendation Systems Rating matrix Measuring user similarity Collaborative filtering User-based collaborative filtering Item-based collaborative filtering Singular value decomposition Predicting recommendations for movies and jokes Loading and pre-processing the data Exploring the data Evaluating binary top-N recommendations Evaluating non-binary top-N recommendations Evaluating individual predictions Other approaches to recommendation systems Summary 13. Scaling Up Starting the project Data definition Experience Data of scale – big data Using Excel to gauge your data Characteristics of big data Volume Varieties Sources and spans Structure Statistical noise Training models at scale Pain by phase Specific challenges Heterogeneity Scale Location Timeliness Privacy Collaborations Reproducibility A path forward Opportunities Bigger data, bigger hardware Breaking up Sampling Aggregation Dimensional reduction Alternatives Chunking Alternative language integrations Summary 14. Deep Learning Machine learning or deep learning What is deep learning? An alternative to manual instruction Growing importance Deeper data? Deep learning for IoT Use cases Word embedding Word prediction Word vectors Numerical representations of contextual similarities Netflix learns Implementations Deep learning architectures Artificial neural networks Recurrent neural networks Summary Index

← Prev
Back
Next →

← Prev
Back
Next →

Chief Librarian: Las Zenow <zenow@riseup.net>
Fork the source code from gitlab.

This is a mirror of the Tor onion service:
http://kx5thpx2olielkihfyo4jgjqfb7zx7wxr3sd4xzt26ochei4m6f7tayd.onion