Log In
Or create an account ->
Imperial Library
Home
About
News
Upload
Forum
Help
Login/SignUp
Index
R Data Science Essentials
Table of Contents
R Data Science Essentials
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Support files, eBooks, discount offers, and more
Why subscribe?
Free access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Errata
Piracy
Questions
1. Getting Started with R
Reading data from different sources
Reading data from a database
Data types in R
Variable data types
Data preprocessing techniques
Performing data operations
Arithmetic operations on the data
String operations on the data
Aggregation operations on the data
Mean
Median
Sum
Maximum and minimum
Standard deviation
Control structures in R
Control structures – if and else
Control structures – for
Control structures – while
Control structures – repeat and break
Control structures – next and return
Bringing data to a usable format
Summary
2. Exploratory Data Analysis
The Titanic dataset
Descriptive statistics
Box plot
Exercise
Inferential statistics
Univariate analysis
Bivariate analysis
Multivariate analysis
Cross-tabulation analysis
Graphical analysis
Summary
3. Pattern Discovery
Transactional datasets
Using the built-in dataset
Building the dataset
Apriori analysis
Support, confidence, and lift
Support
Confidence
Lift
Generating filtering rules
Plotting
Dataset
Rules
Sequential dataset
Apriori sequence analysis
Understanding the results
Reference
Business cases
Summary
4. Segmentation Using Clustering
Datasets
Reading and formatting the dataset in R
Centroid-based clustering and an ideal number of clusters
Implementation using K-means
Visualizing the clusters
Connectivity-based clustering
Visualizing the connectivity
Business use cases
Summary
5. Developing Regression Models
Datasets
Sampling the dataset
Logistic regression
Evaluating logistic regression
Linear regression
Evaluating linear regression
Methods to improve the accuracy
Ensemble models
Replacing NA with mean or median
Removing the highly correlated values
Removing outliers
Summary
6. Time Series Forecasting
Datasets
Extracting patterns
Forecasting using ARIMA
Forecasting using Holt-Winters
Methods to improve accuracy
Summary
7. Recommendation Engine
Dataset and transformation
Recommendations using user-based CF
Recommendations using item-based CF
Challenges and enhancements
Summary
8. Communicating Data Analysis
Dataset
Plotting using the googleVis package
Creating an interactive dashboard using Shiny
Summary
Index
← Prev
Back
Next →
← Prev
Back
Next →