Log In
Or create an account ->
Imperial Library
Home
About
News
Upload
Forum
Help
Login/SignUp
Index
R Statistical Application Development by Example Beginner's Guide
Table of Contents
R Statistical Application Development by Example Beginner's Guide
Credits
About the Author
About the Reviewers
www.PacktPub.com
Support files, eBooks, discount offers and more
Why Subscribe?
Free Access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Time for action – heading
What just happened?
Pop quiz – heading
Have a go hero – heading
Reader feedback
Customer support
Downloading the example code
Downloading the color images of this book
Errata
Piracy
Questions
1. Data Characteristics
Questionnaire and its components
Understanding the data characteristics in an R environment
Experiments with uncertainty in computer science
R installation
Using R packages
RSADBE – the book's R package
Discrete distribution
Discrete uniform distribution
Binomial distribution
Hypergeometric distribution
Negative binomial distribution
Poisson distribution
Continuous distribution
Uniform distribution
Exponential distribution
Normal distribution
Summary
2. Import/Export Data
data.frame and other formats
Constants, vectors, and matrices
Time for action – understanding constants, vectors, and basic arithmetic
What just happened?
Time for action – matrix computations
What just happened?
The list object
Time for action – creating a list object
What just happened?
The data.frame object
Time for action – creating a data.frame object
What just happened?
Have a go hero
The table object
Time for action – creating the Titanic dataset as a table object
What just happened?
Have a go hero
read.csv, read.xls, and the foreign package
Time for action – importing data from external files
What just happened?
What just happened?
Importing data from MySQL
Exporting data/graphs
Exporting R objects
Exporting graphs
Time for action – exporting a graph
What just happened?
Managing an R session
Time for action – session management
What just happened?
Have a go hero
Summary
3. Data Visualization
Visualization techniques for categorical data
Bar charts
Going through the built-in examples of R
Time for action – bar charts in R
What just happened?
Have a go hero
Dot charts
Time for action – dot charts in R
What just happened?
Spine and mosaic plots
Time for action – the spine plot for the shift and operator data
What just happened?
Time for action – the mosaic plot for the Titanic dataset
What just happened?
Pie charts and the fourfold plot
Visualization techniques for continuous variable data
Boxplot
Time for action – using the boxplot
What just happened?
Histograms
Time for action – understanding the effectiveness of histograms
What just happened?
Scatter plots
Time for action – plot and pairs R functions
What just happened?
Pareto charts
A brief peek at ggplot2
Time for action – qplot
What just happened?
Time for action – ggplot
What just happened?
Have a go hero
Summary
4. Exploratory Analysis
Essential summary statistics
Percentiles, quantiles, and median
Hinges
The interquartile range
Time for action – the essential summary statistics for "The Wall" dataset
What just happened?
The stem-and-leaf plot
Time for action – the stem function in play
What just happened?
Letter values
Data re-expression
Have a go hero
Bagplot – a bivariate boxplot
Time for action – the bagplot display for a multivariate dataset
What just happened?
The resistant line
Time for action – the resistant line as a first regression model
What just happened?
Smoothing data
Time for action – smoothening the cow temperature data
What just happened?
Median polish
Time for action – the median polish algorithm
What just happened?
Have a go hero
Summary
5. Statistical Inference
Maximum likelihood estimator
Visualizing the likelihood function
Time for action – visualizing the likelihood function
What just happened?
Finding the maximum likelihood estimator
Using the fitdistr function
Time for action – finding the MLE using mle and fitdistr functions
What just happened?
Confidence intervals
Time for action – confidence intervals
What just happened?
Hypotheses testing
Binomial test
Time for action – testing the probability of success
What just happened?
Tests of proportions and the chi-square test
Time for action – testing proportions
What just happened?
Tests based on normal distribution – one-sample
Time for action – testing one-sample hypotheses
What just happened?
Have a go hero
Tests based on normal distribution – two-sample
Time for action – testing two-sample hypotheses
What just happened?
Have a go hero
Summary
6. Linear Regression Analysis
The simple linear regression model
What happens to the arbitrary choice of parameters?
Time for action – the arbitrary choice of parameters
What just happened?
Building a simple linear regression model
Time for action – building a simple linear regression model
What just happened?
Have a go hero
ANOVA and the confidence intervals
Time for action – ANOVA and the confidence intervals
What just happened?
Model validation
Time for action – residual plots for model validation
What just happened?
Have a go hero
Multiple linear regression model
Averaging k simple linear regression models or a multiple linear regression model
Time for action – averaging k simple linear regression models
What just happened?
Building a multiple linear regression model
Time for action – building a multiple linear regression model
What just happened?
The ANOVA and confidence intervals for the multiple linear regression model
Time for action – the ANOVA and confidence intervals for the multiple linear regression model
What just happened?
Have a go hero
Useful residual plots
Time for action – residual plots for the multiple linear regression model
What just happened?
Regression diagnostics
Leverage points
Influential points
DFFITS and DFBETAS
The multicollinearity problem
Time for action – addressing the multicollinearity problem for the Gasoline data
What just happened?
Model selection
Stepwise procedures
The backward elimination
The forward selection
Criterion-based procedures
Time for action – model selection using the backward, forward, and AIC criteria
What just happened?
Have a go hero
Summary
7. The Logistic Regression Model
The binary regression problem
Time for action – limitations of linear regression models
What just happened?
Probit regression model
Time for action – understanding the constants
What just happened?
Logistic regression model
Time for action – fitting the logistic regression model
What just happened?
Hosmer-Lemeshow goodness-of-fit test statistic
Time for action – The Hosmer-Lemeshow goodness-of-fit statistic
What just happened?
Model validation and diagnostics
Residual plots for the GLM
Time for action – residual plots for the logistic regression model
What just happened?
Have a go hero
Influence and leverage for the GLM
Time for action – diagnostics for the logistic regression
What just happened?
Have a go hero
Receiving operator curves
Time for action – ROC construction
What just happened?
Logistic regression for the German credit screening dataset
Time for action – logistic regression for the German credit dataset
What just happened?
Have a go hero
Summary
8. Regression Models with Regularization
The overfitting problem
Time for action – understanding overfitting
What just happened?
Have a go hero
Regression spline
Basis functions
Piecewise linear regression model
Time for action – fitting piecewise linear regression models
What just happened?
Natural cubic splines and the general B-splines
Time for action – fitting the spline regression models
What just happened?
Ridge regression for linear models
Time for action – ridge regression for the linear regression model
What just happened?
Ridge regression for logistic regression models
Time for action – ridge regression for the logistic regression model
What just happened?
Another look at model assessment
Time for action – selecting lambda iteratively and other topics
What just happened?
Pop quiz
Summary
9. Classification and Regression Trees
Recursive partitions
Time for action – partitioning the display plot
What just happened?
Splitting the data
The first tree
Time for action – building our first tree
What just happened?
The construction of a regression tree
Time for action – the construction of a regression tree
What just happened?
The construction of a classification tree
Time for action – the construction of a classification tree
What just happened?
Classification tree for the German credit data
Time for action – the construction of a classification tree
What just happened?
Have a go hero
Pruning and other finer aspects of a tree
Time for action – pruning a classification tree
What just happened?
Pop quiz
Summary
10. CART and Beyond
Improving CART
Time for action – cross-validation predictions
What just happened?
Bagging
The bootstrap
Time for action – understanding the bootstrap technique
What just happened?
The bagging algorithm
Time for action – the bagging algorithm
What Just Happened?
Random forests
Time for action – random forests for the German credit data
What just happened?
The consolidation
Time for action – random forests for the low birth weight data
What just happened?
Summary
A. References
Index
← Prev
Back
Next →
← Prev
Back
Next →