Log In
Or create an account ->
Imperial Library
Home
About
News
Upload
Forum
Help
Login/SignUp
Index
R in a Nutshell
Preface
Why I Wrote This Book
When Should You Use R?
What’s New in the Second Edition?
R License Terms
Examples
How This Book Is Organized
Conventions Used in This Book
Using Code Examples
Safari® Books Online
How to Contact Us
Acknowledgments
I. R Basics
1. Getting and Installing R
R Versions
Getting and Installing Interactive R Binaries
Windows
Mac OS X
Linux and Unix Systems
Installation using package management systems
Installing R from downloaded files
2. The R User Interface
The R Graphical User Interface
Windows
Mac OS X
Linux and Unix
The R Console
Command-Line Editing
Batch Mode
Using R Inside Microsoft Excel
RStudio
Other Ways to Run R
3. A Short R Tutorial
Basic Operations in R
Functions
Variables
Introduction to Data Structures
Objects and Classes
Models and Formulas
Charts and Graphics
Getting Help
4. R Packages
An Overview of Packages
Listing Packages in Local Libraries
Loading Packages
Loading Packages on Windows and Linux
Loading Packages on Mac OS X
Exploring Package Repositories
Exploring R Package Repositories on the Web
Finding and Installing Packages Inside R
Windows and Linux GUIs
Mac OS X GUI
R console
Installing from the command line
Installing Packages From Other Repositories
Custom Packages
Creating a Package Directory
Building the Package
II. The R Language
5. An Overview of the R Language
Expressions
Objects
Symbols
Functions
Objects Are Copied in Assignment Statements
Everything in R Is an Object
Special Values
NA
Inf and -Inf
NaN
NULL
Coercion
The R Interpreter
Seeing How R Works
6. R Syntax
Constants
Numeric Vectors
Character Vectors
Symbols
Operators
Order of Operations
Assignments
Expressions
Separating Expressions
Parentheses
Curly Braces
Control Structures
Conditional Statements
Loops
Accessing Data Structures
Data Structure Operators
Indexing by Integer Vector
Indexing by Logical Vector
Indexing by Name
R Code Style Standards
7. R Objects
Primitive Object Types
Vectors
Lists
Other Objects
Matrices
Arrays
Factors
Data Frames
Formulas
Time Series
Shingles
Dates and Times
Connections
Attributes
Class
8. Symbols and Environments
Symbols
Working with Environments
The Global Environment
Environments and Functions
Working with the Call Stack
Evaluating Functions in Different Environments
Adding Objects to an Environment
Exceptions
Signaling Errors
Catching Errors
9. Functions
The Function Keyword
Arguments
Return Values
Functions as Arguments
Anonymous Functions
Properties of Functions
Argument Order and Named Arguments
Side Effects
Changes to Other Environments
Input/Output
Graphics
10. Object-Oriented Programming
Overview of Object-Oriented Programming in R
Key Ideas
Implementation Example
Object-Oriented Programming in R: S4 Classes
Defining Classes
New Objects
Accessing Slots
Working with Objects
Creating Coercion Methods
Methods
Managing Methods
Basic Classes
More Help
Old-School OOP in R: S3
S3 Classes
S3 Methods
Using S3 Classes in S4 Classes
Finding Hidden S3 Methods
III. Working with Data
11. Saving, Loading, and Editing Data
Entering Data Within R
Entering Data Using R Commands
Using the Edit GUI
Windows Data Editor
Mac OS X Data Editor
X Windows (Linux) Data Editor
Saving and Loading R Objects
Saving Objects with save
Importing Data from External Files
Text Files
Delimited files
Fixed-width files
Other functions to parse data
Other Software
Exporting Data
Importing Data From Databases
Export Then Import
Database Connection Packages
RODBC
Getting RODBC working
Installing the RODBC package
Installing ODBC drivers
Example: SQLite ODBC on Mac OS X
Example: SQLite ODBC on Windows
Using RODBC
Opening a channel
Getting information about the database
Getting data
Closing a channel
DBI
Opening a connection
Getting DB information
Querying the database
Cleaning up
TSDBI
Getting Data from Hadoop
12. Preparing Data
Combining Data Sets
Pasting Together Data Structures
Paste
rbind and cbind
An extended example
Merging Data by Common Fields
Transformations
Reassigning Variables
The Transform Function
Applying a Function to Each Element of an Object
Applying a function to an array
Applying a function to a list or vector
the plyr library
Binning Data
Shingles
Cut
Combining Objects with a Grouping Variable
Subsets
Bracket Notation
subset Function
Random Sampling
Summarizing Functions
tapply, aggregate
Aggregating Tables with rowsum
Counting Values
Reshaping Data
Transposing matrices and data frames
Reshaping data frames and matrices
Using the Reshape Library
Melting and Casting
Examples of reshape
melt
Cast
Data Cleaning
Finding and Removing Duplicates
Sorting
IV. Data Visualization
13. Graphics
An Overview of R Graphics
Scatter Plots
Plotting Time Series
Bar Charts
Pie Charts
Plotting Categorical Data
Three-Dimensional Data
Plotting Distributions
Box Plots
Graphics Devices
Customizing Charts
Common Arguments to Chart Functions
Graphical Parameters
Annotation
Margins
Multiple plots
Text properties
Text size
Typeface
Alignment and spacing
Rotation
Line properties
Colors
Axes
Points
Graphical parameters by name
Basic Graphics Functions
points
lines
curve
text
abline
polygon
segments
legend
title
axis
box
mtext
trans3d
14. Lattice Graphics
History
An Overview of the Lattice Package
How Lattice Works
A Simple Example
Using Lattice Functions
Custom Panel Functions
High-Level Lattice Plotting Functions
Univariate Trellis Plots
Bar charts
Dot plots
Histograms
Density plots
Strip plots
Univariate quantile-quantile plots
Bivariate Trellis Plots
Scatter plots
Box plots in lattice
Scatter plots matrices
Bivariate quantile-quantile plots
Trivariate Plots
Level plots
Contour plots
Cloud plots
Wire-frame plots
Other Plots
Customizing Lattice Graphics
Common Arguments to Lattice Functions
trellis.skeleton
Controlling How Axes Are Drawn
Parameters
plot.trellis
strip.default
simpleKey
Low-Level Functions
Low-Level Graphics Functions
Panel Functions
15. ggplot2
A Short Introduction
The Grammar of Graphics
A More Complex Example: Medicare Data
Quick Plot
Creating Graphics with ggplot2
Learning More
V. Statistics with R
16. Analyzing Data
Summary Statistics
Correlation and Covariance
Principal Components Analysis
Factor Analysis
Bootstrap Resampling
17. Probability Distributions
Normal Distribution
Common Distribution-Type Arguments
Distribution Function Families
18. Statistical Tests
Continuous Data
Normal Distribution-Based Tests
Comparing means
Comparing paired data
Comparing variances of two populations
Comparing means across more than two groups
Pairwise t-tests between multiple groups
Testing for normality
Testing if a data vector came from an arbitrary distribution
Testing if two data vectors came from the same distribution
Correlation tests
Non-Parametric Tests
Comparing two means
Comparing more than two means
Comparing variances
Difference in scale parameters
Discrete Data
Proportion Tests
Binomial Tests
Tabular Data Tests
Non-Parametric Tabular Data Tests
19. Power Tests
Experimental Design Example
t-Test Design
Proportion Test Design
ANOVA Test Design
20. Regression Models
Example: A Simple Linear Model
Fitting a Model
Helper Functions for Specifying the Model
Getting Information About a Model
Viewing the model
Predicting values using a model
Analyzing the fit
Refining the Model
Details About the lm Function
Assumptions of Least Squares Regression
Robust and Resistant Regression
Resistant regression
Robust regression
Comparing lm, lqs, and rlm
Subset Selection and Shrinkage Methods
Stepwise Variable Selection
Ridge Regression
Lasso and Least Angle Regression
elasticnet
Principal Components Regression and Partial Least Squares Regression
Nonlinear Models
Generalized Linear Models
glmnet
Nonlinear Least Squares
Survival Models
Smoothing
Splines
Fitting Polynomial Surfaces
Kernel Smoothing
Machine Learning Algorithms for Regression
Regression Tree Models
Recursive partitioning trees
Patient rule induction method
Bagging for regression
Boosting for regression
Random forests for regression
MARS
Neural Networks
Project Pursuit Regression
Generalized Additive Models
Support Vector Machines
21. Classification Models
Linear Classification Models
Logistic Regression
Linear Discriminant Analysis
Log-Linear Models
Machine Learning Algorithms for Classification
k Nearest Neighbors
Classification Tree Models
Bagging
Boosting
Neural Networks
SVMs
Random Forests
22. Machine Learning
Market Basket Analysis
Clustering
Distance Measures
Clustering Algorithms
23. Time Series Analysis
Autocorrelation Functions
Time Series Models
VI. Additional Topics
24. Optimizing R Programs
Measuring R Program Performance
Timing
Profiling
Monitor How Much Memory You Are Using
Profiling Memory Usage
Optimizing Your R Code
Using Vector Operations
Iterative algorithms and vector operations
Transforming problems to use built-in functions
Lookup Performance in R
Lookups and R objects
Using environment objects in place of vectors
Use a Database to Query Large Data Sets
Preallocate Memory
Cleaning Up Memory
Functions for Big Data Sets
Other Ways to Speed Up R
The R Byte Code Compiler
Manual compilation
Inspecting byte code
Just-in-time compilation
High-Performance R Binaries
Revolution R
Building your own
Building on Microsoft Windows
Building R on Unix-like systems
Building R on Mac OS X
25. Bioconductor
An Example
Loading Raw Expression Data
Loading Data from GEO
Matching Phenotype Data
Analyzing Expression Data
Key Bioconductor Packages
Data Structures
eSet
AssayData
AnnotatedDataFrame
MIAME
Other Classes Used by Bioconductor Packages
Where to Go Next
Resources Outside Bioconductor
Vignettes
Courses
Books
26. R and Hadoop
R and Hadoop
Overview of Hadoop
Map/Reduce
Distributed data storage
Managing a cluster of servers
Java framework
When should you consider Hadoop?
RHadoop
Make sure Hadoop is installed locally
Installing RHadoop locally
An example RHadoop application
Details of rmr
Learning more
Hadoop Streaming
Learning More
Other Packages for Parallel Computation with R
Segue
doMC
Where to Learn More
A. R Reference
base
Functions
Data Sets
boot
Functions
Data Sets
class
Functions
cluster
Functions
Data Sets
codetools
foreign
Functions
grDevices
Functions
Data Sets
graphics
Functions
grid
KernSmooth
Functions
lattice
Functions
Data Sets
MASS
Functions
Data Sets
methods
Functions
mgcv
nlme
nnet
Functions
rpart
Functions
Data Sets
spatial
Functions
splines
Functions
stats
Functions
Data Set
stats4
Functions
survival
Functions
Data Sets
tcltk
tools
Functions
Data Sets
utils
Functions
Bibliography
Index
About the Author
Colophon
Copyright
← Prev
Back
Next →
← Prev
Back
Next →