Contents

Cover

Title Page

Copyright

Preface

Acknowledgements

Chapter 1: Getting Started

1.1 How to use this book

1.2 Installing R

1.3 Running R

1.4 The Comprehensive R Archive Network

1.5 Getting help in R

1.6 Packages in R

1.7 Command line versus scripts

1.8 Data editor

1.9 Changing the look of the R screen

1.10 Good housekeeping

1.11 Linking to other computer languages

Chapter 2: Essentials of the R Language

2.1 Calculations

2.2 Logical operations

2.3 Generating sequences

2.4 Membership: Testing and coercing in R

2.5 Missing values, infinity and things that are not numbers

2.6 Vectors and subscripts

2.7 Vector functions

2.8 Matrices and arrays

2.9 Random numbers, sampling and shuffling

2.10 Loops and repeats

2.11 Lists

2.12 Text, character strings and pattern matching

2.13 Dates and times in R

2.14 Environments

2.15 Writing R functions

2.16 Writing from R to file

2.17 Programming tips

Chapter 3: Data Input

3.1 Data input from the keyboard

3.2 Data input from files

3.3 Input from files using scan

3.4 Reading data from a file using readLines

3.5 Warnings when you attach the dataframe

3.6 Masking

3.7 Input and output formats

3.8 Checking files from the command line

3.9 Reading dates and times from files

3.10 Built-in data files

3.11 File paths

3.12 Connections

3.13 Reading data from an external database

Chapter 4: Dataframes

4.1 Subscripts and indices

4.2 Selecting rows from the dataframe at random

4.3 Sorting dataframes

4.4 Using logical conditions to select rows from the dataframe

4.5 Omitting rows containing missing values, NA

4.6 Using order and !duplicated to eliminate pseudoreplication

4.7 Complex ordering with mixed directions

4.8 A dataframe with row names instead of row numbers

4.9 Creating a dataframe from another kind of object

4.10 Eliminating duplicate rows from a dataframe

4.11 Dates in dataframes

4.12 Using the match function in dataframes

4.13 Merging two dataframes

4.14 Adding margins to a dataframe

4.15 Summarizing the contents of dataframes

Chapter 5: Graphics

5.1 Plots with two variables

5.2 Plotting with two continuous explanatory variables: Scatterplots

5.3 Adding other shapes to a plot

5.4 Drawing mathematical functions

5.5 Shape and size of the graphics window

5.6 Plotting with a categorical explanatory variable

5.7 Plots for single samples

5.8 Plots with multiple variables

5.9 Special plots

5.10 Saving graphics to file

5.11 Summary

Chapter 6: Tables

6.1 Tables of counts

6.2 Summary tables

6.3 Expanding a table into a dataframe

6.4 Converting from a dataframe to a table

6.5 Calculating tables of proportions with prop.table

6.6 The scale function

6.7 The expand.grid function

6.8 The model.matrix function

6.9 Comparing table and tabulate

Chapter 7: Mathematics

7.1 Mathematical functions

7.2 Probability functions

7.3 Continuous probability distributions

7.4 Discrete probability distributions

7.5 Matrix algebra

7.6 Solving systems of linear equations using matrices

7.7 Calculus

Chapter 8: Classical Tests

8.1 Single samples

8.2 Bootstrap in hypothesis testing

8.3 Skew and kurtosis

8.4 Two samples

8.5 Tests on paired samples

8.6 The sign test

8.7 Binomial test to compare two proportions

8.8 Chi-squared contingency tables

8.9 Correlation and covariance

8.10 Kolmogorov–Smirnov test

8.11 Power analysis

8.12 Bootstrap

Chapter 9: Statistical Modelling

9.1 First things first

9.2 Maximum likelihood

9.3 The principle of parsimony (Occam's razor)

9.4 Types of statistical model

9.5 Steps involved in model simplification

9.6 Model formulae in R

9.7 Multiple error terms

9.8 The intercept as parameter 1

9.9 The update function in model simplification

9.10 Model formulae for regression

9.11 Box–Cox transformations

9.12 Model criticism

9.13 Model checking

9.14 Influence

9.15 Summary of statistical models in R

9.16 Optional arguments in model-fitting functions

9.17 Akaike's information criterion

9.18 Leverage

9.19 Misspecified model

9.20 Model checking in R

9.21 Extracting information from model objects

9.22 The summary tables for continuous and categorical explanatory variables

9.23 Contrasts

9.24 Model simplification by stepwise deletion

9.25 Comparison of the three kinds of contrasts

9.26 Aliasing

9.27 Orthogonal polynomial contrasts: contr.poly

9.28 Summary of statistical modelling

Chapter 10: Regression

10.1 Linear regression

10.2 Polynomial approximations to elementary functions

10.3 Polynomial regression

10.4 Fitting a mechanistic model to data

10.5 Linear regression after transformation

10.6 Prediction following regression

10.7 Testing for lack of fit in a regression

10.8 Bootstrap with regression

10.9 Jackknife with regression

10.10 Jackknife after bootstrap

10.11 Serial correlation in the residuals

10.12 Piecewise regression

10.13 Multiple regression

Chapter 11: Analysis of Variance

11.1 One-way ANOVA

11.2 Factorial experiments

11.3 Pseudoreplication: Nested designs and split plots

11.4 Variance components analysis

11.5 Effect sizes in ANOVA: aov or lm?

11.6 Multiple comparisons

11.7 Multivariate analysis of variance

Chapter 12: Analysis of Covariance

12.1 Analysis of covariance in R

12.2 ANCOVA and experimental design

12.3 ANCOVA with two factors and one continuous covariate

12.4 Contrasts and the parameters of ANCOVA models

12.5 Order matters in summary.aov

Chapter 13: Generalized Linear Models

13.1 Error structure

13.2 Linear predictor

13.3 Link function

13.4 Proportion data and binomial errors

13.5 Count data and Poisson errors

13.6 Deviance: Measuring the goodness of fit of a GLM

13.7 Quasi-likelihood

13.8 The quasi family of models

13.9 Generalized additive models

13.10 Offsets

13.11 Residuals

13.12 Overdispersion

13.13 Bootstrapping a GLM

13.14 Binomial GLM with ordered categorical variables

Chapter 14: Count Data

14.1 A regression with Poisson errors

14.2 Analysis of deviance with count data

14.3 Analysis of covariance with count data

14.4 Frequency distributions

14.5 Overdispersion in log-linear models

14.6 Negative binomial errors

Chapter 15: Count Data in Tables

15.1 A two-class table of counts

15.2 Sample size for count data

15.3 A four-class table of counts

15.4 Two-by-two contingency tables

15.5 Using log-linear models for simple contingency tables

15.6 The danger of contingency tables

15.7 Quasi-Poisson and negative binomial models compared

15.8 A contingency table of intermediate complexity

15.9 Schoener's lizards: A complex contingency table

15.10 Plot methods for contingency tables

15.11 Graphics for count data: Spine plots and spinograms

Chapter 16: Proportion Data

16.1 Analyses of data on one and two proportions

16.2 Count data on proportions

16.3 Odds

16.4 Overdispersion and hypothesis testing

16.5 Applications

16.6 Averaging proportions

16.7 Summary of modelling with proportion count data

16.8 Analysis of covariance with binomial data

16.9 Converting complex contingency tables to proportions

Chapter 17: Binary Response Variables

17.1 Incidence functions

17.2 Graphical tests of the fit of the logistic to data

17.3 ANCOVA with a binary response variable

17.4 Binary response with pseudoreplication

Chapter 18: Generalized Additive Models

18.1 Non-parametric smoothers

18.2 Generalized additive models

18.3 An example with strongly humped data

18.4 Generalized additive models with binary data

18.5 Three-dimensional graphic output from gam

Chapter 19: Mixed-Effects Models

19.1 Replication and pseudoreplication

19.2 The lme and lmer functions

19.3 Best linear unbiased predictors

19.4 Designed experiments with different spatial scales: Split plots

19.5 Hierarchical sampling and variance components analysis

19.6 Mixed-effects models with temporal pseudoreplication

19.7 Time series analysis in mixed-effects models

19.8 Random effects in designed experiments

19.9 Regression in mixed-effects models

19.10 Generalized linear mixed models

Chapter 20: Non-Linear Regression

20.1 Comparing Michaelis–Menten and asymptotic exponential

20.2 Generalized additive models

20.3 Grouped data for non-linear estimation

20.4 Non-linear time series models (temporal pseudoreplication)

20.5 Self-starting functions

20.6 Bootstrapping a family of non-linear regressions

Chapter 21: Meta-Analysis

21.1 Effect size

21.2 Weights

21.3 Fixed versus random effects

21.4 Random-effects meta-analysis of binary data

Chapter 22: Bayesian Statistics

22.1 Background

22.2 A continuous response variable

22.3 Normal prior and normal likelihood

22.4 Priors

22.5 Bayesian statistics for realistically complicated models

22.6 Practical considerations

22.7 Writing BUGS models

22.8 Packages in R for carrying out Bayesian analysis

22.9 Installing JAGS on your computer

22.10 Running JAGS in R

22.11 MCMC for a simple linear regression

22.12 MCMC for a model with temporal pseudoreplication

22.13 MCMC for a model with binomial errors

Chapter 23: Tree Models

23.1 Background

23.2 Regression trees

23.3 Using rpart to fit tree models

23.4 Tree models as regressions

23.5 Model simplification

23.6 Classification trees with categorical explanatory variables

23.7 Classification trees for replicated data

23.8 Testing for the existence of humps

Chapter 24: Time Series Analysis

24.1 Nicholson's blowflies

24.2 Moving average

24.3 Seasonal data

24.4 Built-in time series functions

24.5 Decompositions

24.6 Testing for a trend in the time series

24.7 Spectral analysis

24.8 Multiple time series

24.9 Simulated time series

24.10 Time series models

Chapter 25: Multivariate Statistics

25.1 Principal components analysis

25.2 Factor analysis

25.3 Cluster analysis

25.4 Hierarchical cluster analysis

25.5 Discriminant analysis

25.6 Neural networks

Chapter 26: Spatial Statistics

26.1 Point processes

26.2 Nearest neighbours

26.3 Tests for spatial randomness

26.4 Packages for spatial statistics

26.5 Geostatistical data

26.6 Regression models with spatially correlated errors: Generalized least squares

26.7 Creating a dot-distribution map from a relational database

Chapter 27: Survival Analysis

27.1 A Monte Carlo experiment

27.2 Background

27.3 The survivor function

27.4 The density function

27.5 The hazard function

27.6 The exponential distribution

27.7 Kaplan–Meier survival distributions

27.8 Age-specific hazard models

27.9 Survival analysis in R

27.10 Parametric analysis

27.11 Cox's proportional hazards

27.12 Models with censoring

Chapter 28: Simulation Models

28.1 Temporal dynamics: Chaotic dynamics in population size

28.2 Temporal and spatial dynamics: A simulated random walk in two dimensions

28.3 Spatial simulation models

28.4 Pattern generation resulting from dynamic interactions

Chapter 29: Changing the Look of Graphics

29.1 Graphs for publication

29.2 Colour

29.3 Cross-hatching

29.4 Grey scale

29.5 Coloured convex hulls and other polygons

29.6 Logarithmic axes

29.7 Different font families for text

29.8 Mathematical and other symbols on plots

29.9 Phase planes

29.10 Fat arrows

29.11 Three-dimensional plots

29.12 Complex 3D plots with wireframe

29.13 An alphabetical tour of the graphics parameters

29.14 Trellis graphics

References and Further Reading

Index