Variables

Like most other languages, R lets you assign values to variables and refer to them by name. In R, the assignment operator is <-. Usually, this is pronounced as “gets.” For example, the statement:

x <- 1

is usually read as “x gets 1.” (If you’ve ever done any work with theoretical computer science, you’ll probably like this notation: it looks just like algorithm pseudocode.)

After you assign a value to a variable, the R interpreter will substitute that value in place of the variable name when it evaluates an expression. Here’s a simple example:

> x <- 1
> y <- 2
> z <- c(x,y)
> # evaluate z to see what's stored as z
> z
[1] 1 2

Notice that the substitution is done at the time that the value is assigned to z, not at the time that z is evaluated. Suppose that you were to type in the preceding three expressions and then change the value of y. The value of z would not change:

> y <- 4
> z
[1] 1 2

I’ll talk more about the subtleties of variables and how they’re evaluated in Chapter 8.

R provides several different ways to refer to a member (or set of members) of a vector. You can refer to elements by location in a vector:

> b <- c(1,2,3,4,5,6,7,8,9,10,11,12)
> b
 [1]  1  2  3  4  5  6  7  8  9 10 11 12
> # let's fetch the 7th item in vector b
> b[7]
[1] 7
> # fetch items 1 through 6
> b[1:6]
[1] 1 2 3 4 5 6
> # fetch only members of b that are congruent to zero (mod 3)
> # (in non-math speak, members that are multiples of 3)
> b[b %% 3 == 0]
[1]  3  6  9 12

You can fetch multiple items in a vector by specifying the indices of each item as an integer vector:

> # fetch items 1 through 6
> b[1:6]
[1] 1 2 3 4 5 6
> # fetch 1, 6, 11
> b[c(1,6,11)]
[1]  1  6 11

You can fetch items out of order. Items are returned in the order they are referenced:

> b[c(8,4,9)]
[1] 8 4 9

You can also specify which items to fetch through a logical vector. As an example, let’s fetch only multiples of 3 (by selecting items that are congruent to 0 mod 3):

> b %% 3 == 0
 [1] FALSE FALSE  TRUE FALSE FALSE  TRUE FALSE FALSE  TRUE FALSE FALSE
[12]  TRUE
> b[b %% 3 == 0]
[1]  3  6  9 12

In R, there are two additional operators that can be used for assigning values to symbols. First, you can use a single equals sign (“=”) for assignment .^[8] This operator assigns the symbol on the left to the object on the right. In many other languages, all assignment statements use equals signs. If you are more comfortable with this notation, you are free to use it. However, I will be using only the <- assignment operator in this book because I think it is easier to read. Whichever notation you prefer, be careful because the = operator does not mean “equals.” For that, you need to use the == operator:

> one <- 1
> two <- 2
> # This means: assign the value of "two" to the variable "one"
> one = two
> one
[1] 2
> two
[1] 2
> # let's start again
> one <- 1
> two <- 2
> # This means: does the value of "one" equal the value of "two"
> one == two
[1] FALSE

In R, you can also assign an object on the left to a symbol on the right:

> 3 -> three
> three
[1] 3

In some programming contexts, this notation might help you write clearer code. (It may also be convenient if you type in a long expression and then realize that you have forgotten to assign the result to a symbol.)

A function in R is just another object that is assigned to a symbol. You can define your own functions in R, assign them a name, and then call them just like the built-in functions:

> f <- function(x,y) {c(x+1, y+1)}
> f(1,2)
[1] 2 3

This leads to a very useful trick. You can often type the name of a function to see the code for it. Here’s an example:

> f
function(x,y) {c(x+1, y+1)}

^[8] Note that you cannot use the <- operator when passing arguments to a function; you need to map values to argument names using the “=” symbol. Using the <- operator in a function will assign the value to the variable in the current environment and then pass the value returned to the function. This might be what you want, but it probably isn’t.