Seeing How R Works

To end this overview of the R language, I wanted to share a few functions that are convenient for seeing how R works. As you may recall, R expressions are R objects. This means that it is possible to parse expressions in R, or partially evaluate expressions in R, and see how R interprets them. This can be very useful for learning how R works or for debugging R code.

As noted above, the R interpreter goes through several steps when evaluating statements. The first step is to parse a statement, changing it into proper functional form. It is possible to view the R interpreter to see how a given expression is evaluated. As an example, let’s use the same R code fragment that we used in The R Interpreter:

> if (x > 1) "orange" else "apple"
[1] "apple"

To show how this expression is parsed, we can use the quote() function. This function will parse its argument but not evaluate it. By calling quote, an R expression returns a “language” object:

> typeof(quote(if (x > 1) "orange" else "apple"))
[1] "language"

Unfortunately, the print function for language objects is not very informative:

> quote(if (x > 1) "orange" else "apple")
if (x > 1) "orange" else "apple"

However, it is possible to convert a language object into a list. By displaying the language object as a list, it is possible to see how R evaluates an expression. This is the parse tree for the expression:

> as(quote(if (x > 1) "orange" else "apple"),"list")
[[1]]
`if`

[[2]]
x > 1

[[3]]
[1] "orange"

[[4]]
[1] "apple"

We can also apply the typeof function to every element in the list to see the type of each object in the parse tree:^[17]

> lapply(as(quote(if (x > 1) "orange" else "apple"), "list"),typeof)
[[1]]
[1] "symbol"

[[2]]
[1] "language"

[[3]]
[1] "character"

[[4]]
[1] "character"

In this case, we can see how this expression is interpreted. Notice that some parts of the if-then statement are not included in the parsed expression (in particular, the else keyword). Also, notice that the first item in the list is a symbol. In this case, the symbol refers to the if function. So, although the syntax for the if-then statement is different from a function call, the R parser translates the expression into a function call before evaluating the expression. The function name is the first item, and the arguments are the remaining items in the list.

For constants, there is only one item in the returned list:

> as.list(quote(1))
[[1]]
[1] 1

By using the quote function, you can see that many constructions in the R language are just syntactic sugar for function calls. For example, let’s consider looking up the second item in a vector x. The standard way to do this is through R’s bracket notation, so the expression would be x[2]. An alternative way to represent this expression is as a function: `[`(x,2). (Function names that contain special characters need to be encapsulated in backquotes.) Both of these expressions are interpreted the same way by R:

> as.list(quote(x[2]))
[[1]]
`[`

[[2]]
x

[[3]]
[1] 2

> as.list(quote(`[`(x,2)))
[[1]]
`[`

[[2]]
x

[[3]]
[1] 2

As you can see, R interprets both of these expressions identically. Clearly, the operation is not reversible (because both expressions are translated into the same parse tree). The deparse function can take the parse tree and turn it back into properly formatted R code. (The deparse function will use proper R syntax when translating a language object back into the original code.) Here’s how it acts on these two bits of code:

> deparse(quote(x[2]))
[1] "x[2]"
> deparse(quote(`[`(x,2)))
[1] "x[2]"

As you read through this book, you might want to try using quote, substitute, typeof, class, and methods to see how the R interpreter parses expressions.

^[17] As a convenient shorthand, you can omit the as function because R will automatically coerce the language object to a list. This means you can just use a command like:

> lapply(quote(if (x > 1) "orange" else "apple"),typeof)

Coercion is explained in Coercion.