Declarations

Typically, compiled languages require that you declare variables; that is, warn the interpreter/compiler of the variables’ existence before using them. This is the case in our earlier C example:

int x;
int y[3];

As with most scripting languages (such as Python and Perl), you do not declare variables in R. For instance, consider this code:

z <- 3

This code, with no previous reference to z, is perfectly legal (and commonplace).

However, if you reference specific elements of a vector, you must warn R. For instance, say we wish y to be a two-component vector with values 5 and 12. The following will not work:

> y[1] <- 5
> y[2] <- 12

Instead, you must create y first, for instance this way:

> y <- vector(length=2)
> y[1] <- 5
> y[2] <- 12

The following will also work:

> y <- c(5,12)

This approach is all right because on the right-hand side we are creating a new vector, to which we then bind y.

The reason we cannot suddenly spring an expression like y[2] on R stems from R’s functional language nature. The reading and writing of individual vector elements are actually handled by functions. If R doesn’t already know that y is a vector, these functions have nothing on which to act.

Speaking of binding, just as variables are not declared, they are not constrained in terms of mode. The following sequence of events is perfectly valid:

> x <- c(1,5)
> x
[1] 1 5
> x <- "abc"

First, x is associated with a numeric vector, then with a string. (Again, for C/C++ programmers: x is nothing more than a pointer, which can point to different types of objects at different times.)