As a typical R session progresses, you tend to accumulate a large number of objects. Various tools are available to manage them. Here, we’ll look at the following:
The ls()
function
The rm()
function
The save()
function
Several functions that tell you more about the structure of an object, such as class()
and mode()
The exists()
function
The ls()
command will list all of your current objects. A useful named argument for this function is pattern
, which enables wildcards. Here, you tell ls()
to list only the objects whose names include a specified pattern. The following is an example.
> ls() [1] "acc" "acc05" "binomci" "cmeans" "divorg" "dv" [7] "fit" "g" "genxc" "genxnt" "j" "lo" [13] "out1" "out1.100" "out1.25" "out1.50" "out1.75" "out2" [19] "out2.100" "out2.25" "out2.50" "out2.75" "par.set" "prpdf" [25] "ratbootci" "simonn" "vecprod" "x" "zout" "zout.100" [31] "zout.125" "zout3" "zout5" "zout.50" "zout.75" > ls(pattern="ut") [1] "out1" "out1.100" "out1.25" "out1.50" "out1.75" "out2" [7] "out2.100" "out2.25" "out2.50" "out2.75" "zout" "zout.100" [13] "zout.125" "zout3" "zout5" "zout.50" "zout.75"
In the second case, we asked for a list of all objects whose names include the string "ut"
.
To remove objects you no longer need, use rm()
. Here’s an example:
> rm(a,b,x,y,z,uuu)
This code removes the six specified objects (a
, b
, and so on).
One of the named arguments of rm()
is list
, which makes it easier to remove multiple objects. This code assigns all of our objects to list
, thus removing everything:
> rm(list = ls())
Using ls()
’s pattern
argument, this tool becomes even more powerful. Here’s an example:
> ls() [1] "doexpt" "notebookline" "nreps" "numcorrectcis" [5] "numnotebooklines" "numrules" "observationpt" "prop" [9] "r" "rad" "radius" "rep" [13] "s" "s2" "sim" "waits" [17] "wbar" "x" "y" "z" > ls(pattern="notebook") [1] "notebookline" "numnotebooklines" > rm(list=ls(pattern="notebook")) > ls() [1] "doexpt" "nreps" "numcorrectcis" "numrules" [5] "observationpt" "prop" "r" "rad" [9] "radius" "rep" "s" "s2" [13] "sim" "waits" "wbar" "x" [17] "y" "z"
Here, we found two objects whose names include the string "notebook"
and then asked to remove them, which was confirmed by the second call to ls()
.
You may find the function browseEnv()
helpful. It will show in your web browser your globals (or objects in a different specified environment), with some details on each.
Calling save()
on a collection of objects will write them to disk for later retrieval by load()
. Here’s a quick example:
> z <- rnorm(100000) > hz <- hist(z) > save(hz,"hzfile") > ls() [1] "hz" "z" > rm(hz) > ls() [1] "z" > load("hzfile") > ls() [1] "hz" "z" > plot(hz) # graph window pops up
Here, we generate some data and then draw a histogram of it. But we also save the output of hist()
in a variable, hz
. That variable is an object (of class "histogram"
, of course). Anticipating that we will want to reuse this object in a later R session, we use the save()
function to save the object to the file hzfile. It can be reloaded in that future session via load()
. To demonstrate this, we deliberately removed the hz
object, then called load()
to reload it, and then called ls()
to show that it had indeed been reloaded.
I once needed to read in a very large data file, each record of which required processing. I then used save()
to keep the R object version of the processed file for future R sessions.
Developers often need to know the exact structure of the object returned by a library function. If the documentation does not give sufficient details, what can we do?
The following R functions may be helpful:
class()
, mode()
names()
, attributes()
unclass()
, str()
edit()
Let’s go through an example. R includes facilities for constructing contingency tables, which we discussed in Section 6.4. An example in that section involved an election survey in which five respondents are asked whether they intend to vote for candidate X and whether they voted for X in the last election. Here is the resulting table:
> cttab <- table(ct) > cttab Voted.for.X.Last.Time Vote.for.X No Yes No 2 0 Not Sure 0 1 Yes 1 1
For instance, two respondents answered no to both questions.
The object cttab
was returned by the function table
and thus is likely of class "table"
. A check of the documentation (?table
) confirms this. But what is in the class?
Let’s explore the structure of that object cttab
of class "table"
.
> ctu <- unclass(cttab) > ctu Votes.for.X.Last.Time Vote.for.X No Yes No 2 0 Not Sure 0 1 Yes 1 1 > class(ctu) [1] "matrix"
So, the counts portion of the object is a matrix. (If the data had involved three or more questions, rather than just two, this would have been a higher-dimensional array.) Note that the names of the dimensions and of the individual rows and columns are there, too; they are associated with the matrix.
The unclass()
function is quite useful as a first step. If you simply print an object, you are at the mercy of the version of print()
associated with that class, which may in the name of succinctness hide or distort some valuable information. Printing the result of calling unclass()
allows you to work around this problem, though there was no difference in this example. (You saw an instance in which it did make a difference in the section about S3 generic functions in Section 9.1.1 earlier.) The function str()
serves the same purpose, in a more compact manner.
Note, though, applying unclass()
to an object still results in an object with some basic class. Here, cttab
had the class "table"
, but unclass(cttab)
still had the class "matrix"
.
Let’s try looking at the code for table()
, the library function that produced cttab
. We could simply type table
, but since this is a somewhat longish function, a lot of the function would zoom by on the screen too fast for us to absorb it. We could use page()
to solve this problem, but I prefer edit()
:
> edit(table)
This allows you to browse through the code with your text editor. In doing so, you’ll find this code at the end:
y <- array(tabulate(bin, pd), dims, dimnames = dn) class(y) <- "table" y
Ah, interesting. This shows that table()
is, to some extent, a wrapper for another function, tabulate()
. But what might be more important here is that the structure of a "table"
object is really pretty simple: It consists of an array created from the counts, with the class attribute tacked on. So, it’s essentially just an array.
The function names()
shows the components in an object, and attributes()
gives you this and a bit more, notably the class name.
The function exists()
returns TRUE
or FALSE
, depending on whether the argument exists. Be sure to put the argument in quotation marks.
For example, the following code shows that the acc
object exists:
> exists("acc") [1] TRUE
Why would this function be useful? Don’t we always know whether or not we’ve created an object and whether it’s still there? Not necessarily. If you are writing general-purpose code, say to be made available to the world in R’s CRAN code repository, your code may need to check whether a certain object exists, and if it doesn’t, then your code must create it. For example, as you learned in Section 9.4.3, you can save objects to disk files using save()
and then later restore them to R’s memory space by calling load()
. You might write general-purpose code that makes the latter call if the object is not already present, a condition you could check by calling exists()
.