An Overview of the Lattice Package

Lattice graphics consist of one or more rectangular drawing areas called panels. The data assigned to each panel is referred to as a packet. Lattice functions work by calling one or more panel functions, which actually plot the packets within panels. To change the appearance of a plot, you can specify arguments to the plotting function or change the panel function.

Here is what typically happens in a lattice session:

Lattice graphics are extremely modular; they share many high-level functions (like plot.lattice) and low-level functions (like panel.axis, which draws axes). This means that they share many common arguments. It also means that you can customize the appearance of lattice graphics by creating substitute components.

There are many arguments to lattice functions, but in this section we’ll focus on a handful of key arguments for specifying what data to plot.

As you may have noticed, functions in the graphics package don’t have completely consistent arguments. Many of them share some common parameters (see Customizing Charts), but many of them have different names for arguments with the same purpose. (For example, data for barplot is specified with the height argument, while data for plot is specified with x and y.) Arguments within the lattice package are much more consistent.

You can always specify the data to plot using a formula and a data frame. Let’s create a simple data set and plot a scatter plot with xyplot:

> d <- data.frame(x=c(0:9), y=c(1:10), z=c(rep(c("a", "b"), times=5)))
> d
   x  y z
1  0  1 a
2  1  2 b
3  2  3 a
4  3  4 b
5  4  5 a
6  5  6 b
7  6  7 a
8  7  8 b
9  8  9 a
10 9 10 b

To plot this data frame, we’ll use the formula y~x and specify the data frame d. The first argument given is the formula. (The argument used to be called “formula” and is currently named x. The help files for lattice warn not to pass this as a named argument, possibly because the name may change again.) To specify the data frame containing the plotting data, we use the argument data:

> xyplot(y~x, data=d)

The resulting plot is shown in Figure 14-1. Formulas in the lattice package can also specify a conditioning variable. The conditioning variable is used to assign data points to different panels. For example, we can plot the same data shown above in two panels, split by the conditioning variable z. To do this, we will change the formula to y~x|z:

> library(lattice)
> xyplot(y~x|z, data=d)

The scatter plot with the conditioning variable is shown in Figure 14-2. As you can see, the data is now split into two panels. If you would prefer to see the two data series superimposed on the same plot, you can specify a grouping variable. To do this, use the argument groups to specify the grouping variable(s):

> xyplot(y~x, groups=z, data=d)

As shown in Figure 14-3, the two data series are represented by different symbols. (If you try this example yourself using the R console, the different groups will be plotted in different colors. To make the charts readable in black and white, I generated the charts using special settings.)

The easiest way to use lattice graphics is by calling a high-level plotting function. Most of these functions are the equivalent of a similar function in the graphics package. Here’s a table showing how standard graphics functions map to lattice functions.

When you call a high-level lattice function, it does not actually plot the data. Instead, each of these functions returns a lattice object. To actually show the graphic, you need to use a print or plot command. If you simply execute a lattice function on the R command line, R runs print automatically, so the graphic is shown. However, if you call a lattice function inside another function or inside a script and you want to show the results, make sure that you actually call print.

For some (but not all) lattice functions, it is possible to specify the source data in multiple forms. For example, the function histogram can also accept data arguments as factors or numeric vectors. These methods are provided for convenience where appropriate. For example, I frequently plot contingency tables as bar charts, so I often use the table method of barchart. Here is a table of data types accepted by different lattice functions.

For more details on arguments to lattice functions, see Customizing Lattice Graphics.

With standard graphics, you could easily superimpose points, lines, text, and other objects on existing charts. It’s possible to do the same thing with lattice graphics, but it’s a little trickier.

In order to add extra graphical elements to a lattice plot, you need to use a custom panel function. As we described above, low-level panel functions actually plot graphics. The high-level functions simply specify how data is divided between panels, and how different elements (legends, strips, axes, etc.) need to be added. To add extra elements to a lattice chart, you need to change the panel function.

As a simple example, let’s add a diagonal line to Figure 14-2. To do this, we’ll create a new custom panel function that calls both panel.xyplot and panel.abline. The new panel function will pass along its arguments to panel.xyplot. We’ll specify a line that crosses the y-axis at 1 (through the a=1 argument to panel.abline) and has slope 1 (through the b=1 argument to panel.abline). Here’s the code to generate this chart:

xyplot(y~x|z, data=d,
   panel=function(...){
      panel.abline(a=1,b=1)
      panel.xyplot(...)
   }
)

As you can see, the chart with the custom panel function (Figure 14-4) is identical to the chart we showed above for multiple panels (Figure 14-2, shown previously), except with the addition of the diagonal lines.