R includes some handy set operations, including these:
union(x,y)
: Union of the sets x
and y
intersect(x,y)
: Intersection of the sets x
and y
setdiff(x,y)
: Set difference between x
and y
, consisting of all elements of x
that are not in y
setequal(x,y)
: Test for equality between x
and y
c %in% y
: Membership, testing whether c
is an element of the set y
choose(n,k)
: Number of possible subsets of size k
chosen from a set of size n
Here are some simple examples of using these functions:
> x <- c(1,2,5) > y <- c(5,1,8,9) > union(x,y) [1] 1 2 5 8 9 > intersect(x,y) [1] 1 5 > setdiff(x,y) [1] 2 > setdiff(y,x) [1] 8 9 > setequal(x,y) [1] FALSE > setequal(x,c(1,2,5)) [1] TRUE > 2 %in% x [1] TRUE > 2 %in% y [1] FALSE > choose(5,2) [1] 10
Recall from Section 7.12 that you can write your own binary operations. For instance, consider coding the symmetric difference between two sets—that is, all the elements belonging to exactly one of the two operand sets. Because the symmetric difference between sets x
and y
consists exactly of those elements in x
but not y
and vice versa, the code consists of easy calls to setdiff()
and union()
, as follows:
> symdiff function(a,b) { sdfxy <- setdiff(x,y) sdfyx <- setdiff(y,x) return(union(sdfxy,sdfyx)) }
> x [1] 1 2 5 > y [1] 5 1 8 9 > symdiff(x,y) [1] 2 8 9
Here’s another example: a binary operand for determining whether one set u
is a subset of another set v
. A bit of thought shows that this property is equivalent to the intersection of u
and v
being equal to u
. Hence we have another easily coded function:
> "%subsetof%" <- function(u,v) { + return(setequal(intersect(u,v),u)) + } > c(3,8) %subsetof% 1:10 [1] TRUE > c(3,8) %subsetof% 5:10 [1] FALSE
The function combn()
generates combinations. Let’s find the subsets of {1,2,3} of size 2.
> c32 <- combn(1:3,2) > c32 [,1] [,2] [,3] [1,] 1 1 2 [2,] 2 3 3 > class(c32) [1] "matrix"
The results are in the columns of the output. We see that the subsets of {1,2,3} of size 2 are (1,2), (1,3), and (2,3).
The function also allows you to specify a function to be called by combn()
on each combination. For example, we can find the sum of the numbers in each subset, like this:
> combn(1:3,2,sum) [1] 3 4 5
The first subset, {1,2}, has a sum of 2, and so on.