Ordinary numerical sorting of a vector can be done with the sort()
function, as in this example:
> x <- c(13,5,12,5) > sort(x) [1] 5 5 12 13 > x [1] 13 5 12 5
Note that x
itself did not change, in keeping with R’s functional language philosophy.
If you want the indices of the sorted values in the original vector, use the order()
function. Here’s an example:
> order(x) [1] 2 4 3 1
This means that x[2]
is the smallest value in x
, x[4]
is the second smallest, x[3]
is the third smallest, and so on.
You can use order()
, together with indexing, to sort data frames, like this:
> y V1 V2 1 def 2 2 ab 5 3 zzzz 1 > r <- order(y$V2) > r [1] 3 1 2 > z <- y[r,] > z V1 V2 3 zzzz 1 1 def 2 2 ab 5
What happened here? We called order()
on the second column of y
, yielding a vector r
, telling us where numbers should go if we want to sort them. The 3 in this vector tells us that x[3,2]
is the smallest number in x[,2]
; the 1 tells us that x[1,2]
is the second smallest; and the 2 tells us that x[2,2]
is the third smallest. We then use indexing to produce the frame sorted by column 2, storing it in z
.
You can use order()
to sort according to character variables as well as numeric ones, as follows:
> d kids ages 1 Jack 12 2 Jill 10 3 Billy 13 > d[order(d$kids),] kids ages 3 Billy 13 1 Jack 12 2 Jill 10 > d[order(d$ages),] kids ages 2 Jill 10 1 Jack 12 3 Billy 13
A related function is rank()
, which reports the rank of each element of a vector.
> x <- c(13,5,12,5) > rank(x) [1] 4.0 1.5 3.0 1.5
This says that 13 had rank 4 in x
; that is, it is the fourth smallest. The value 5 appears twice in x
, with those two being the first and second smallest, so the rank 1.5 is assigned to both. Optionally, other methods of handling ties can be specified.