Using Rprof() to Find Slow Spots in Your Code

If you think your R code is running unnecessarily slowly, a handy tool for finding the culprit is Rprof(), which gives you a report of (approximately) how much time your code is spending in each of the functions it calls. This is important, as it may not be wise to optimize every section of your program. Optimization may come at the expense of coding time and code clarity, so it’s of value to know where optimization would really help.

Monitoring with Rprof()

Let’s demonstrate using Rprof() with our three versions of code to find a powers matrix from the previous extended example. We’ll call Rprof() to get the monitor started, run our code, and then call Rprof() with a NULL argument to stop the monitoring. Finally, we’ll call summaryRprof() to see the results.

> x <- runif(1000000)
> Rprof()
> invisible(powers1(x,8))
> Rprof(NULL)
> summaryRprof()
$by.self
          self.time self.pct total.time total.pct
"cbind"        0.74     86.0       0.74      86.0
"*"            0.10     11.6       0.10      11.6
"matrix"       0.02     2.3        0.02       2.3
"powers1"      0.00     0.0        0.86      100.0

$by.total
          total.time total.pct self.time self.pct
"powers1"       0.86     100.0     0.00       0.0
"cbind"         0.74     86.0      0.74      86.0
"*"             0.10     11.6      0.10      11.6
"matrix"        0.02      2.3      0.02       2.3

$sampling.time
[1] 0.86

We see immediately that the runtime of our code is dominated by calls to cbind(), which as we noted in the extended example is indeed slowing things down.

By the way, the call to invisible() in this example is used to suppress output. We certainly don’t want to see the 1,000,000-row matrix returned by powers1() here!

Profiling powers2() does not show any obvious bottlenecks.

> Rprof()
> invisible(powers2(x,8))
> Rprof(NULL)
> summaryRprof()
$by.self
          self.time self.pct total.time total.pct
"powers2"      0.38     67.9       0.56     100.0
"matrix"       0.14     25.0       0.14      25.0
"*"            0.04      7.1       0.04       7.1

$by.total
          total.time total.pct self.time self.pct
"powers2"       0.56     100.0      0.38     67.9
"matrix"        0.14      25.0      0.14     25.0
"*"             0.04       7.1      0.04      7.1

$sampling.time
[1] 0.56

What about powers3(), the promising approach that didn’t pan out?

> Rprof()
> invisible(powers3(x,8))
> Rprof(NULL)
> summaryRprof()
$by.self
           self.time self.pct total.time total.pct
"FUN"           0.94     56.6       0.94      56.6
"outer"         0.72     43.4       1.66     100.0
"powers3"       0.00      0.0       1.66     100.0

$by.total
           total.time total.pct self.time self.pct
"outer"          1.66     100.0      0.72     43.4
"powers3"        1.66     100.0      0.00      0.0
"FUN"            0.94      56.6      0.94     56.6

$sampling.time
[1] 1.66

The function logging the largest amount of time was FUN(), which as noted in our extended example is simply multiplication. For each pair of elements of x here, one of the elements is multiplied by the other; that is, a product of two scalars is found. In other words, no vectorization! No wonder it was slow.

How Rprof() Works

Let’s explore in a bit more detail what Rprof() does. Every 0.02 seconds (the default value), R inspects the call stack to determine which function calls are in effect at that time. It writes the result of each inspection to a file, by default Rprof.out. Here is an excerpt of that file from our run of powers3():

...
"outer" "powers3"
"outer" "powers3"
"outer" "powers3"
"FUN" "outer" "powers3"
"FUN" "outer" "powers3"
"FUN" "outer" "powers3"
"FUN" "outer" "powers3"
...

So, Rprof() often found that at inspection time, powers3() had called outer(), which in turn had called FUN(), the latter being the currently executing function. The function summaryRprof() conveniently summarizes all those lines in the file, but you may find that looking at the file itself reveals more insights in some cases.

Note, too, that Rprof() is no panacea. If the code you’re profiling produces many function calls (including indirect calls, triggered when your code calls some function that then calls another within R), the profiling output may be hard to decipher. This is arguably the case for the output from powers4():

$by.self
            self.time self.pct total.time total.pct
"apply"         19.46     67.5      27.56      95.6
"lapply"         4.02     13.9       5.68      19.7
"FUN"            2.56      8.9       2.56       8.9
"as.vector"      0.82      2.8       0.82       2.8
"t.default"      0.54      1.9       0.54       1.9
"unlist"         0.40      1.4       6.08      21.1
"!"              0.34      1.2       0.34       1.2
"is.null"        0.32      1.1       0.32       1.1
"aperm"          0.22      0.8       0.22       0.8
"matrix"         0.14      0.5       0.74       2.6
"!="             0.02      0.1       0.02       0.1
"powers4"        0.00      0.0      28.84     100.0
"t"              0.00      0.0      28.10      97.4
"array"          0.00      0.0       0.22       0.8

$by.total
            total.time total.pct self.time self.pct
"powers4"        28.84     100.0      0.00      0.0
"t"              28.10      97.4      0.00      0.0
"apply"          27.56      95.6     19.46     67.5
"unlist"          6.08      21.1      0.40      1.4
"lapply"          5.68      19.7      4.02     13.9
"FUN"             2.56       8.9      2.56      8.9
"as.vector"       0.82       2.8      0.82      2.8
"matrix"          0.74       2.6      0.14      0.5
"t.default"       0.54       1.9      0.54      1.9
"!"               0.34       1.2      0.34      1.2
"is.null"         0.32       1.1      0.32      1.1
"aperm"           0.22       0.8      0.22      0.8
"array"           0.22       0.8      0.00      0.0
"!="              0.02       0.1      0.02      0.1

$sampling.time
[1] 28.84