Introducing perf

perf is an abbreviation of the Linux performance event counter subsystem, perf_events, and also the name of the command-line tool for interacting with perf_events. Both have been part of the kernel since Linux 2.6.31. There is plenty of useful information in the Linux source tree in tools/perf/Documentation, and also at https://perf.wiki.kernel.org.

The initial impetus for developing perf was to provide a unified way to access the registers of the performance measurement unit (PMU), which is part of most modern processor cores. Once the API was defined and integrated into Linux, it became logical to extend it to cover other types of performance counters.

At its heart, perf is a collection of event counters with rules about when they actively collect data. By setting the rules, you can capture data from the whole system, or just the kernel, or just one process and its children, and do it across all CPUs or just one CPU. It is very flexible. With this one tool you can start by looking at the whole system, then zero in on a device driver that seems to be causing problems, or an application that is running slowly, or a library function that seems to being taking longer to execute than you thought.

The code for the perf command-line tool is part of the kernel, in the tools/perf directory. The tool and the kernel subsystem are developed hand-in-hand, meaning that they must be from the same version of the kernel. perf can do a lot. In this chapter, I will examine it only as a profiler. For a description of its other capabilities, read the perf man pages and refer to the documentation mentioned in the previous paragraph.

You need a kernel that is configured for perf_events and you need the perf command cross compiled to run on the target. The relevant kernel configuration is CONFIG_PERF_EVENTS present in the menu General setup | Kernel Performance Events And Counters.

If you want to profile using tracepoints—more on this subject later—also enable the options described in the section about Ftrace. While you are there, it is worthwhile enabling CONFIG_DEBUG_INFO as well.

The perf command has many dependencies which makes cross compiling it quite messy. However, both the Yocto Project and Buildroot have target packages for it.

You will also need debug symbols on the target for the binaries that you are interested in profiling, otherwise perf will not be able to resolve addresses to meaningful symbols. Ideally, you want debug symbols for the whole system including the kernel. For the latter, remember that the debug symbols for the kernel are in the vmlinux file.

If you are using the standard linux-yocto kernel, perf_events is enabled already, so there is nothing more to do.

To build the perf tool, you can add it explicitly to the target image dependencies, or you can add the tools-profile feature which also brings in gprof. As I mentioned previously, you will probably want debug symbols on the target image, and also the kernel vmlinux image. In total, this is what you will need in conf/local.conf:

Many Buildroot kernel configurations do not include perf_events, so you should begin by checking that your kernel includes the options mentioned in the preceding section.

To cross compile perf, run the Buildroot menuconfig and select the following:

Then, run make clean, followed by make.

When you have built everything, you will have to copy vmlinux into the target image manually.

You can use perf to sample the state of a program using one of the event counters and accumulate samples over a period of time to create a profile. This is another example of statistical profiling. The default event counter is called cycles, which is a generic hardware counter that is mapped to a PMU register representing a count of cycles at the core clock frequency.

Creating a profile using perf is a two stage process: the perf record command captures samples and writes them to a file named perf.data (by default) and then perf report analyzes the results. Both commands are run on the target. The samples being collected are filtered for the process and its children, for a command you specify. Here is an example profiling a shell script that searches for the string linux:

Now you can show the results from perf.data using the command perf report. There are three user interfaces which you can select on the command line:

The default is TUI, as shown in this example:

Profiling with perf

perf is able to record the kernel functions executed on behalf of the processes because it collects samples in kernel space.

The list is ordered with the most active functions first. In this example, all but one are captured while grep is running. Some are in a library, libc-2.20, some in a program, busybox.nosuid, and some are in the kernel. We have symbol names for program and library functions because all the binaries have been installed on the target with debug information, and kernel symbols are being read from /boot/vmlinux. If you have vmlinux in a different location, add -k <path> to the perf report command. Rather than storing samples in perf.data, you can save them to a different file using perf record -o <file name> and analyze them using perf report -i <file name>.

By default, perf record samples at a frequency of 1000Hz using the cycles counter.

This is still not really making life easy; the functions at the top of the list are mostly low level memory operations and you can be fairly sure that they have already been optimized. It would be nice to step back and see where these functions are being called from. You can do that by capturing the backtrace from each sample, which you can do with the -g option to perf record.

Now perf report shows a plus sign (+) where the function is part of a call chain. You can expand the trace to see the functions lower down in the chain:

Call graphs

Now that you know which functions to look at, it would be nice to step inside and see the code and to have hit counts for each instruction. That is what perf annotate does, by calling down to a copy of objdump installed on the target. You just need to use perf annotate in place of perf report.

perf annotate requires symbol tables for the executables and vmlinux. Here is an example of an annotated function:

perf annotate

If you want to see the source code interleaved with the assembler, you can copy the relevant parts to the target device. If you are using the Yocto Project and build with the extra image feature dbg-pkgs, or have installed the individual -dbg package, then the source will have been installed for you in /usr/src/debug. Otherwise, you can examine the debug information to see the location of the source code:

The path on the target should be exactly the same as the path you can see in DW_AT_comp_dir.

Here is an example of annotation with source and assembler code:

perf annotate