Measuring scheduling latencies

All the configuration and tuning you may do will be pointless if you cannot show that your device meets the deadlines. You will need your own benchmarks for the final testing but I will describe here two important measurement tools: cyclictest and Ftrace.

cyclictest was originally written by Thomas Gleixner and is now available on most platforms in a package named rt-tests. If you are using the Yocto Project, you can create a target image that includes rt-tests by building the real-time image recipe like this:

If you are using Buildroot, you need to add the package, BR2_PACKAGE_RT_TESTS in the menu Target packages | Debugging, profiling and benchmark | rt-tests.

cyclictest measures scheduling latencies by comparing the actual time taken for a sleep to the requested time. If there was no latency they would be the same and the reported latency would be zero. cyclictest assumes a timer resolution of less than one microsecond.

It has a large number of command-line options. To start with, you might try running this command as root on the target:

The options selected are as follows:

The result line shows the following, reading from left to right:

This was obtained on an idle system running an unmodified linux-yocto kernel as a quick demonstration of the tool. To be of real use, you would run tests over a 24 hour period or more while running a load representative of the maximum you expect.

Of the numbers produced by cyclictest, the maximum latency is the most interesting, but it would be nice to get an idea of the spread of the values. You can get that by adding -h <N> to obtain a histogram of samples that are up to N microseconds late. Using this technique, I obtained three traces for the same target board running kernels with no preemption, with standard preemption, and with RT preemption while being loaded with Ethernet traffic from a flood ping. The command line was as shown here:

The following is the output generated with no preemption:

cyclictest

Without preemption, most samples are within 100 microseconds of the deadline, but there are some outliers of up to 500 microseconds, which is pretty much what you would expect.

This is the output generated with standard preemption:

cyclictest

With preemption, the samples are spread out at the lower end but there is nothing beyond 120 microseconds.

Here is the output generated with RT preemption:

cyclictest

The RT kernel is a clear winner because everything is tightly bunched around the 20 microsecond mark and there is nothing later than 35 microseconds.

cyclictest, then, is a standard metric for scheduling latencies. However, it cannot help you identify and resolve specific problems with kernel latency. To do that, you need Ftrace.

The kernel function tracer has tracers to help track down kernel latencies—that is what it was originally written for, after all. These tracers capture the trace for the worst case latency detected during a run, showing the functions that caused the delay. The tracers of interest, together with the kernel configuration parameters, are as follows:

Be aware that running Ftrace adds a lot of latency, in the order of tens of milliseconds, every time it captures a new maximum which Ftrace itself can ignore. However, it skews the results of user-space tracers such as cyclictest. In other words, ignore the results of cyclictest if you run it while capturing traces.

Selecting the tracer is the same as for the function tracer we looked at in Chapter 13, Profiling and Tracing. Here is an example of capturing a trace for the maximum period with preemption disabled for a period of 60 seconds:

# echo preemptoff > /sys/kernel/debug/tracing/current_tracer
# echo 0 > /sys/kernel/debug/tracing/tracing_max_latency
# echo 1  > /sys/kernel/debug/tracing/tracing_on
# sleep 60
# echo 0  > /sys/kernel/debug/tracing/tracing_on

The resulting trace, heavily edited, looks like this:

# cat /sys/kernel/debug/tracing/trace
# tracer: preemptoff
#
# preemptoff latency trace v1.1.5 on 3.14.19-yocto-standard
# --------------------------------------------------------------------
# latency: 1160 us, #384/384, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0)
#    -----------------
#    | task: init-1 (uid:0 nice:0 policy:0 rt_prio:0)
#    -----------------
#  => started at: ip_finish_output
#  => ended at:   __local_bh_enable_ip
#
#
#                  _------=> CPU#
#                 / _-----=> irqs-off
#                | / _----=> need-resched
#                || / _---=> hardirq/softirq
#                ||| / _--=> preempt-depth
#                |||| /     delay
#  cmd     pid   ||||| time  |   caller
#     \   /      |||||  \    |   /
    init-1       0..s.    1us+: ip_finish_output
    init-1       0d.s2   27us+: preempt_count_add <-cpdma_chan_submit
    init-1       0d.s3   30us+: preempt_count_add <-cpdma_chan_submit
    init-1       0d.s4   37us+: preempt_count_sub <-cpdma_chan_submit

[...]

    init-1       0d.s2 1152us+: preempt_count_sub <-__local_bh_enable
    init-1       0d..2 1155us+: preempt_count_sub <-__local_bh_enable_ip
    init-1       0d..1 1158us+: __local_bh_enable_ip
    init-1       0d..1 1162us!: trace_preempt_on <-__local_bh_enable_ip
    init-1       0d..1 1340us : <stack trace>

Here, you can see that the longest period with kernel preemption disabled while running the trace was 1,160 microseconds. This simple fact is available by reading /sys/kernel/debug/tracing/tracing_max_latency, but the trace above goes further and gives you the sequence of kernel function calls that lead up to that measurement. The column marked delay shows the point on the trail where each function was called, ending with the call to trace_preempt_on() at 1162us, at which point kernel preemption is once again enabled. With this information, you can look back through the call chain and (hopefully) work out if this is a problem or not.

The other tracers mentioned work in the same way.

If cyclictest reports unexpectedly long latencies you can use the breaktrace option to abort the program and trigger Ftrace to obtain more information.

You invoke breaktrace using -b<N> or --breaktrace=<N> where N is the number of microseconds of latency that will trigger the trace. You select the Ftrace tracer using -T[tracer name] or one of the following:

For example, this will trigger the Ftrace function tracer when a latency greater than 100 microseconds is measured: