When a process is rescheduled to run on a multiprocessor system, it doesn’t necessarily run on the same CPU on which it last executed. The usual reason it may run on another CPU is that the original CPU is already busy.
When a process changes CPUs, there is a performance impact: in order for a line of the process’s data to be loaded into the cache of the new CPU, it must first be invalidated (i.e., either discarded if it is unmodified, or flushed to main memory if it was modified), if present in the cache of the old CPU. (To prevent cache inconsistencies, multiprocessor architectures allow data to be kept in only one CPU cache at a time.) This invalidation costs execution time. Because of this performance impact, the Linux (2.6) kernel tries to ensure soft CPU affinity for a process—wherever possible, the process is rescheduled to run on the same CPU.
A cache line is the cache analog of a page in a virtual memory management system. It is the size of the unit used for transfers between the CPU cache and main memory. Typical line sizes range from 32 to 128 bytes. For further information, see [Schimmel, 1994] and [Drepper, 2007].
One of the fields in the Linux-specific /proc/
PID/stat
file displays the number of the CPU on which a process is currently executing or last executed. See the proc(5) manual page for details.
Sometimes, it is desirable to set hard CPU affinity for a process, so that it is explicitly restricted to always running on one, or a subset, of the available CPUs. Among the reasons we may want to do this are the following:
We can avoid the performance impacts caused by invalidation of cached data.
If multiple threads (or processes) are accessing the same data, then we may obtain performance benefits by confining them all to the same CPU, so that they don’t contend for the data and thus cause cache misses.
For a time-critical application, it may be desirable to confine most processes on the system to other CPUs, while reserving one or more CPUs for the time-critical application.
The isolcpus kernel boot option can be used to isolate one or more CPUs from the normal kernel scheduling algorithms. The only way to move a process on or off a CPU that has been isolated is via the CPU affinity system calls described in this section. The isolcpus boot option is the preferred method of implementing the last of the scenarios listed above. For details, see the kernel source file Documentation/kernel-parameters.txt
.
Linux also provides a cpuset kernel option, which can be used on systems containing large numbers of CPUs to achieve more sophisticated control over how the CPUs and memory are allocated to processes. For details, see the kernel source file Documentation/cpusets.txt
.
Linux 2.6 provides a pair of nonstandard system calls to modify and retrieve the hard CPU affinity of a process: sched_setaffinity() and sched_getaffinity().
Many other UNIX implementations provide interfaces for controlling CPU affinity. For example, HP-UX and Solaris provide a pset_bind() system call.
The sched_setaffinity() system call sets the CPU affinity of the process specified by pid. If pid is 0, the CPU affinity of the calling process is changed.
#define _GNU_SOURCE
#include <sched.h>
int sched_setaffinity
(pid_t pid, size_t len, cpu_set_t *set);
Returns 0 on success, or -1 on error
The CPU affinity to be assigned to the process is specified in the cpu_set_t structure pointed to by set.
CPU affinity is actually a per-thread attribute that can be adjusted independently for each of the threads in a thread group. If we want to change the CPU affinity of a specific thread in a multithreaded process, we can specify pid as the value returned by a call to gettid() in that thread. Specifying pid as 0 means the calling thread.
Although the cpu_set_t data type is implemented as a bit mask, we should treat it as an opaque structure. All manipulations of the structure should be done using the macros CPU_ZERO()
, CPU_SET()
, CPU_CLR()
, and CPU_ISSET()
.
#define _GNU_SOURCE #include <sched.h> voidCPU_ZERO
(cpu_set_t *set); voidCPU_SET
(int cpu, cpu_set_t *set); voidCPU_CLR
(int cpu, cpu_set_t *set); intCPU_ISSET
(int cpu, cpu_set_t *set);
Returns true (1) if cpu is in set, or false (0) otherwise
These macros operate on the CPU set pointed to by set as follows:
CPU_ZERO()
initializes set to be empty.
CPU_SET()
adds the CPU cpu to set.
CPU_CLR()
removes the CPU cpu from set.
CPU_ISSET()
returns true if the CPU cpu is a member of set.
The GNU C library also provides a number of other macros for working with CPU sets. See the CPU_SET(3) manual page for details.
The CPUs in a CPU set are numbered starting at 0. The <sched.h>
header file defines the constant CPU_SETSIZE
to be one greater than the maximum CPU number that can be represented in a cpu_set_t variable. CPU_SETSIZE
has the value 1024.
The len argument given to sched_setaffinity() should specify the number of bytes in the set argument (i.e., sizeof(cpu_set_t)).
The following code confines the process identified by pid to running on any CPU other than the first CPU of a four-processor system:
cpu_set_t set; CPU_ZERO(&set); CPU_SET(1, &set); CPU_SET(2, &set); CPU_SET(3, &set); sched_setaffinity(pid, CPU_SETSIZE, &set);
If the CPUs specified in set don’t correspond to any CPUs on the system, then sched_setaffinity() fails with the error EINVAL
.
If set doesn’t include the CPU on which the calling process is currently running, then the process is migrated to one of the CPUs in set.
An unprivileged process may set the CPU affinity of another process only if its effective user ID matches the real or effective user ID of the target process. A privileged (CAP_SYS_NICE
) process may set the CPU affinity of any process.
The sched_getaffinity() system call retrieves the CPU affinity mask of the process specified by pid. If pid is 0, the CPU affinity mask of the calling process is returned.
#define _GNU_SOURCE
#include <sched.h>
int sched_getaffinity
(pid_t pid, size_t len, cpu_set_t *set);
Returns 0 on success, or -1 on error
The CPU affinity mask is returned in the cpu_set_t structure pointed to by set. The len argument should be set to indicate the number of bytes in this structure (i.e., sizeof(cpu_set_t)). We can use the CPU_ISSET()
macro to determine which CPUs are in the returned set.
If the CPU affinity mask of the target process has not otherwise been modified, sched_getaffinity() returns a set containing all of the CPUs on the system.
No permission checking is performed by sched_getaffinity(); an unprivileged process can retrieve the CPU affinity mask of any process on the system.
A child process created by fork() inherits its parent’s CPU affinity mask, and this mask is preserved across an exec().
The sched_setaffinity() and sched_getaffinity() system calls are Linux-specific.