In this section, we provide details on each of the resource limits available on Linux, noting those that are Linux-specific.
The RLIMIT_AS
limit specifies the maximum size for the process’s virtual memory (address space), in bytes. Attempts (brk(), sbrk(), mmap(), mremap(), and shmat()) to exceed this limit fail with the error ENOMEM
. In practice, the most common place where a program may hit this limit is in calls to functions in the malloc package, which make use of sbrk() and mmap(). Upon encountering this limit, stack growth can also fail with the consequences listed below for RLIMIT_STACK
.
The RLIMIT_CORE
limit specifies the maximum size, in bytes, for core dump files produced when a process is terminated by certain signals (Core Dump Files). Production of a core dump file will stop at this limit. Specifying a limit of 0 prevents creation of core dump files, which is sometimes useful because core dump files can be very large, and end users usually don’t know what to do with them. Another reason for disabling core dumps is security—to prevent the contents of a program’s memory from being dumped to disk. If the RLIMIT_FSIZE
limit is lower than this limit, core dump files are limited to RLIMIT_FSIZE
bytes.
The RLIMIT_CPU
limit specifies the maximum number of seconds of CPU time (in both system and user mode) that can be used by the process. SUSv3 requires that the SIGXCPU
signal be sent to the process when the soft limit is reached, but leaves other details unspecified. (The default action for SIGXCPU
is to terminate a process with a core dump.) It is possible to establish a handler for SIGXCPU
that does whatever processing is desired and then returns control to the main program. Thereafter, (on Linux) SIGXCPU
is sent once per second of consumed CPU time. If the process continues executing until the hard CPU limit is reached, then the kernel sends it a SIGKILL
signal, which always terminates the process.
UNIX implementations vary in the details of how they deal with processes that continue consuming CPU time after handling a SIGXCPU
signal. Most continue to deliver SIGXCPU
at regular intervals. If aiming for portable use of this signal, we should code an application so that, on first receipt of this signal, it does whatever cleanup is required and terminates. (Alternatively, the program could change the resource limit after receiving the signal.)
The RLIMIT_DATA
limit specifies the maximum size, in bytes, of the process’s data segment (the sum of the initialized data, uninitialized data, and heap segments described in Memory Layout of a Process). Attempts (sbrk() and brk()) to extend the data segment (program break) beyond this limit fail with the error ENOMEM
. As with RLIMIT_AS
, the most common place where a program may hit this limit is in calls to functions in the malloc package.
The RLIMIT_FSIZE
limit specifies the maximum size of files that the process may create, in bytes. If a process attempts to extend a file beyond the soft limit, it is sent a SIGXFSZ
signal, and the system call (e.g., write() or truncate()) fails with the error EFBIG
. The default action for SIGXFSZ
is to terminate a process and produce a core dump. It is possible to instead catch this signal and return control to the main program. However, any further attempt to extend the file will yield the same signal and error.
The RLIMIT_MEMLOCK
limit (BSD-derived; absent from SUSv3 and available only on Linux and the BSDs) specifies the maximum number of bytes of virtual memory that a process may lock into physical memory, to prevent the memory from being swapped out. This limit affects the mlock() and mlockall() system calls, and the locking options for the mmap() and shmctl() system calls. We describe the details in Section 50.2.
If the MCL_FUTURE
flag is specified when calling mlockall(), then the RLIMIT_MEMLOCK
limit may also cause later calls to brk(), sbrk(), mmap(), or mremap() to fail.
The RLIMIT_MSGQUEUE
limit (Linux-specific; since Linux 2.6.8) specifies the maximum number of bytes that can be allocated for POSIX message queues for the real user ID of the calling process. When a POSIX message queue is created using mq_open(), bytes are deducted against this limit according to the following formula:
bytes = attr.mq_maxmsg * sizeof(struct msg_msg *) + attr.mq_maxmsg * attr.mq_msgsize;
In this formula, attr is the mq_attr structure that is passed as the fourth argument to mq_open(). The addend that includes sizeof(struct msg_msg *) ensures that the user can’t queue an unlimited number of zero-length messages. (The msg_msg structure is a data type used internally by the kernel.) This is necessary because, although zero-length messages contain no data, they do consume some system memory for bookkeeping overhead.
The RLIMIT_MSGQUEUE
limit affects only the calling process. Other processes belonging to this user are not affected unless they also set this limit or inherit it.
The RLIMIT_NICE
limit (Linux-specific; since Linux 2.6.12) specifies a ceiling on the nice value that may be set for this process using sched_setscheduler() and nice(). The ceiling is calculated as 20 – rlim_cur, where rlim_cur is the current RLIMIT_NICE
soft resource limit. Refer to Process Priorities (Nice Values) for further details.
The RLIMIT_NOFILE
limit specifies a number one greater than the maximum file descriptor number that a process may allocate. Attempts (e.g., open(), pipe(), socket(), accept(), shm_open(), dup(), dup2(), fcntl(F_DUPFD), and epoll_create()) to allocate descriptors beyond this limit fail. In most cases, the error is EMFILE
, but for dup2(fd, newfd) it is EBADF
, and for fcntl(fd, F_DUPFD, newfd) with newfd is greater than or equal to the limit, it is EINVAL
.
Changes to the RLIMIT_NOFILE
limit are reflected in the value returned by sysconf(_SC_OPEN_MAX). SUSv3 permits, but doesn’t require, an implementation to return different values for a call to sysconf(_SC_OPEN_MAX) before and after changing the RLIMIT_NOFILE
limit; other implementations may not behave the same as Linux on this point.
SUSv3 states that if an application sets the soft or hard RLIMIT_NOFILE
limit to a value less than or equal to the number of the highest file descriptor that the process currently has open, unexpected behavior may occur.
On Linux, we can check which file descriptors a process currently has open by using readdir() to scan the contents of the /proc/
PID/fd
directory, which contains symbolic links for each of the file descriptors currently opened by the process.
The kernel imposes a ceiling on the value to which the RLIMIT_NOFILE
limit may be raised. In kernels before 2.6.25, this ceiling is a hard-coded value defined by the kernel constant NR_OPEN
, whose value is 1,048,576. (A kernel rebuild is required to raise this ceiling.) Since kernel 2.6.25, the limit is defined by the value in the Linux-specific /proc/sys/fs/nr_open
file. The default value in this file is 1,048,576; this can be modified by the superuser. Attempts to set the soft or hard RLIMIT_NOFILE
limit higher than the ceiling value yield the error EPERM
.
There is also a system-wide limit on the total number of files that may be opened by all processes. This limit can be retrieved and modified via the Linux-specific /proc/sys/fs/file-max
file. (Referring to Relationship Between File Descriptors and Open Files, we can define file-max
more precisely as a system-wide limit on the number of open file descriptions.) Only privileged (CAP_SYS_ADMIN
) processes can exceed the file-max
limit. In an unprivileged process, a system call that encounters the file-max
limit fails with the error ENFILE
.
The RLIMIT_NPROC
limit (BSD-derived; absent from SUSv3 and available only on Linux and the BSDs) specifies the maximum number of processes that may be created for the real user ID of the calling process. Attempts (fork(), vfork(), and clone()) to exceed this limit fail with the error EAGAIN
.
The RLIMIT_NPROC
limit affects only the calling process. Other processes belonging to this user are not affected unless they also set or inherit this limit. This limit is not enforced for privileged (CAP_SYS_ADMIN
or CAP_SYS_RESOURCE
) processes.
Linux also imposes a system-wide limit on the number of processes that can be created by all users. On Linux 2.4 and later, the Linux-specific /proc/sys/kernel/threads-max
file can be used to retrieve and modify this limit.
To be precise, the RLIMIT_NPROC
resource limit and the threads-max
file are actually limits on the numbers of threads that can be created, rather than the number of processes.
The manner in which the default value for the RLIMIT_NPROC
resource limit is set has varied across kernel versions. In Linux 2.2, it was calculated according to a fixed formula. In Linux 2.4 and later, it is calculated using a formula based on the amount of available physical memory.
SUSv3 doesn’t specify the RLIMIT_NPROC
resource limit. The SUSv3-mandated method for retrieving (but not changing) the maximum number of processes permitted to a user ID is via the call sysconf(_SC_CHILD_MAX). This sysconf() call is supported on Linux, but in kernel versions before 2.6.23, the call does not return accurate information—it always returns the value 999. Since Linux 2.6.23 (and with glibc 2.4 and later), this call correctly reports the limit (by checking the value of the RLIMIT_NPROC
resource limit).
There is no portable way of discovering how many processes have already been created for a specific user ID. On Linux, we can try scanning all of the /proc/
PID/status
files on the system and examining the information under the Uid
entry (which lists the four process user IDs in the order: real, effective, saved set, and file system) in order to estimate the number of processes currently owned by a user. Be aware, however, that by the time we have completed such a scan, this information may already have changed.
The RLIMIT_RSS
limit (BSD-derived; absent from SUSv3, but widely available) specifies the maximum number of pages in the process’s resident set; that is, the total number of virtual memory pages currently in physical memory. This limit is provided on Linux, but it currently has no effect.
In older Linux 2.4 kernels (up to and including 2.4.29), RLIMIT_RSS
did have an effect on the behavior of the madvise() MADV_WILLNEED
operation (Advising Future Memory Usage Patterns: madvise()). If this operation could not be performed as a result of encountering the RLIMIT_RSS
limit, the error EIO
was returned in errno.
The RLIMIT_RTPRIO
limit (Linux-specific; since Linux 2.6.12) specifies a ceiling on the realtime priority that may be set for this process using sched_setscheduler() and sched_setparam(). Refer to Modifying and Retrieving Policies and Priorities for further details.
The RLIMIT_RTTIME
limit (Linux-specific; since Linux 2.6.25) specifies the maximum amount of CPU time in microseconds that a process running under a realtime scheduling policy may consume without sleeping (i.e., performing a blocking system call). The behavior if this limit is reached is the same as for RLIMIT_CPU
: if the process reaches the soft limit, then a SIGXCPU
signal is sent to the process, and further SIGXCPU
signals are sent for each additional second of CPU time consumed. On reaching the hard limit, a SIGKILL
signal is sent. Refer to Modifying and Retrieving Policies and Priorities for further details.
The RLIMIT_SIGPENDING
limit (Linux-specific; since Linux 2.6.8) specifies the maximum number of signals that may be queued for the real user ID of the calling process. Attempts (sigqueue()) to exceed this limit fail with the error EAGAIN
.
The RLIMIT_SIGPENDING
limit affects only the calling process. Other processes belonging to this user are not affected unless they also set or inherit this limit.
As initially implemented, the default value for the RLIMIT_SIGPENDING
limit was 1024. Since kernel 2.6.12, the default value has been changed to be the same as the default value for RLIMIT_NPROC
.
For the purposes of checking the RLIMIT_SIGPENDING
limit, the count of queued signals includes both realtime and standard signals. (Standard signals can be queued only once to a process.) However, this limit is enforced only for sigqueue(). Even if the number of signals specified by this limit has already been queued to processes belonging to this real user ID, it is still possible to use kill() to queue one instance of each of the signals (including realtime signals) that are not already queued to a process.
From kernel 2.6.12 onward, the SigQ
field of the Linux-specific /proc/
PID/status
file displays the current and maximum number of queued signals for the real user ID of the process.
The RLIMIT_STACK
limit specifies the maximum size of the process stack, in bytes. Attempts to grow the stack beyond this limit result in the generation of a SIGSEGV
signal for the process. Since the stack is exhausted, the only way to catch this signal is by establishing an alternate signal stack, as described in Section 21.3.
Since Linux 2.6.23, the RLIMIT_STACK
limit also determines the amount of space available for holding the process’s command-line arguments and environment variables. See the execve(2) manual page for details.