The standard memory allocation policy is to over-commit, meaning that the kernel will allow more memory to be allocated by applications than there is physical memory. Most of the time, this works fine because it is common for applications to request more memory than they really need. It also helps in the implementation of fork(2)
: it is safe to make a copy of a large program because the pages of memory are shared with the copy-on-write
flag set. In the majority of cases, fork
is followed by an exec
function call, which unshares the memory and then loads a new program.
However, there is always the possibility that a particular workload will cause a group of processes to try to cash in on the allocations they have been promised simultaneously and so demand more than there really is. This is an out of memory situation, or OOM. At this point, there is no other alternative but to kill off processes until the problem goes away. This is the job of the out of memory killer.
Before we get to that, there is a tuning parameter for kernel allocations in /proc/sys/vm/overcommit_memory
, which you can set to:
0
: heuristic over-commit (this is the default)1
: always over-commit, never check2
: always check, never over-commitOption 1 is only really useful if you run programs that work with large sparse arrays and so allocate large areas of memory but write to a small proportion of them. Such programs are rare in the context of embedded systems.
Option 2, never over-commit, seems to be a good choice if you are worried about running out of memory, perhaps in a mission or safety-critical application. It will fail allocations that are greater than the commit limit, which is the size of swap space plus total memory multiplied by the over-commit ratio. The over-commit ratio is controlled by /proc/sys/vm/overcommit_ratio
and has a default value of 50%.
As an example, suppose you have a device with 512 MB of system RAM and you set a really conservative ratio of 25%:
# echo 25 > /proc/sys/vm/overcommit_ratio # grep -e MemTotal -e CommitLimit /proc/meminfo MemTotal: 509016 kB CommitLimit: 127252 kB
There is no swap so the commit limit is 25% of MemTotal
, as expected.
There is another important variable in /proc/meminfo: Committed_AS
. This is the total amount of memory that is needed to fulfill all the allocations made so far. I found the following on one system:
# grep -e MemTotal -e Committed_AS /proc/meminfo MemTotal: 509016 kB Committed_AS: 741364 kB
In other words, the kernel has already promised more memory than the available memory. Consequently, setting overcommit_memory
to 2
means that all allocations fail, regardless of overcommit_ratio
. To get to a working system, I would have to either install double the amount of RAM or severely reduce the number of running processes, of which there are about 40.
In all cases, the final defense is the OOM killer. It uses a heuristic method to calculate a badness score between 0 and 1,000 for each process and then terminates those with the highest score until there is enough free memory. You should see something like this in the kernel log:
[44510.490320] eatmem invoked oom-killer: gfp_mask=0x200da, order=0, oom_score_adj=0 ...
You can force an OOM event using echo f > /proc/sysrq-trigger
.
You can influence the badness score for a process by writing an adjustment value to /proc/<PID>/oom_score_adj
. A value of -1000
means that the badness score can never be greater than zero and so it will never be killed; a value of +1000
means that it will always be greater than 1000 and so will always be killed.