Semaphore Undo Values

Suppose that, having adjusted the value of a semaphore (e.g., decreased the semaphore value so that it is now 0), a process then terminates, either deliberately or accidentally. By default, the semaphore’s value is left unchanged. This may constitute a problem for other processes using the semaphore, since they may be blocked waiting on that semaphore—that is, waiting for the now-terminated process to undo the change it made.

To avoid such problems, we can employ the SEM_UNDO flag when changing the value of a semaphore via semop(). When this flag is specified, the kernel records the effect of the semaphore operation, and then undoes the operation if the process terminates. The undo happens regardless of whether the process terminates normally or abnormally.

The kernel doesn’t need to keep a record of all operations performed using SEM_UNDO. It suffices to record the sum of all of the semaphore adjustments performed using SEM_UNDO in a per-semaphore, per-process integer total called the semadj (semaphore adjustment) value. When the process terminates, all that is necessary is to subtract this total from the semaphore’s current value.

Note

Since Linux 2.6, processes (threads) created using clone() share semadj values if the CLONE_SYSVSEM flag is employed. Such sharing is required for a conforming implementation of POSIX threads. The NPTL threading implementation employs CLONE_SYSVSEM for the implementation of pthread_create().

When a semaphore value is set using the semctl() SETVAL or SETALL operation, the corresponding semadj values are cleared (i.e., set to 0) in all processes using the semaphore. This makes sense, since absolutely setting the value of a semaphore destroys the value associated with the historical record maintained in the semadj total.

A child created via fork() doesn’t inherit its parent’s semadj values; it doesn’t make sense for a child to undo its parent’s semaphore operations. On the other hand, semadj values are preserved across an exec(). This permits us to adjust a semaphore value using SEM_UNDO, and then exec() a program that performs no operation on the semaphore, but does automatically adjust the semaphore on process termination. (This can be used as a technique that allows another process to discover when this process terminates.)

We conclude by noting that the SEM_UNDO flag is less useful than it first appears, for two reasons. One is that because modifying a semaphore typically corresponds to acquiring or releasing some shared resource, the use of SEM_UNDO on its own may be insufficient to allow a multiprocess application to recover in the event that a process unexpectedly terminates. Unless process termination also automatically returns the shared resource state to a consistent state (unlikely in many scenarios), undoing a semaphore operation is probably insufficient to allow the application to recover.

The second factor limiting the utility of SEM_UNDO is that, in some cases, it is not possible to perform semaphore adjustments when a process terminates. Consider the following scenario, applied to a semaphore whose initial value is 0:

At this point, it is impossible to completely undo the effect of process A’s operation in step 1, since the value of the semaphore is too low. There are three possible ways to resolve this situation:

The first solution is infeasible since it might force a terminating process to block forever. Linux adopts the second solution. Some other UNIX implementations adopt the third solution. SUSv3 is silent on what an implementation should do in this situation.

Note

An undo operation that attempts to raise a semaphore’s value above its permitted maximum value of 32,767 (the SEMVMX limit, described Semaphore Limits) also causes anomalous behavior. In this case, the kernel always performs the adjustment, thus (illegitimately) raising the semaphore’s value above SEMVMX.