Suppose that, having adjusted the value of a semaphore (e.g., decreased the semaphore value so that it is now 0), a process then terminates, either deliberately or accidentally. By default, the semaphore’s value is left unchanged. This may constitute a problem for other processes using the semaphore, since they may be blocked waiting on that semaphore—that is, waiting for the now-terminated process to undo the change it made.
To avoid such problems, we can employ the SEM_UNDO
flag when changing the value of a semaphore via semop(). When this flag is specified, the kernel records the effect of the semaphore operation, and then undoes the operation if the process terminates. The undo happens regardless of whether the process terminates normally or abnormally.
The kernel doesn’t need to keep a record of all operations performed using SEM_UNDO
. It suffices to record the sum of all of the semaphore adjustments performed using SEM_UNDO
in a per-semaphore, per-process integer total called the semadj (semaphore adjustment) value. When the process terminates, all that is necessary is to subtract this total from the semaphore’s current value.
Since Linux 2.6, processes (threads) created using clone() share semadj values if the CLONE_SYSVSEM
flag is employed. Such sharing is required for a conforming implementation of POSIX threads. The NPTL threading implementation employs CLONE_SYSVSEM
for the implementation of pthread_create().
When a semaphore value is set using the semctl() SETVAL
or SETALL
operation, the corresponding semadj values are cleared (i.e., set to 0) in all processes using the semaphore. This makes sense, since absolutely setting the value of a semaphore destroys the value associated with the historical record maintained in the semadj total.
A child created via fork() doesn’t inherit its parent’s semadj values; it doesn’t make sense for a child to undo its parent’s semaphore operations. On the other hand, semadj values are preserved across an exec(). This permits us to adjust a semaphore value using SEM_UNDO
, and then exec() a program that performs no operation on the semaphore, but does automatically adjust the semaphore on process termination. (This can be used as a technique that allows another process to discover when this process terminates.)
The following shell session log shows the effect of performing operations on two semaphores: one operation with the SEM_UNDO
flag and one without. We begin by creating a set containing two semaphores:
$ ./svsem_create -p 2
131073
Next, we execute a command that adds 1 to both semaphores and then terminates. The operation on semaphore 0 specifies the SEM_UNDO
flag:
$ ./svsem_op 131073 0+1u 1+1
2248, 06:41:56: about to semop()
2248, 06:41:56: semop() completed
Now, we use the program in Example 47-3 to check the state of the semaphores:
$ ./svsem_mon 131073
Semaphore changed: Sun Jul 25 06:41:34 2010
Last semop(): Sun Jul 25 06:41:56 2010
Sem # Value SEMPID SEMNCNT SEMZCNT
0 0 2248 0 0
1 1 2248 0 0
Looking at the semaphore values in the last two lines of the above output, we can see that the operation on semaphore 0 was undone, but the operation on semaphore 1 was not undone.
We conclude by noting that the SEM_UNDO
flag is less useful than it first appears, for two reasons. One is that because modifying a semaphore typically corresponds to acquiring or releasing some shared resource, the use of SEM_UNDO
on its own may be insufficient to allow a multiprocess application to recover in the event that a process unexpectedly terminates. Unless process termination also automatically returns the shared resource state to a consistent state (unlikely in many scenarios), undoing a semaphore operation is probably insufficient to allow the application to recover.
The second factor limiting the utility of SEM_UNDO
is that, in some cases, it is not possible to perform semaphore adjustments when a process terminates. Consider the following scenario, applied to a semaphore whose initial value is 0:
Process A increases the value of a semaphore by 2, specifying the SEM_UNDO
flag for the operation.
Process B decreases the value of the semaphore by 1, so that it has the value 1.
Process A terminates.
At this point, it is impossible to completely undo the effect of process A’s operation in step 1, since the value of the semaphore is too low. There are three possible ways to resolve this situation:
Force the process to block until the semaphore adjustment is possible.
Decrease the semaphore value as far as possible (i.e., to 0) and exit.
Exit without performing any semaphore adjustment.
The first solution is infeasible since it might force a terminating process to block forever. Linux adopts the second solution. Some other UNIX implementations adopt the third solution. SUSv3 is silent on what an implementation should do in this situation.
An undo operation that attempts to raise a semaphore’s value above its permitted maximum value of 32,767 (the SEMVMX
limit, described Semaphore Limits) also causes anomalous behavior. In this case, the kernel always performs the adjustment, thus (illegitimately) raising the semaphore’s value above SEMVMX
.