Previous chapters have covered various techniques that processes can use to synchronize their actions, including signals (Chapter 20 to Chapter 22) and semaphores (Chapter 47 and Chapter 53). In this chapter, we look at further synchronization techniques designed specifically for use with files.
A frequent application requirement is to read data from a file, make some change to that data, and then write it back to the file. As long as just one process at a time ever uses a file in this way, then there are no problems. However, problems can arise if multiple processes are simultaneously updating a file. Suppose, for example, that each process performs the following steps to update a file containing a sequence number:
Read the sequence number from the file.
Use the sequence number for some application-defined purpose.
Increment the sequence number and write it back to the file.
The problem here is that, in the absence of any synchronization technique, two processes could perform the above steps at the same time with (for example) the consequences shown in Figure 55-1 (here, we assume that the initial value of the sequence number is 1000).
The problem is clear: at the end of these steps, the file contains the value 1001, when it should contain the value 1002. (This is an example of a race condition.) To prevent such possibilities, we need some form of interprocess synchronization.
Although we could use (say) semaphores to perform the required synchronization, using file locks is usually preferable, because the kernel automatically associates locks with files.
[Stevens & Rago, 2005] dates the first UNIX file locking implementation to 1980, and notes that fcntl() locking, upon which we primarily focus in this chapter, appeared in System V Release 2 in 1984.
In this chapter, we describe two different APIs for placing file locks:
flock(), which places locks on entire files; and
fcntl(), which places locks on regions of a file.
The flock() system call originated on BSD; fcntl() originated on System V.
The general method of using flock() and fcntl() is as follows:
Place a lock on the file.
Perform file I/O.
Unlock the file so that another process can lock it.
Although file locking is normally used in conjunction with file I/O, we can also use it as a more general synchronization technique. Cooperating processes can follow a convention that locking all or part of a file indicates access by a process to some shared resource other than the file itself (e.g., a shared memory region).
Because of the user-space buffering performed by the stdio library, we should be cautious when using stdio functions with the locking techniques described in this chapter. The problem is that an input buffer might be filled before a lock is placed, or an output buffer may be flushed after a lock is removed. There are a few ways to avoid these problems:
Perform file I/O using read() and write() (and related system calls) instead of the stdio library.
Flush the stdio stream immediately after placing a lock on the file, and flush it once more immediately before releasing the lock.
Perhaps at the cost of some efficiency, disable stdio buffering altogether using setbuf() (or similar).
In the remainder of this chapter, we’ll distinguish locks as being either advisory or mandatory. By default, file locks are advisory. This means that a process can simply ignore a lock placed by another process. In order for an advisory locking scheme to be workable, each process accessing the file must cooperate, by placing a lock before performing file I/O. By contrast, a mandatory locking system forces a process performing I/O to abide by the locks held by other processes. We say more about this distinction in Section 55.4.