Interactions Between fork(), stdio Buffers, and _exit()

The output yielded by the program in Example 25-2 demonstrates a phenomenon that is at first puzzling. When we run this program with standard output directed to the terminal, we see the expected result:

$ ./fork_stdio_buf
Hello world
Ciao

However, when we redirect standard output to a file, we see the following:

$ ./fork_stdio_buf > a
$ cat a
Ciao
Hello world
Hello world

In the above output, we see two strange things: the line written by printf() appears twice, and the output of write() precedes that of printf().

Example 25-2. Interaction of fork() and stdio buffering

procexec/fork_stdio_buf.c
#include "tlpi_hdr.h"

int
main(int argc, char *argv[])
{
    printf("Hello world\n");
    write(STDOUT_FILENO, "Ciao\n", 5);

    if (fork() == -1)
        errExit("fork");

    /* Both child and parent continue execution here */

    exit(EXIT_SUCCESS);
}
     procexec/fork_stdio_buf.c

To understand why the message written with printf() appears twice, recall that the stdio buffers are maintained in a process’s user-space memory (refer to Buffering in the stdio Library). Therefore, these buffers are duplicated in the child by fork(). When standard output is directed to a terminal, it is line-buffered by default, with the result that the newline-terminated string written by printf() appears immediately. However, when standard output is directed to a file, it is block-buffered by default. Thus, in our example, the string written by printf() is still in the parent’s stdio buffer at the time of the fork(), and this string is duplicated in the child. When the parent and the child later call exit(), they both flush their copies of the stdio buffers, resulting in duplicate output.

We can prevent this duplicated output from occurring in one of the following ways:

As a specific solution to the stdio buffering issue, we can use fflush() to flush the stdio buffer prior to a fork() call. Alternatively, we could use setvbuf() or setbuf() to disable buffering on the stdio stream.
Instead of calling exit(), the child can call _exit(), so that it doesn’t flush stdio buffers. This technique exemplifies a more general principle: in an application that creates child processes that don’t exec new programs, typically only one of the processes (most often the parent) should terminate via exit(), while the other processes should terminate via _exit(). This ensures that only one process calls exit handlers and flushes stdio buffers, which is usually desirable.

Note

Other approaches that allow both the parent and child to call exit() are possible (and sometimes necessary). For example, it may be possible to design exit handlers so that they operate correctly even if called from multiple processes, or to have the application install exit handlers only after the call to fork(). Furthermore, sometimes we may actually want all processes to flush their stdio buffers after a fork(). In this case, we may choose to terminate the processes using exit(), or use explicit calls to fflush() in each process, as appropriate.

The output of the write() in the program in Example 25-2 doesn’t appear twice, because write() transfers data directly to a kernel buffer, and this buffer is not duplicated during a fork().

By now, the reason for the second strange aspect of the program’s output when redirected to a file should be clear. The output of write() appears before that from printf() because the output of write() is immediately transferred to the kernel buffer cache, while the output from printf() is transferred only when the stdio buffers are flushed by the call to exit(). (In general, care is required when mixing stdio functions and system calls to perform I/O on the same file, as described in Section 13.7.)