Using Pipes to Connect Filters

When a pipe is created, the file descriptors used for the two ends of the pipe are the next lowest-numbered descriptors available. Since, in normal circumstances, descriptors 0, 1, and 2 are already in use for a process, some higher-numbered descriptors will be allocated for the pipe. So how do we bring about the situation shown in Figure 44-1, where two filters (i.e., programs that read from stdin and write to stdout) are connected using a pipe, such that the standard output of one program is directed into the pipe and the standard input of the other is taken from the pipe? And in particular, how can we do this without modifying the code of the filters themselves?

The answer is to use the techniques described in Duplicating File Descriptors for duplicating file descriptors. Traditionally, the following series of calls was used to accomplish the desired result:

int pfd[2];

pipe(pfd);          /* Allocates (say) file descriptors 3 and 4 for pipe */

/* Other steps here, e.g., fork() */

close(STDOUT_FILENO);           /* Free file descriptor 1 */
dup(pfd[1]);                    /* Duplication uses lowest free file
                                   descriptor, i.e., fd 1 */

The end result of the above steps is that the process’s standard output is bound to the write end of the pipe. A corresponding set of calls can be used to bind a process’s standard input to the read end of the pipe.

Note that these steps depend on the assumption that file descriptors 0, 1, and 2 for a process are already open. (The shell normally ensures this for each program it executes.) If file descriptor 0 was already closed prior to the above steps, then we would erroneously bind the process’s standard input to the write end of the pipe. To avoid this possibility, we can replace the calls to close() and dup() with the following dup2() call, which allows us to explicitly specify the descriptor to be bound to the pipe end:

dup2(pfd[1], STDOUT_FILENO);    /* Close descriptor 1, and reopen bound
                                   to write end of pipe */

After duplicating pfd[1], we now have two file descriptors referring to the write end of the pipe: descriptor 1 and pfd[1]. Since unused pipe file descriptors should be closed, after the dup2() call, we close the superfluous descriptor:

close(pfd[1]);

The code we have shown so far relies on standard output having been previously open. Suppose that, prior to the pipe() call, standard input and standard output had both been closed. In this case, pipe() would have allocated these two descriptors to the pipe, perhaps with pfd[0] having the value 0 and pfd[1] having the value 1. Consequently, the preceding dup2() and close() calls would be equivalent to the following:

dup2(1, 1);         /* Does nothing */
close(1);           /* Closes sole descriptor for write end of pipe */

Therefore, it is good defensive programming practice to bracket these calls with an if statement of the following form:

if (pfd[1] != STDOUT_FILENO) {
    dup2(pfd[1], STDOUT_FILENO);
    close(pfd[1]);
}

Example program

The program in Example 44-4 uses the techniques described in this section to bring about the setup shown in Figure 44-1. After building a pipe, this program creates two child processes. The first child binds its standard output to the write end of the pipe and then execs ls. The second child binds its standard input to the read end of the pipe and then execs wc.

Example 44-4. Using a pipe to connect ls and wc

pipes/pipe_ls_wc.c
#include <sys/wait.h>
#include "tlpi_hdr.h"

int
main(int argc, char *argv[])
{
    int pfd[2];                                     /* Pipe file descriptors */

    if (pipe(pfd) == -1)                            /* Create pipe */
        errExit("pipe");

    switch (fork()) {
    case -1:
        errExit("fork");

    case 0:             /* First child: exec 'ls' to write to pipe */
        if (close(pfd[0]) == -1)                    /* Read end is unused */
            errExit("close 1");

        /* Duplicate stdout on write end of pipe; close duplicated descriptor */

        if (pfd[1] != STDOUT_FILENO) {              /* Defensive check */
            if (dup2(pfd[1], STDOUT_FILENO) == -1)
                errExit("dup2 1");
            if (close(pfd[1]) == -1)
                errExit("close 2");
        }

        execlp("ls", "ls", (char *) NULL);          /* Writes to pipe */
        errExit("execlp ls");

    default:            /* Parent falls through to create next child */
        break;
    }

    switch (fork()) {
    case -1:
        errExit("fork");

    case 0:             /* Second child: exec 'wc' to read from pipe */
        if (close(pfd[1]) == -1)                    /* Write end is unused */
            errExit("close 3");

        /* Duplicate stdin on read end of pipe; close duplicated descriptor */

        if (pfd[0] != STDIN_FILENO) {               /* Defensive check */
            if (dup2(pfd[0], STDIN_FILENO) == -1)
                errExit("dup2 2");
            if (close(pfd[0]) == -1)
                errExit("close 4");
        }

        execlp("wc", "wc", "-l", (char *) NULL);    /* Reads from pipe */
        errExit("execlp wc");

    default:            /* Parent falls through */
        break;
    }

    /* Parent closes unused file descriptors for pipe, and waits for children */

    if (close(pfd[0]) == -1)
        errExit("close 5");
    if (close(pfd[1]) == -1)
        errExit("close 6");
    if (wait(NULL) == -1)
        errExit("wait 1");
    if (wait(NULL) == -1)
        errExit("wait 2");

    exit(EXIT_SUCCESS);
}
     pipes/pipe_ls_wc.c

When we run the program in Example 44-4, we see the following:

$ ./pipe_ls_wc
     24
$ ls | wc -l                    Verify the results using shell commands
     24