We now start to look in earnest at the system call API. Files are a good place to start, since they are central to the UNIX philosophy. The focus of this chapter is the system calls used for performing file input and output.
We introduce the concept of a file descriptor, and then look at the system calls that constitute the so-called universal I/O model. These are the system calls that open and close a file, and read and write data.
We focus on I/O on disk files. However, much of the material covered here is relevant for later chapters, since the same system calls are used for performing I/O on all types of files, such as pipes and terminals.
Chapter 5 extends the discussion in this chapter with further details on file I/O. One other aspect of file I/O, buffering, is complex enough to deserve its own chapter. Chapter 13 covers I/O buffering in the kernel and in the stdio library.
All system calls for performing I/O refer to open files using a file descriptor, a (usually small) nonnegative integer. File descriptors are used to refer to all types of open files, including pipes, FIFOs, sockets, terminals, devices, and regular files. Each process has its own set of file descriptors.
By convention, most programs expect to be able to use the three standard file descriptors listed in Table 4-1. These three descriptors are opened on the program’s behalf by the shell, before the program is started. Or, more precisely, the program inherits copies of the shell’s file descriptors, and the shell normally operates with these three file descriptors always open. (In an interactive shell, these three file descriptors normally refer to the terminal under which the shell is running.) If I/O redirections are specified on a command line, then the shell ensures that the file descriptors are suitably modified before starting the program.
Table 4-1. Standard file descriptors
File descriptor | Purpose | POSIX name | stdio stream |
---|---|---|---|
0 | standard input |
| stdin |
1 | standard output |
| stdout |
2 | standard error |
| stderr |
When referring to these file descriptors in a program, we can use either the numbers (0, 1, or 2) or, preferably, the POSIX standard names defined in <unistd.h>
.
Although the variables stdin, stdout, and stderr initially refer to the process’s standard input, output, and error, they can be changed to refer to any file by using the freopen() library function. As part of its operation, freopen() may change the file descriptor underlying the reopened stream. In other words, after an freopen() on stdout, for example, it is no longer safe to assume that the underlying file descriptor is still 1.
The following are the four key system calls for performing file I/O (programming languages and software packages typically employ these calls only indirectly, via I/O libraries):
fd = open (pathname, flags, mode) opens the file identified by pathname, returning a file descriptor used to refer to the open file in subsequent calls. If the file doesn’t exist, open() may create it, depending on the settings of the flags bit-mask argument. The flags argument also specifies whether the file is to be opened for reading, writing, or both. The mode argument specifies the permissions to be placed on the file if it is created by this call. If the open() call is not being used to create a file, this argument is ignored and can be omitted.
numread = read (fd, buffer, count) reads at most count bytes from the open file referred to by fd and stores them in buffer. The read() call returns the number of bytes actually read. If no further bytes could be read (i.e., end-of-file was encountered), read() returns 0.
numwritten = write (fd, buffer, count) writes up to count bytes from buffer to the open file referred to by fd. The write() call returns the number of bytes actually written, which may be less than count.
status = close (fd) is called after all I/O has been completed, in order to release the file descriptor fd and its associated kernel resources.
Before we launch into the details of these system calls, we provide a short demonstration of their use in Example 4-1. This program is a simple version of the cp(1) command. It copies the contents of the existing file named in its first command-line argument to the new file named in its second command-line argument.
We can use the program in Example 4-1 as follows:
$ ./copy oldfile newfile
Example 4-1. Using I/O system calls
fileio/copy.c
#include <sys/stat.h> #include <fcntl.h> #include "tlpi_hdr.h" #ifndef BUF_SIZE /* Allow "cc -D" to override definition */ #define BUF_SIZE 1024 #endif int main(int argc, char *argv[]) { int inputFd, outputFd, openFlags; mode_t filePerms; ssize_t numRead; char buf[BUF_SIZE]; if (argc != 3 || strcmp(argv[1], "--help") == 0) usageErr("%s old-file new-file\n", argv[0]); /* Open input and output files */ inputFd = open(argv[1], O_RDONLY); if (inputFd == -1) errExit("opening file %s", argv[1]); openFlags = O_CREAT | O_WRONLY | O_TRUNC; filePerms = S_IRUSR | S_IWUSR | S_IRGRP | S_IWGRP | S_IROTH | S_IWOTH; /* rw-rw-rw- */ outputFd = open(argv[2], openFlags, filePerms); if (outputFd == -1) errExit("opening file %s", argv[2]); /* Transfer data until we encounter end of input or an error */ while ((numRead = read(inputFd, buf, BUF_SIZE)) > 0) if (write(outputFd, buf, numRead) != numRead) fatal("couldn't write whole buffer"); if (numRead == -1) errExit("read"); if (close(inputFd) == -1) errExit("close input"); if (close(outputFd) == -1) errExit("close output"); exit(EXIT_SUCCESS); }fileio/copy.c