I/O on Large Files

The off_t data type used to hold a file offset is typically implemented as a signed long integer. (A signed data type is required because the value -1 is used for representing error conditions.) On 32-bit architectures (such as x86-32) this would limit the size of files to 231–1 bytes (i.e., 2 GB).

However, the capacity of disk drives long ago exceeded this limit, and thus the need arose for 32-bit UNIX implementations to handle files larger than this size. Since this is a problem common to many implementations, a consortium of UNIX vendors cooperated on the Large File Summit (LFS), to enhance the SUSv2 specification with the extra functionality required to access large files. We outline the LFS enhancements in this section. (The complete LFS specification, finalized in 1996, can be found at http://opengroup.org/platform/lfs.html.)

Linux has provided LFS support on 32-bit systems since kernel 2.4 (glibc 2.2 or later is also required). In addition, the corresponding file system must also support large files. Most native Linux file systems provide this support, but some nonnative file systems do not (notable examples are Microsoft’s VFAT and NFSv2, both of which impose hard limits of 2 GB, regardless of whether the LFS extensions are employed).

Note

Because long integers use 64 bits on 64-bit architectures (e.g., Alpha, IA-64), these architectures generally don’t suffer the limitations that the LFS enhancements were designed to address. Nevertheless, the implementation details of some native Linux file systems mean that the theoretical maximum size of a file may be less than 263–1, even on 64-bit systems. In most cases, these limits are much higher than current disk sizes, so they don’t impose a practical limitation on file sizes.

We can write applications requiring LFS functionality in one of two ways:

To use the transitional LFS API, we must define the _LARGEFILE64_SOURCE feature test macro when compiling our program, either on the command line, or within the source file before including any header files. This API provides functions capable of handling 64-bit file sizes and offsets. These functions have the same names as their 32-bit counterparts, but have the suffix 64 appended to the function name. Among these functions are fopen64(), open64(), lseek64(), truncate64(), stat64(), mmap64(), and setrlimit64(). (We’ve already described some of the 32-bit counterparts of these functions; others are described in later chapters.)

In order to access a large file, we simply use the 64-bit version of the function. For example, to open a large file, we could write the following:

fd = open64(name, O_CREAT | O_RDWR, mode);
if (fd == -1)
    errExit("open");

Note

Calling open64() is equivalent to specifying the O_LARGEFILE flag when calling open(). Attempts to open a file larger than 2 GB by calling open() without this flag return an error.

In addition to the aforementioned functions, the transitional LFS API adds some new data types, including:

The off64_t data type is used with (among others) the lseek64() function, as shown in Example 5-3. This program takes two command-line arguments: the name of a file to be opened and an integer value specifying a file offset. The program opens the specified file, seeks to the given file offset, and then writes a string. The following shell session demonstrates the use of the program to seek to a very large offset in the file (greater than 10 GB) and then write some bytes:

$ ./large_file x 10111222333
$ ls -l x                                   Check size of resulting file
-rw-------    1 mtk      users    10111222337 Mar  4 13:34 x

The recommended method of obtaining LFS functionality is to define the macro _FILE_OFFSET_BITS with the value 64 when compiling a program. One way to do this is via a command-line option to the C compiler:

$ cc -D_FILE_OFFSET_BITS=64 prog.c

Alternatively, we can define this macro in the C source before including any header files:

#define _FILE_OFFSET_BITS 64

This automatically converts all of the relevant 32-bit functions and data types into their 64-bit counterparts. Thus, for example, calls to open() are actually converted into calls to open64(), and the off_t data type is defined to be 64 bits long. In other words, we can recompile an existing program to handle large files without needing to make any changes to the source code.

Using _FILE_OFFSET_BITS is clearly simpler than using the transitional LFS API, but this approach relies on applications being cleanly written (e.g., correctly using off_t to declare variables holding file offsets, rather than using a native C integer type).

The _FILE_OFFSET_BITS macro is not required by the LFS specification, which merely mentions this macro as an optional method of specifying the size of the off_t data type. Some UNIX implementations use a different feature test macro to obtain this functionality.

One problem that the LFS extensions don’t solve for us is how to pass off_t values to printf() calls. In System Data Types, we noted that the portable method of displaying values of one of the predefined system data types (e.g., pid_t or uid_t) was to cast that value to long, and use the %ld printf() specifier. However, if we are employing the LFS extensions, then this is often not sufficient for the off_t data type, because it may be defined as a type larger than long, typically long long. Therefore, to display a value of type off_t, we cast it to long long and use the %lld printf() specifier, as in the following:

#define _FILE_OFFSET_BITS 64

off_t offset;           /* Will be 64 bits, the size of 'long long' */

/* Other code assigning a value to 'offset' */

printf("offset=%lld\n", (long long) offset);

Similar remarks also apply for the related blkcnt_t data type, which is employed in the stat structure (described in Retrieving File Information: stat()).

Note

If we are passing function arguments of the types off_t or stat between separately compiled modules, then we need to ensure that both modules use the same sizes for these types (i.e., either both were compiled with _FILE_OFFSET_BITS set to 64 or both were compiled without this setting).