In this section, we consider the topic of writing portable system programs. We introduce feature test macros and the standard system data types defined by SUSv3, and then look at some other portability issues.
Various standards govern the behavior of the system call and library function APIs (see Standardization). Some of these standards are defined by standards bodies such The Open Group (Single UNIX Specification), while others are defined by the two historically important UNIX implementations: BSD and System V Release 4 (and the associated System V Interface Definition).
Sometimes, when writing a portable application, we may want the various header files to expose only the definitions (constants, function prototypes, and so on) that follow a particular standard. To do this, we define one or more of the feature test macros listed below when compiling a program. One way that we can do this is by defining the macro in the program source code before including any header files:
#define _BSD_SOURCE 1
Alternatively, we can use the -D option to the C compiler:
$ cc -D_BSD_SOURCE prog.c
The term feature test macro may seem confusing, but it makes sense if we look at things from the perspective of the implementation. The implementation decides which of the features available in each header it should make visible, by testing (with #if
) which values the application has defined for these macros.
The following feature test macros are specified by the relevant standards, and consequently their usage is portable to all systems that support these standards:
_POSIX_SOURCE
If defined (with any value), expose definitions conforming to POSIX.1-1990 and ISO C (1990). This macro is superseded by _POSIX_C_SOURCE
.
_POSIX_C_SOURCE
If defined with the value 1, this has the same effect as _POSIX_SOURCE
. If defined with a value greater than or equal to 199309, also expose definitions for POSIX.1b (realtime). If defined with a value greater than or equal to 199506, also expose definitions for POSIX.1c (threads). If defined with the value 200112, also expose definitions for the POSIX.1-2001 base specification (i.e., the XSI extension is excluded). (Prior to version 2.3.3, the glibc headers don’t interpret the value 200112 for _POSIX_C_SOURCE
.) If defined with the value 200809, also expose definitions for the POSIX.1-2008 base specification. (Prior to version 2.10, the glibc headers don’t interpret the value 200809 for _POSIX_C_SOURCE
.)
_XOPEN_SOURCE
If defined (with any value), expose POSIX.1, POSIX.2, and X/Open (XPG4) definitions. If defined with the value 500 or greater, also expose SUSv2 (UNIX 98 and XPG5) extensions. Setting to 600 or greater additionally exposes SUSv3 XSI (UNIX 03) extensions and C99 extensions. (Prior to version 2.2, the glibc headers don’t interpret the value 600 for _XOPEN_SOURCE
.) Setting to 700 or greater also exposes SUSv4 XSI extensions. (Prior to version 2.10, the glibc headers don’t interpret the value 700 for _XOPEN_SOURCE
.) The values 500, 600, and 700 for _XOPEN_SOURCE
were chosen because SUSv2, SUSv3, and SUSv4 are Issues 5, 6, and 7, respectively, of the X/Open specifications.
The following feature test macros listed are glibc-specific:
_BSD_SOURCE
If defined (with any value), expose BSD definitions. Defining this macro also defines _POSIX_C_SOURCE
with the value 199506. Explicitly setting just this macro causes BSD definitions to be favored in a few cases where standards conflict.
_SVID_SOURCE
If defined (with any value), expose System V Interface Definition (SVID) definitions.
_GNU_SOURCE
If defined (with any value), expose all of the definitions provided by setting all of the preceding macros, as well as various GNU extensions.
When the GNU C compiler is invoked without special options, _POSIX_SOURCE
, _POSIX_C_SOURCE=200809
(200112 with glibc versions 2.5 to 2.9, or 199506 with glibc versions earlier than 2.4), _BSD_SOURCE
, and _SVID_SOURCE
are defined by default.
If individual macros are defined, or the compiler is invoked in one of its standard modes (e.g., cc -ansi or cc -std=c99), then only the requested definitions are supplied. There is one exception: if _POSIX_C_SOURCE
is not otherwise defined, and the compiler is not invoked in one of its standard modes, then _POSIX_C_SOURCE
is defined with the value 200809 (200112 with glibc versions 2.4 to 2.9, or 199506 with glibc versions earlier than 2.4).
Defining multiple macros is additive, so that we could, for example, use the following cc command to explicitly select the same macro settings as are provided by default:
$ cc -D_POSIX_SOURCE -D_POSIX_C_SOURCE=199506 \
-D_BSD_SOURCE -D_SVID_SOURCE prog.c
The <features.h>
header file and the feature_test_macros(7) manual page provide further information on precisely what values are assigned to each of the feature test macros.
Only the _POSIX_C_SOURCE
and _XOPEN_SOURCE
feature test macros are specified in POSIX.1-2001/SUSv3, which requires that these macros be defined with the values 200112 and 600, respectively, in conforming applications. Defining _POSIX_C_SOURCE
as 200112 provides conformance to the POSIX.1-2001 base specification (i.e., POSIX conformance, excluding the XSI extension). Defining _XOPEN_SOURCE
as 600 provides conformance to SUSv3 (i.e., XSI conformance, the base specification plus the XSI extension). Analogous statements apply for POSIX.1-2008/SUSv4, which require that the two macros be defined with the values 200809 and 700.
SUSv3 specifies that setting _XOPEN_SOURCE
to 600 should supply all of the features that are enabled if _POSIX_C_SOURCE
is set to 200112. Thus, an application needs to define only _XOPEN_SOURCE
for SUSv3 (i.e., XSI) conformance. SUSv4 makes an analogous specification that setting _XOPEN_SOURCE
to 700 should supply all of the features that are enabled if _POSIX_C_SOURCE
is set to 200809.
The manual pages describe which feature test macro(s) must be defined in order to make a particular constant definition or function declaration visible from a header file.
All of the source code examples in this book are written so that they will compile using either the default GNU C compiler options or the following options:
$ cc -std=c99 -D_XOPEN_SOURCE=600
The prototype of each function shown in this book indicates any feature test macro(s) that must be defined in order to employ that function in a program compiled with either the default compiler options or the options in the cc command just shown. The manual pages provide more precise descriptions of the feature test macro(s) required to expose the declaration of each function.
Various implementation data types are represented using standard C types, for example, process IDs, user IDs, and file offsets. Although it would be possible to use the C fundamental types such as int and long to declare variables storing such information, this reduces portability across UNIX systems, for the following reasons:
The sizes of these fundamental types vary across UNIX implementations (e.g., a long may be 4 bytes on one system and 8 bytes on another), or sometimes even in different compilation environments on the same implementation. Furthermore, different implementations may use different types to represent the same information. For example, a process ID might be an int on one system but a long on another.
Even on a single UNIX implementation, the types used to represent information may differ between releases of the implementation. Notable examples on Linux are user and group IDs. On Linux 2.2 and earlier, these values were represented in 16 bits. On Linux 2.4 and later, they are 32-bit values.
To avoid such portability problems, SUSv3 specifies various standard system data types, and requires an implementation to define and use these types appropriately. Each of these types is defined using the C typedef
feature. For example, the pid_t data type is intended for representing process IDs, and on Linux/x86-32 this type is defined as follows:
typedef int pid_t;
Most of the standard system data types have names ending in _t. Many of them are declared in the header file <sys/types.h>
, although a few are defined in other header files.
An application should employ these type definitions to portably declare the variables it uses. For example, the following declaration would allow an application to correctly represent process IDs on any SUSv3-conformant system:
pid_t mypid;
Table 3-1 lists some of the system data types we’ll encounter in this book. For certain types in this table, SUSv3 requires that the type be implemented as an arithmetic type. This means that the implementation may choose the underlying type as either an integer or a floating-point (real or complex) type.
Table 3-1. Selected system data types
Data type | SUSv3 type requirement | Description |
---|---|---|
blkcnt_t | signed integer | File block count (Retrieving File Information: stat()) |
blksize_t | signed integer | File block size (Retrieving File Information: stat()) |
cc_t | unsigned integer | Terminal special character (Terminal Special Characters) |
clock_t | integer or real-floating | System time in clock ticks (Process Time) |
clockid_t | an arithmetic type | Clock identifier for POSIX.1b clock and timer functions (POSIX Interval Timers) |
comp_t | not in SUSv3 | Compressed clock ticks (Process Accounting) |
dev_t | an arithmetic type | Device number, consisting of major and minor numbers (Retrieving File Information: stat()) |
DIR | no type requirement | Directory stream (Reading Directories: opendir() and readdir()) |
fd_set | structure type | File descriptor set for select() (The select() System Call) |
fsblkcnt_t | unsigned integer | File-system block count (Obtaining Information About a File System: statvfs()) |
fsfilcnt_t | unsigned integer | File count (Obtaining Information About a File System: statvfs()) |
gid_t | integer | Numeric group identifier (The Group File: /etc/group) |
id_t | integer | A generic type for holding identifiers; large enough to hold at least pid_t, uid_t, and gid_t |
in_addr_t | 32-bit unsigned integer | IPv4 address (Internet Socket Addresses) |
in_port_t | 16-bit unsigned integer | IP port number (Internet Socket Addresses) |
ino_t | unsigned integer | File i-node number (Retrieving File Information: stat()) |
key_t | an arithmetic type | System V IPC key (IPC Keys) |
mode_t | integer | File permissions and type (Retrieving File Information: stat()) |
mqd_t | no type requirement, but shall not be an array type | POSIX message queue descriptor |
msglen_t | unsigned integer | Number of bytes allowed in System V message queue (Message Queue Associated Data Structure) |
msgqnum_t | unsigned integer | Counts of messages in System V message queue (Message Queue Associated Data Structure) |
nfds_t | unsigned integer | Number of file descriptors for poll() (The poll() System Call) |
nlink_t | integer | Count of (hard) links to a file (Retrieving File Information: stat()) |
off_t | signed integer | File offset or size (Changing the File Offset: lseek() and Retrieving File Information: stat()) |
pid_t | signed integer | Process ID, process group ID, or session ID (Process ID and Parent Process ID, Process Groups, and Sessions) |
ptrdiff_t | signed integer | Difference between two pointer values, as a signed integer |
rlim_t | unsigned integer | Resource limit (Process Resource Limits) |
sa_family_t | unsigned integer | Socket address family (Generic Socket Address Structures: struct sockaddr) |
shmatt_t | unsigned integer | Count of attached processes for a System V shared memory segment (Shared Memory Associated Data Structure) |
sig_atomic_t | integer | Data type that can be atomically accessed (Global Variables and the sig_atomic_t Data Type) |
siginfo_t | structure type | Information about the origin of a signal (The SA_SIGINFO Flag) |
sigset_t | integer or structure type | Signal set (Signal Sets) |
size_t | unsigned integer | Size of an object in bytes |
socklen_t | integer type of at least 32 bits | Size of a socket address structure in bytes (Binding a Socket to an Address: bind()) |
speed_t | unsigned integer | Terminal line speed (Terminal Line Speed (Bit Rate)) |
ssize_t | signed integer | Byte count or (negative) error indication |
stack_t | structure type | Description of an alternate signal stack (Handling a Signal on an Alternate Stack: sigaltstack()) |
suseconds_t | signed integer allowing range [-1, 1000000] | Microsecond time interval (Calendar Time) |
tcflag_t | unsigned integer | Terminal mode flag bit mask (Retrieving and Modifying Terminal Attributes) |
time_t | integer or real-floating | Calendar time in seconds since the Epoch (Calendar Time) |
timer_t | an arithmetic type | Timer identifier for POSIX.1b interval timer functions (POSIX Interval Timers) |
uid_t | integer | Numeric user identifier (The Password File: /etc/passwd) |
When discussing the data types in Table 3-1 in later chapters, we’ll often make statements that some type “is an integer type [specified by SUSv3].” This means that SUSv3 requires the type to be defined as an integer, but doesn’t require that a particular native integer type (e.g., short, int, or long) be used. (Often, we won’t say which particular native data type is actually used to represent each of the system data types in Linux, because a portable application should be written so that it doesn’t care which data type is used.)
When printing values of one of the numeric system data types shown in Table 3-1 (e.g., pid_t and uid_t), we must be careful not to include a representation dependency in the printf() call. A representation dependency can occur because C’s argument promotion rules convert values of type short to int, but leave values of type int and long unchanged. This means that, depending on the definition of the system data type, either an int or a long is passed in the printf() call. However, because printf() has no way to determine the types of its arguments at run time, the caller must explicitly provide this information using the %d
or %ld
format specifier. The problem is that simply coding one of these specifiers within the printf() call creates an implementation dependency. The usual solution is to use the %ld
specifier and always cast the corresponding value to long, like so:
pid_t mypid; mypid = getpid(); /* Returns process ID of calling process */ printf("My PID is %ld\n", (long) mypid);
We make one exception to the above technique. Because the off_t data type is the size of long long in some compilation environments, we cast off_t values to this type and use the %lld
specifier, as described in I/O on Large Files.
The C99 standard defines the z
length modifier for printf(), to indicate that the following integer conversion corresponds to a size_t or ssize_t type. Thus, we could write %zd
instead of using %ld
plus a cast for these types. Although this specifier is available in glibc, we avoid it because it is not available on all UNIX implementations.
The C99 standard also defines the j
length modifier, which specifies that the corresponding argument is of type intmax_t (or uintmax_t), an integer type that is guaranteed to be large enough to be able to represent an integer of any type. Ultimately, the use of an (intmax_t) cast plus the %jd
specifier should replace the (long) cast plus the %ld
specifier as the best way of printing numeric system data type values, since the former approach also handles long long values and any extended integer types such as int128_t. However, (again) we avoid this technique since it is not possible on all UNIX implementations.
In this section, we consider a few other portability issues that we may encounter when writing system programs.
Each UNIX implementation specifies a range of standard structures that are used in various system calls and library functions. As an example, consider the sembuf structure, which is used to represent a semaphore operation to be performed by the semop() system call:
struct sembuf { unsigned short sem_num; /* Semaphore number */ short sem_op; /* Operation to be performed */ short sem_flg; /* Operation flags */ };
Although SUSv3 specifies structures such as sembuf, it is important to realize the following:
In general, the order of field definitions within such structures is not specified.
In some cases, extra implementation-specific fields may be included in such structures.
Consequently, it is not portable to use a structure initializer such as the following:
struct sembuf s = { 3, -1, SEM_UNDO };
Although this initializer will work on Linux, it won’t work on another implementation where the fields in the sembuf structure are defined in a different order. To portably initialize such structures, we must use explicit assignment statements, as in the following:
struct sembuf s; s.sem_num = 3; s.sem_op = -1; s.sem_flg = SEM_UNDO;
If we are using C99, then we can employ that language’s new syntax for structure initializers to write an equivalent initialization:
struct sembuf s = { .sem_num = 3, .sem_op = -1, .sem_flg = SEM_UNDO };
Considerations about the order of the members of standard structures also apply if we want to write the contents of a standard structure to a file. To do this portably, we can’t simply do a binary write of the structure. Instead, the structure fields must be written individually (probably in text form) in a specified order.
In some cases, a macro may not be defined on all UNIX implementations. For example, the WCOREDUMP()
macro (which checks whether a child process produced a core dump file) is widely available, but it is not specified in SUSv3. Therefore, this macro might not be present on some UNIX implementations. In order to portably handle such possibilities, we can use the C preprocessor #ifdef
directive, as in the following example:
#ifdef WCOREDUMP /* Use WCOREDUMP() macro */ #endif
In some cases, the header files required to prototype various system calls and library functions vary across UNIX implementations. In this book, we show the requirements on Linux and note any variations from SUSv3.
Some of the function synopses in this book show a particular header file with the accompanying comment /* For portability */. This indicates that the header file is not required on Linux or by SUSv3, but because some other (especially older) implementations may require it, we should include it in portable programs.
For many of the functions that it specified, POSIX.1-1990 required that the header <sys/types.h>
be included before any other headers associated with the function. However, this requirement was redundant, because most contemporary UNIX implementations did not require applications to include this header for these functions. Consequently, SUSv1 removed this requirement. Nevertheless, when writing portable programs, it is wise to make this one of the first header files included. (However, we omit this header from our example programs because it is not required on Linux and omitting it allows us to make the example programs one line shorter.)