Portability Issues

In this section, we consider the topic of writing portable system programs. We introduce feature test macros and the standard system data types defined by SUSv3, and then look at some other portability issues.

Various standards govern the behavior of the system call and library function APIs (see Standardization). Some of these standards are defined by standards bodies such The Open Group (Single UNIX Specification), while others are defined by the two historically important UNIX implementations: BSD and System V Release 4 (and the associated System V Interface Definition).

Sometimes, when writing a portable application, we may want the various header files to expose only the definitions (constants, function prototypes, and so on) that follow a particular standard. To do this, we define one or more of the feature test macros listed below when compiling a program. One way that we can do this is by defining the macro in the program source code before including any header files:

#define _BSD_SOURCE 1

Alternatively, we can use the -D option to the C compiler:

$ cc -D_BSD_SOURCE prog.c

The following feature test macros are specified by the relevant standards, and consequently their usage is portable to all systems that support these standards:

_POSIX_SOURCE

If defined (with any value), expose definitions conforming to POSIX.1-1990 and ISO C (1990). This macro is superseded by _POSIX_C_SOURCE.

_POSIX_C_SOURCE

If defined with the value 1, this has the same effect as _POSIX_SOURCE. If defined with a value greater than or equal to 199309, also expose definitions for POSIX.1b (realtime). If defined with a value greater than or equal to 199506, also expose definitions for POSIX.1c (threads). If defined with the value 200112, also expose definitions for the POSIX.1-2001 base specification (i.e., the XSI extension is excluded). (Prior to version 2.3.3, the glibc headers don’t interpret the value 200112 for _POSIX_C_SOURCE.) If defined with the value 200809, also expose definitions for the POSIX.1-2008 base specification. (Prior to version 2.10, the glibc headers don’t interpret the value 200809 for _POSIX_C_SOURCE.)

_XOPEN_SOURCE

If defined (with any value), expose POSIX.1, POSIX.2, and X/Open (XPG4) definitions. If defined with the value 500 or greater, also expose SUSv2 (UNIX 98 and XPG5) extensions. Setting to 600 or greater additionally exposes SUSv3 XSI (UNIX 03) extensions and C99 extensions. (Prior to version 2.2, the glibc headers don’t interpret the value 600 for _XOPEN_SOURCE.) Setting to 700 or greater also exposes SUSv4 XSI extensions. (Prior to version 2.10, the glibc headers don’t interpret the value 700 for _XOPEN_SOURCE.) The values 500, 600, and 700 for _XOPEN_SOURCE were chosen because SUSv2, SUSv3, and SUSv4 are Issues 5, 6, and 7, respectively, of the X/Open specifications.

The following feature test macros listed are glibc-specific:

When the GNU C compiler is invoked without special options, _POSIX_SOURCE, _POSIX_C_SOURCE=200809 (200112 with glibc versions 2.5 to 2.9, or 199506 with glibc versions earlier than 2.4), _BSD_SOURCE, and _SVID_SOURCE are defined by default.

If individual macros are defined, or the compiler is invoked in one of its standard modes (e.g., cc -ansi or cc -std=c99), then only the requested definitions are supplied. There is one exception: if _POSIX_C_SOURCE is not otherwise defined, and the compiler is not invoked in one of its standard modes, then _POSIX_C_SOURCE is defined with the value 200809 (200112 with glibc versions 2.4 to 2.9, or 199506 with glibc versions earlier than 2.4).

Defining multiple macros is additive, so that we could, for example, use the following cc command to explicitly select the same macro settings as are provided by default:

$ cc -D_POSIX_SOURCE -D_POSIX_C_SOURCE=199506 \
                                           -D_BSD_SOURCE -D_SVID_SOURCE prog.c

The <features.h> header file and the feature_test_macros(7) manual page provide further information on precisely what values are assigned to each of the feature test macros.

Various implementation data types are represented using standard C types, for example, process IDs, user IDs, and file offsets. Although it would be possible to use the C fundamental types such as int and long to declare variables storing such information, this reduces portability across UNIX systems, for the following reasons:

To avoid such portability problems, SUSv3 specifies various standard system data types, and requires an implementation to define and use these types appropriately. Each of these types is defined using the C typedef feature. For example, the pid_t data type is intended for representing process IDs, and on Linux/x86-32 this type is defined as follows:

typedef int pid_t;

Most of the standard system data types have names ending in _t. Many of them are declared in the header file <sys/types.h>, although a few are defined in other header files.

An application should employ these type definitions to portably declare the variables it uses. For example, the following declaration would allow an application to correctly represent process IDs on any SUSv3-conformant system:

pid_t mypid;

Table 3-1 lists some of the system data types we’ll encounter in this book. For certain types in this table, SUSv3 requires that the type be implemented as an arithmetic type. This means that the implementation may choose the underlying type as either an integer or a floating-point (real or complex) type.

Table 3-1. Selected system data types

Data type

SUSv3 type requirement

Description

blkcnt_t

signed integer

File block count (Retrieving File Information: stat())

blksize_t

signed integer

File block size (Retrieving File Information: stat())

cc_t

unsigned integer

Terminal special character (Terminal Special Characters)

clock_t

integer or real-floating

System time in clock ticks (Process Time)

clockid_t

an arithmetic type

Clock identifier for POSIX.1b clock and timer functions (POSIX Interval Timers)

comp_t

not in SUSv3

Compressed clock ticks (Process Accounting)

dev_t

an arithmetic type

Device number, consisting of major and minor numbers (Retrieving File Information: stat())

DIR

no type requirement

Directory stream (Reading Directories: opendir() and readdir())

fd_set

structure type

File descriptor set for select() (The select() System Call)

fsblkcnt_t

unsigned integer

File-system block count (Obtaining Information About a File System: statvfs())

fsfilcnt_t

unsigned integer

File count (Obtaining Information About a File System: statvfs())

gid_t

integer

Numeric group identifier (The Group File: /etc/group)

id_t

integer

A generic type for holding identifiers; large enough to hold at least pid_t, uid_t, and gid_t

in_addr_t

32-bit unsigned integer

IPv4 address (Internet Socket Addresses)

in_port_t

16-bit unsigned integer

IP port number (Internet Socket Addresses)

ino_t

unsigned integer

File i-node number (Retrieving File Information: stat())

key_t

an arithmetic type

System V IPC key (IPC Keys)

mode_t

integer

File permissions and type (Retrieving File Information: stat())

mqd_t

no type requirement, but shall not be an array type

POSIX message queue descriptor

msglen_t

unsigned integer

Number of bytes allowed in System V message queue (Message Queue Associated Data Structure)

msgqnum_t

unsigned integer

Counts of messages in System V message queue (Message Queue Associated Data Structure)

nfds_t

unsigned integer

Number of file descriptors for poll() (The poll() System Call)

nlink_t

integer

Count of (hard) links to a file (Retrieving File Information: stat())

off_t

signed integer

File offset or size (Changing the File Offset: lseek() and Retrieving File Information: stat())

pid_t

signed integer

Process ID, process group ID, or session ID (Process ID and Parent Process ID, Process Groups, and Sessions)

ptrdiff_t

signed integer

Difference between two pointer values, as a signed integer

rlim_t

unsigned integer

Resource limit (Process Resource Limits)

sa_family_t

unsigned integer

Socket address family (Generic Socket Address Structures: struct sockaddr)

shmatt_t

unsigned integer

Count of attached processes for a System V shared memory segment (Shared Memory Associated Data Structure)

sig_atomic_t

integer

Data type that can be atomically accessed (Global Variables and the sig_atomic_t Data Type)

siginfo_t

structure type

Information about the origin of a signal (The SA_SIGINFO Flag)

sigset_t

integer or structure type

Signal set (Signal Sets)

size_t

unsigned integer

Size of an object in bytes

socklen_t

integer type of at least 32 bits

Size of a socket address structure in bytes (Binding a Socket to an Address: bind())

speed_t

unsigned integer

Terminal line speed (Terminal Line Speed (Bit Rate))

ssize_t

signed integer

Byte count or (negative) error indication

stack_t

structure type

Description of an alternate signal stack (Handling a Signal on an Alternate Stack: sigaltstack())

suseconds_t

signed integer allowing range [-1, 1000000]

Microsecond time interval (Calendar Time)

tcflag_t

unsigned integer

Terminal mode flag bit mask (Retrieving and Modifying Terminal Attributes)

time_t

integer or real-floating

Calendar time in seconds since the Epoch (Calendar Time)

timer_t

an arithmetic type

Timer identifier for POSIX.1b interval timer functions (POSIX Interval Timers)

uid_t

integer

Numeric user identifier (The Password File: /etc/passwd)

When discussing the data types in Table 3-1 in later chapters, we’ll often make statements that some type “is an integer type [specified by SUSv3].” This means that SUSv3 requires the type to be defined as an integer, but doesn’t require that a particular native integer type (e.g., short, int, or long) be used. (Often, we won’t say which particular native data type is actually used to represent each of the system data types in Linux, because a portable application should be written so that it doesn’t care which data type is used.)

When printing values of one of the numeric system data types shown in Table 3-1 (e.g., pid_t and uid_t), we must be careful not to include a representation dependency in the printf() call. A representation dependency can occur because C’s argument promotion rules convert values of type short to int, but leave values of type int and long unchanged. This means that, depending on the definition of the system data type, either an int or a long is passed in the printf() call. However, because printf() has no way to determine the types of its arguments at run time, the caller must explicitly provide this information using the %d or %ld format specifier. The problem is that simply coding one of these specifiers within the printf() call creates an implementation dependency. The usual solution is to use the %ld specifier and always cast the corresponding value to long, like so:

pid_t mypid;

mypid = getpid();           /* Returns process ID of calling process */
printf("My PID is %ld\n", (long) mypid);

We make one exception to the above technique. Because the off_t data type is the size of long long in some compilation environments, we cast off_t values to this type and use the %lld specifier, as described in I/O on Large Files.

In this section, we consider a few other portability issues that we may encounter when writing system programs.

Each UNIX implementation specifies a range of standard structures that are used in various system calls and library functions. As an example, consider the sembuf structure, which is used to represent a semaphore operation to be performed by the semop() system call:

struct sembuf {
    unsigned short sem_num;         /* Semaphore number */
    short          sem_op;          /* Operation to be performed */
    short          sem_flg;         /* Operation flags */
};

Although SUSv3 specifies structures such as sembuf, it is important to realize the following:

  • In general, the order of field definitions within such structures is not specified.

  • In some cases, extra implementation-specific fields may be included in such structures.

Consequently, it is not portable to use a structure initializer such as the following:

struct sembuf s = { 3, -1, SEM_UNDO };

Although this initializer will work on Linux, it won’t work on another implementation where the fields in the sembuf structure are defined in a different order. To portably initialize such structures, we must use explicit assignment statements, as in the following:

struct sembuf s;

s.sem_num = 3;
s.sem_op  = -1;
s.sem_flg = SEM_UNDO;

If we are using C99, then we can employ that language’s new syntax for structure initializers to write an equivalent initialization:

struct sembuf s = { .sem_num = 3, .sem_op = -1, .sem_flg = SEM_UNDO };

Considerations about the order of the members of standard structures also apply if we want to write the contents of a standard structure to a file. To do this portably, we can’t simply do a binary write of the structure. Instead, the structure fields must be written individually (probably in text form) in a specified order.