Thread-Specific Data

The most efficient way of making a function thread-safe is to make it reentrant. All new library functions should be implemented in this way. However, for an existing nonreentrant library function (one that was perhaps designed before the use of threads became common), this approach usually requires changing the function’s interface, which means modifying all of the programs that use the function.

Thread-specific data is a technique for making an existing function thread-safe without changing its interface. A function that uses thread-specific data may be slightly less efficient than a reentrant function, but allows us to leave the programs that call the function unchanged.

Thread-specific data allows a function to maintain a separate copy of a variable for each thread that calls the function, as illustrated in Figure 31-1. Thread-specific data is persistent; each thread’s variable continues to exist between the thread’s invocations of the function. This allows the function to maintain per-thread information between calls to the function, and allows the function to pass distinct result buffers (if required) to each calling thread.

Figure 31-1. Thread-specific data (TSD) provides per-thread storage for a function

Thread-Specific Data from the Library Function’s Perspective

In order to understand the use of the thread-specific data API, we need to consider things from the point of view of a library function that uses thread-specific data:

The function must allocate a separate block of storage for each thread that calls the function. This block needs to be allocated once, the first time the thread calls the function.
On each subsequent call from the same thread, the function needs to be able to obtain the address of the storage block that was allocated the first time this thread called the function. The function can’t maintain a pointer to the block in an automatic variable, since automatic variables disappear when the function returns; nor can it store the pointer in a static variable, since only one instance of each static variable exists in the process. The Pthreads API provides functions to handle this task.
Different (i.e., independent) functions may each need thread-specific data. Each function needs a method of identifying its thread-specific data (a key), as distinct from the thread-specific data used by other functions.
The function has no direct control over what happens when the thread terminates. When the thread terminates, it is probably executing code outside the function. Nevertheless, there must be some mechanism (a destructor) to ensure that the storage block allocated for this thread is automatically deallocated when the thread terminates. If this is not done, then a memory leak could occur as threads are continuously created, call the function, and then terminate.

Overview of the Thread-Specific Data API

The general steps that a library function performs in order to use thread-specific data are as follows:

The function creates a key, which is the means of differentiating the thread-specific data item used by this function from the thread-specific data items used by other functions. The key is created by calling the pthread_key_create() function. Creating a key needs to be done only once, when the first thread calls the function. For this purpose, pthread_once() is employed. Creating a key doesn’t allocate any blocks of thread-specific data.
The call to pthread_key_create() serves a second purpose: it allows the caller to specify the address of the programmer-defined destructor function that is used to deallocate each of the storage blocks allocated for this key (see the next step). When a thread that has thread-specific data terminates, the Pthreads API automatically invokes the destructor, passing it a pointer to the data block for this thread.
The function allocates a thread-specific data block for each thread from which it is called. This is done using malloc() (or a similar function). This allocation is done once for each thread, the first time the thread calls the function.
In order to save a pointer to the storage allocated in the previous step, the function employs two Pthreads functions: pthread_setspecific() and pthread_getspecific(). A call to pthread_setspecific() is a request to the Pthreads implementation to say “save this pointer, recording the fact that it is associated with a particular key (the one for this function) and a particular thread (the calling thread).” Calling pthread_getspecific() performs the complementary task, returning the pointer previously associated with a given key for the calling thread. If no pointer was previously associated with a particular key and thread, then pthread_getspecific() returns NULL. This is how a function can determine that it is being called for the first time by this thread, and thus must allocate the storage block for the thread.

Details of the Thread-Specific Data API

In this section, we provide details of each of the functions mentioned in the previous section, and elucidate the operation of thread-specific data by describing how it is typically implemented. The next section shows how to use thread-specific data to write a thread-safe implementation of the standard C library function strerror().

Calling pthread_key_create() creates a new thread-specific data key that is returned to the caller in the buffer pointed to by key.

#include <pthread.h>

int pthread_key_create(pthread_key_t *key, void (*destructor)(void *));

Note

Returns 0 on success, or a positive error number on error

Because the returned key is used by all threads in the process, key should point to a global variable.

The destructor argument points to a programmer-defined function of the following form:

void
dest(void *value)
{
    /* Release storage pointed to by 'value' */
}

Upon termination of a thread that has a non-NULL value associated with key, the destructor function is automatically invoked by the Pthreads API and given that value as its argument. The passed value is normally a pointer to this thread’s thread-specific data block for this key. If a destructor is not required, then destructor can be specified as NULL.

Note

If a thread has multiple thread-specific data blocks, then the order in which the destructors are called is unspecified. Destructor functions should be designed to operate independently of one another.

Looking at the implementation of thread-specific data helps us to understand how it is used. A typical implementation (NPTL is typical), involves the following arrays:

a single global (i.e., process-wide) array of information about thread-specific data keys; and

a set of per-thread arrays, each containing pointers to all of the thread-specific data blocks allocated for a particular thread (i.e., this array contains the pointers stored by calls to pthread_setspecific()).

In this implementation, the pthread_key_t value returned by pthread_key_create() is simply an index into the global array, which we label pthread_keys, whose form is shown in Figure 31-2. Each element of this array is a structure containing two fields. The first field indicates whether this array element is in use (i.e., has been allocated by a previous call to pthread_key_create()). The second field is used to store the pointer to the destructor function for the thread-specific data blocks for this key (i.e., it is a copy of the destructor argument to pthread_key_create()).

Figure 31-2. Implementation of thread-specific data keys

The pthread_setspecific() function requests the Pthreads API to save a copy of value in a data structure that associates it with the calling thread and with key, a key returned by a previous call to pthread_key_create(). The pthread_getspecific() function performs the converse operation, returning the value that was previously associated with the given key for this thread.

#include <pthread.h>

int pthread_setspecific(pthread_key_t key, const void *value);

Note

Returns 0 on success, or a positive error number on error

void *pthread_getspecific(pthread_key_t key);

Note

Returns pointer, or NULL if no thread-specific data isassociated with key

The value argument given to pthread_setspecific() is normally a pointer to a block of memory that has previously been allocated by the caller. This pointer will be passed as the argument for the destructor function for this key when the thread terminates.

Note

The value argument doesn’t need to be a pointer to a block of memory. It could be some scalar value that can be assigned (with a cast) to void *. In this case, the earlier call to pthread_key_create() would specify destructor as NULL.

Figure 31-3 shows a typical implementation of the data structure used to store value. In this diagram, we assume that pthread_keys[1] was allocated to a function named myfunc(). For each thread, the Pthreads API maintains an array of pointers to thread-specific data blocks. The elements of each of these thread-specific arrays have a one-to-one correspondence with the elements of the global pthread_keys array shown in Figure 31-2. The pthread_setspecific() function sets the element corresponding to key in the array for the calling thread.

Figure 31-3. Data structure used to implement thread-specific data (TSD) pointers

When a thread is first created, all of its thread-specific data pointers are initialized to NULL. This means that when our library function is called by a thread for the first time, it must begin by using pthread_getspecific() to check whether the thread already has an associated value for key. If it does not, then the function allocates a block of memory and saves a pointer to the block using pthread_setspecific(). We show an example of this in the thread-safe strerror() implementation presented in the next section.

Employing the Thread-Specific Data API

When we first described the standard strerror() function in Handling Errors from System Calls and Library Functions, we noted that it may return a pointer to a statically allocated string as its function result. This means that strerror() may not be thread-safe. In the next few pages, we look at a non-thread-safe implementation of strerror(), and then show how thread-specific data can be used to make this function thread-safe.

Note

On many UNIX implementations, including Linux, the strerror() function provided by the standard C library is thread-safe. However, we use the example of strerror() anyway, because SUSv3 doesn’t require this function to be thread-safe, and its implementation provides a simple example of the use of thread-specific data.

Example 31-1 shows a simple non-thread-safe implementation of strerror(). This function makes use of a pair of global variables defined by glibc: _sys_errlist is an array of pointers to strings corresponding to the error numbers in errno (thus, for example, _sys_errlist[EINVAL] points to the string Invalid operation), and _sys_nerr specifies the number of elements in _sys_errlist.

Example 31-1. An implementation of strerror() that is not thread-safe

threads/strerror.c

#define _GNU_SOURCE                 /* Get '_sys_nerr' and '_sys_errlist'
                                       declarations from <stdio.h> */

#include <stdio.h>
#include <string.h>           /* Get declaration of strerror() */

#define MAX_ERROR_LEN 256            /* Maximum length of string
                                        returned by strerror() */

static char buf[MAX_ERROR_LEN];     /* Statically allocated return buffer */

char *
strerror(int err)
{
    if (err < 0 || err >= _sys_nerr || _sys_errlist[err] == NULL) {
        snprintf(buf, MAX_ERROR_LEN, "Unknown error %d", err);
    } else {
        strncpy(buf, _sys_errlist[err], MAX_ERROR_LEN - 1);
        buf[MAX_ERROR_LEN - 1] = '\0';          /* Ensure null termination */
    }

    return buf;

}

      threads/strerror.c

We can use the program in Example 31-2 to demonstrate the consequences of the fact that the strerror() implementation in Example 31-1 is not thread-safe. This program calls strerror() from two different threads, but displays the returned value only after both threads have called strerror(). Even though each thread specifies a different value (EINVAL and EPERM) as the argument to strerror(), this is what we see when we compile and link this program with the version of strerror() shown in Example 31-1:

$ ./strerror_test
Main thread has called strerror()
Other thread about to call strerror()
Other thread: str (0x804a7c0) = Operation not permitted
Main thread:  str (0x804a7c0) = Operation not permitted

Both threads displayed the errno string corresponding to EPERM, because the call to strerror() by the second thread (in threadFunc) overwrote the buffer that was written by the call to strerror() in the main thread. Inspection of the output shows that the local variable str in the two threads points to the same memory address.

Example 31-2. Calling strerror() from two different threads

threads/strerror_test.c
#include <stdio.h>
#include <string.h>                 /* Get declaration of strerror() */
#include <pthread.h>
#include "tlpi_hdr.h"

static void *
threadFunc(void *arg)
{
    char *str;

    printf("Other thread about to call strerror()\n");
    str = strerror(EPERM);
    printf("Other thread: str (%p) = %s\n", str, str);

    return NULL;
}

int
main(int argc, char *argv[])
{
    pthread_t t;
    int s;
    char *str;

    str = strerror(EINVAL);
    printf("Main thread has called strerror()\n");

    s = pthread_create(&t, NULL, threadFunc, NULL);
    if (s != 0)
        errExitEN(s, "pthread_create");

    s = pthread_join(t, NULL);
    if (s != 0)
        errExitEN(s, "pthread_join");

    printf("Main thread:  str (%p) = %s\n", str, str);

    exit(EXIT_SUCCESS);
}

      threads/strerror_test.c

Example 31-3 shows a reimplementation of strerror() that uses thread-specific data to ensure thread safety.

The first step performed by the revised strerror() is to call pthread_once() to ensure that the first invocation of this function (from any thread) calls createKey() . The createKey() function calls pthread_key_create() to allocate a thread-specific data key that is stored in the global variable strerrorKey . The call to pthread_key_create() also records the address of the destructor that will be used to free the thread-specific buffers corresponding to this key.

The strerror() function then calls pthread_getspecific() to retrieve the address of this thread’s unique buffer corresponding to strerrorKey. If pthread_getspecific() returns NULL, then this thread is calling strerror() for the first time, and so the function allocates a new buffer using malloc() , and saves the address of the buffer using pthread_setspecific() . If the pthread_getspecific() call returns a non-NULL value, then that pointer refers to an existing buffer that was allocated when this thread previously called strerror().

The remainder of this strerror() implementation is similar to the implementation that we showed earlier, with the difference that buf is the address of a thread-specific data buffer, rather than a static variable.

Example 31-3. A thread-safe implementation of strerror() using thread-specific data

threads/strerror_tsd.c
    #define _GNU_SOURCE             /* Get '_sys_nerr' and '_sys_errlist'
                                   declarations from <stdio.h> */
    #include <stdio.h>
    #include <string.h>             /* Get declaration of strerror() */
    #include <pthread.h>
    #include "tlpi_hdr.h"

    static pthread_once_t once = PTHREAD_ONCE_INIT;
    static pthread_key_t strerrorKey;

    #define MAX_ERROR_LEN 256       /* Maximum length of string in per-thread
                                 buffer returned by strerror() */

    static void                     /* Free thread-specific data buffer */
  destructor(void *buf)
    {
        free(buf);
    }

    static void                     /* One-time key creation function */
  createKey(void)
    {
        int s;

        /* Allocate a unique thread-specific data key and save the address
           of the destructor for thread-specific data buffers */

    s = pthread_key_create(&strerrorKey, destructor);
        if (s != 0)
            errExitEN(s, "pthread_key_create");
    }
        char *
    strerror(int err)
    {
        int s;
        char *buf;

        /* Make first caller allocate key for thread-specific data */

    s = pthread_once(&once, createKey);
      if (s != 0)
            errExitEN(s, "pthread_once");

    buf = pthread_getspecific(strerrorKey);
      if (buf == NULL) {          /* If first call from this thread, allocate
                                       buffer for thread, and save its location */
        buf = malloc(MAX_ERROR_LEN);
          if (buf == NULL)
              errExit("malloc");

     s = pthread_setspecific(strerrorKey, buf);
       if (s != 0)
          errExitEN(s, "pthread_setspecific");
        }

        if (err < 0 || err >= _sys_nerr || _sys_errlist[err] == NULL) {
            snprintf(buf, MAX_ERROR_LEN, "Unknown error %d", err);
        } else {
            strncpy(buf, _sys_errlist[err], MAX_ERROR_LEN - 1);
            buf[MAX_ERROR_LEN - 1] = '\0';          /* Ensure null termination */
        }

        return buf;
    }

        threads/strerror_tsd.c

If we compile and link our test program (Example 31-2) with the new version of strerror() (Example 31-3) to create an executable file, strerror_test_tsd, then we see the following results when running the program:

$ ./strerror_test_tsd
Main thread has called strerror()
Other thread about to call strerror()
Other thread: str (0x804b158) = Operation not permitted
Main thread:  str (0x804b008) = Invalid argument

From this output, we see that the new version of strerror() is thread-safe. We also see that the address pointed to by the local variable str in the two threads is different.

Thread-Specific Data Implementation Limits

As implied by our description of how thread-specific data is typically implemented, an implementation may need to impose limits on the number of thread-specific data keys that it supports. SUSv3 requires that an implementation support at least 128 (_POSIX_THREAD_KEYS_MAX) keys. An application can determine how many keys an implementation actually supports either via the definition of PTHREAD_KEYS_MAX (defined in <limits.h>) or by calling sysconf(_SC_THREAD_KEYS_MAX). Linux supports up to 1024 keys.

Even 128 keys should be more than sufficient for most applications. This is because each library function should employ only a small number of keys—often just one. If a function requires multiple thread-specific data values, these can usually be placed in a single structure that has just one associated thread-specific data key.