Chapter 7. Memory Allocation

Many system programs need to be able to allocate extra memory for dynamic data structures (e.g., linked lists and binary trees), whose size depends on information that is available only at run time. This chapter describes the functions that are used to allocate memory on the heap or the stack.

A process can allocate memory by increasing the size of the heap, a variable-size segment of contiguous virtual memory that begins just after the uninitialized data segment of a process and grows and shrinks as memory is allocated and freed (see Figure 6-1 in Virtual Memory Management). The current limit of the heap is referred to as the program break.

To allocate memory, C programs normally use the malloc family of functions, which we describe shortly. However, we begin with a description of brk() and sbrk(), upon which the malloc functions are based.

Resizing the heap (i.e., allocating or deallocating memory) is actually as simple as telling the kernel to adjust its idea of where the process’s program break is. Initially, the program break lies just past the end of the uninitialized data segment (i.e., the same location as &end, shown in Figure 6-1).

After the program break is increased, the program may access any address in the newly allocated area, but no physical memory pages are allocated yet. The kernel automatically allocates new physical pages on the first attempt by the process to access addresses in those pages.

Traditionally, the UNIX system has provided two system calls for manipulating the program break, and these are both available on Linux: brk() and sbrk(). Although these system calls are seldom used directly in programs, understanding them helps clarify how memory allocation works.

#define _BSD_SOURCE             /* Or: #define _SVID_SOURCE */
#include <unistd.h>

int brk(void *end_data_segment);

Note

Returns 0 on success, or -1 on error

void *sbrk(intptr_t increment);

Note

Returns previous program break on success, or (void *) -1 on error

The brk() system call sets the program break to the location specified by end_data_segment. Since virtual memory is allocated in units of pages, end_data_segment is effectively rounded up to the next page boundary.

Attempts to set the program break below its initial value (i.e., below &end) are likely to result in unexpected behavior, such as a segmentation fault (the SIGSEGV signal, described in Signal Types and Default Actions) when trying to access data in now nonexistent parts of the initialized or uninitialized data segments. The precise upper limit on where the program break can be set depends on a range of factors, including: the process resource limit for the size of the data segment (RLIMIT_DATA, described in Details of Specific Resource Limits); and the location of memory mappings, shared memory segments, and shared libraries.

A call to sbrk() adjusts the program break by adding increment to it. (On Linux, sbrk() is a library function implemented on top of brk().) The intptr_t type used to declare increment is an integer data type. On success, sbrk() returns the previous address of the program break. In other words, if we have increased the program break, the return value points to the start of the newly allocated block of memory.

The call sbrk(0) returns the current setting of the program break without changing it. This can be useful if we want to track the size of the heap, perhaps in order to monitor the behavior of a memory allocation package.

In general, C programs use the malloc family of functions to allocate and deallocate memory on the heap. These functions offer several advantages over brk() and sbrk(). In particular, they:

The malloc() function allocates size bytes from the heap and returns a pointer to the start of the newly allocated block of memory. The allocated memory is not initialized.

#include <stdlib.h>

void *malloc(size_t size);

Note

Returns pointer to allocated memory on success, or NULL on error

Because malloc() returns void *, we can assign it to any type of C pointer. The block of memory returned by malloc() is always aligned on a byte boundary suitable for any type of C data structure. In practice, this means that it is allocated on an 8-byte or 16-byte boundary on most architectures.

Note

SUSv3 specifies that the call malloc(0) may return either NULL or a pointer to a small piece of memory that can (and should) be freed with free(). On Linux, malloc(0) follows the latter behavior.

If memory could not be allocated (perhaps because we reached the limit to which the program break could be raised), then malloc() returns NULL and sets errno to indicate the error. Although the possibility of failure in allocating memory is small, all calls to malloc(), and the related functions that we describe later, should check for this error return.

The free() function deallocates the block of memory pointed to by its ptr argument, which should be an address previously returned by malloc() or one of the other heap memory allocation functions that we describe later in this chapter.

#include <stdlib.h>

void free(void *ptr);

In general, free() doesn’t lower the program break, but instead adds the block of memory to a list of free blocks that are recycled by future calls to malloc(). This is done for several reasons:

If the argument given to free() is a NULL pointer, then the call does nothing. (In other words, it is not an error to give a NULL pointer to free().)

Making any use of ptr after the call to free()—for example, passing it to free() a second time—is an error that can lead to unpredictable results.

The program in Example 7-1 can be used to illustrate the effect of free() on the program break. This program allocates multiple blocks of memory and then frees some or all of them, depending on its (optional) command-line arguments.

The first two command-line arguments specify the number and size of blocks to allocate. The third command-line argument specifies the loop step unit to be used when freeing memory blocks. If we specify 1 here (which is also the default if this argument is omitted), then the program frees every memory block; if 2, then every second allocated block; and so on. The fourth and fifth command-line arguments specify the range of blocks that we wish to free. If these arguments are omitted, then all allocated blocks (in steps given by the third command-line argument) are freed.

Running the program in Example 7-1 with the following command line causes the program to allocate 1000 blocks of memory and then free every second block:

$ ./free_and_sbrk 1000 10240 2

The output shows that after these blocks have been freed, the program break is left unchanged from the level it reached when all memory blocks were allocated:

Initial program break:           0x804a6bc
Allocating 1000*10240 bytes
Program break is now:            0x8a13000
Freeing blocks from 1 to 1000 in steps of 2
After free(), program break is:  0x8a13000

The following command line specifies that all but the last of the allocated blocks should be freed. Again, the program break remains at its “high-water mark.”

$ ./free_and_sbrk 1000 10240 1 1 999

Initial program break:           0x804a6bc
Allocating 1000*10240 bytes
Program break is now:            0x8a13000
Freeing blocks from 1 to 999 in steps of 1
After free(), program break is:  0x8a13000

If, however, we free a complete set of blocks at the top end of the heap, we see that the program break decreases from its peak value, indicating that free() has used sbrk() to lower the program break. Here, we free the last 500 blocks of allocated memory:

$ ./free_and_sbrk 1000 10240 1 500 1000

Initial program break:           0x804a6bc
Allocating 1000*10240 bytes
Program break is now:            0x8a13000
Freeing blocks from 500 to 1000 in steps of 1
After free(), program break is:  0x852b000

In this case, the (glibc) free() function is able to recognize that an entire region at the top end of the heap is free, since, when releasing blocks, it coalesces neighboring free blocks into a single larger block. (Such coalescing is done to avoid having a large number of small fragments on the free list, all of which may be too small to satisfy subsequent malloc() requests.)

Although malloc() and free() provide an interface for allocating memory that is much easier to use than brk() and sbrk(), it is still possible to make various programming errors when using them. Understanding how malloc() and free() are implemented provides us with insights into the causes of these errors and how we can avoid them.

The implementation of malloc() is straightforward. It first scans the list of memory blocks previously released by free() in order to find one whose size is larger than or equal to its requirements. (Different strategies may be employed for this scan, depending on the implementation; for example, first-fit or best-fit.) If the block is exactly the right size, then it is returned to the caller. If it is larger, then it is split, so that a block of the correct size is returned to the caller and a smaller free block is left on the free list.

If no block on the free list is large enough, then malloc() calls sbrk() to allocate more memory. To reduce the number of calls to sbrk(), rather than allocating exactly the number of bytes required, malloc() increases the program break in larger units (some multiple of the virtual memory page size), putting the excess memory onto the free list.

Looking at the implementation of free(), things start to become more interesting. When free() places a block of memory onto the free list, how does it know what size that block is? This is done via a trick. When malloc() allocates the block, it allocates extra bytes to hold an integer containing the size of the block. This integer is located at the beginning of the block; the address actually returned to the caller points to the location just past this length value, as shown in Figure 7-1.

When a block is placed on the (doubly linked) free list, free() uses the bytes of the block itself in order to add the block to the list, as shown in Figure 7-2.

As blocks are deallocated and reallocated over time, the blocks of the free list will become intermingled with blocks of allocated, in-use memory, as shown in Figure 7-3.

Now consider the fact that C allows us to create pointers to any location in the heap, and modify the locations they point to, including the length, previous free block, and next free block pointers maintained by free() and malloc(). Add this to the preceding description, and we have a fairly combustible mix when it comes to creating obscure programming bugs. For example, if, via a misdirected pointer, we accidentally increase one of the length values preceding an allocated block of memory, and subsequently deallocate that block, then free() will record the wrong size block of memory on the free list. Subsequently, malloc() may reallocate this block, leading to a scenario where the program has pointers to two blocks of allocated memory that it understands to be distinct, but which actually overlap. Numerous other pictures of what could go wrong can be drawn.

To avoid these types of errors, we should observe the following rules:

Failure to observe the rules listed above can lead to the creation of bugs that are obscure and difficult to reproduce. The task of finding such bugs can be eased considerably by using the malloc debugging tools provided by glibc or one of a number of malloc debugging libraries that are designed for this purpose.

Among the malloc debugging tools provided by glibc are the following:

Further information about all of the above features can be found in the glibc manual.

A malloc debugging library offers the same API as the standard malloc package, but does extra work to catch memory allocation bugs. In order to use such a library, we link our application against that library instead of the malloc package in the standard C library. Because these libraries typically operate at the cost of slower run-time operation, increased memory consumption, or both, we should use them only for debugging purposes, and then return to linking with the standard malloc package for the production version of an application. Among such libraries are Electric Fence (http://www.perens.com/FreeSoftware/), dmalloc (http://dmalloc.com/), Valgrind (http://valgrind.org/), and Insure++ (http://www.parasoft.com/).

As well as malloc(), the C library provides a range of other functions for allocating memory on the heap, and we describe those functions here.

The calloc() function allocates memory for an array of identical items.

#include <stdlib.h>

void *calloc(size_t numitems, size_t size);

Note

Returns pointer to allocated memory on success, or NULL on error

The numitems argument specifies how many items to allocate, and size specifies their size. After allocating a block of memory of the appropriate size, calloc() returns a pointer to the start of the block (or NULL if the memory could not be allocated). Unlike malloc(), calloc() initializes the allocated memory to 0.

Here is an example of the use of calloc():

struct { /* Some field definitions */ } myStruct;
struct myStruct *p;

p = calloc(1000, sizeof(struct myStruct));
if (p == NULL)
    errExit("calloc");

The realloc() function is used to resize (usually enlarge) a block of memory previously allocated by one of the functions in the malloc package.

#include <stdlib.h>

void *realloc(void *ptr, size_t size);

Note

Returns pointer to allocated memory on success, or NULL on error

The ptr argument is a pointer to the block of memory that is to be resized. The size argument specifies the desired new size of the block.

On success, realloc() returns a pointer to the location of the resized block. This may be different from its location before the call. On error, realloc() returns NULL and leaves the block pointed to by ptr untouched (SUSv3 requires this).

When realloc() increases the size of a block of allocated memory, it doesn’t initialize the additionally allocated bytes.

Memory allocated using calloc() or realloc() should be deallocated with free().

Note

The call realloc(ptr, 0) is equivalent to calling free(ptr) followed by malloc(0). If ptr is specified as NULL, then realloc() is equivalent to calling malloc(size).

For the usual case, where we are increasing the size of the block of memory, realloc() attempts to coalesce the block with an immediately following block of memory on the free list, if one exists and is large enough. If the block lies at the end of the heap, then realloc() expands the heap. If the block of memory lies in the middle of the heap, and there is insufficient free space immediately following it, realloc() allocates a new block of memory and copies all existing data from the old block to the new block. This last case is common and CPU-intensive. In general, it is advisable to minimize the use of realloc().

Since realloc() may relocate the block of memory, we must use the returned pointer from realloc() for future references to the memory block. We can employ realloc() to reallocate a block pointed to by the variable ptr as follows:

nptr = realloc(ptr, newsize);
if (nptr == NULL) {
    /* Handle error */
} else {                /* realloc() succeeded */
    ptr = nptr;
}

In this example, we didn’t assign the return value of realloc() directly to ptr because, if realloc() had failed, then ptr would have been set to NULL, making the existing block inaccessible.

Because realloc() may move the block of memory, any pointers that referred to locations inside the block before the realloc() call may no longer be valid after the call. The only type of reference to a location within the block that is guaranteed to remain valid is one formed by adding an offset to the pointer to the start of the block. We discuss this point in more detail in Section 48.6.

The memalign() and posix_memalign() functions are designed to allocate memory starting at an address aligned at a specified power-of-two boundary, a feature that is useful for some applications (see, for example, Example 13-1, in Alignment restrictions for direct I/O).

#include <malloc.h>

void *memalign(size_t boundary, size_t size);

Note

Returns pointer to allocated memory on success, or NULL on error

The memalign() function allocates size bytes starting at an address aligned to a multiple of boundary, which must be a power of two. The address of the allocated memory is returned as the function result.

The memalign() function is not present on all UNIX implementations. Most other UNIX implementations that provide memalign() require the inclusion of <stdlib.h> instead of <malloc.h> in order to obtain the function declaration.

SUSv3 doesn’t specify memalign(), but instead specifies a similar function, named posix_memalign(). This function is a recent creation of the standards committees, and appears on only a few UNIX implementations.

#include <stdlib.h>

int posix_memalign(void **memptr, size_t alignment, size_t size);

Note

Returns 0 on success, or a positive error number on error

The posix_memalign() function differs from memalign() in two respects:

  • The address of the allocated memory is returned in memptr.

  • The memory is aligned to a multiple of alignment, which must be a power-of-two multiple of sizeof(void *) (4 or 8 bytes on most hardware architectures).

Note also the unusual return value of this function—rather than returning -1 on error, it returns an error number (i.e., a positive integer of the type normally returned in errno).

If sizeof(void *) is 4, then we can use posix_memalign() to allocate 65,536 bytes of memory aligned on a 4096-byte boundary as follows:

int s;
void *memptr;

s = posix_memalign(&memptr, 1024 * sizeof(void *), 65536);
if (s != 0)
    /* Handle error */

Blocks of memory allocated using memalign() or posix_memalign() should be deallocated with free().

Note

On some UNIX implementations, it is not possible to call free() on a block of memory allocated via memalign(), because the memalign() implementation uses malloc() to allocate a block of memory, and then returns a pointer to an address with a suitable alignment in that block. The glibc implementation of memalign() doesn’t suffer this limitation.