Chapter 50. Virtual Memory Operations

This chapter looks at various system calls that perform operations on a process’s virtual address space:

The mprotect() system call changes the protection on a region of virtual memory.
The mlock() and mlockall() system calls lock a region of virtual memory into physical memory, thus preventing it from being swapped out.
The mincore() system call allows a process to determine whether the pages in a region of virtual memory are resident in physical memory.
The madvise() system call allows a process to advise the kernel about its future patterns of usage of a virtual memory region.

Some of these system calls find particular use in conjunction with shared memory regions (Chapter 48, Chapter 49, and Chapter 54), but they can be applied to any region of a process’s virtual memory.

Note

The techniques described in this chapter are not in fact about IPC at all, but we include them in this part of the book because they are sometimes used with shared memory.

Changing Memory Protection: mprotect()

The mprotect() system call changes the protection on the virtual memory pages in the range starting at addr and continuing for length bytes.

#include <sys/mman.h>

int mprotect(void *addr, size_t length, int prot);

Note

Returns 0 on success, or -1 on error

The value given in addr must be a multiple of the system page size (as returned by sysconf(_SC_PAGESIZE)). (SUSv3 specified that addr must be page-aligned. SUSv4 says that an implementation may require this argument to be page-aligned.) Because protections are set on whole pages, length is, in effect, rounded up to the next multiple of the system page size.

The prot argument is a bit mask specifying the new protection for this region of memory. It must be specified as either PROT_NONE or a combination created by ORing together one or more of PROT_READ, PROT_WRITE, and PROT_EXEC. All of these values have the same meaning as for mmap() (Table 49-2, in Creating a Mapping: mmap()).

If a process attempts to access a region of memory in a manner that violates the memory protection, the kernel generates a SIGSEGV signal for the process.

One use of mprotect() is to change the protection of a region of mapped memory originally set in a call to mmap(), as shown in Example 50-1. This program creates an anonymous mapping that initially has all access denied (PROT_NONE). The program then changes the protection on the region to read plus write. Before and after making the change, the program uses the system() function to execute a shell command that displays the line from the /proc/PID/maps file corresponding to the mapped region, so that we can see the change in memory protection. (We could have obtained the mapping information by directly parsing /proc/self/maps, but we used the call to system() because it results in a shorter program.) When we run this program, we see the following:

$ ./t_mprotect
Before mprotect()
b7cde000-b7dde000 ---s 00000000 00:04 18258    /dev/zero (deleted)
After mprotect()
b7cde000-b7dde000 rw-s 00000000 00:04 18258    /dev/zero (deleted)

From the last line of output, we can see that mprotect() has changed the permissions of the memory region to PROT_READ | PROT_WRITE. (For an explanation of the (deleted) string that appears after /dev/zero in the shell output, refer to Section 48.5.)

Example 50-1. Changing memory protection with mprotect()

vmem/t_mprotect.c
#define _BSD_SOURCE         /* Get MAP_ANONYMOUS definition from <sys/mman.h> */
#include <sys/mman.h>
#include "tlpi_hdr.h"

#define LEN (1024 * 1024)

#define SHELL_FMT "cat /proc/%ld/maps | grep zero"
#define CMD_SIZE (sizeof(SHELL_FMT) + 20)
                            /* Allow extra space for integer string */

int
main(int argc, char *argv[])
{
    char cmd[CMD_SIZE];
    char *addr;

    /* Create an anonymous mapping with all access denied */

    addr = mmap(NULL, LEN, PROT_NONE, MAP_SHARED | MAP_ANONYMOUS, -1, 0);
    if (addr == MAP_FAILED)
        errExit("mmap");

    /* Display line from /proc/self/maps corresponding to mapping */

    printf("Before mprotect()\n");
    snprintf(cmd, CMD_SIZE, SHELL_FMT, (long) getpid());
    system(cmd);

    /* Change protection on memory to allow read and write access */

    if (mprotect(addr, LEN, PROT_READ | PROT_WRITE) == -1)
        errExit("mprotect");

    printf("After mprotect()\n");
    system(cmd);                /* Review protection via /proc/self/maps */

    exit(EXIT_SUCCESS);
}
      vmem/t_mprotect.c