Creating a Mapping: mmap()

The mmap() system call creates a new mapping in the calling process’s virtual address space.

#include <sys/mman.h>

void *mmap(void *addr, size_t length, int prot, int
 flags, int fd, off_t offset);

Note

Returns starting address of mapping on success, or MAP_FAILED on error

The addr argument indicates the virtual address at which the mapping is to be located. If we specify addr as NULL, the kernel chooses a suitable address for the mapping. This is the preferred way of creating a mapping. Alternatively, we can specify a non-NULL value in addr, which the kernel takes as a hint about the address at which the mapping should be placed. In practice, the kernel will at the very least round the address to a nearby page boundary. In either case, the kernel will choose an address that doesn’t conflict with any existing mapping. (If the value MAP_FIXED is included in flags, then addr must be page-aligned. We describe this flag in The MAP_FIXED Flag.)

On success, mmap() returns the starting address of the new mapping. On error, mmap() returns MAP_FAILED.

Note

On Linux (and on most other UNIX implementations), the MAP_FAILED constant equates to ((void *) -1). However, SUSv3 specifies this constant because the C standards can’t guarantee that ((void *) -1) is distinct from a successful mmap() return value.

The length argument specifies the size of the mapping in bytes. Although length doesn’t need to be a multiple of the system page size (as returned by sysconf(_SC_PAGESIZE)), the kernel creates mappings in units of this size, so that length is, in effect, rounded up to the next multiple of the page size.

The prot argument is a bit mask specifying the protection to be placed on the mapping. It can be either PROT_NONE or a combination (ORing) of any of the other three flags listed in Table 49-2.

Table 49-2. Memory protection values

Value	Description
`PROT_NONE`	The region may not be accessed
`PROT_READ`	The contents of the region can be read
`PROT_WRITE`	The contents of the region can be modified
`PROT_EXEC`	The contents of the region can be executed

The flags argument is a bit mask of options controlling various aspects of the mapping operation. Exactly one of the following values must be included in this mask:

MAP_PRIVATE: Create a private mapping. Modifications to the contents of the region are not visible to other processes employing the same mapping, and, in the case of a file mapping, are not carried through to the underlying file.
MAP_SHARED: Create a shared mapping. Modifications to the contents of the region are visible to other processes mapping the same region with the MAP_SHARED attribute and, in the case of a file mapping, are carried through to the underlying file. Updates to the file are not guaranteed to be immediate; see the discussion of the msync() system call in Section 49.5.

Aside from MAP_PRIVATE and MAP_SHARED, other flag values can optionally be ORed in flags. We discuss these flags in Additional mmap() Flags and The MAP_FIXED Flag.

The remaining arguments, fd and offset, are used with file mappings (they are ignored for anonymous mappings). The fd argument is a file descriptor identifying the file to be mapped. The offset argument specifies the starting point of the mapping in the file, and must be a multiple of the system page size. To map the entire file, we would specify offset as 0 and length as the size of the file. We say more about file mappings in Section 49.5.

Memory protection in more detail

As noted above, the mmap() prot argument specifies the protection on a new memory mapping. It can contain the value PROT_NONE, or a mask of one of more of the flags PROT_READ, PROT_WRITE, and PROT_EXEC. If a process attempts to access a memory region in a way that violates the protection on the region, then the kernel delivers the SIGSEGV signal to a process.

Note

Although SUSv3 specifies that SIGSEGV should be used to signal memory protection violations, on some implementations, SIGBUS is used instead.

One use of pages of memory marked PROT_NONE is as guard pages at the start or end of a region of memory that a process has allocated. If the process accidentally steps into one of the pages marked PROT_NONE, the kernel informs it of that fact by generating a SIGSEGV signal.

Memory protections reside in process-private virtual memory tables. Thus, different processes may map the same memory region with different protections.

Memory protection can be changed using the mprotect() system call (Changing Memory Protection: mprotect()).

On some UNIX implementations, the actual protections placed on the pages of a mapping may not be exactly those specified in prot. In particular, limitations of the protection granularity of the underlying hardware (e.g., older x86-32 architectures) mean that, on many UNIX implementations, PROT_READ implies PROT_EXEC and vice versa, and on some implementations, specifying PROT_WRITE implies PROT_READ. However, applications should not rely on such behavior; prot should always specify exactly the memory protections that are required.

Note

Modern x86-32 architectures provide hardware support for marking pages tables as NX (no execute), and, since kernel 2.6.8, Linux makes use of this feature to properly separate PROT_READ and PROT_EXEC permissions on Linux/x86-32.

Alignment restrictions specified in standards for offset and addr

SUSv3 specifies that the offset argument of mmap() must be page-aligned, and that the addr argument must also be page-aligned if MAP_FIXED is specified. Linux conforms to these requirements. However, it was later noted that the SUSv3 requirements differed from earlier standards, which imposed looser requirements on these arguments. The consequence of the SUSv3 wording was to (unnecessarily) render some formerly standards-conformant implementations nonconforming. SUSv4 returns to the looser requirement:

An implementation may require that offset be a multiple of the system page size.
If MAP_FIXED is specified, then an implementation may require that addr be page-aligned.
If MAP_FIXED is specified, and addr is nonzero, then addr and offset shall have the same remainder modulo the system page size.
Note
A similar situation arose for the addr argument of mprotect(), msync(), and munmap(). SUSv3 specified that this argument must be page-aligned. SUSv4 says that an implementation may require this argument to be page-aligned.

Example program

Example 49-1 demonstrates the use of mmap() to create a private file mapping. This program is a simple version of cat(1). It maps the (entire) file named in its command-line argument, and then writes the contents of the mapping to standard output.

Example 49-1. Using mmap() to create a private file mapping

mmap/mmcat.c
#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>
#include "tlpi_hdr.h"

int
main(int argc, char *argv[])
{
    char *addr;
    int fd;
    struct stat sb;

    if (argc != 2 || strcmp(argv[1], "--help") == 0)
        usageErr("%s file\n", argv[0]);

    fd = open(argv[1], O_RDONLY);
    if (fd == -1)
        errExit("open");

    /* Obtain the size of the file and use it to specify the size of
       the mapping and the size of the buffer to be written */

    if (fstat(fd, &sb) == -1)
        errExit("fstat");

    addr = mmap(NULL, sb.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
    if (addr == MAP_FAILED)
        errExit("mmap");

    if (write(STDOUT_FILENO, addr, sb.st_size) != sb.st_size)
        fatal("partial/failed write");
    exit(EXIT_SUCCESS);
}
     mmap/mmcat.c