Creating a Mapping: mmap()

The mmap() system call creates a new mapping in the calling process’s virtual address space.

#include <sys/mman.h>

void *mmap(void *addr, size_t length, int prot, int
 flags, int fd, off_t offset);

Note

Returns starting address of mapping on success, or MAP_FAILED on error

The addr argument indicates the virtual address at which the mapping is to be located. If we specify addr as NULL, the kernel chooses a suitable address for the mapping. This is the preferred way of creating a mapping. Alternatively, we can specify a non-NULL value in addr, which the kernel takes as a hint about the address at which the mapping should be placed. In practice, the kernel will at the very least round the address to a nearby page boundary. In either case, the kernel will choose an address that doesn’t conflict with any existing mapping. (If the value MAP_FIXED is included in flags, then addr must be page-aligned. We describe this flag in The MAP_FIXED Flag.)

On success, mmap() returns the starting address of the new mapping. On error, mmap() returns MAP_FAILED.

Note

On Linux (and on most other UNIX implementations), the MAP_FAILED constant equates to ((void *) -1). However, SUSv3 specifies this constant because the C standards can’t guarantee that ((void *) -1) is distinct from a successful mmap() return value.

The length argument specifies the size of the mapping in bytes. Although length doesn’t need to be a multiple of the system page size (as returned by sysconf(_SC_PAGESIZE)), the kernel creates mappings in units of this size, so that length is, in effect, rounded up to the next multiple of the page size.

The prot argument is a bit mask specifying the protection to be placed on the mapping. It can be either PROT_NONE or a combination (ORing) of any of the other three flags listed in Table 49-2.

Table 49-2. Memory protection values

Value

Description

PROT_NONE

The region may not be accessed

PROT_READ

The contents of the region can be read

PROT_WRITE

The contents of the region can be modified

PROT_EXEC

The contents of the region can be executed

The flags argument is a bit mask of options controlling various aspects of the mapping operation. Exactly one of the following values must be included in this mask:

MAP_PRIVATE

Create a private mapping. Modifications to the contents of the region are not visible to other processes employing the same mapping, and, in the case of a file mapping, are not carried through to the underlying file.

MAP_SHARED

Create a shared mapping. Modifications to the contents of the region are visible to other processes mapping the same region with the MAP_SHARED attribute and, in the case of a file mapping, are carried through to the underlying file. Updates to the file are not guaranteed to be immediate; see the discussion of the msync() system call in Section 49.5.

Aside from MAP_PRIVATE and MAP_SHARED, other flag values can optionally be ORed in flags. We discuss these flags in Additional mmap() Flags and The MAP_FIXED Flag.

The remaining arguments, fd and offset, are used with file mappings (they are ignored for anonymous mappings). The fd argument is a file descriptor identifying the file to be mapped. The offset argument specifies the starting point of the mapping in the file, and must be a multiple of the system page size. To map the entire file, we would specify offset as 0 and length as the size of the file. We say more about file mappings in Section 49.5.

As noted above, the mmap() prot argument specifies the protection on a new memory mapping. It can contain the value PROT_NONE, or a mask of one of more of the flags PROT_READ, PROT_WRITE, and PROT_EXEC. If a process attempts to access a memory region in a way that violates the protection on the region, then the kernel delivers the SIGSEGV signal to a process.

One use of pages of memory marked PROT_NONE is as guard pages at the start or end of a region of memory that a process has allocated. If the process accidentally steps into one of the pages marked PROT_NONE, the kernel informs it of that fact by generating a SIGSEGV signal.

Memory protections reside in process-private virtual memory tables. Thus, different processes may map the same memory region with different protections.

Memory protection can be changed using the mprotect() system call (Changing Memory Protection: mprotect()).

On some UNIX implementations, the actual protections placed on the pages of a mapping may not be exactly those specified in prot. In particular, limitations of the protection granularity of the underlying hardware (e.g., older x86-32 architectures) mean that, on many UNIX implementations, PROT_READ implies PROT_EXEC and vice versa, and on some implementations, specifying PROT_WRITE implies PROT_READ. However, applications should not rely on such behavior; prot should always specify exactly the memory protections that are required.

Note

Modern x86-32 architectures provide hardware support for marking pages tables as NX (no execute), and, since kernel 2.6.8, Linux makes use of this feature to properly separate PROT_READ and PROT_EXEC permissions on Linux/x86-32.