File mappings created with mmap() are linear: there is a sequential, one-to-one correspondence between the pages of the mapped file and the pages of the memory region. For most applications, a linear mapping suffices. However, some applications need to create large numbers of nonlinear mappings—mappings where the pages of the file appear in a different order within contiguous memory. We show an example of a nonlinear mapping in Figure 49-5.
We described one way of creating nonlinear mappings in the previous section: using multiple calls to mmap() with the MAP_FIXED
flag. However, this approach doesn’t scale well. The problem is that each of these mmap() calls creates a separate kernel virtual memory area (VMA) data structure. Each VMA takes time to set up and consumes some nonswappable kernel memory. Furthermore, the presence of a large number of VMAs can degrade the performance of the virtual memory manager; in particular, the time taken to process each page fault can significantly increase when there are tens of thousands of VMAs. (This was a problem for some large database management systems that maintain multiple different views in a database file.)
Each line in the /proc/
PID/maps
file (Location of Shared Memory in Virtual Memory) represents one VMA.
From kernel 2.6 onward, Linux provides the remap_file_pages() system call to create nonlinear mappings without creating multiple VMAs. We do this as follows:
Create a mapping with mmap().
Use one or more calls to remap_file_pages() to rearrange the correspondence between the pages of memory and the pages of the file. (All that remap_file_pages() is doing is manipulating process page tables.)
It is possible to use remap_file_pages() to map the same page of a file into multiple locations within the mapped region.
#define _GNU_SOURCE
#include <sys/mman.h>
int remap_file_pages
(void *addr, size_t size, int prot, size_t
pgoff, int flags);
Returns 0 on success, or -1 on error
The pgoff and size arguments identify a file region whose position in memory is to be changed. The pgoff argument specifies the start of the file region in units of the system page size (as returned by sysconf(_SC_PAGESIZE)). The size argument specifies the length of the file region, in bytes. The addr argument serves two purposes:
It identifies the existing mapping whose pages we want to rearrange. In other words, addr must be an address that falls somewhere within a region that was previously mapped with mmap().
It specifies the memory address at which the file pages identified by pgoff and size are to be located.
Both addr and size should be specified as multiples of the system page size. If they are not, they are rounded down to the nearest multiple of the page size.
Suppose that we use the following call to mmap() to map three pages of the open file referred to by the descriptor fd, and that the call assigns the returned address 0x4001a000
to addr:
ps = sysconf(_SC_PAGESIZE); /* Obtain system page size */ addr = mmap(0, 3 * ps, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
The following calls would then create the nonlinear mapping shown in Figure 49-5:
remap_file_pages(addr, ps, 0, 2, 0); /* Maps page 0 of file into page 2 of region */ remap_file_pages(addr + 2 * ps, ps, 0, 0, 0); /* Maps page 2 of file into page 0 of region */
There are two other arguments to remap_file_pages() that we haven’t yet described:
The prot argument is ignored, and must be specified as 0. In the future, it may be possible to use this argument to change the protection of the memory region affected by remap_file_pages(). In the current implementation, the protection remains the same as that on the entire VMA.
Virtual machines and garbage collectors are other applications that employ multiple VMAs. Some of these applications need to be able to write-protect individual pages. It was intended that remap_file_pages() would allow permissions on individual pages within a VMA to be changed, but this facility has not so far been implemented.
The flags argument is currently unused.
As currently implemented, remap_file_pages() can be applied only to shared (MAP_SHARED
) mappings.
The remap_file_pages() system call is Linux-specific; it is not specified in SUSv3 and is not available on other UNIX implementations.