File and anonymous mappings

The next point to understand is that there are broadly two types of mappings; a file-mapped region or an anonymous region. A file-mapped region quite obviously maps the (full, or partial) content of a file (as shown in the previous figure). We think of the region as being backed by a file; that is, if the OS runs short of memory and decides to reclaim some of the file-mapped pages, it need not write them to the swap partition—they're already available within the file that was mapped. On the other hand, an anonymous mapping is a mapping whose content is dynamic; the data segments (initialized data, BSS, heap), the data sections of library mappings, and the process (or thread) stack(s) are excellent examples of anonymous mappings. Think of them as not being file-backed; thus, if memory runs short, their pages may indeed be written to swap by the OS. Also, recall what we learned back in Chapter 4Dynamic Memory Allocation, regarding the malloc(3); the fact is that the glibc malloc(3) engine uses the heap segment to service the allocation only when it's for a small amount—less than MMAP_THRESHOLD (defaults to 128 KB). Any malloc(3) above that will result in mmap(2) being internally invoked to set up an anonymous memory region—a mapping!—of the required size. These mappings (or segments) will live in the available virtual address space between the top of the heap and the stack of main.

Back to the mmap(2): the fourth parameter is a bitmask called flags; there are several flags, and they affect many attributes of the mapping. Among them, two flags determine the privacy of the mapping and are mutually exclusive (you can only use any one of them at a time):

  • MAP_SHARED: The mapping is a shared one; other processes might work on the same mapping simultaneously (this, in fact, is the generic manner in which a common IPC mechanism—shared memory —can be implemented). In the case of a file mapping, if the memory region is written to, the underlying file is updated! (You can use the msync(2) to control the flushing of in-memory writes to the underlying file.)
  • MAP_PRIVATE: This sets up a private mapping; if it's writable, it implies COW  semantics (leading to optimal memory usage, as explained in Chapter 10, Process Creation). A file-mapped region that is private will not carry through writes to the underlying file. Actually, a private file-mapping is very common on Linux: this is precisely how, at the time of starting to execute a process, the loader (see the information box) brings in the text and data of both the binary executable as well as the text and data of all shared libraries that the process uses.
The reality is that when a process runs, control first goes to a program embedded into your a.out binary executable—the loader (ld.so or ld-linux[-*].so). It performs the key work of setting up the C runtime environment: it memory maps (via the mmap(2)) the text (code) and initialized data segments from the binary executable file into the process, thereby creating the segments in the VAS that we have been talking about since Chapter 2, Virtual Memory. Further, it sets up the initialized data segment, the BSS, the heap, and the stack (of main()), and then it looks for and memory maps all shared libraries into the process VAS.

Try performing a strace(1) on a program; you will see (early in the execution) all the mmap(2) system calls setting up the process VAS! The mmap(2) is critical to Linux: in effect, the entire setup of the process VAS, the segments or mappings—both at process startup as well as later—are all done via the mmap(2) system call.

To help get these important facts clear, we show some (truncated) output of running  strace(1) upon ls(1); (for example) see how the open(2) is done upon glibc, file descriptor 3 is returned, and that in turn is used by the mmap(2) to create a private file-mapped read-only mapping of glibc's code (we can tell by seeing that the offset in the first mmap is 0) in the process VAS! (A detail: the open(2) becomes the openat(2) function within the kernel; ignore that, just as quite often on Linux, the mmap(2) becomes mmap2(2).) The strace(1) (truncated) output follows:

$ strace -e trace=openat,mmap ls > /dev/null
...
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
mmap(NULL, 4131552, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7f963d8a5000
mmap(0x7f963dc8c000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1e7000) = 0x7f963dc8c000
...
The kernel maintains a data structure called the virtual memory area (VMA) for each such mapping per process; the proc filesystem reveals all mappings to us in user space via /proc/PID/maps. Do take a look; you will literally see the virtual memory map of the process user space. (Try sudo cat /proc/self/maps to see the map of the cat process itself.) The man page on proc(5) explains in detail how to interpret this map; please take a look.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.15.210.12