A memory-mapped region often requires its attributes to be queried or changed in some fashion. This section looks at four system calls designed for this purpose:
mprotect(2) | Change the access of the indicated memory pages. |
madvise(2) | Advise the UNIX kernel how you intend to use your memory region. |
mincore(2) | Determine if pages of mapped memory are currently in memory. |
msync(2) | Where modifications exist, indicate what regions of memory should be updated to the mapped files. |
A memory-mapped region, entirely or in part, may have its access protections changed by the mprotect(2) system call. Its function synopsis is as follows:
#include <sys/types.h> #include <sys/mman.h> int mprotect(const void *addr, size_t len, int prot);
The function mprotect(2) allows the application to change the region starting at address addr for a length of len bytes, so as to use the protection specified by the argument prot. The prot flags permitted are
PROT_NONE | Region grants no access (this flag excludes use of the other flags). |
PROT_READ | Region grants read access. |
PROT_WRITE | Region grants write access. |
PROT_EXEC | Program instructions may be executed in the memory-mapped region. |
The function mprotect(2) returns the value 0 when successful. Otherwise, -1 is returned, and the error code is found in errno.
Warning
Not all UNIX implementations permit the caller to change memory region protection on a page-by-page basis. For maximum portability, the entire memory region should be specified.
The messages.c program was modified to call mprotect(2) in the file mprotect.c. The changes made to the program are shown in the context diff(1) form in Listing 26.3.
The mprotect(2) call follows the parse_messages() function call in Listing 26.3. At this point, it is desirable to use a read-only status, since this will prevent buggy code from altering the message text. If an attempt is made to change the error message text, a SIGBUS signal will be raised instead.
To achieve maximum performance, you may find it desirable for your application to inform the UNIX kernel about the status of a memory region or about its usage patterns. The system call madvise(2) permits this to be accomplished:
#include <sys/types.h> #include <sys/mman.h> int madvise(void *addr, size_t len, int behavior);
The madvise(2) function returns 0 when successful. The value -1 is returned when the call fails, leaving the error code in the variable errno.
The madvise(2) system call allows you to hint to the kernel about the memory region starting at addr for a length of len bytes. The behavior is specified by one of the following values:
In addition to these, some platforms support the following behavior:
MADV_SPACEAVAIL | Ensures that the necessary resources are reserved. |
Linux and UnixWare 7 do not support the madvise(2) function at all. Table 26.2 provides a cross-reference grid of supported behaviors.
madvise(2) Behavior | Platform | ||||||
---|---|---|---|---|---|---|---|
FreeBSD | SGI IRIX 6.5 | HPUX 11 | UnixWare 7 | Solaris 8 | IBM AIX 4.3 | Linux | |
MADV_NORMAL | X | X | X | X | |||
MADV_RANDOM | X | X | X | X | |||
MADV_SEQUENTIAL | X | X | X | X | |||
MADV_WILLNEED | X | X | X | X | |||
MADV_DONTNEED | X | X | X | X | |||
MADV_FREE | X | X | |||||
MADV_SPACEAVAIL | X | X |
Listing 26.4 shows a context diff(1) listing, illustrating the changes between mprotect.c and madvise.c. In madvise.c, calls to madvise(2) have been added.
The first madvise(2) call occurs before the error message file is parsed, to indicate sequential access with MADV_SEQUENTIAL. Recall that the parsing of the messages is sequential from the start to the end of the mapped message file.
Once the messages have been parsed, however, the access pattern changes to that of a random nature, since any error message may be called upon demand. Hence, the second call to madvise(2) selects behavior MADV_RANDOM.
It is possible to query the kernel to determine which memory pages are currently in memory. This is accomplished by the mincore(2) system call, and its synopsis is as follows:
#include <sys/types.h> #include <sys/mman.h> int mincore(const void *addr, size_t len, char *vec);
The mincore(2) function accepts a starting address addr and a length of len bytes. All pages within this range are then reported by setting values in the vec character array. The array vec is expected to be large enough to contain all the values that must be reported. Each byte receives 1 if the page is in memory or 0 if the page is not in memory. The number of bytes required depends on the length of the region and the page size returned by the function getpagesize(3).
When successful, the value 0 is returned by mincore(2). Otherwise, -1 is returned, and the error is found in the variable errno.
The following shows a call to mincore(2):
char vec[32]; /* Reports for up to 32 pages */ if ( mincore(addr,len,&vec[0]) == -1 ) perror("mincore(2)"); /* Report error */
Table 26.3 shows that support for mincore(2) is not available on many platforms. Also, note that the argument addr is type caddr_t on non-BSD platforms.
mincore(2) Support | Platform | ||||||
---|---|---|---|---|---|---|---|
FreeBSD | SGI IRIX 6.5 | HPUX 11 | UnixWare 7 | Solaris 8 | IBM AIX 4.3 | Linux | |
mincore(2) | X | X | X | X | |||
const void *addr | X | ||||||
caddr_t addr | X | X | X |
When changes are made to writable mapped regions of memory, there are various timing choices for recording changes into the file. The msync(2) system call provides a degree of control over this choice. Its function synopsis is as follows:
#include <sys/types.h> #include <sys/mman.h> int msync(void *addr, size_t len, int flags);
The msync(2) call affects the region starting at addr for a length of len bytes. When len is 0, all of the pages of the region are affected. Argument flags determines what synchronization choice is to take effect:
MS_ASYNC | Request all changes to be written out, but return immediately. (Not implemented for FreeBSD release 3.4.) |
MS_SYNC | Perform synchronous writes of all outstanding changes. |
MS_INVALIDATE | Immediately invalidate all cached modifications to pages. Future references to these pages require the pages to be fetched from the file. |
The MS_SYNC flag is similar to calling fsync(2) on an open file descriptor. It forces all changes out to the disk media and returns once this has been accomplished. The MS_INVALIDATE flag allows the application to discard all changes that have been made. This saves the kernel from synchronizing the memory region with the file.
The function msync(2) returns 0 when successful. Otherwise, -1 is returned with the error code deposited in errno. The following shows an example of a msync(2) call to cause all changes to be immediately written to the file:
if ( msync(addr,0,MS_SYNC) == -1 ) perror("msync(2)");
Table 26.4 shows the support available for msync(2) on the different platforms.
3.17.162.247