The Swap Cache

The swap cache is crucial to avoid race conditions among processes trying to access pages that are being swapped.

If a page is owned by a single process (or better, if the page belongs to an address space that is owned by one or more clone processes), there is just one race condition to be considered: the process attempts to address a page that is being swapped out. An array of semaphores, one per each page slot, could be used to block the process until the I/O data transfer completes.

In many cases, however, a page is owned by several processes. Again, the same array of semaphores could suffice to avoid race conditions, provided that the kernel is able to locate quickly all Page Table entries that refer to the page to be swapped out. Therefore, the kernel could ensure that either all processes see the same page frame or all of them see the swapped-out page identifier.

Unfortunately, there is no quick way in Linux 2.4 to derive from the page frame the list of processes that own it.[110] Scanning all Page Table entries of all processes looking for an entry with a given physical address is very costly, and it is done only in rare occasions (for instance, when deactivating a swap area).

As a result, the same page may be swapped out for some processes and present in memory for others. The kernel avoids the race conditions induced by this peculiar scenario by means of the swap cache.

Before describing how the swap cache works, let’s recall when a page frame may be shared among several processes:

  • The page frame is associated with a shared nonanonymous memory mapping (see Chapter 15).

  • The page frame is handled by means of Copy On Write, typically because a new process has been forked or because the page frame belongs to a private memory mapping (see Section 8.4.4).

  • The page frame is allocated to an IPC shared memory resource (see Section 19.3.5) or to a shared anonymous memory mapping.

Of course, a page frame is also shared by several processes if they share the memory descriptor and thus the whole set of Page Tables. Recall that such processes are created by passing the CLONE_VM flag to the clone( ) system call (see Section 3.4.1). All clone processes, however, count as a single process as far as the swapping algorithm is concerned. Therefore, here we use the term “processes” to mean “processes owning different memory descriptors.”

As we shall see later in this chapter, page frames used for shared nonanonymous memory mappings are never swapped out. Instead, they are handled by another kernel function that writes their data to the proper files and discards them. However, the other two kinds of shared page frames must be carefully handled by the swapping algorithm by means of the swap cache.

The swap cache collects shared page frames that have been copied to swap areas. It does not exist as a data structure on its own; instead, the pages in the regular page cache are considered to be in the swap cache if certain fields are set.

Shared page swapping works in the following manner: consider a page P that is shared among two processes, A and B. Suppose that the swapping algorithm scans the page frames of process A and selects P for swapping out: it allocates a new page slot and copies the data stored in P into the new page slot. It then puts the swapped-out page identifier in the corresponding Page Table entry of process A. Finally, it invokes _ _free_page( ) to release the page frame. However, the page’s usage counter does not become 0 since P is still owned by B. Thus, the swapping algorithm succeeds in transferring the page into the swap area, but fails to reclaim the corresponding page frame.

Suppose now that the swapping algorithm scans the page frames of process B at a later time and selects P for swapping out. The kernel must recognize that P has already been transferred into a swap area so the page won’t be swapped out a second time. Moreover, it must be able to derive the swapped-out page identifier so it can increase the page slot usage counter.

Figure 16-3 illustrates schematically the actions performed by the kernel on a shared page that is swapped out from multiple processes at different times. The numbers inside the swap area and inside P represent the page slot usage counter and the page usage counter, respectively. Notice that each usage count includes every process that is using the page or page slot, plus the swap cache if the page is included in it. Four stages are shown:

  1. In (a), P is present in the Page Tables of both A and B.

  2. In (b), P has been swapped out from A’s address space.

  3. In (c), P has been swapped out from both the address spaces of A and B, but is still included in the swap cache.

  4. Finally, in (d), P has been released to the buddy system.

The role of the swap cache

Figure 16-3. The role of the swap cache

The swap cache is implemented by the page cache data structures and procedures, which are described in Section 14.1. Recall that the core of the page cache is a hash table that allows the algorithm to quickly derive the address of a page descriptor from the address of an address_space object identifying the owner of the page as well as from an offset value.

Pages in the swap cache are stored like any other page in the page cache, with the following special treatment:

  • The mapping field of the page descriptor points to an address_space object stored in the swapper_space variable.

  • The index field stores the swapped-out page identifier associated with the page.

Moreover, when the page is put in the swap cache, both the count field of the page descriptor and the page slot usage counters are incremented, since the swap cache uses both the page frame and the page slot.

Swap Cache Helper Functions

The kernel uses several functions to handle the swap cache; they are based mainly on those discussed in Section 14.1. We show later how these relatively low-level functions are invoked by higher-level functions to swap pages in and out as needed.

The main functions that handle the swap cache are:

lookup_swap_cache( )

Finds a page in the swap cache through its swapped-out page identifier passed as a parameter and returns the page address. It returns 0 if the page is not present in the cache. It invokes find_get_page( ), passing as parameters the address of the swapper_space page address space object and the swapped-out page identifier to find the required page.

add_to_swap_cache( )

Inserts a page into the swap cache. It essentially invokes swap_duplicate( ) to check whether the page slot passed as a parameter is valid and to increment the page slot usage counter; find_get_page( ) to make sure that no other page with the same address_space object and offset already exists; add_to_page_cache( ) to insert the page into the cache; and lru_cache_add( ) to insert the page in the inactive list (see the later section Section 16.7.2).

delete_from_swap_cache( )

Removes a page from the swap cache by flushing its content to disk, clearing the PG_dirty flag, and invoking remove_page_from_inode_queue( ) and remove_page_from_hash_queue( ) (see Section 14.1.2).

free_page_and_swap_cache( )

Releases a page by invoking _ _free_page( ). If the caller is the only process that owns the page, this function also removes the page from the active or inactive list (see the later section Section 16.7.2), removes the page from the swap cache by invoking delete_from_swap_cache( ), and frees the page slot on the swap area by flushing the page contents to disk and invoking swap_free( ).



[110] One of the hot features of Linux 2.5 consists of a data structure that allows the kernel to quickly get a list of all processes that share a given page.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.142.145