26.7OSAPIIdeas 425
26.7OSAPIIdeas
All interactions that the memory manager has with the underlying operating sys-
tem are tucked away in the
OSAPI. The OSAPI has to be able to tell immediately
what kind of page an allocation comes from and be able to return the pointer to
the start of that page. This has to be very fast, or it becomes the bottleneck for all
other memory operations. It also must be able to allocate new pages and delete
pages that are empty and no longer needed.
There are many possible approaches to implementing the
OSAPI. Our re-
search on this has not yet reached a final best solution, and indeed, we believe
that the right solution depends on the project and platform. Even so, there are a
few ways to implement the
OSAPI’s page tracking, so we discuss some here.
The
OSAPIForWindows is implemented such that all SBA and MBA pages
are tracked in a
std::set, which is implemented as a red-black tree. While theo-
retically fast, each node requires its own allocation, which can both fragment the
underlying heap and be slow. Pages are allocated using the
VirtualAlloc()
function, which has some implications for ideal page size, especially on 32-bit
systems where the amount of address space mapped should be equal to the physi-
cal memory being mapped. Performance is not great, but it does work well as a
rudimentary example of how to implement the features required by the interface.
It also suffers due to the many calls to
VirtualAlloc() and VirtualFree(),
causing thunks into the kernel to allocate memory. With this technique, operating
system functions show up on performance profiles—never a good sign.
The
OSAPIFastForWindows is an improved version that preallocates a big
chunk of memory and assigns pages from this chunk. To maximize flexible use
of this space, SBA and MBA pages are set to the same size and are allowed to
change from small- to medium-sized pages while the page is empty. This is ac-
complished simply by tracking the type of page with an array of bits and tracking
whether the page is in use with another array of bits. To limit time spent search-
ing for free pages, a simple start index is stored in the class that remembers the
last index to a freed page. It is not required to be accurate, but it does give a good
place to start searching and usually returns a hit on the first attempt to allocate a
page. However, this implementation has the downside of being rather static, and
if there are times when too many large allocations are made, or too many small or
medium allocations are made, the system could catastrophically fail.
Future systems could be written to have completely different allocation strat-
egies. For instance, it could be smart about allocating pages from lower memory
addresses and allocating large blocks from higher memory addresses until the
two collide somewhere in the middle.