26. A Highly Optimized Portable Memory Manager (1/4)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

 409

AHighlyOptimizedPortable

MemoryManager

Jason Hughes

Steel Penny Games, Inc.

26.1Introduction

Every game has a memory manager of some sort. On PCs, this tends to be Mi-

crosoft’s C run-time library memory manager. On the various consoles, it’s likely

to either be the platform-specific memory manager that was written by the hard-

ware vendor or the one provided by the compiler company in its run-time library.

Do they all work the same way? No. Do they exhibit the same performance char-

acteristics? Absolutely no. Some allow the heap to become fragmented very

quickly, while others may be very slow for small allocations or when the number

of allocations becomes quite large, and still others may have a high per-allocation

overhead that invisibly eats away at your memory.

Memory allocation is a fundamental operation, thus, it has to satisfy a wide

number of use cases robustly and efficiently. This is a serious technical chal-

lenge. Even a good implementation can harm a game’s performance if exercised

in just the wrong way. A naive implementation can utterly cripple performance

or cause crashes due to artificial low-memory situations (e.g., fragmentation or

overhead). The good news is that most of the provided memory managers are

relatively efficient and work well enough for simple cases with few allocations.

After enough experiences where cross-platform stability and performance

came down strictly to the memory manager, however, you may be tempted to

implement your own that is scalable and easy to tune for best performance. These

days, with so many platforms to support, it’s a mark of quality for an engine to

run well across all machines.

410 26.AHighlyOptimizedPortableMemoryManager

AlternativestoRollingYourOwn

Surely someone has already written a good heap. Doug Lea’s dlmalloc is one of

the best and is typically the memory manager of choice on Linux. There are

many derivatives on the internet that claim some improvements, but there are

many flavors to choose from. dlmalloc is a good choice primarily because it’s

very stable and well-tested, it’s public domain code that is freely available, and it

runs on 32-bit and 64-bit systems of either endianness. Very compelling.

The main gripes with dlmalloc is that it is a mess to look at, it’s horrible to

debug in case of heap corruption, and it’s easily corrupted in case of a buffer

overrun. As for usability, there’s no way to create one-off heaps with it that have

been optimized for special use cases, where you can predict sizes and counts of

allocations. dlmalloc also has a fairly high per-allocation overhead that chews up

memory. As general-purpose heaps go, it’s excellent, but for games, we some-

times need custom solutions with more bells and whistles.

DesirableProperties

After writing about eight or nine different memory managers, a list of priorities

emerges from recognizing the importance of certain attributes that make a

memory manager good. Here are a few:

1. Must not thrash the CPU cache. High performance comes with limiting the

number of bytes that have to interact with RAM. Whatever is touched should

fit within a minimum number of cache lines, both to reduce memory latency

and to limit the amount of cache disruption to the rest of the program.

2. No searching for free space. The naive implementation of a memory manag-

er is a linked list of blocks that are marked free or empty. Scanning this list is

hugely expensive and slow under all circumstances. Good memory managers

have lookup tables of some sort that point to places where free blocks of

memory of various sizes are.

3. Minimum overhead per allocation. Reducing overhead in a memory manager

is almost like adding compression to memory—you can fit more data in the

same physical space.

4. Should be easy to debug. Sometimes memory problems happen, and it’s im-

portant to consider the difficulty in tracking down such issues. Sometimes

this means temporarily adding features to the memory manager. Ideally, the

debug build should do some basic instrumentation as it runs that determines

whether memory has been trampled without slowing down the system.

5. Should resist corruption. Most memory managers are organized such that

blocks of program data are sandwiched between heap tracking information.

26.2Overview 411

This is particularly fragile because buffer overruns tend to be utterly fatal to

the system, but not immediately. The system keeps running until the memory

manager has to allocate or free memory in that region and attempts to use a

pointer that is garbage. However, determining what exactly is smashing the

heap is a challenge, but one made less critical if the memory manager itself

doesn’t crash in response to a programming error. Ideally, program data

should only be able to trample other program data, and heap tracking infor-

mation is separated and thus protected.

6. Should not fragment easily. Fragmentation leads to situations where a small

allocation in the middle of memory prevents a large allocation from succeed-

ing, even though the total number of available bytes (if you add the separate

free chunks together) is more than enough to accommodate the allocation.

It’s also a common failure of memory managers to spend a lot of run-time

cycles reducing existing fragmentation or actively attempting to prevent it.

The wrong approach is creating complex policies that define where alloca-

tions should go, in what order allocations should be placed in the heap, or

how to coalesce adjacent free blocks. Some systems even go to the extent of

relocating memory to manually defragment the heap, similar to Java’s

memory model. In the end, preventing fragmentation should be a design cri-

terion that is inherent in the system, not paid for at run time.

26.2Overview

The following design is one such system that satisfies the minimum criteria, as

specified in the introduction, though there are many others possible. There are

certain trade-offs that are often made during large design processes that ultimate-

ly shape the final software product. Although we describe a specific instance of

the memory manager, the version found on the website is actually a template-

based solution that allows reconfiguration of many parameters so you can easily

experiment with what is right for your specific situation. In fact, it would be easy

to capture all allocations and deletions in a log file with your current memory

manager, then replay the operations in a unit test setting to measure exactly what

performance and memory fragmentation would be under realistic heap

conditions.

FightingFragmentationIsJob#1

From experience, the greatest problem that needs to be addressed by a games

memory manager is fragmentation. There are two types of fragmentation: inter-

nal and external.

412 26.AHighlyOptimizedPortableMemoryManager

■ Internal fragmentation occurs when a single allocation is requested for M

bytes, but the memory manager cannot allocate exactly M bytes, so it rounds

up by an extra N bytes, so



bytes are reserved for the allocation. Inter-

nal fragmentation can be considered a factor of systemic overhead, and is

tolerable if kept to a minimum, because all that unusable memory will be ful-

ly reclaimed when the block is freed.

■

External fragmentation occurs when memory is known to be free and availa-

ble by the memory manager but has been subdivided by allocations so that it

is not contiguous and, thus, cannot be allocated in a single large block. The

degree of fragmentation can be measured in many ways, but developers are

traditionally concerned with the “big block,” that is, the single largest alloca-

tion possible at any given time. The ratio of the big block to total available

memory is another meaningful way to express how fragmented memory is.

As that ratio tends toward zero, the number of free regions increases while

the sizes of free spaces decrease, and the largest single allocation possible

becomes a fraction of total available memory.

External fragmentation, henceforth referred to simply as fragmentation, is the

source of many apparent stability problems in otherwise well-written programs.

For games, this is particularly challenging because in this industry, memory tends

Figure 26.1. From left to right: the heap is free and unfragmented, one large allocation is

made, one small allocation is made, the large allocation is freed. In the rightmost dia-

gram, the heap is partitioned in half and exhibits an approximately 0.5 fragmentation

ratio.

FreeSpace

LargeAlloc

SmallAlloc Sm allAlloc

FreeSpace

26.2Overview 413

to be considerably more limited, and players are more likely to play the game

without resetting for hours or even days on end. Thus, the dreaded “uptime

crash” is almost unavoidable without special precautions, careful planning, and

perhaps a custom memory management strategy that attempts to avoid fragmen-

tation automatically.

For further clarification of how fragmentation can occur within a program,

see Figure 26.1. Fragmentation can be demonstrated in three simple operations:

allocate twice and free once. If one allocation is quite large and is followed by a

small allocation, then once you release the large block back to the system, the

memory manager now has an approximate 0.5 fragmentation ratio. If the next

allocation is slightly larger than 50 percent of the total available memory, it will

fail.

Most decent memory managers have allocation strategies that react different-

ly based on the size of the allocation. Often, there is a specific handler for tiny

allocations (under 256 bytes, for instance) as well as a general allocation method

for everything larger. The idea behind this is primarily to reduce overhead in al-

locating very small blocks of memory that tend to represent the lion’s share of

allocations in most C/C++ programs. One happy consequence is that it prevents

that group of allocations from possibly splitting the big block in half and causing

fragmentation by preventing them from coming from the same memory

pool.

By extending this understanding to the design of a memory manager, it is

possible to reduce external fragmentation, at some small expense of internal

fragmentation. To illuminate by exaggeration, if you round up all of your alloca-

tions to the largest size you’ll ever allocate, then no external fragmentation is

important because no allocation will ever fail due to a fragmented heap, and any

allocation that gets freed can always fit any other allocation that may come in the

future. Of course, all of your memory would be exhausted before that could hap-

pen! Preposterous as it is, scaling back the idea and applying it to smaller alloca-

tions, where rounding up the size is less significant, makes the concept viable.

The trick is to make it fast and limit the amount of internal fragmentation (i.e.,

wasted space) the strategy produces.

Our approach is to make a very fast small block allocator (SBA) for alloca-

tions under 256 bytes, a reasonably fast medium block allocator (MBA) for allo-

cations that are larger than 256 bytes but smaller than a large allocation, and a

large block allocator (LBA) for any allocation of at least 4096 bytes. As men-

tioned above, the code on the website is templatized, so these numbers can be

modified trivially for your purposes, and when set to equal sizes, you can com-

pletely remove the SBA, the MBA, or both.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 26. A Highly Optimized Portable Memory Manager (1/4)

Create new playlist

Sign In

Sign Up

Table of Contents for
26. A Highly Optimized Portable Memory Manager (1/4)