Most implementations of malloc
and free
use memory-management algorithms that are necessarily based on the sizes of objects. The first-fit algorithm used in the previous chapter is an example. In some applications, deallocations are grouped and occur at the same time. Graphical user interfaces are an example. Space for scroll bars, buttons, and so forth, is allocated when a window is created, and deallocated when the window is destroyed. A compiler is another example. lcc
, for example, allocates memory as it compiles a function and deallocates all of that memory at once when it finishes compiling the function.
Memory-management algorithms based on the lifetimes of objects are often better for these kinds of applications. Stack-based allocation is an example of this class of allocation algorithms, but it can be used only if object lifetimes are nested, which often is not the case.
This chapter describes a memory-management interface and an implementation that uses arena-based algorithms, which allocate memory from an arena and deallocate entire arenas at once. Calling malloc
requires a subsequent call to free
. As discussed in the previous chapter, it’s easy to forget to call free
or, worse, to deallocate an object that has already been deallocated, or one that shouldn’t be deallocated.
With the arena-based allocator, there’s no obligation to call free
for every call to malloc
; there’s only a single call that deallocates all the memory allocated in an arena since the last deallocation. Both allocation and deallocation are more efficient, and there are no storage leaks. But the most important benefit of this scheme is that it simplifies code. Applicative algorithms allocate new data structures instead of changing existing ones. The arena-based allocator encourages simple applicative algorithms in place of algorithms that might be more space-efficient but are always more complex because they must remember when to call free
.
There are two disadvantages of the arena-based scheme: It can use more memory, and it can create dangling pointers. If an object is allocated in the wrong arena and that arena is deallocated before the program is done with the object, the program will reference either unallocated memory or memory that has been reused for another, perhaps unrelated, arena. It’s also possible to allocate objects in an arena that isn’t deallocated as early as expected, which creates a storage leak. In practice, however, arena management is so easy that these problems rarely occur.
The Arena
interface specifies two exceptions and functions that manage arenas and allocate memory from them:
<arena.h>≡ #ifndef ARENA_INCLUDED #define ARENA_INCLUDED #include "except.h" #define T Arena_T typedef struct T *T; extern const Except_T Arena_NewFailed; extern const Except_T Arena_Failed; <exported functions 91> #undef T #endif
Arenas are created and destroyed by
<exported functions 91>≡
extern T Arena_new (void);
extern void Arena_dispose(T *ap);
Arena_new
creates a new arena and returns an opaque pointer to the newly created arena. These pointers are passed to the other functions to specify an arena. If Arena_new
cannot allocate the arena, it raises the exception Arena_NewFailed
. Arena_dispose
frees the memory associated with the arena *ap
, disposes of the arena itself, and clears *ap
. It is a checked runtime error to pass a null ap
or *ap
to Arena_dispose
.
The allocation functions Arena_alloc
and Arena_calloc
are like the functions with similar names in the Mem
interface, except they allocate memory from an arena.
<exported functions 91>+≡
extern void *Arena_alloc (T arena, long nbytes,
const char *file, int line);
extern void *Arena_calloc(T arena, long count,
long nbytes, const char *file, int line);
extern void Arena_free (T arena);
Arena_alloc
allocates a block of at least nbytes
in arena
and returns a pointer to the first byte. The block is aligned on an addressing boundary that is suitable for the data with the strictest alignment requirement. The contents of the block are uninitialized. Arena_calloc
allocates a block large enough to hold an array of count
elements, each of size nbytes
, in arena
, and returns a pointer to the first byte. The block is aligned as for Arena_alloc
, and is initialized to zeros. It is a checked runtime error for count
or nbytes
to be nonpositive.
The last two arguments to Arena_alloc
and Arena_calloc
are the file name and the line number of the location of the call. If Arena_alloc
and Arena_calloc
cannot allocate the memory requested, they raise Arena_Failed
and pass file
and line
to Except_raise
so that the exception reports the location of the call. If file
is the null pointer, they supply the source locations within their implementations that raise Arena_Failed
.
Arena_free
deallocates all the storage in arena
, which amounts to deallocating everything that has been allocated in arena
since arena
was created or since the last call to Arena_free
for that arena.
It is a checked runtime error to pass a null T
to any routine in this interface. The routines in this interface can be used with those in the Mem
interface and with other allocators based on malloc
and free
.
<arena.c>≡ #include <stdlib.h> #include <string.h> #include "assert.h" #include "except.h" #include "arena.h" #define T Arena_T const Except_T Arena_NewFailed = { "Arena Creation Failed" }; const Except_T Arena_Failed = { "Arena Allocation Failed" }; <macros 98> <types 92> <data 96> <functions 93>
An arena describes a chunk of memory:
<types 92>≡
struct T {
T prev;
char *avail;
char *limit;
};
The prev
field points to the head of the chunk, which begins with an arena structure as described below, and the limit
field points just past the end of the chunk. The avail
field points to the chunk’s first free location; the space beginning at avail
and up to limit
is available for allocation.
To allocate N
bytes when N
does not exceed limit-avail
, avail
is incremented by N
and its previous value is returned. If N
exceeds limit-avail
, a new chunk is allocated by calling malloc
; the current value of *arena
is “pushed” by storing it at the beginning of the new chunk, the fields of arena
are initialized so they describe the new chunk, and allocation proceeds.
The arena structure thus heads a linked list of chunks in which the links are the prev
fields in copies of the arena structures that begin each chunk. Figure 6.1 shows the state of an arena after three chunks have been allocated. The shading denotes allocated space; chunks can vary in size and may end with unallocated space if allocations don’t exactly fill the chunks.
Arena_new
allocates and returns an arena structure with its fields set to null pointers, which denotes an empty arena:
<functions 93>≡
T Arena_new(void) {
T arena = malloc(sizeof (*arena));
if (arena == NULL)
RAISE(Arena_NewFailed);
arena->prev = NULL;
arena->limit = arena->avail = NULL;
return arena;
}
Arena_dispose
calls Arena_free
to deallocate the chunks in the arena; it then frees the arena structure itself and clears the pointer to the arena:
<functions 93>+≡
void Arena_dispose(T *ap) {
assert(ap && *ap);
Arena_free(*ap);
free(*ap);
*ap = NULL;
}
Arena
uses malloc
and free
instead of, say, Mem_alloc
and Mem_free
, so that it’s independent of other allocators.
Most allocations are trivial: They round the request amount up to the proper alignment boundary, increment the avail
pointer by the amount of the rounded request, and return the previous value.
<functions 93>+≡ void *Arena_alloc(T arena, long nbytes, const char *file, int line) { assert(arena); assert(nbytes > 0); <round nbytes up to an alignment boundary 95> while (nbytes > arena->limit - arena->avail) { <get a new chunk 95> } arena->avail += nbytes; return arena->avail - nbytes; }
As in the checking implementation of the Mem
interface, the size of the union
<types 92>+≡
union align {
int i;
long l;
long *lp;
void *p;
void (*fp)(void);
float f;
double d;
long double ld;
};
gives the minimum alignment on the host machine. Its fields are those that are most likely to have the strictest alignment requirements, and it is used to round up nbytes
:
<round nbytes up to an alignment boundary 95>≡ nbytes = ((nbytes + sizeof (union align) - 1)/ (sizeof (union align)))*(sizeof (union align));
For most calls, nbytes
is less than arena->limit - arena->avail
; that is, the chunk has at least nbytes
of free space, so the body of the while loop in Arena_alloc
above is not executed. If the request cannot be satisfied from the current chunk, a new chunk must be allocated. This wastes the free space at the end of current chunk, which is illustrated in the second chunk on the list shown in Figure 6.1.
After a new chunk is allocated, the current value of *arena
is saved at the beginning of the new chunk, and arena
’s fields are initialized so that allocation can continue:
<get a new chunk 95>≡ T ptr; char *limit; <ptr ← a new chunk 96> *ptr = *arena; arena->avail = (char *)((union header *)ptr + 1); arena->limit = limit; arena->prev = ptr; <types 92>+≡ union header { struct T b; union align a; };
The structure assignment *ptr = *arena
pushes *arena
by saving it at the beginning of the new chunk. The union header
ensures that arena->avail
is set to a properly aligned address for the first allocation in this new chunk.
As shown below, Arena_free
keeps a few free chunks on a free list emanating from freechunks
to reduce the number of times it must call malloc
. This list is threaded through the prev
fields of the chunks’ initial arena structures, and the limit
fields of those structures point just past the ends of their chunks. nfree
is the number of chunks on the list. Arena_alloc
gets a free chunk from this list or by calling malloc
, and it sets the local variable limit
for use in <get a new chunk
95
> above:
<data 96>+≡ static T freechunks; static int nfree; <ptr ← a new chunk 96>≡ if ((ptr = freechunks) != NULL) { freechunks = freechunks->prev; nfree--; limit = ptr->limit; } else { long m = sizeof (union header) + nbytes + 10*1024; ptr = malloc(m); if (ptr == NULL) <raise Arena_Failed 96> limit = (char *)ptr + m; }
If a new chunk must be allocated, one is requested that is large enough to hold an arena structure plus nbytes
, and have 10K bytes of available space left over. If malloc
returns null, allocation fails and Arena_alloc
raises Arena_Failed
:
<raise Arena_Failed 96>≡
{
if (file == NULL)
RAISE(Arena_Failed);
else
Except_raise(&Arena_Failed, file, line);
}
Once arena
points to the new chunk, the while loop in Arena_alloc
tries the allocation again. It still might fail: If the new chunk came from freechunks
, it might be too small to fill the request, which is why there’s a while loop instead of an if statement.
Arena_calloc
simply calls Arena_alloc
:
<functions 93>+≡
void *Arena_calloc(T arena, long count, long nbytes,
const char *file, int line) {
void *ptr;
assert(count > 0);
ptr = Arena_alloc(arena, count*nbytes, file, line);
memset(ptr, ' ', count*nbytes);
return ptr;
}
An arena is deallocated by adding its chunks to the list of free chunks, which also restores *arena
to its initial state as the list is traversed.
<functions 93>+≡ void Arena_free(T arena) { assert(arena); while (arena->prev) { struct T tmp = *arena->prev; <free the chunk described by arena 98> *arena = tmp; } assert(arena->limit == NULL); assert(arena->avail == NULL); }
The structure assignment to tmp
copies to tmp
all of the fields of the arena structure pointed to by arena->prev
. This assignment and the assignment *arena = tmp
thus “pops” the stack of arena structures formed by the list of chunks. Once the entire list is traversed, all of the fields of arena
should be null.
freechunks
accumulates free chunks from all arenas and thus could get large. The length of the list isn’t a problem, but the free storage it holds might be. Chunks on freechunks
look like allocated memory to other allocators and thus might make calls to malloc
fail, for example. To avoid tying up too much storage, Arena_free
keeps no more than
<macros 98>≡
#define THRESHOLD 10
free chunks on freechunks
. Once nfree
reaches THRESHOLD
, subsequent chunks are deallocated by calling free
:
<free the chunk described by arena 98>≡
if (nfree < THRESHOLD) {
arena->prev->prev = freechunks;
freechunks = arena->prev;
nfree++;
freechunks->limit = arena->limit;
} else
free(arena->prev);
In Figure 6.2, the chunk on the left is to be deallocated. When nfree
is less than THRESHOLD
, the chunk is added to freechunks
. The deallocated chunk is shown on the right, and the dotted lines depict the pointers planted by the three assignments in the code above.
Arena-based allocators are also known as pool allocators, and have been described several times. Arena
’s allocator (Hanson 1990) was originally developed for use in lcc
(Fraser and Hanson 1995). lcc
’s allocator is slightly simpler than Arena
’s: Its arenas are allocated statically, and its deallocator doesn’t call free
. In its initial versions, allocation was done by macros that manipulated arena structures directly and called a function only when a new chunk was needed.
Barrett and Zorn (1993) describe how to choose the appropriate arena automatically. Their experiments suggest that the execution path to an allocation site is a good predictor of the lifetime of the block allocated at that site. This information includes the call chain and the address of the allocation site, and it is used to choose one of several application-specific arenas.
Vmalloc (Vo 1996) is a more general allocator that can be used to implement both the Mem
and Arena
interfaces. Vmalloc permits clients to organize memory into regions and to provide functions that manage the memory in each region. The Vmalloc library includes an implementation of the malloc
interface that provides memory checks similar to those done by Mem
’s checking implementation, and these checks can be controlled by setting environment variables.
Arena-based allocation collapses many explicit deallocations into one. Garbage collectors go one step further: They avoid all explicit deallocations. In languages with garbage collectors, programmers can almost ignore storage allocation, and storage allocation bugs can’t occur. The advantages of this property are hard to overstate.
With a garbage collector, space is reclaimed automatically as necessary, usually when an allocation request can’t be filled. A garbage collector finds all blocks that are referenced by program variables, and all blocks that are referenced by fields in these blocks, and so on. These are the accessible blocks; the rest are inaccessible and can be reused. There is a large body of literature on garbage collection: Appel (1991) is a brief survey that emphasizes recent algorithms, and Knuth (1973a) and Cohen (1981) cover the older algorithms in more depth.
To find accessible blocks, most collectors must know which variables point to blocks and which fields in blocks point to other blocks. Collectors are usually used in languages that have enough compile-time or runtime data to supply the necessary information. Examples include LISP, Icon, SmallTalk, ML, and Modula-3. Conservative collectors (Boehm and Weiser 1988) can deal with languages that don’t provide enough type information, such as C and C++. They assume that any properly aligned bit pattern that looks like a pointer is one and that the block it points to is accessible. A conservative collector thus identifies some inaccessible blocks as accessible and therefore busy, but has no choice but to overestimate the set of accessible blocks. Despite this apparent handicap, conservative collectors work surprising well in some programs (Zorn 1993).
18.222.111.24