Chapter 6. More Memory Management

Most implementations of malloc and free use memory-management algorithms that are necessarily based on the sizes of objects. The first-fit algorithm used in the previous chapter is an example. In some applications, deallocations are grouped and occur at the same time. Graphical user interfaces are an example. Space for scroll bars, buttons, and so forth, is allocated when a window is created, and deallocated when the window is destroyed. A compiler is another example. lcc, for example, allocates memory as it compiles a function and deallocates all of that memory at once when it finishes compiling the function.

Memory-management algorithms based on the lifetimes of objects are often better for these kinds of applications. Stack-based allocation is an example of this class of allocation algorithms, but it can be used only if object lifetimes are nested, which often is not the case.

This chapter describes a memory-management interface and an implementation that uses arena-based algorithms, which allocate memory from an arena and deallocate entire arenas at once. Calling malloc requires a subsequent call to free. As discussed in the previous chapter, it’s easy to forget to call free or, worse, to deallocate an object that has already been deallocated, or one that shouldn’t be deallocated.

With the arena-based allocator, there’s no obligation to call free for every call to malloc; there’s only a single call that deallocates all the memory allocated in an arena since the last deallocation. Both allocation and deallocation are more efficient, and there are no storage leaks. But the most important benefit of this scheme is that it simplifies code. Applicative algorithms allocate new data structures instead of changing existing ones. The arena-based allocator encourages simple applicative algorithms in place of algorithms that might be more space-efficient but are always more complex because they must remember when to call free.

There are two disadvantages of the arena-based scheme: It can use more memory, and it can create dangling pointers. If an object is allocated in the wrong arena and that arena is deallocated before the program is done with the object, the program will reference either unallocated memory or memory that has been reused for another, perhaps unrelated, arena. It’s also possible to allocate objects in an arena that isn’t deallocated as early as expected, which creates a storage leak. In practice, however, arena management is so easy that these problems rarely occur.

Interface

The Arena interface specifies two exceptions and functions that manage arenas and allocate memory from them:

<arena.h>≡
   #ifndef ARENA_INCLUDED
   #define ARENA_INCLUDED
   #include "except.h"

   #define T Arena_T
   typedef struct T *T;

   extern const Except_T Arena_NewFailed;
   extern const Except_T Arena_Failed;

   <exported functions 91>

   #undef T
   #endif

Arenas are created and destroyed by

<exported functions 91>≡
   extern T    Arena_new    (void);
   extern void Arena_dispose(T *ap);

Arena_new creates a new arena and returns an opaque pointer to the newly created arena. These pointers are passed to the other functions to specify an arena. If Arena_new cannot allocate the arena, it raises the exception Arena_NewFailed. Arena_dispose frees the memory associated with the arena *ap, disposes of the arena itself, and clears *ap. It is a checked runtime error to pass a null ap or *ap to Arena_dispose.

The allocation functions Arena_alloc and Arena_calloc are like the functions with similar names in the Mem interface, except they allocate memory from an arena.

<exported functions 91>+≡
   extern void *Arena_alloc (T arena, long nbytes,
       const char *file, int line);
   extern void *Arena_calloc(T arena, long count,
       long nbytes, const char *file, int line);
   extern void  Arena_free  (T arena);

Arena_alloc allocates a block of at least nbytes in arena and returns a pointer to the first byte. The block is aligned on an addressing boundary that is suitable for the data with the strictest alignment requirement. The contents of the block are uninitialized. Arena_calloc allocates a block large enough to hold an array of count elements, each of size nbytes, in arena, and returns a pointer to the first byte. The block is aligned as for Arena_alloc, and is initialized to zeros. It is a checked runtime error for count or nbytes to be nonpositive.

The last two arguments to Arena_alloc and Arena_calloc are the file name and the line number of the location of the call. If Arena_alloc and Arena_calloc cannot allocate the memory requested, they raise Arena_Failed and pass file and line to Except_raise so that the exception reports the location of the call. If file is the null pointer, they supply the source locations within their implementations that raise Arena_Failed.

Arena_free deallocates all the storage in arena, which amounts to deallocating everything that has been allocated in arena since arena was created or since the last call to Arena_free for that arena.

It is a checked runtime error to pass a null T to any routine in this interface. The routines in this interface can be used with those in the Mem interface and with other allocators based on malloc and free.

Implementation

<arena.c>≡
   #include <stdlib.h>
   #include <string.h>
   #include "assert.h"
   #include "except.h"
   #include "arena.h"
   #define T Arena_T

   const Except_T Arena_NewFailed =
       { "Arena Creation Failed" };
   const Except_T Arena_Failed    =
       { "Arena Allocation Failed" };

   <macros 98>
   <types 92>
   <data 96>
   <functions 93>

An arena describes a chunk of memory:

<types 92>≡
   struct T {
       T prev;
       char *avail;
       char *limit;
   };

The prev field points to the head of the chunk, which begins with an arena structure as described below, and the limit field points just past the end of the chunk. The avail field points to the chunk’s first free location; the space beginning at avail and up to limit is available for allocation.

To allocate N bytes when N does not exceed limit-avail, avail is incremented by N and its previous value is returned. If N exceeds limit-avail, a new chunk is allocated by calling malloc; the current value of *arena is “pushed” by storing it at the beginning of the new chunk, the fields of arena are initialized so they describe the new chunk, and allocation proceeds.

The arena structure thus heads a linked list of chunks in which the links are the prev fields in copies of the arena structures that begin each chunk. Figure 6.1 shows the state of an arena after three chunks have been allocated. The shading denotes allocated space; chunks can vary in size and may end with unallocated space if allocations don’t exactly fill the chunks.

An arena with three chunks

Figure 6.1. An arena with three chunks

Arena_new allocates and returns an arena structure with its fields set to null pointers, which denotes an empty arena:

<functions 93>≡
   T Arena_new(void) {
       T arena = malloc(sizeof (*arena));
       if (arena == NULL)
           RAISE(Arena_NewFailed);
       arena->prev = NULL;
       arena->limit = arena->avail = NULL;
       return arena;
   }

Arena_dispose calls Arena_free to deallocate the chunks in the arena; it then frees the arena structure itself and clears the pointer to the arena:

<functions 93>+≡
   void Arena_dispose(T *ap) {
       assert(ap && *ap);
       Arena_free(*ap);
       free(*ap);
       *ap = NULL;
   }

Arena uses malloc and free instead of, say, Mem_alloc and Mem_free, so that it’s independent of other allocators.

Most allocations are trivial: They round the request amount up to the proper alignment boundary, increment the avail pointer by the amount of the rounded request, and return the previous value.

<functions 93>+≡
   void *Arena_alloc(T arena, long nbytes,
       const char *file, int line) {
       assert(arena);
       assert(nbytes > 0);
       <round nbytes up to an alignment boundary 95>
       while (nbytes > arena->limit - arena->avail) {
           <get a new chunk 95>
       }
       arena->avail += nbytes;
       return arena->avail - nbytes;
   }

As in the checking implementation of the Mem interface, the size of the union

<types 92>+≡
   union align {
       int i;
       long l;
       long *lp;
       void *p;
       void (*fp)(void);
       float f;
       double d;
       long double ld;
   };

gives the minimum alignment on the host machine. Its fields are those that are most likely to have the strictest alignment requirements, and it is used to round up nbytes:

<round nbytes up to an alignment boundary 95>≡
   nbytes = ((nbytes + sizeof (union align) - 1)/
       (sizeof (union align)))*(sizeof (union align));

For most calls, nbytes is less than arena->limit - arena->avail; that is, the chunk has at least nbytes of free space, so the body of the while loop in Arena_alloc above is not executed. If the request cannot be satisfied from the current chunk, a new chunk must be allocated. This wastes the free space at the end of current chunk, which is illustrated in the second chunk on the list shown in Figure 6.1.

After a new chunk is allocated, the current value of *arena is saved at the beginning of the new chunk, and arena’s fields are initialized so that allocation can continue:

<get a new chunk 95>≡
   T ptr;
   char *limit;
   <ptr ← a new chunk 96>
   *ptr = *arena;
   arena->avail = (char *)((union header *)ptr + 1);
   arena->limit = limit;
   arena->prev  = ptr;

<types 92>+≡
   union header {
       struct T b;
       union align a;
   };

The structure assignment *ptr = *arena pushes *arena by saving it at the beginning of the new chunk. The union header ensures that arena->avail is set to a properly aligned address for the first allocation in this new chunk.

As shown below, Arena_free keeps a few free chunks on a free list emanating from freechunks to reduce the number of times it must call malloc. This list is threaded through the prev fields of the chunks’ initial arena structures, and the limit fields of those structures point just past the ends of their chunks. nfree is the number of chunks on the list. Arena_alloc gets a free chunk from this list or by calling malloc, and it sets the local variable limit for use in <get a new chunk 95> above:

<data 96>+≡
  static T freechunks;
  static int nfree;

<ptr ← a new chunk 96>≡
  if ((ptr = freechunks) != NULL) {
      freechunks = freechunks->prev;
      nfree--;
      limit = ptr->limit;
  } else {
      long m = sizeof (union header) + nbytes + 10*1024;
      ptr = malloc(m);
      if (ptr == NULL)
          <raise Arena_Failed 96>
      limit = (char *)ptr + m;
  }

If a new chunk must be allocated, one is requested that is large enough to hold an arena structure plus nbytes, and have 10K bytes of available space left over. If malloc returns null, allocation fails and Arena_alloc raises Arena_Failed:

<raise Arena_Failed 96>≡
   {
        if (file == NULL)
        RAISE(Arena_Failed);
   else
        Except_raise(&Arena_Failed, file, line);
   }

Once arena points to the new chunk, the while loop in Arena_alloc tries the allocation again. It still might fail: If the new chunk came from freechunks, it might be too small to fill the request, which is why there’s a while loop instead of an if statement.

Arena_calloc simply calls Arena_alloc:

<functions 93>+≡
   void *Arena_calloc(T arena, long count, long nbytes,
       const char *file, int line) {
       void *ptr;

       assert(count > 0);
       ptr = Arena_alloc(arena, count*nbytes, file, line);
       memset(ptr, '', count*nbytes);
       return ptr;
   }

An arena is deallocated by adding its chunks to the list of free chunks, which also restores *arena to its initial state as the list is traversed.

<functions 93>+≡
   void Arena_free(T arena) {
       assert(arena);
       while (arena->prev) {
           struct T tmp = *arena->prev;
           <free the chunk described by arena 98>
           *arena = tmp;
       }
       assert(arena->limit == NULL);
       assert(arena->avail == NULL);
   }

The structure assignment to tmp copies to tmp all of the fields of the arena structure pointed to by arena->prev. This assignment and the assignment *arena = tmp thus “pops” the stack of arena structures formed by the list of chunks. Once the entire list is traversed, all of the fields of arena should be null.

freechunks accumulates free chunks from all arenas and thus could get large. The length of the list isn’t a problem, but the free storage it holds might be. Chunks on freechunks look like allocated memory to other allocators and thus might make calls to malloc fail, for example. To avoid tying up too much storage, Arena_free keeps no more than

<macros 98>≡
  #define THRESHOLD 10

free chunks on freechunks. Once nfree reaches THRESHOLD, subsequent chunks are deallocated by calling free:

<free the chunk described by arena 98>≡
   if (nfree < THRESHOLD) {
       arena->prev->prev = freechunks;
       freechunks = arena->prev;
       nfree++;
       freechunks->limit = arena->limit;
   } else
       free(arena->prev);

In Figure 6.2, the chunk on the left is to be deallocated. When nfree is less than THRESHOLD, the chunk is added to freechunks. The deallocated chunk is shown on the right, and the dotted lines depict the pointers planted by the three assignments in the code above.

Deallocating a chunk when nfree < THRESHOLD

Figure 6.2. Deallocating a chunk when nfree < THRESHOLD

Further Reading

Arena-based allocators are also known as pool allocators, and have been described several times. Arena’s allocator (Hanson 1990) was originally developed for use in lcc (Fraser and Hanson 1995). lcc’s allocator is slightly simpler than Arena’s: Its arenas are allocated statically, and its deallocator doesn’t call free. In its initial versions, allocation was done by macros that manipulated arena structures directly and called a function only when a new chunk was needed.

Barrett and Zorn (1993) describe how to choose the appropriate arena automatically. Their experiments suggest that the execution path to an allocation site is a good predictor of the lifetime of the block allocated at that site. This information includes the call chain and the address of the allocation site, and it is used to choose one of several application-specific arenas.

Vmalloc (Vo 1996) is a more general allocator that can be used to implement both the Mem and Arena interfaces. Vmalloc permits clients to organize memory into regions and to provide functions that manage the memory in each region. The Vmalloc library includes an implementation of the malloc interface that provides memory checks similar to those done by Mem’s checking implementation, and these checks can be controlled by setting environment variables.

Arena-based allocation collapses many explicit deallocations into one. Garbage collectors go one step further: They avoid all explicit deallocations. In languages with garbage collectors, programmers can almost ignore storage allocation, and storage allocation bugs can’t occur. The advantages of this property are hard to overstate.

With a garbage collector, space is reclaimed automatically as necessary, usually when an allocation request can’t be filled. A garbage collector finds all blocks that are referenced by program variables, and all blocks that are referenced by fields in these blocks, and so on. These are the accessible blocks; the rest are inaccessible and can be reused. There is a large body of literature on garbage collection: Appel (1991) is a brief survey that emphasizes recent algorithms, and Knuth (1973a) and Cohen (1981) cover the older algorithms in more depth.

To find accessible blocks, most collectors must know which variables point to blocks and which fields in blocks point to other blocks. Collectors are usually used in languages that have enough compile-time or runtime data to supply the necessary information. Examples include LISP, Icon, SmallTalk, ML, and Modula-3. Conservative collectors (Boehm and Weiser 1988) can deal with languages that don’t provide enough type information, such as C and C++. They assume that any properly aligned bit pattern that looks like a pointer is one and that the block it points to is accessible. A conservative collector thus identifies some inaccessible blocks as accessible and therefore busy, but has no choice but to overestimate the set of accessible blocks. Despite this apparent handicap, conservative collectors work surprising well in some programs (Zorn 1993).

Exercises

6.1

Arena_alloc looks only in the chunk described by arena. If there’s not enough free space in that chunk, it allocates a new chunk even if there is enough space in some other chunk further down the list. Change Arena_alloc so that it allocates space in an existing chunk if there’s one that has enough space, and measure the resulting benefits. Can you find an application whose memory use is reduced significantly by this change?

6.2

When Arena_alloc needs a new chunk, it takes the first one on the free list, if there is one. A better choice would be to find the largest free chunk that satisfies the request, allocating a new one only if freechunks doesn’t hold a suitable chunk. Keeping track of the largest chunk in freechunks would avoid fruitless traversals in this scheme. With this change, the while loop in Arena_alloc could be replaced with an if statement. Implement this scheme and measure its benefits. Does it make Arena_alloc noticeably slower? Does it use memory more efficiently?

6.3

Setting THRESHOLD to 10 means that free list will never hold more than about 100K bytes of memory, since Arena_alloc allocates chunks of at least 10K bytes. Devise a way for Arena_alloc and Arena_free to monitor allocation and deallocation patterns and to compute THRESHOLD dynamically based on these patterns. The goal is to keep the free list as small as possible and to minimize the number of calls to malloc.

6.4

Explain why the Arena interface doesn’t support the function

void *Arena_resize(void **ptr, long nbytes,
    const char *file, int line)

which, like Mem_resize, would change the size of the block pointed to by ptr to nbytes and return a pointer to the resized block, which would reside in the same arena (but not necessarily the same chunk) as the block given by ptr. How would you change the implementation to support this function? What checked runtime errors would you support?

6.5

In a stack allocator, an allocation pushes the new space onto the top of a specified stack and returns the address of its first byte. Marking a stack returns a value that encodes the current height of that stack, and deallocation pops a stack back to a previously marked height. Design and implement an interface for a stack allocator. What checked runtime errors can you provide that will catch deallocation errors? Examples of such errors are deallocating at a point higher than the current the top of a stack, or deallocating at a point that has already been deallocated and subsequently reallocated.

6.6

One problem with having more than one memory-allocation interface is that other interfaces must choose between them without knowing the best one for a particular application. Design and implement a single interface that supports both kinds of allocators. This interface might, for example, provide an allocation function that is like Mem_alloc but that operates in an “allocation environment,” which can be changed by other functions. This environment would specify memory-management details, such as which allocator and which arena to use, if it specified arena-based allocation. Other functions might, for example, push the current environment on an internal stack and establish a new environment, and pop the stack to reestablish a previous environment. Investigate these and other variations in your design.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.222.111.24