Fundamental heap management

Before addressing actual algorithms for garbage collection, we need to talk about allocation and deallocation of objects. We will also need to know which specific objects on the heap to garbage collect, and we need to briefly discuss how they get there and how they are removed.

Allocating and releasing objects

Allocation on a per-object basis normally, in the common case, never takes place directly on the heap. Rather, it is performed in thread local buffers or similar constructs that are promoted to the heap from time to time. However, in the end, allocation is still about finding appropriate space on the heap for the newly allocated objects or collections of objects.

In order to put allocated objects on the heap, the memory management system must keep track of which sections of the heap are free (that is, those which contain no live objects). Free heap space is usually managed by maintaining a free list—a linked list of the free memory chunks on the heap, prioritized in some order that makes sense. A best fit or first fit can then be performed on the free list in order to find a heap address where enough free space is available for the new object. There are many different algorithms for this, with different advantages.

Fragmentation and compaction

It is not enough to just keep track of free space in a useful manner. Fragmentation is also an issue for the memory manager. When dead objects are garbage collected all over the heap, we end up with a lot of holes from where objects have been removed.

Fragmentation is a serious scalability issue for garbage collection, as we can have a very large amount of free space on the heap that, even though it is free, is virtually unusable. This is because there is not enough contiguous space for the allocation of new objects—no hole is big enough. Typically, this will lead to the runtime system triggering more and more GCs in order to attempt clean up of the mess, but still isn't able to reclaim enough contiguous space for new objects. Untreated fragmentation is a classic performance death spiral.

The following figure shows a heap occupied by several objects:

Fragmentation and compaction

The heap section shown is completely occupied by live objects. The object A is two heap units in size and the other objects are just one heap unit in size. The application has two objects reachable from a program point where garbage collection takes place, A that references the object graph ABCD and E that references the object graph EFGH, where ABCD and EFGH are mutually independent.

If E is assigned null and thus removed from the scope of the program, E and its children can be garbage collected. The resulting heap after garbage collection will look like in the following figure:

Fragmentation and compaction

There is now free space on the heap, but even though there is a total amount of four free heap units, there is no place on the heap with more than one free unit in sequence. If the memory manager now attempts to find space for a new object that is, say, two heap units in size, an OutOfMemoryError will be thrown, even though there are four free units on the heap. This illustrates why fragmentation is problematic.

It follows that a memory management system needs some logic for moving objects around on the heap, in order to create larger contiguous free regions. This process is called compaction, and involves a separate GC stage where the heap is defragmented by moving live objects so that they are next to one another on the heap.

The following figure shows the heap from our example after compaction has taken place:

Fragmentation and compaction

Now, we have four consecutive free heap units, and so we are able to allocate larger objects than before with the same amount of free space.

Compaction is difficult to do without stopping the world, but we will discuss some ways of making it more efficient later in this chapter (and to some extent in Chapter 5 and 13).

By looking at the object reference graph and by gambling that objects referencing each other are likely to be accessed in sequence, the compaction algorithm may move these objects so that they are next to one another on the heap. This is beneficial for the cache, and hopefully, the object lifetimes are similar so that larger free heap holes are created upon reclamation.

Intrinsic properties of different garbage collection algorithms also prevent some degree of fragmentation (generational GCs) or allow for automatic compaction (stop and copy). These are discussed later in this chapter.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.147.85.181