Chapter 17. Garbage Collection and Memory

 

Civilization is a limitless multiplication of unnecessary necessaries.

 
 --Mark Twain

The Java virtual machine uses a technique known as garbage collection to determine when an object is no longer referenced within a program, and so can be safely reclaim466466ed to free up memory space. This chapter teaches the basic ideas behind garbage collection, how the programmer can be involved in the garbage collection process, and how special reference objects can be used to influence when an object may be considered garbage.

Garbage Collection

Objects are created with new, but there is no corresponding delete operation to reclaim the memory used by an object. When you are finished with an object, you simply stop referring to it—change your reference to refer to another object or to null, or return from a method so its local variables no longer exist and hence refer to nothing. Objects that are no longer referenced are termed garbage, and the process of finding and reclaiming these objects is known as garbage collection.

The Java virtual machine uses garbage collection both to ensure any referenced object will remain in memory, and to free up memory by deallocating objects that are no longer reachable from references in executing code. This is a strong guarantee—an object will not be collected if it can be reached by following a chain of references starting with a root reference, that is, a reference directly accessible from executing code.

In simple terms, when an object is no longer reachable from any executable code, the space it occupies can be reclaimed. We use the phrase “can be” because space is reclaimed at the garbage collector's discretion, usually only if more space is needed or if the collector wants to avoid running out of memory. A program may exit without running out of space or even coming close and so may never need to perform garbage collection. An object is “no longer reachable” when no reference to the object exists in any variable of any currently executing method, nor can a reference to the object be found by starting from such variables and then following each field or array element, and so on.

Garbage collection means never having to worry about dangling references. In systems in which you directly control when objects are deleted, you can delete an object to which some other object still has a reference. That other reference is now dangling, meaning it refers to space that the system considers free. Space that is thought to be free might be allocated to a new object, and the dangling reference would then reference something completely different from what the object thought it referenced. This situation could cause all manner of havoc when the program uses the values in that space as if they were part of something they are not. Garbage collection solves the dangling reference problem for you because an object that's still referenced somewhere will never be garbage-collected and so will never be considered free. Garbage collection also solves the problem of accidentally deleting an object multiple times—something that can also cause havoc.

Garbage is collected without your intervention, but collecting garbage still takes work. Creating and collecting large numbers of objects can interfere with time-critical applications. You should design such systems to be judicious in the number of objects they create and so reduce the amount of garbage to be collected.

Garbage collection is not a guarantee that memory will always be available for new objects. You could create objects indefinitely, place them in lists, and continue doing so until there is no more space and no unreferenced objects to reclaim. You could create a memory leak by, for example, allowing a list of objects to refer to objects you no longer need. Garbage collection solves many, but not all, memory allocation problems.

A Simple Model

Garbage collection is easier to understand with an explicit model so this section describes a simple one, but practical garbage collectors are far more sophisticated. Garbage collection is logically split into two phases: separating live objects from dead objects and then reclaiming the storage of the dead ones. Live objects are those that are reachable from running code—the objects that some action of your code can still potentially use. Dead objects are the garbage that can be reclaimed.

One obvious model of garbage collection is reference counting: When object X references object Y, the system increments a counter on Y, and when X drops its reference to Y, the system decrements the counter. When the counter reaches zero, Y is no longer live and can be collected, which will decrement the counts of any other objects to which Y refers.

Reference counting fails in the face of cycles, in which loops are created in the references. If X and Y reference each other, neither object's counter will ever become zero, and so neither X nor Y will ever be collected, nor will anything to which either object refers, directly or indirectly. Most garbage collectors do not use reference counting for this and other reasons.

The simplest model of garbage collection not subject to this problem is called mark-and-sweep. The name refers to the way the two phases of garbage collection are implemented. To find which objects are live, the garbage collector first determines a set of roots that contains the directly reachable objects: References in local variables on the stack, for example, are reachable because you can use those variables to manipulate the object. Objects referred to by local variables are therefore clearly live.

Once a set of roots is determined, the collector will mark the objects referenced by those roots as reachable. It will then examine references in each of those objects. If an object referred to by such a reference is already marked reachable from the first step, it is ignored. Otherwise. the object is marked reachable and its references are examined. This process continues until no more reachable objects remain unmarked. After this marking process is complete, the collector can reclaim the dead objects (those which are not marked) by sweeping them away.

Any change to the interconnection of objects during a run of mark-and-sweep will clearly interfere with the collection process. A marking run can miss an object that was unreachable at the beginning of the marking process, but which is assigned to a reachable reference in the middle. Running a basic mark-and-sweep pass requires freezing execution of the program, at least during the marking phase.

There are other problems with mark-and-sweep. Garbage collection is a complex area of research with no easy or universal answers. We present mark-and-sweep as a relatively simple mental model for you to use to understand garbage collection. Each virtual machine has its own collection strategy, and some let you choose among several. Use this mark-and-sweep model as a mental model only—do not assume that this is how any particular virtual machine actually works.

Finalization

You won't normally notice when an orphaned object's space is reclaimed—“it just works.” But a class can implement a finalize method that is executed before an object's space is reclaimed—see “Strengths of Reference and Reachability” on page 455. Such a finalize method gives you a chance to use the state contained in the object to reclaim other non-memory resources. The finalize method is declared in the Object class:

  • protected void finalize() throws Throwable

    • Is invoked by the garbage collector after it determines that this object is no longer reachable and its space is to be reclaimed. This method might clean up any non-memory resources used by this object. It is invoked at most once per object, even if execution of this method causes the object to become reachable again and later it becomes unreachable again. There is no guarantee, however, that finalize will be called in any specific time period; it may never be called at all. This method is declared to throw any exception but if an exception occurs it is ignored by the garbage collector. The virtual machine makes no guarantees about which thread will execute the finalize method of any given object, but it does guarantee that the thread will not hold any user-visible synchronization locks.

You should only rarely need to write a finalize method, and when you do, you should write it with great care. If your object has become garbage it is quite possible that other objects to which it refers are also garbage. As garbage, they may have been finalized before your finalize method is invoked and may therefore be in an unexpected state.

Garbage collection collects only memory. When you are dealing with non-memory resources that are not reclaimed by garbage collection, finalizers look like a neat solution. For example, open files are usually a limited resource, so closing them when you can is good behavior. But this usually cannot wait until the finalize phase of garbage collection. The code that asks you to perform an operation that opens a file should tell you when it's done—there is no guarantee that your object holding the open file will be collected before all the open file resources are used up.

Still, your objects that allocate external resources could provide a finalize method that cleans them up so that the class doesn't itself create a resource leak. For example, a class that opens a file to do its work should have some form of close method to close the file, enabling programmers using that class to explicitly manage the number-of-open-files resource. The finalize method can then invoke close. Just don't rely on this to prevent users of the class from having problems. They might get lucky and have the finalizer executed before they run out of open files, but that is risky—finalization is a safety-net to be used as a last resort, after the programmer has failed to release the resource manually. If you were to write such a method, it might look like this:

public class ProcessFile {
    private FileReader file;

    public ProcessFile(String path) throws
        FileNotFoundException
    {
        file = new FileReader(path);
    }

    // ...

    public synchronized void close() throws IOException {
        if (file != null) {
            file.close();
            file = null;
        }
    }

    protected void finalize() throws Throwable {
        try {
            close();
        } finally {
            super.finalize();
        }
    }
}

Note that close is carefully written to be correct if it is invoked more than once. Otherwise, if someone invoked close, finalizing the object would cause another close on the file, which might not be allowed.

Note also that, in this example, finalize invokes super.finalize in a finally clause. Train yourself so that you always write that invocation in any finalize method you write. If you don't invoke super.finalize, you may correctly finalize your own part of the object, but the superclass's part will not get finalized. Invoking super.finalize is one of those good habits you should adopt even when your class doesn't extend any other class. In addition to being good training, invoking super.finalize in such a case means that you can always add a superclass to a class like ProcessFile without remembering to examine its finalize method for correctness. Invoking the superclass's finalize method in a finally clause ensures that the superclass's cleanup will happen even if your cleanup causes an exception.

The garbage collector may reclaim objects in any order or it may never reclaim them. Memory resources are reclaimed when the garbage collector thinks the time is appropriate. Not being bound to an ordering guarantee, the garbage collector can operate in whatever manner is most efficient, and that helps minimize the overhead of garbage collection. You can, if necessary, invoke the garbage collector to try to force earlier collection using System.gc or Runtime.gc, as you'll see in the next section, but there is no guarantee that garbage collection will actually occur.

When an application exits, no further garbage collection is performed, so any objects that have not yet been collected will not have their finalize methods invoked. In many cases this will not be a problem. For example, on most systems when the virtual machine exits, the underlying system automatically closes all open files and sockets. However, for non-system resources you will have to invent other solutions. (Temporary files can be marked as “delete on exit,” which solves one of the more common issues—see “The File Class” on page 543.)

Reference queues—discussed on page 459—provide a better way of performing clean-up actions when an object is it about to be, or has been, reclaimed.

Resurrecting Objects during finalize

A finalize method can “resurrect” an object by making it referenced again—for example, by adding it to a static list of objects. Resurrection is discouraged, but there is nothing the system can do to stop you.

However, the virtual machine invokes finalize at most once on any object, even if that object becomes unreachable more than once because a previous finalize resurrected it. If resurrecting objects is important to your design, the object would be resurrected only once—probably not the behavior you wanted.

If you think you need to resurrect objects, you should review your design carefully—you may uncover a flaw. If your design review convinces you that you need something like resurrection, the best solution is to clone the object or create a new object, not to resurrect it. The finalize method can insert a reference to a new object that will continue the state of the dying object rather than a reference to the dying object itself. Being new, the cloned object's finalize method will be invoked in the future (if needed), enabling it to insert yet another copy of itself in yet another list, ensuring the survival, if not of itself, at least of its progeny.

Interacting with the Garbage Collector

Although the language has no explicit way to dispose of unwanted objects, you can directly invoke the garbage collector to look for unused objects. The Runtime class, together with some convenience methods in the System class, allows you to invoke the garbage collector, request that any pending finalizers be run, or query the current memory state:

  • public void gc()

    • Asks the virtual machine to expend effort toward recycling unused objects so that their memory can be reused.

  • public void runFinalization()

    • Asks the virtual machine to expend effort running the finalizers of objects that it has found to be unreachable but have not yet had their finalizers run.

  • public long freeMemory()

    • Returns an estimate of free bytes in system memory.

  • public long totalMemory()

    • Returns the total bytes in system memory.

  • public long maxMemory()

    • Returns the maximum amount of memory, in bytes, that the virtual machine will ever attempt to use. If there is no limit, Long.MAX_VALUE is returned. There is no method to set the maximum; a virtual machine will typically have a command-line or other configuration option to set the maximum.

To invoke these methods you need to obtain a reference to the current Runtime object via the static method Runtime.getRuntime. The System class supports static gc and runFinalization methods that invoke the corresponding methods on the current Runtime; in other words, System.gc() is equivalent to Runtime.getRuntime().gc().

The garbage collector may not be able to free any additional memory when Runtime.gc is invoked. There may be no garbage to collect, and not all garbage collectors can find collectable objects on demand. So invoking the garbage collector may have no effect whatsoever. However, before creating a large number of objects—especially in a time-critical application that might be affected by garbage-collection overhead—invoking gc may be advisable. Doing so has two potential benefits: You start with as much free memory as possible, and you reduce the likelihood of the garbage collector running during the task. Here is a method that aggressively frees everything it can at the moment:

public static void fullGC() {
    Runtime rt = Runtime.getRuntime();
    long isFree = rt.freeMemory();
    long wasFree;
    do {
        wasFree = isFree;
        rt.runFinalization();
        rt.gc();
        isFree = rt.freeMemory();
    } while (isFree > wasFree);
}

This method loops while the amount of freeMemory is being increased by successive calls to runFinalization and gc. When the amount of free memory doesn't increase, further calls will likely do nothing.

You will not usually need to invoke runFinalization, because finalize methods are called asynchronously by the garbage collector. Under some circumstances, such as running out of a resource that a finalize method reclaims, it is useful to force as much finalization as possible. But remember, there is no guarantee that any object actually awaiting finalization is using some of that resource, so runFinalization may be of no help.

The fullGC method is too aggressive for most purposes. In the unusual circumstance that you need to force garbage collection, a single invocation of the System.gc method will gather most if not all of the available garbage. Repeated invocations are progressively less productive—on many systems they will be completely unproductive.

Exercise 17.1Write a program to examine the amount of memory available on start up and after allocation of a number of objects. Try invoking the garbage collector explicitly to see how the amount of free memory changes—make sure you don't hold references to the newly allocated objects of course.

Reachability States and Reference Objects

An object can be garbage collected only when there are no references to it, but sometimes you would like an object to be garbage collected even though you may have a specific reference to it. For example, suppose you are writing a web browser. There will be images that have been seen by the user but that are not currently visible. If memory becomes tight, you can theoretically free up some memory by writing those images to disk, or even by forgetting about them since you can presumably refetch them later if needed. But since the objects representing the images are referenced from running code (and hence reachable) they will not be released. You would like to be able to have a reference to an object that doesn't force the object to remain reachable if that is the only reference to the object. Such special references are provided by reference objects.

A reference object is an object whose sole purpose is to maintain a reference to another object, called the referent. Instead of maintaining direct references to objects, via fields or local variables, you maintain a direct reference to a reference object that wraps the actual object you are interested in. The garbage collector can determine that the only references to an object are through reference objects and so can decide whether to reclaim that object—if the object is reclaimed then, the reference object is usually cleared so that it no longer refers to that object. The strength of a reference object determines how the garbage collector will behave—normal references are the strongest references.

The Reference Class

The classes for the reference object types are contained in the package java.lang.ref. The primary class is the generic, abstract class Reference<T>, which is the superclass of all the specific reference classes. It has four methods:

  • public T get()

    • Returns this reference object's referent object.

  • public void clear()

    • Clears this reference object so it has no referent object.

  • public boolean enqueue()

    • Adds this reference object to the reference queue with which it is registered, if any. Returns true if the reference object was enqueued and false if there is no registered queue or this reference object was already enqueued.

  • public boolean isEnqueued()

    • Returns true if this reference object has been enqueued (either by the programmer or the garbage collector), and false otherwise.

We defer a discussion of reference queues until Section 17.5.3 on page 459.

Subclasses of Reference provide ways to bind the referent object to the reference object—the existing subclasses do this with a constructor argument. Once an object has been wrapped in a reference object you can retrieve the object via get (and thus have a normal strong reference to it) or you can clear the reference, perhaps making the referent unreachable. There is no means to change the object referred to by the reference object and you cannot subclass Reference directly.

Strengths of Reference and Reachability

In decreasing order of strength, the kinds of reference objects available to you are SoftReference<T>, WeakReference<T>, and PhantomReference<T>. These correspond to the reachability stages an object can pass through:

  • An object is strongly reachable if it can be reached through at least one chain of strong references (the normal kind of references).

  • An object is softly reachable if it is not strongly reachable, but is reachable through at least one chain containing a soft reference.

  • An object is weakly reachable if it is not softly reachable, but is reachable through at least one chain containing a weak reference.

  • An object is phantom reachable when it is not weakly reachable, has been finalized (if necessary), but is reachable through at least one chain containing a phantom reference.

  • Finally, an object is unreachable if it is not reachable through any chain.

Once an object becomes weakly reachable (or less), it can be finalized. If after finalization the object is unreachable, it can be reclaimed.

Objects need not actually go through all these stages. For example, an object that is reachable only through strong references becomes unreachable when it is no longer strongly reachable.

The reachability stages of an object trigger behavior in the garbage collector appropriate to the corresponding reference object types:

  • A softly reachable object may be reclaimed at the discretion of the garbage collector. If memory is low, the collector may clear a SoftReference object so that its referent can be reclaimed. There are no specific rules for the order in which this is done (but a good implementation will prefer keeping recently used or created references, where “used” is defined as “invoked get”). You can be sure that all SoftReferences to softly reachable objects will be cleared before an OutOfMemoryError is thrown.

  • A weakly reachable object will be reclaimed by the garbage collector. When the garbage collector determines that an object is weakly reachable, all WeakReference objects that refer to that object will be cleared. The object then becomes finalizable and after finalization will be reclaimed (assuming it is not resurrected) unless it is phantom reachable.

  • A phantom reachable object isn't really reachable in the normal sense because the referent object cannot be accessed via a PhantomReferenceget always returns null. But the existence of the phantom reference prevents the object from being reclaimed until the phantom reference is explicitly cleared. Phantom references allow you to deal with objects whose finalize methods have been invoked and so can safely be considered “dead.” Phantom references are used in conjunction with the reference queues we discuss in the next section.

Both SoftReference and WeakReference declare a constructor that takes a single referent object. All three classes declare a two-argument constructor that takes a referent object and a ReferenceQueue.

Soft references provide you with a kind of caching behavior, clearing older references while trying not to clear new or used ones. Consider our web browser scenario. If you maintain your images in soft references, they will be reclaimed as memory runs low. Images are probably relatively unimportant to keep in memory should memory run low, and clearing the oldest or least used images would be a reasonable approach. In contrast, if you used weak references then all images would be reclaimed as soon as memory got low—this would probably induce a lot of overhead as you reload images that it may not have been necessary to get rid of in the first place.

Weak references are a way of holding a reference to an object but saying “reclaim this object if this is the only type of reference to it.”

Consider the following method that returns data read into memory from a file. The method has been optimized under the assumption that the same file is often named more than once in a row and that reading the data is expensive:

import java.lang.ref.*;
import java.io.File;

class DataHandler {
    private File lastFile;          // last file read
    private WeakReference<byte[]> 
                          lastData; // last data (maybe)

    byte[] readFile(File file) {
        byte[] data;

        // check to see if we remember the data
        if (file.equals(lastFile)) {
            data = lastData.get();
            if (data != null)
                return data;
        }

        // don't remember it, read it in
        data = readBytesFromFile(file);
        lastFile = file;
        lastData = new WeakReference<byte[]>(data);


        return data;
    }
}

When readFile is called it first checks to see if the last file read was the same as the one being requested. If it is, readFile retrieves the reference stored in lastData, which is a weak reference to the last array of bytes returned. If the reference returned by get is null, the data has been garbage collected since it was last returned and so it must be re-read. The data is then wrapped in a new WeakReference. If get returns a non-null reference, the array has not been collected and can be returned.

If lastData were a direct, strong reference to the last data returned, that data would not be collected, even if the invoking program had long since dropped all references to the array. Such a strong reference would keep the object alive as long as the DataHandler object itself was reachable. This could be quite unfortunate. Using a WeakReference allows the space to be reclaimed, at the cost of occasionally re-reading the data from disk.

Notice that invoking get on lastData makes the byte array strongly reachable once again because its value is bound to an active local variable—a strong reference. If get returns a non-null reference there is no possibility of the byte array being reclaimed as long as readFile is executing, or the block of code to which readFile returns the reference is executing. That invoking code can store the reference in a reachable place or return it to another method, thereby ensuring the data's reachability. An object can only be less than strongly reachable when none of its references are strong. Storing a non-null reference fetched from a reference object's get method creates a strong reference that will be treated like any other strong reference.

Weak references typically store information about an object that might be time consuming to compute but that need not outlive the object itself. For example, if you had some complicated information built up from using reflection on an object, you might use the java.util.WeakHashMap class, which is a hashtable for mapping a weakly-held object to your information. If the object becomes unreachable the WeakHashMap will clean up the associated information, which presumably is no longer useful (since the object will not be used in any future computation because of its weak reachability). You will learn about WeakHashMap with the rest of the collection classes in Chapter 21, specifically in Section 21.9 on page 594.

Exercise 17.2Modify DataHandler so that lastFile is also stored weakly.

Reference Queues

When an object changes reachability state, references to the object may be placed on a reference queue. These queues are used by the garbage collector to communicate with your code about reachability changes. They are usually the best way to detect such changes, although you can sometimes poll for changes as well, by seeing if get returns null.

Reference objects can be associated with a particular queue when they are constructed. Each of the Reference subclasses provide a constructor of the form

  • public StrengthReference(T referent, ReferenceQueue<? super T> q)

    • Creates a new reference object with the given referent and registered with the given queue.

Both weak and soft references are enqueued at some point after the garbage collector determines that their referent has entered that particular reachability state, and in both cases they are cleared before being enqueued. Phantom references are also enqueued at some point after the garbage collector determines the referent is phantom reachable, but they are not cleared. Once a reference object has been queued by the garbage collector, it is guaranteed that get will return null, and so the object cannot be resurrected.

Registering a reference object with a reference queue does not create a reference between the queue and the reference object. If your reference object itself becomes unreachable, then it will never be enqueued. So your application needs to keep a strong reference to all reference objects.

The ReferenceQueue class provides three methods for removing references from the queue:

  • public Reference<? extends T> poll()

    • Removes and returns the next reference object from this queue, or null if the queue is empty.

  • public Reference<? extends T> remove() throws InterruptedException

    • Removes and returns the next reference object from this queue. This method blocks indefinitely until a reference object is available from the queue.

  • public Reference<? extends T> remove(long timeout) throws InterruptedException

    • Removes and returns the next reference object from this queue. This method blocks until a reference object is available from the queue or the specified time-out period elapses. If the time-out expires, null is returned. A time-out of zero means wait indefinitely.

The poll method allows a thread to query the existence of a reference in a queue, taking action only if one is present, as in the example. The remove methods are intended for more complex (and rare) situations in which a dedicated thread is responsible for removing references from the queue and taking the appropriate action—the blocking behavior of these methods is the same as that defined by Object.wait (as discussed from page 354). You can ask whether a particular reference is in a queue via its isEnqueued method. You can force a reference into its queue by calling its enqueue method, but usually this is done by the garbage collector.

Reference queues are used with phantom references to determine when an object is about to be reclaimed. A phantom reference never lets you reach the object, even when it is otherwise reachable: Its get method always returns null. In effect it is the safest way to find out about a collected object—a weak or soft reference will be enqueued after an object is finalizable; a phantom reference is enqueued after the referent has been finalized and, therefore only after the last possible time that the object can do something. If you can, you should generally use a phantom reference because the other references still allow the possibility that a finalize method will use the object.

Consider a resource manager that controls access to some set of external resources. Objects can request access to a resource and use it until they are done, after which they should return the resource back to the resource manager. If the resource is shared, and use of it is passed from object to object, perhaps even across multiple threads, then it can be difficult to determine which use of the resource is the last use. That makes it difficult to determine which piece of code is responsible for returning the resource. To deal with this situation, the resource manager can automate the recovery of the resource by associating with it a special object called the key. As long as the key object is reachable, the resource is considered in use. As soon as the key object can be reclaimed, the resource is automatically released. Here's an abstract representation of such a resource:

interface Resource {
    void use(Object key, Object... args);
    void release();
}

When a resource is obtained, a key must be presented to the resource manager. The Resource instance that is given back will only allow use of the resource when presented with that key. This ensures that the resource cannot be used after the key has been reclaimed, even though the Resource object itself may still be reachable. Note that it is important that the Resource object not store a strong reference to the key object, since that would prevent the key from ever becoming unreachable, and so the resource could never be recovered. A Resource implementation class might be nested in the resource manager:

private static class ResourceImpl implements Resource {
    int keyHash;
    boolean needsRelease = false;

    ResourceImpl(Object key) {
        keyHash = System.identityHashCode(key);

        // .. set up the external resource

        needsRelease = true;
    }

    public void use(Object key, Object... args) {
        if (System.identityHashCode(key) != keyHash)
            throw new IllegalArgumentException("wrong key");

        // ... use the resource ...
    }

    public synchronized void release() {
        if (needsRelease) {
            needsRelease = false;

            // .. release the resource ...
        }
    }
}

When the resource is created it stores the identity hash code of the key, and whenever use is called, it checks that the same key was provided. Actually using the resource may require additional synchronization, but for simplicity this is elided. The release method is responsible for releasing the resource. It can either be called directly by the users of the resource when they have finished, or it will be called through the resource manager when the key object is no longer referenced. Because we will be using a separate thread to watch the reference queue, release has to be synchronized and it has to be tolerant to being called more than once.

The actual resource manager looks like this:

public final class ResourceManager {

    final ReferenceQueue<Object> queue;
    final Map<Reference<?>, Resource> refs;
    final Thread reaper;
    boolean shutdown = false;

    public ResourceManager() {
        queue = new ReferenceQueue<Object>();
        refs = new HashMap<Reference<?>, Resource>();
        reaper = new ReaperThread();
        reaper.start();

        // ... initialize resources ...
    }

    public synchronized void shutdown() {
        if (!shutdown) {
            shutdown = true;
            reaper.interrupt();
        }
    }

    public synchronized Resource getResource(Object key) {
        if (shutdown)
            throw new IllegalStateException();
        Resource res = new ResourceImpl(key);
        Reference<?> ref =
            new PhantomReference<Object>(key, queue);
        refs.put(ref, res);
        return res;
    }
}

The key object can be an arbitrary object—this gives great flexibility to the users of the resource, compared to having the resource manager assign a key. When getResource is invoked, a new ResourceImpl object is created, passing in the supplied key. A phantom reference is then created, with the key as the referent, and using the resource manager's reference queue. The phantom reference and the resource object are then stored into a map. This map serves two purposes: First, it keeps all the phantom reference objects reachable; second it provides an easy way to find the actual resource object associated with each phantom reference. (The alternative would be to subclass PhantomReference and store the Resource object in a field.)

The resource manager uses a separate “reaper” thread to process resources when the key has become unreachable. The shutdown method “turns off” the resource manager by allowing the reaper to terminate (in response to the interruption) and causing getResource calls to throw IllegalStateException. In this simple design, any references enqueued after shutdown will not be processed. The actual reaper thread looks like this:

class ReaperThread extends Thread {
    public void run() {
        // run until interrupted
        while (true) {
            try {
                Reference<?> ref =  queue.remove();
                Resource res = null;
                synchronized(ResourceManager.this) {
                    res = refs.get(ref);
                    refs.remove(ref);
                }
                res.release();
                ref.clear();
            }
            catch (InterruptedException ex) {
                break; // all done
            }
        }
    }
}

ReaperThread is an inner class, and a given reaper thread runs until its associated resource manager is shut down. It blocks on remove until a phantom reference associated with a particular key has been enqueued. The phantom reference is used to get a reference to the Resource object from the map, and then that entry is removed from the map. The Resource object then has release invoked on it to release the resource. Finally, the phantom reference is cleared so that the key can actually be reclaimed.

As an alternative to using a separate thread, the getResource method could instead poll the queue whenever it is called and release any resources whose key has become unreachable. The shutdown method could then do a final poll, too. The semantics for the resource manager necessarily depend on the actual kind of resource and its usage patterns.

A design that uses reference queues can be more reliable than direct use of finalization—particularly with phantom references—but remember that there are no guarantees as to exactly when or in what order a reference object will be enqueued. There are also no guarantees that by the time an application terminates, all enqueuable reference objects will have been enqueued. If you need to guarantee that all resources are released before the application terminates, you must install the necessary shutdown hooks (see “Shutdown” on page 672) or other application-defined protocols to ensure this happens.

Exercise 17.3Rework the resource implementation class so that it uses a reference object to keep track of the key instead of using the hash code.

Exercise 17.4Modify the reaper thread so that it stays alive after shutdown until all the allocated resources can be released.

Exercise 17.5Redesign the resource manager to not use a reaper thread. Be clear on what semantics the resource manager has and on when resources will be released.

Finalization and Reachability

An object becomes finalizable when it becomes weakly reachable (or less). It would seem that to determine reachability you need simply examine the source code for your applications. Unfortunately, such an examination would often be wrong. Reachability is not determined by the statements in your source code, but by the actual state of the virtual machine at runtime. The virtual machine can perform certain optimizations that make an object unreachable much sooner than a simple examination of the source code would indicate.

For example, suppose a method creates an object that controls an external resource and that the only reference is through a local variable. The method then uses the external resource and finally nulls the local reference or simply returns, allowing the object to be finalized and release the external resource. The programmer might assume that the object is reachable until that reference is set to null or the method returns. But the virtual machine can detect that the object is not referenced within the method and may deem it unreachable and finalizable the instant after it was constructed, causing immediate finalization and release of the resource, which was not something the programmer intended. Even the reference queue designs that we have discussed depend on the referent remaining reachable as long as the programmer expects it to be reachable.

The optimizations under consideration only apply to references held on the stack—that is, references stored in local variables or parameters. If a reference is stored in a field, then the object remains reachable at least until the field no longer refers to it. This approach also covers the case of inner class objects—an inner class instance is reachable whenever its enclosing instance is reachable.

There are further subtle interactions between finalization, synchronization, and the memory model, but these are advanced topics unneeded by most programmers and are beyond the scope of this book.

 

Don't ever take a fence down until you know the reason why it was put up.

 
 --G.K. Chesterton
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.16.50.252