Chapter 26. Tuning Modules for Performance & Memory Footprint

NetBeans is, by definition, a large application, with a significant memory footprint. It is incumbent on anyone writing modules to write them responsibly, so that the user experience is not degraded. This chapter will outline some ways to do this in the context of NetBeans. For general optimization of Java code, we refer you to some of the excellent articles and books on the subject, such as Java 2 Performance and Idiom Guide by Craig Larman and Rhett Guthrie.

The limiting factor in NetBeans’ performance is memory. As objects are created in memory and then are no longer referenced, the memory footprint grows until garbage collection is run in the JVM. This means three things:

  • Modules should not create objects until they are about to be used.

  • Care should be taken not to hold references to objects that may not be used again or for a long time.

  • Once created, do my objects need to exist until NetBeans is shut down, or can they be disposed of sooner? What can I softly cache?[20]

NetBeans has a number of coding conventions and utility classes to optimize for memory footprint, which will be outlined throughout this chapter. Use them and your modules will run will in any reasonable environment. Some of the advice about use of threads is dependent on the target platform. But unless you’re far better than the authors at predicting the future, you probably want your code to work in any environment in which it might be deployed. This chapter is your guide to common pitfalls and how to avoid them when coding for NetBeans.

Startup Performance vs. Runtime Performance

Clearly you don’t want to add any more to startup time than absolutely necessary. One of the reasons for the declarative XML syntax in module layers for instantiating objects is that, in addition to the flexibility it offers, it means that classes referenced in XML will not even be loaded into the JVM unless something requests an instance of the class referred to.

Loading classes into a Java VM is typically expensive. A trivial class takes at least 1K of memory, and the process of resolving a class, loading it, linking it and running static initializers takes time. A large chunk of NetBeans startup time is traceable directly to class loading. Most of the performance tricks in NetBeans are focused on letting a module avoid loading classes until they’re actually going to be used. You can easily check how you’re doing by running the VM in verbose class loading mode or using a profiler.

That being said, there will be times when doing some initialization on startup is the right thing to do. They are rare, but an example is when the Code Editor aggressively loads some of its configuration because the alternative would be an unacceptable delay the first time the user tries to edit a file. For most cases, and probably for your module, this is not going to be the case.

Operating Principles

While there are many places performance optimizations can be made, the following sections highlight a few ways of thinking and places to pay attention to that will assist you.

“Nobody Will Ever Use My Module”

Okay, that’s a depressing thought. You wouldn’t be writing a module if you didn’t want people to use it! But it’s also a good way to think about things when you’re writing a module. If you’ve written fantastic Intercal (http://www.tuxedo.org/~esr/intercal/) support for NetBeans, you still probably don’t want to assume that editing Intercal programs is the only thing the user is likely to be doing with NetBeans. This is doubly true for modules implementing a limited set of functionality (however difficult to implement) such as profiling or UML diagramming. The common activity in NetBeans is likely to be editing code, and no matter how revolutionary any module is, it needs to be a good citizen and stay out of the way when the user is not paying attention to it. Code on the assumption that the user is not using your module, and you will write a module that behaves well both when it is and when it is not being used.

For evidence of this, consider modules such as CORBA or RMI. How often do you use them? For some people it may be quite a bit, but even though most users of NetBeans will have these modules loaded they will rarely, if ever, use them. It is certainly the inappropriate choice for someone to have to disable a module to improve performance; the module should load nothing except, ideally, a few XML entries that will allow NetBeans to bootstrap the classes that are needed on demand and let the module go to work.

What Does My Module Really Need to Do on Startup?

It is tempting to bring your module to an initialized state when it is first loaded. But the real question is what you actually need to initialize. Anything done when a module is first loaded is going to add to startup time, and while this is in some ways less important than performing well at runtime, bear in mind that in a sense, the model for NetBeans modules is one of cooperative multitasking. The number of other modules your module will need to coexist cannot be determined, but any module can potentially degrade the performance of others. Therefore, it is important that each module be a good citizen of the NetBeans runtime, or everyone can suffer, and this applies equally to startup time and runtime performance. For better or worse, startup time is the user’s first experience with an application. If you’re building a module that runs inside NetBeans or an entire application based on NetBeans, you want the user to have a good first experience and a satisfying ongoing experience with your code.

There are cases where the time at which an object will be demanded cannot be determined. The HTTP Server module is an example of this, since on startup there may be an existing external HTTP client that expects to be able to interact with it. But these cases are few and far between, and even in these cases, there is the possibility for optimization (such as a dummy HTTP server that, when a connection is attempted, starts the real HTTP server and forwards the request).

What Are the Critical Paths?

Beyond startup performance, there are critical paths through the code where it is absolutely critical to optimize for performance. This applies mainly to commonly used API-specified objects. The constructor for any object is the most basic example of a critical path through the code. Constructors should not make any assumptions that the object being created will actually be used or that it will be used in the near term, and thus should do no more work than is absolutely necessary. Ideally, a constructor should simply instantiate an uninitialized object that will initialize itself when something tries to use it.

One particularly important example is Nodes. Nodes must be very inexpensive to create. One of the principal reasons a Node does not directly create its child nodes, but delegates this to a Children object, is to delay the creation of additional nodes until the last possible moment. This is done on the assumption that if the user doesn’t actually expand the node, the child Nodes should never come into being. This principle of lazy instantiation is a good idea almost everywhere, but it is particularly critical in the case of Nodes, since creating Nodes is something that happens very frequently within the NetBeans environment. Note that this means you should avoid using Children.Array, since that class requires that the children be created when the Node is.

DataLoaders need to be able to recognize files fast. Your DataLoader is going to be called for every file in the system. This process must be optimized so that the negative test is as fast as possible, and your recognition code will return false at the earliest possible moment. If you’re simply recognizing files by extension, this should already be fast. Register a MIMEResolver that can assign files relevant to your code to your DataLoader.

For identifying XML files by DTD, create a specialized MIME type such as text/x-mytype+xml. Your MIMEResolver will resolve the MIME type. You could accomplish this by subclassing MIMEResolver, but it is more efficient to use an XML MIME resolver in your module layer, as the Minicomposer example does. (For an example, see Chapter 21.) This allows you to declare a set of standard rules that will determine if a file is a match using magic byte sequence, extension, XML DOCTYPE, or root element. Then the optimization worries are those of NetBeans, and you can rest assured that recognition of your document type will be as optimized as possible. Another advantage of declaring your MIMEResolver in XML is that as of NetBeans 4.0 a DataLoader will never be loaded into the JVM until a file needing to be represented by it is encountered.

Cookie creation in Nodes and DataObjects should happen on demand. If the presence or absence of a given cookie depends on the state of the object the Node represents, override getCookie( ) to create the cookie and do the state calculations on demand. There is also a helper class CookieSet.Factory that can be used for this purpose.

Techniques

A number of best practices for performance have evolved over the course of NetBeans lifespan. Indeed, as the size and complexity of the NetBeans IDE has grown, sometimes what was once the best practice (such as aggressively loading classes for responsiveness at startup) can become a very bad practice later. By and large, these guidelines travel on the assumption that the application is a large one, and are geared toward conserving memory and performing work on demand rather than in anticipation of demand.

Lazy Initialization

This simple approach is a good coding practice in almost any application. When writing a class, don’t create objects until something actually tries to use them. In its simplest form, this adds up to the difference between:

public class Foo extends Object {
  Bar myBar = new Bar( );
  public void doSomething( ) {
    myBar.doSomething( );
  }
}

and this code, which will load faster and potentially consume less memory:

public class Foo extends Object {
  Bar myBar;
  public void doSomething( ) {
    getBar( ).doSomething( );
  }
  private Bar getBar( ) {
    if (myBar == null) {
      myBar = new Bar( );
    }
    return myBar;
  }
}

This approach is nothing revolutionary, and there are more reasons than just optimization to take it. If you’ve ever been coding and had an insight that you actually ought to do one more piece of work when you initialize something like the Bar object above, you probably know the pain of going through your code and replacing the references to the variable for your object with a method call. Even with the niftiest refactoring support in the world, it’s work that could have been avoided in the first place.

Avoid Static Initializers

Similar to the preceding example, it is generally preferable to avoid static initializer sections in your code, such as this:

public class Foo extends Object {
  protected static Bar[ ] myConstants;
  static {
    for (int i = 0; i < 200; i++) {
      myConstants[i] = new Bar(i);
    }
  }
}

This is true for several reasons: first, as soon as the Foo class is loaded, Bar is also yanked into memory; second, your code is assuming that at some point the user is going to do something that makes use of this array. You don’t know that! Depending on how you write your code, it may be that your Foo class is referenced directly by a SystemAction that your module installs as a menu item. This means that as soon as NetBeans creates its main menu, Foo gets loaded, which in turn drags in Bar and creates 200 instances of it.

Avoid ModuleInstall Classes—Use XML Layers Instead

Another piece of the lazy initialization puzzle is avoiding ModuleInstall classes. This is a class that can be specified in a module’s manifest, which has methods that will be run when the module is installed, uninstalled, restored (for example, NetBeans is restarted) or shut down. NetBeans offers XML layers as a way to specifically avoid doing unnecessary work during startup. There is an older infrastructure that allows these same things to be specified in the module manifest. This older infrastructure will cause classes to be loaded on startup and should be avoided any place it is possible to use XML instead. For the details of doing this, see Chapter 15.

As of NetBeans 3.3, only three kinds of sections must still be handled via the manifest: actions, data loaders, and nodes. In NetBeans 3.4, nodes will be taken off this list, and in NetBeans 4.0, it will be possible to handle actions and loaders declaratively in XML as well, effectively eliminating the need for any classes from a module to be loaded on startup. The specification for doing this is not crystallized at the time of this writing. Check the APIs for your version of NetBeans to find out how to do this.

.settings files

There is an extension to the Lookup mechanism: .settings files.

One advantage to using XML layers is that no work needs to be done to deregister your objects and clean up if your module is disabled—your module’s XML layer is simply cleanly removed from the system. A well-behaved module will offer its services to the system via XML layers.

.settings files are similar to .instance files, except that when an instance is created, a PropertyChangeListener is also added to the object by the system. If any properties of the object are changed, those changes are written to disk and will be reflected in instances created from this .settings file in the future. This gets you persistent objects and storage for free, while retaining the advantage that an instance of your class that provides settings will never becomes instantiated and begin taking up memory unless the user or module code demands it.

Batching results of expensive operations

There is no getting around it—some operations occasionally take quite a bit of time. Some kinds of Nodes will require intensive calculation to instantiate their children (such as parsing a large document). The same is true of properties of objects. Sometimes delaying instantiation until the last possible moment results in an unacceptably long delay for the user. This is a bad thing for two reasons: first, the user can’t do any other work while they wait for NetBeans to finish chewing; second, it simply creates a bad impression that makes people want to find a faster solution, one that may not involve your module or NetBeans.

Fortunately, such situations have at least a partial solution. In most cases you can do a partial calculation in a background thread, return the results and continue the work, while the user remains able to interact with NetBeans. The basic technique is this: if you are creating a Node with many children or properties, create a threshold number of them (for example, enough properties to fill most of a property sheet or subnodes in a typical Explorer tree view). Return those, and in the background create the rest in batched sets, firing events along the way to notify the system that the children or properties have changed and the UI should be updated.

Partial Loading Considerations—InstanceCookie and InstanceDataObject

If someone opens the Options window, they see a large number of Nodes that represent user-settable settings. The last thing anybody wants is to cause classes from every module that installs a settings Node to get dragged into memory, simply because the user is being presented with a dialog. After all, the user most likely wants to change one or two things when opening the settings dialog box. So it’s practically guaranteed that most elements in the Options tree will not be touched in a given invocation of the dialog box.

NetBeans contains infrastructure for name-based lookup and loading of classes. By using XML layers and InstanceDataObject, you can know the name of the class you want to get and have the appropriate icon and localized display name for it, without actually forcing it to be loaded unless the class needs to do some work, such as displaying its properties. Use this pattern in your own classes for greater efficiency.

Use URLs Instead of CDATA Sections in XML Layers

The specification for XML layers includes the ability to include file data within <<file>> tags in XML. However, this imposes a memory penalty, since the string representation of that data will be held in memory. It is equally possible to provide a URL in the XML layer that points to the actual content as a path inside your module JAR. This is the better approach, since no memory is taken up by data that might not be used (except for the URL string) and the file on the other end of that URL will be provided as the content of that file if the data is actually needed by the system.

Additionally, pointing to external content using a URL makes it simpler to work with non-ASCII characters or binary files and avoids possible confusion about whitespace surrounding the content.

Reduce the Number of Classes You Create

Current JVMs are still a bit inefficient when loading classes. Where possible, aggregate. For example, if your class attaches property change listeners to several types of objects, it is more efficient to test for the source of the change event and handle it accordingly than to create several anonymous inner property change listeners to do the job.

Another example is the case where you need to start a thread and Runnable listen for completion. One typically thinks of the listener as pretty distinct from the thread being run, but the following code is actually more efficient:

Task t = cookie.prepare( );
class MyRun implements TaskListener, Runnable {
  public void taskFinished(Task t2) {
    SwingUtilities.invokeLater(this);
  }
  public void run( ) {
    // do the work...
  }
}
t.addTaskListener(new MyRun( ));

GUI Components—Wait for addNotify( )

When creating GUI components, there is a temptation to create all the subcomponents of a component in the constructor. This is, however, a comparatively expensive process. Your component cannot guarantee it is going to be displayed until it appears onscreen. java.awt.Component and its descendants provide a convenient method to hook into the operation of displaying and hiding a component in the methods addNotify( ) and removeNotify( ). If you need to do significant work with adding and initializing components, do this in addNotify( ) and release any resources you can in removeNotify( ) . Remember that you must call the super method when overriding addNotify( ) and removeNotify( ), or bizarre bugs may result.

Using the addNotify( ) and removeNotify( ) Pattern Where Exposed by Non-GUI Classes

Nodes mirror the addNotify( ) and removeNotify( ) pattern found in java.awt.Component. When a Node is first expanded, addNotify( ) is called on the children, which indicates that its children are soon to be required and that it needs to calculate the keys or whatever initial data is needed so it can provide them. If the Node is closed, removeNotify( ) will eventually be called, at which time your Node should dispose of its keys and return to its uninitialized state. This way objects created by expanding it can be garbage collected if the JVM is low on memory.

SystemActions also have addNotify( ) and removeNotify( ) methods (implemented in the SharedClassObject ancestor class). These methods are called based on the presence of listeners on the action. The presence of listeners implies that some component is presenting the action to the user. Then all code involving your action attaching listeners to other objects or other initialization can be postponed until listeners are attached to it. If it is not being presented to the user, an action should not incur any overhead in the system. For most SystemAction subclasses, however, you will not be directly implementing SystemAction, but one of the convenience subclasses for context-sensitive actions such as NodeAction, CookieAction, or CallbackSystemAction, which handle this logic for you. If you are implementing addNotify( ) and removeNotify( ) to do something particularly clever with your action, it’s worth putting in some debug code that logs output using ErrorManager on calls to these methods and make sure you’re not doing excessive work here.

While DataObjects lack the addNotify( )/removeNotify( ) pattern, with a little care you can nonetheless defer any expensive operations with the data object. Most commonly, the real functionality is available from some cookie served by the object. Do not initiate work such as parsing unless and until this cookie is actually requested.

Use Weak and Soft References for Objects

The java.lang.ref package contains a number of helper classes that can help you save memory by not forcing a hard reference to an object. In Java, an object is retained in memory as long as it is referenced by another object. Objects that are not known to any other object in the system are no longer needed, and their memory can be reclaimed. When garbage collection starts, it does a pass through all objects in an area of memory and marks those that are not reachable as eligible to be collected. Then it takes a second pass to free up the memory they’re consuming. One of the primary sources of memory leaks in Java code is holding references to objects that will never be touched, or probably won’t be touched again but that can be created if needed.

A weak reference is a reference that the garbage collector will not count in deciding whether an object is still referenced. If an object is only weakly reachable, it will generally be garbage collected soon, like a completely unreachable object. This is possible because the object that is only weakly reachable is wrapped in a java.lang.ref.WeakReference object that holds a reference to the object. WeakReference classes have a method Object get( ) that returns the object referenced, or null if it was garbage collected.

If you are creating an object that may or may not be needed or used for long periods of time, use either java.lang.ref.WeakReference or java.lang.ref.SoftReference to hold these objects, and provide a getter method that dereferences and returns the weakly referenced object. As long as some other code is using your object, it can’t be garbage collected because there will be a hard reference to it. If there are no hard references, it can safely be garbage collected, and your getter can re-create it on demand.

To show this in action, let’s look at some examples:

public class MySingletonObject {
  public MySingletonObject defaultInstance =
    new MySingletonObject( );
}

The preceding example is what you absolutely don’t want to do—force creation of your object when the class is first loaded, whether or not it will be used! The following example is slightly better:

public class MySingletonObject {
  private MySingletonObject defaultInstance = null;
  public MySingletonObject getDefault( ) {
    if (defaultInstance == null) {
      defaultInstance = new MySingletonObject( );
    }
    return defaultInstance;
  }
}

But this is still not ideal—what if your object were used only once in an entire session of running NetBeans? Your object would spend the rest of its life taking up memory and waiting for the phone to ring. This is where weak references come in:

public class MySingletonObject {
  private static Reference defaultInstance = null;
  public static synchronized MySingletonObject getDefault( ) {
    MySingletonObject obj;
    if (defaultInstance != null) {
      obj = (MySingletonObject)defaultInstance.get( );
    } else {
      obj = null;
    }
    if (obj == null) {
      obj = new MySingletonObject( );
      defaultInstance = new WeakReference(obj);
    }
    return obj;
  }
}

This example gets your singleton, but without requiring it to remain in memory if it’s not actually being used.

If your object is something that may live longer but that can be re-created, use java.lang.ref.SoftReference instead. It is functionally similar to WeakReference but might not be collected even if there is no one holding a strong reference until JVM memory becomes very low. In some situations use of weak references is logical, and in other situations you should use soft references—they are not usually interchangeable.

Note that overuse of weak references can hurt performance in some circumstances by causing excessive object re-creation. Soft references do not suffer from this problem, but might consume memory for a long time holding objects no one will ask for. Always use a profiler and/or logging code to examine how often your object is really disposed and re-created.

Utility classes that can help

The Open APIs contain several helper classes that can assist you. Particularly of interest may be org.openide.util.WeakSet, which is a Set implementation whose members can disappear on garbage collection if they are not referenced elsewhere. If you need to keep a list of all instances of a class, use WeakSet to do so. The Java platform class java.util.WeakHashMap can also be helpful.

Use WeakListener

org.openide.util.WeakListener is probably the most commonly used helper class for saving memory in NetBeans. Say that you have a long-lived object such as a data model that holds a structural representation of some document you’ve parsed. A common situation is that you also want to attach a listener to it from a more peripheral piece of code, such as a GUI component, so that that component can update its state if the model is changed. The chances are that the model will outlive the GUI component, which the user may close. If the GUI component directly attaches a listener to the model, you’ve introduced a memory leak of sorts: The GUI component can never be garbage collected unless the model is as well because the model is still holding a reference to it—the listener. The model could end up holding references to many defunct objects this way. Worse, it is still firing changes to them.

WeakListener is the solution to this problem. It allows you to attach a listener that is only weakly referenced by the object to which it is attached, so events can still be fired, but the object that is listening can be garbage collected if nothing else references it anymore. The WeakListener class has inner classes that implement a number of the standard JDK and NetBeans listener classes, such as EventListener, PropertyChangeListener, and so forth. Static methods are on the WeakListener class to attach all of these listeners, by basically creating a proxy listener that holds a weak reference to the real listener. To attach a PropertyChangeListener to the data model we just discussed, the code will look like this (assuming our short-lived GUI component implements PropertyChangeListener):

  model.addPropertyChangeListener(WeakListener.propertyChange(this, model));

If the GUI component in question gets garbage collected, the WeakListener removes itself from the model. If you need to implement a custom listener WeakListener does not support, you can subclass WeakListener to accomplish this.

If you have a listening object with addNotify( ) and removeNotify( ) methods (for example, Children or Component), you do not need to use WeakListener. It is simpler and more efficient to attach a listener directly in addNotify( ) and detach it in removeNotify( ).

Avoid Excessive Event Firing

You never know what may be listening to your objects and possibly champing at the bit to do time- and memory-consuming work just because a property of your object changed. A practical example is a Node that can be renamed by the user. It is not difficult for a user to click and hover the mouse too long over the in-place editor in the Explorer window, which means it is opened and closed, even though the name of the object that was clicked was not actually changed. Confirm that the property has actually changed the name before firing changes that could galvanize ponderous parsing processes into action.

Avoid Overuse of Threads

Overuse of threads can be a problem on some operating systems and not on others. Since you (presumably) want your code to run smoothly and cleanly on as many platforms as possible, it’s always a good practice to optimize. On Windows this does not present a tremendous problem; on Linux overuse of threads causes a minor performance hit; on Solaris the effect can be severe.

In practice two pieces of advice will help: Avoid switching between threads more than is absolutely necessary, and if you need to do something in a different thread, create it and do as much work as you can in a single pass. Entering a thread, synchronizing it, and running it is expensive in Java.

The utility org.openide.util.RequestProcessor.postRequest( ) provides a convenient technique for creating a new runnable and dumping it on a queue to be run later. This is useful in many parts of NetBeans programming where you need to do something but doing so at the present time would be dangerous (for example, if you’re holding a lock and you’re in an arbitrary thread and need to change the GUI). You can also use SwingUtilities.invokeLater( ) for similar purposes, and the advice is the same. A common mistake is to post too many Runnables for trivial purposes. The next section covers one of the primary causes of this.

Note that for NetBeans 3.4, it is deprecated to post requests into the singleton public request processor instance, as such requests are run serially and can deadlock one another unless written carefully to avoid this. In NetBeans 3.3 you can create a private request processor to ensure that such deadlocks do not occur as a result of your code, but the price is the creation of a dedicated thread; in 3.4 creating a private request processor is lightweight and recommended more broadly.

Another thing to avoid is too much synchronization when using threads. Wherever possible, use a single lock or mutex (the class org.openide.util.Mutex exists to help with this). Synchronization is expensive!

Batching Events

A typical mistake is to have a class that may have hundreds of instances, all listening to the same object. It is a far better practice to batch the changes. One way to accomplish this is to maintain a WeakSet with all instances of your class. In the case of a change they will need to be notified of, iterate through that set and notify the necessary objects. This is particularly important in cases where each notification could involve thread creation.

Another case in which you will want to batch events is where a change that will have large effects on subsidiary objects is being made. Filesystem operations are typical of this class of event. If you’re in the NetBeans IDE and you rename or move a Java package, this has a ripple effect on all of the DataObjects in the system that represent Java sources in that package. And you definitely don’t want the effects of the rename to start to cause cascading events of their own before the renaming work is complete. FileSystem.runAtomic( ) exists for this reason. If you need to do a large number of file operations at once, it is best to run them as an atomic block. This avoids some synchronization costs and fires changes only once, at the end of the operation. All changes are queued and duplicates are pruned.

Swing Performance

Excellent resources are available on Swing performance, so we won’t cover this topic in great detail. Most of the preceding advice is applicable to Swing programming as well, along with such advice as using TableData to load large tables of data dynamically. Generally, it is inadvisable to cache dialog boxes, with some caveats in that a dialog box that is expensive to create (heavy parsing or complex graphical operations required to display it) may be better off weakly cached. This avoids heavy memory consumption if multiple instances are serially created.

Once you have a highly-tuned module packed with features, how can you get it to your users? You will need to take steps to package it into a convenient form and learn how to maintain it for the long haul. The next chapter explains how to do so.



[20] Soft caching is a means of holding a reference to an object, but still allowing it to be garbage collected by the JVM if memory is required. In that case it will need to be re-created on demand if requested again.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.16.66.156