Code generation strategies

There are several ways of executing bytecode in a JVM, from just emulating the bytecode in a pure bytecode interpreter to converting everything to native code for a particular platform.

Pure bytecode interpretation

Early JVMs contained only simple bytecode interpreters as a means of executing Java code. To simplify this a little, a bytecode interpreter is just a main function with a large switch construct on the possible opcodes. The function is called with a state representing the contents of the Java evaluation stack and the local variables. Interpreting a bytecode operation uses this state as input and output. All in all, the fundamentals of a working interpreter shouldn't amount to more than a couple of thousand lines of code.

There are several simplicity benefits to using a pure interpreter. The code generator of an interpreting JVM just needs to be recompiled to support a new hardware architecture. No new native compiler needs to be written. Also, a native compiler for just one platform is probably much larger than our simple switch construct.

A pure bytecode interpreter also needs little bookkeeping. A JVM that compiles some or all methods to native code would need to keep track of all compiled code. If a method is changed at runtime, which Java allows, it needs to be scheduled for regeneration as the old code is obsolete. In a pure interpreter, its new bytecodes are simply interpreted again from the start the next time that we emulate a call to the method.

It follows that the amount of bookkeeping in a completely interpreted model is minimal. This lends itself well to being used in an adaptive runtime such as a JVM, where things change all the time.

Naturally, there is a significant performance penalty to a purely interpreted language when comparing the execution time of an interpreted method with a native code version of the same code. Sun Microsystems' Classic Virtual Machine started out as a pure bytecode interpreter.

Running our previous add method, with its four bytecode instructions, might easily require the execution of ten times as many native instructions in an interpreter written in C. Whereas, a native version of our add most likely would just be two assembly instructions (add and return).

int evaluate(int opcode, int* stack, int* localvars) {
switch (opcode) {
...
case iload_1:
case iload_2:
int lslot = opcode - iload_1;
stack[sp++] = localvars[lslot];
break;
case iadd:
int sum = stack[--sp] + stack[--sp];
stack[sp++] = sum;
break;
case ireturn:
return stack[--sp];
...
}
}

The previous example shows simple pseudo code for a bytecode interpreter with just enough functionality to execute our add method. Even this simple code snippet amounts to tens of assembly instructions in the natively compiled JVM. Considering that a natively compiled version of the add method would just be two instructions, this illustrates the performance problem with pure bytecode interpretation.

JIT compiling the add method on x86 yields us:

add eax, edx // eax = edx+eax
ret // return eax

Note

Note that this book will sometimes show assembly code examples in places, to illustrate points. No prior knowledge of assembly code on any platform is needed to reap the full benefits of the text. However, the concept of low level languages should be familiar to the reader. If you feel yourself breaking out in a cold sweat over the assembly listings that are displayed in a few places throughout the text, don't worry too much. They are not necessary to understand the big picture.

Static compilation

In the early days of Java, several simple "brute force" approaches to getting around the bytecode performance problem were made. These usually involved static compilation in some form. Usually, an entire Java program was compiled into native code before execution. This is known as ahead-of-time compilation. Basically, ahead-of-time compilation is what your average C++ compiler does all the time.

As a limited subset of the problem of static compilation for Java is easy to solve, a row of products appeared in the late nineties, using methodologies like turning bytecodes into naive C code and then passing it to a C compiler. Most of the time, the resulting code ran significantly faster than purely interpreted bytecode. However, these kinds of products rarely supported the full dynamic nature of the Java language and were unable to graciously handle things like code being replaced at runtime without large workarounds.

The obvious disadvantage of static compilation for Java is that the benefits of platform independence immediately disappear. The JVM is removed from the equation.

Another disadvantage is that the automatic memory management of Java has to be handled more or less explicitly, leading to limited implementations with scalability issues.

As Java gradually moved more and more towards server side applications, where its dynamic nature was put to even more use, static solutions became impractical. For example, an application server generating plenty of Java Server Pages (JSPs) on the fly reduces a static compiler to a JIT compiling JVM, only slower and less adaptive.

Note

Note that static ahead-of-time solutions, while unsuitable for implementing Java, can be useful in certain other contexts, for example ahead-of-time analysis. Program analysis is a time consuming business. If some of it can be done offline, before program execution, and communicated to the JVM, there may be performance benefits to be had. For example, .class files may be annotated with offline profiling data, perhaps in the form of Java Annotations.

Total JIT compilation

Another way to speed up bytecode execution is to not use an interpreter at all, and JIT compile all Java methods to native code immediately when they are first encountered. The compilation takes place at runtime, inside the JVM, not ahead-of-time.

Unlike completely static ahead-of-time compilation, on the fly compilation fits better into the Java model with a mobile adaptive language.

Total JIT compilation has the advantage that we do not need to maintain an interpreter, but the disadvantage is that compile time becomes a factor in the total runtime. While we definitely see benefits in JIT compiling hot methods, we also unnecessarily spend expensive compile time on cold methods and methods that are run only once. Those methods might as well have been interpreted instead.

Note

A frequently executed method is said to be hot. A method that is not frequently executed and doesn't contribute to the overall program performance regardless of its implementation is said to be cold.

This can be remedied by implementing different levels of compiler quality in the JIT compiler, starting out with every method as a quick and dirty version. When the JVM knows that a method is hot, for example if the number of invocations of the method reaches a certain threshold value, it can be queued for recompilation with more optimizations applied. This naturally takes longer.

The main disadvantage of total JIT compilation is still low code generation speed. In the same way that an interpreted method executes hundreds of times slower than a native one, a native method that has to be generated from Java bytecodes takes hundreds of times longer to get ready for execution than an interpreted method. When using total JIT compilation, it is extremely important to spend clock cycles on optimizing code only where it will pay off in better execution time. The mechanism that detects hot methods has to be very advanced, indeed. Even a quick and dirty JIT compiler is still significantly slower at getting code ready for execution than a pure interpreter. The interpreter never needs to translate bytecodes into anything else.

Another issue that becomes more important with total JIT compilation is the large amounts of throwaway code that is produced. If a method is regenerated, for example since assumptions made by the compiler are no longer valid, the old code takes up precious memory. The same is true for a method that has been optimized. Therefore, the JVM requires some kind of "garbage collection" for generated code or a system with large amounts of JIT compilation would slowly run out of native memory as code buffers grow.

JRockit is an example of a JVM that uses an advanced variant of total JIT compilation as its code generation strategy.

Mixed mode interpretation

The first workable solution that was proposed, that would both increase execution speed and not compromise the dynamic nature of Java, was mixed mode interpretation.

In a JVM using mixed mode interpretation, all methods start out as interpreted when they are first encountered. However, when a method is found to be hot, it is scheduled for JIT compilation and turned into more efficient native code. This adaptive approach is similar to that of keeping different code quality levels in the JIT, described in the previous section.

Detecting hot methods is a fundamental functionality of every modern JVM, regardless of code execution model, and it will be covered to a greater extent later in this chapter. Early mixed mode interpreters typically detected the hotness of a method by counting the number of times it was invoked. If this number was large enough, optimizing JIT compilation would be triggered for the method.

Similar to total JIT compilation, if the process of determining if a method is hot is good enough, the JVM spends compilation time only on the methods where it makes the most difference. If a method is seldom executed, the JVM would waste no time turning it into native code, but rather keep interpreting it each time that it is called.

Bookkeping JIT code is a simple problem with mixed mode interpretation. If a version of a compiled method needs to be regenerated or an assumption is invalidated, its code is thrown out. The next time the method is called, it will once again be interpreted. If the method is still hot, it will eventually be recompiled with the changed model of the world incorporated.

Note

Sun Microsystems was the first vendor to embrace mixed mode interpretation in the HotSpot compiler, available both in a client version and a server side version, the latter with more advanced code optimizations. HotSpot in turn, was based on technology acquired from Longview Technologies LLC (which started out as Animorphic).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.17.79.206