Appendix A. Application Evolution

 

With every passing hour our solar system comes 43,000 miles closer to globular cluster M13 in the constellation Hercules, and still there are some misfits who continue to insist that there is no such thing as progress.

 
 --Ransom K. Ferm

The Java platform has undergone a number of changes since it was first introduced, but none more significant than those that occurred with the 5.0 release. While in general we try to avoid issues concerning the evolution of the platform, it is impossible to ignore those issues now. For some time to come, there will be applications and libraries that still comply with the older 1.3 version of the platform (as documented in the third edition of this book), those that support the additional features introduced in the 1.4 release, and finally those moving forward to take full advantage of the new features of the 5.0 release. The technical issues involved in application evolution and migration can be complex, and many of the changes to the language and libraries for the 5.0 release were designed with backward compatibility and migration in mind. A full coverage of the technical issues, let alone the management and logistics issues involved, is beyond the scope of this book, but we do want to highlight a couple of areas for your consideration.

The first section gives a brief overview of how the compiler and virtual machine can deal with different versions of the language and runtime. Next, we look at some ways of dealing with the different dialects that language changes produce, using assertions as an example. Finally, we look at some of the issues involving the integration of generic and non-generic code.

Language, Library, and Virtual Machine Versions

Each new release of the Java platform potentially changes three things:

  • The language itself, through the addition of new keywords or extensions to the use of existing keywords

  • The libraries: new types and/or new methods on existing types

  • The virtual machine, and in particular the format of the compiled classes that the virtual machine expects

Not every new platform release changes all three—in particular, changes to the class format occur rarely because of the major impact such changes can have.

The compiler that comes with the Java Development Kit (JDK) tracks changes to the language and virtual machine through the use of “source” and “target” options that can be passed to it. By supplying an appropriate source and target pair you can compile code that will run on current or older versions of the virtual machine—assuming the code is compatible with the source version of the language chosen.

The different source versions and their approximate meaning are as follows:

  • 1.1—. The oldest recognized definition of the language, which included the original language definition together with nested types and blank final variables. This source version is no longer supported by the compiler in the 5.0 release.

  • 1.2—. Introduced the strictfp modifier.

  • 1.3—. Same as 1.2; this version number was added for consistency.

  • 1.4—. Introduced assert.

  • 1.5—. Introduced generics, enums, annotations, and extended for loop. Also uses StringBuilder rather than StringBuffer for string concatenation.

Similarly, the target versions are

  • 1.1—. Compliant with the first edition of the Java Virtual Machine Specification (JVMS)

  • 1.2—. Compliant with the second edition of the JVMS. The main change involved the way static member accesses were encoded in the compiled class and resolved at runtime.

  • 1.3—. No known change

  • 1.4—. No known change

  • 1.5—. Compliant with the third edition of the JVMS. This enables support for enums, annotations, and some aspects of generics. It also modifies the way class literals are implemented.

Because some language changes require new library classes, a change in the source version may imply a minimum target value. For example, a source of 1.4 requires a target of 1.4 because an earlier version of the virtual machine won't have the classes or methods that support assertions. Not surprisingly, a source of 1.5 requires a target of 1.5 as well. Naturally, source and target versions that came into existence after a given compiler was released are not supported.

Each target version causes a change in the class file version number that is encoded in the compiled classes. Each virtual machine knows what class versions it supports and will generally refuse to load any class files that have a higher version number. This means, for example, that you can prevent a class from being loaded on an older virtual machine even if there is no missing functionality in the version.

The default settings for the compiler in the 5.0 release are source 1.5 and target 1.5 to enable all the latest features of the language. If you compile code to run on previous versions (which doesn't use the new features of course) you'll need to explicitly set the target to a suitable earlier version.

Dealing with Multiple Dialects

Adding new keywords to the language runs the risk of breaking existing code, especially if the word is an identifier as common as “assert” or “enum.”[1] The choice to add a new keyword creates some interesting and important compatibility issues. Taking assertions as an example, you can choose whether or not assert is recognized in your code by selecting a source version of 1.4 or higher. This allows you to keep your old use of “assert” in one class, while using the language assertions in another—provided the different uses are independent. When asserts were introduced in the 1.4 release, the default source version was left at 1.3 so by default nothing changed and you had to explicitly tell the compiler when you were ready to use the new language feature. In the 5.0 release the presumption is that everyone will want to use at least some of the new features, so they are enabled by default—if you don't want them you must either specify the appropriate source version or use an older version of the compiler.

These side-by-side dialects can create problems. With only one dialect you could use the same compiler, and the same compiler options, on all your source code. With two or more dialects you might have some source files that require assert or enum and others that reject it. You will have to compile your source carefully to handle such a situation. At this point, there are three dialects: with assert and enum (5.0), with assert but not enum (1.4), and without either (1.3 and earlier).

The obvious solution is to use a single dialect for all your source. But you will not always have that kind of control. If you are using source produced by another group they may have moved on while you cannot, or vice versa.

This is deeper than you might hope. Suppose you have two source files, Newer.java which uses the assert keyword and Older.java which uses “assert” as a method name. If the newer class depends on the older and neither has yet been compiled, compiling Newer.java will attempt to compile Older.java as well. With only one dialect this is a feature. But with the incompatible dialects you will be unable to compile Newer.java until you have already compiled Older.java under the non-assert dialect. This complicates your build process because the order in which you compile things has now become significant.

Of course, the ultimate problem is if Newer.java and Older.java depend on each other. Now you have a loop you cannot solve without modifying one source file. Presumably you would upgrade Older.java to use the new dialect, but if you cannot do so you will have to regress Newer.java to the older dialect.

Woe betide if you can modify neither.

Generics: Reification, Erasure, and Raw Types

The generic type system was added to the Java programming language for the 5.0 release. The design was strongly constrained by the need to maintain compatibility between independently developed code modules that may or may not include the use of generics. The key design decision was that generic types could not be reified at runtime. Rather, for each generic type declaration there would be a single type that was defined in terms of the erasure of its type variables. As we discussed in Chapter 11, in simple terms the type is defined such that each type variable is replaced by its bound—commonly Object. The reasoning behind this decision is beyond the scope of this book, but to quote JLS 4.7 (The Java Language Specification, Third Edition, Section 4.7):

  • …the design of the generic type system seeks to support migration compatibility. Migration compatibility allows the evolution of existing code to take advantage of generics without imposing dependencies between independently developed software modules. The price of migration compatibility is that full reification of the generic type system is not possible, at least while the migration is taking place.

Note that in the future, this design decision might be changed.

Raw Types, “Unchecked” Warnings, and Bridge Methods

A consequence of the reification design decision is the existence and use of raw types. Raw types represent the erasure of a specific generic type. They exist so that code that was written to use a non-generic version of a class or interface can execute in a virtual machine that uses a generic version of that class or interface. Some of these uses may cause “unchecked” warnings to be issued by the compiler, but the uses are still permitted. “Unchecked” warnings are emitted in the following circumstances:

  • Assignment from a raw type variable to a parameterized type variable—the raw type variable might not refer to an instance of the expected parameterized type.

  • Casts involving type variables—the cast cannot be checked at runtime because the erasure of the type variable is used.

  • Invocation of a method or constructor of a raw type if erasure changes the type of a parameter.

  • Access to a field of a raw type if erasure changes the type of the field.

The conversion between a raw type and a parameterized type adds an additional conversion to those defined in Section 9.4 on page 216: the unchecked conversion. These unchecked conversions can be applied as the final step in most of the conversion contexts—for example, as part of an assignment. An unchecked conversion is always accompanied by an “unchecked” warning, unless it has been suppressed by using the annotation SuppressWarnings("unchecked")—if your compiler supports this.

Use of raw types can also lead to other complications. Consider a variation of the passThrough example from Chapter 11 repackaged as a generic interface:

interface PassThrough<T> {
    T passThrough(T t);
}

Now consider an implementation of that interface:

class PassThroughString implements PassThrough<String> {
    public String passThrough(String t) {
        return t;
    }
}

and the following incorrect use:

public static void main(String[] args) {
    PassThrough s = new PassThroughString();
    s.passThrough(args);
}

This results in an “unchecked” warning at compile time because the use of the raw type means that we don't know what type passThrough should take or return. But we can see that s.passThrough should only be accepting a String argument, and args is a String[] not a String. If the compiler rejected the above, we could appease the compiler by casting args to String, and then we should expect, and would get, a ClassCastException at runtime. However, the compiler doesn't reject this code, but in fact if we execute this method a ClassCastException is still thrown. This behavior is specified as a special rule of method invocation that deals with this situation in JLS 15.12.4.5:

  • If the erasure of the type of the method being invoked differs in its signature from the erasure of the type of the compile-time declaration for the method invocation, then if any of the argument values is an object which is not an instance of a subclass or subinterface of the erasure of the corresponding formal parameter type in the compile-time declaration for the method invocation, then a ClassCastException is thrown.

In simple terms, the compiler has to accept incorrect invocations like the above at compile time, but it has to ensure that at runtime the bad argument is not actually passed to the method. The ClassCastException must be thrown before invoking the method (as if we had put in the bad cast ourselves). One way a compiler can ensure this is to introduce what is known as a bridge method. A suitable bridge method in this case would be the following:

public Object passThrough(Object o) {
    return passThrough((String)o);
}

The compiler would insert calls to this bridge method, rather than calls to the actual method. Thus, the cast will fail when it should and succeed otherwise. Because bridge methods are introduced by the compiler, they will be marked as synthetic.

Bridge methods also fill another role in maintaining backward compatibility. For example, prior to generics the Comparable interface's compareTo method took an Object argument, but now it takes a T, whatever that may be. However, code compiled against non-generified Comparable implementations have byte codes to invoke a version of the method that takes an Object. No such version is defined in the source code, but the compiler generates a compareTo(Object) bridge method that casts the argument to the expected type and invokes the new compareTo method.

The use of raw types in code written after generics was added is strongly discouraged, as they exist only for backward compatibility and may be removed in a future version of the language.

API Issues

A second compatibility issue concerns the changes to the APIs themselves, within the class libraries. Many, if not all, the generic types in the class libraries would have been written slightly differently had they been written from scratch to be generic types. For example, a number of methods in the collections classes that define a collection of T still take parameters of type Object. That was what the old signature specified, and changing it would break compatibility.

As another example, the reflection method in java.lang.reflect.Array to create a new array is still defined as

public static Object newInstance(Class<?> type, int length)

which has the unfortunate consequence that use of this method always results in an “unchecked” warning by the compiler. A generic version might be specified as

public static <T> T[] newInstance(Class<T> type, int length)

but not only is that not backward compatible, it precludes creating arrays where the component type is a primitive type. So even with a type-safe generic method, you'd still need a second method to allow primitive array creation.

There is a similar problem with implementing the clone method: You cannot define a clone method for a parameterized type without getting an “unchecked” warning when you cast the Object result of Object.clone to the correct type. This is one of those rare occasions in which a cast using the expected parameterized type is the right thing to do. For example, given a generic class Cell<E>, the clone method should be declared to return an instance of Cell<E> because that will permit the caller of clone to use the returned object without the need for any casts. To declare clone in that way requires that the result of super.clone be cast to Cell<E>, which incurs the “unchecked” warning. But be warned, an erroneous or malicious clone implementation could still return an object of the wrong type.

As you generify you own applications and libraries you must be aware of these issues in order to maintain the right level of compatibility with users of your code.

 

Things will get better despite our efforts to improve them.

 
 --Will Rogers


[1] The keyword strictfp was added to the language in the 1.2 release but caused almost no problems because nearly nobody had ever used it as an identifier.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.78.136