7 Generics and Collections

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

7
Generics and Collections

CERTIFICATION OBJECTIVES

Design Using Collections

Override equals() and hashCode(), Distinguish == and equals()

Use Generic Versions of Collections Including Set, List, and Map

Use Type Parameters, Write Generics methods

Use java.util to Sort and Search Use Comparable and Comparator

Two-Minute Drill

Q&A Self Test

Generics are possibly the most talked about feature of Java 5. Some people love 'em, some people hate'em, but they're here to stay. At their simplest, they can help make code easier to write, and more robust. At their most complex, they can be very, very hard to create, and maintain. Luckily, the exam creators stuck to the simple end of generics, covering the most common and useful features, and leaving out most of the especially tricky bits. Coverage of collections in this exam has expanded in two ways from the previous exam: the use of generics in collections, and the ability to sort and search through collections.

CERTIFICATION OBJECTIVES

overriding hashCode() and equals() (objective 6.2)

6.2 Distinguish between correct and incorrect overrides of corresponding hashCode and equals methods, and explain the difference between == and the equals method.

You're an object. Get used to it. You have state, you have behavior, you have a job. (Or at least your chances of getting one will go up after passing the exam.) If you exclude primitives, everything in Java is an object. Not just an object, but an Object with a capital O. Every exception, every event, every array extends from java.lang.Object. For the exam, you don't need to know every method in Object, but you will need to know about the methods listed in Table 7-1.

TABLE 7-1 Methods of Class Object Covered on the Exam

Chapter 9 covers wait(), notify(), and notifyAll(). The finalize() method was covered in Chapter 3. So in this section We'll look at just the hashCode() and equals() methods. Oh, that leaves out toString(), doesn't it. Okay, We'll cover that right now because it takes two seconds.

The Tostring() Method Override toString() when you want a mere mortal to be able to read something meaningful about the objects of your class. Code can call toString() on your object when it wants to read useful details about your object. When you pass an object reference to the System.out.println() method, for example, the object's toString() method is called, and the return of toString() is shown in the following example:

Running the HardToRead class gives us the lovely and meaningful,

The preceding output is what you get when you don't override the toString() method of class Object. It gives you the class name (at least that's meaningful) followed by the @ symbol, followed by the unsigned hexadecimal representation of the object's hashcode.

Trying to read this output might motivate you to override the toString() method in your classes, for example,

This ought to be a bit more readable:

Some people affectionately refer to toString() as the "spill-your-guts method," because the most common implementations of toString() simply spit out the object's state (in other words, the current values of the important instance variables). That's it for toString(). Now We'll tackle equals() and hashCode().

Overriding equals()

You learned about the equals() method in earlier chapters, where we looked at the wrapper classes. We discussed how comparing two object references using the == operator evaluates to true only when both references refer to the same object (because == simply looks at the bits in the variable, and they're either identical or they're not). You saw that the String class and the wrapper classes have overridden the equals() method (inherited from class Object), so that you could compare two different objects (of the same type) to see if their contents are meaningfully equivalent. If two different Integer instances both hold the int value 5, as far as you're concerned they are equal. The fact that the value 5 lives in two separate objects doesn't matter.

When you really need to know if two references are identical, use ==. But when you need to know if the objects themselves (not the references) are equal, use the equals() method. For each class you write, you must decide if it makes sense to consider two different instances equal. For some classes, you might decide that two objects can never be equal. For example, imagine a class Car that has instance variables for things like make, model, year, configuration—you certainly don't want your car suddenly to be treated as the very same car as someone with a car that has identical attributes. Your car is your car and you don't want your neighbor Billy driving off in it just because, "hey, it's really the same car; the equals() method said so." So no two cars should ever be considered exactly equal. If two references refer to one car, then you know that both are talking about one car, not two cars that have the same attributes. So in the case of a Car you might not ever need, or want, to override the equals() method. Of course, you know that isn't the end of the story.

What It Means If You Don't Override equals()

There's a potential limitation lurking here: if you don't override a class's equals() method, you won't be able to use those objects as a key in a hashtable and you probably won't get accurate Sets, such that there are no conceptual duplicates.

The equals() method in class Object uses only the == operator for comparisons, so unless you override equals(), two objects are considered equal only if the two references refer to the same object.

Let's look at what it means to not be able to use an object as a hashtable key. Imagine you have a car, a very specific car (say, John's red Subaru Outback as opposed to Mary's purple Mini) that you want to put in a HashMap (a type of hashtable We'll look at later in this chapter), so that you can search on a particular car and retrieve the corresponding Person object that represents the owner. So you add the car instance as the key to the HashMap (along with a corresponding Person object as the value). But now what happens when you want to do a search? You want to say to the HashMap collection, "Here's the car, now give me the Person object that goes with this car." But now you're in trouble unless you still have a reference to the exact object you used as the key when you added it to the Collection. In other words, you can't make an identical Car object and use it for the search.

The bottom line is this: if you want objects of your class to be used as keys for a hashtable (or as elements in any data structure that uses equivalency for searching for—and/or retrieving—an object), then you must override equals() so that two different instances can be considered the same. So how would we fix the car? You might override the equals() method so that it compares the unique VIN (Vehicle Identification Number) as the basis of comparison. That way, you can use one instance when you add it to a Collection, and essentially re-create an identical instance when you want to do a search based on that object as the key. Of course, overriding the equals() method for Car also allows the potential that more than one object representing a single unique car can exist, which might not be safe in your design. Fortunately, the String and wrapper classes work well as keys in hashtables—they override the equals() method. So rather than using the actual car instance as the key into the car/owner pair, you could simply use a String that represents the unique identifier for the car. That way, you'll never have more than one instance representing a specific car, but you can still use the car—or rather, one of the car's attributes—as the search key.

Implementing an equals() Method

Let's say you decide to override equals() in your class. It might look like this:

Let's look at this code in detail. In the main() method of EqualsTest, we create two Moof instances, passing the same value 8 to the Moof constructor. Now look at the Moof class and let's see what it does with that constructor argument—it assigns the value to the moofValue instance variable. Now imagine that you've decided two Moof objects are the same if their moofValue is identical. So you override the equals() method and compare the two moofValues. It is that simple. But let's break down what's happening in the equals() method:

First of all, you must observe all the rules of overriding, and in line 1 we are indeed declaring a valid override of the equals() method we inherited from Object.

Line 2 is where all the action is. Logically, we have to do two things in order to make a valid equality comparison.

First, be sure that the object being tested is of the correct type! It comes in polymorphically as type Object, so you need to do an instanceof test on it. Having two objects of different class types be considered equal is usually not a good idea, but that's a design issue we won't go into here. Besides, you'd still have to do the instanceof test just to be sure that you could cast the object argument to the correct type so that you can access its methods or variables in order to actually do the comparison. Remember, if the object doesn't pass the instanceof test, then you'll get a runtime ClassCastException. For example:

The (Moof)o cast will fail if o doesn't refer to something that IS-A Moof.

Second, compare the attributes we care about (in this case, just moofValue). Only the developer can decide what makes two instances equal. (For best performance, you're going to want to check the fewest number of attributes.)

In case you were a little surprised by the whole ((Moof)o).getMoofValue() syntax, we're simply casting the object reference, o, just-in-time as we try to call a method that's in the Moof class but not in Object. Remember, without the cast, you can't compile because the compiler would see the object referenced by o as simply, well, an Object. And since the Object class doesn't have a getMoofValue() method, the compiler would squawk (technical term). But then as we said earlier, even with the cast, the code fails at runtime if the object referenced by o isn't something that's castable to a Moof. So don't ever forget to use the instanceof test first. Here's another reason to appreciate the short circuit & & operator—if the instanceof test fails, We'll never get to the code that does the cast, so we're always safe at runtime with the following:

So that takes care of equals()…

Whoa…not so fast. If you look at the Object class in the Java API spec, you'll find what we call a contract specified in the equals() method. A Java contract is a set of rules that should be followed, or rather must be followed if you want to provide a "correct" implementation as others will expect it to be. Or to put it another way, if you don't follow the contract, your code may still compile and run, but your code (or someone else's) may break at runtime in some unexpected way.

Remember that the equals(), hashCode(), and toString() methods are all public. The following would not be a valid override of the equals() method, although it might appear to be if you don't look closely enough during the exam:

And watch out for the argument types as well. The following method is an overload, but not an override of the equals() method:

Be sure you're very comfortable with the rules of overriding so that you can identify whether a method from Object is being overridden, overloaded, or illegally redeclared in a class. The equals() method in class Boo changes the argument from Object to Boo, so it becomes an overloaded method and won't be called unless it's from your own code that knows about this new, different method that happens to also be named equals().

The equals() Contract

Pulled straight from the Java docs, the equals() contract says

It is reflexive. For any reference value x, x.equals(x) should return true.

It is symmetric. For any reference values x and y, x.equals(y) should return true if and only if y.equals(x) returns true.

It is transitive. For any reference values x, y, and z, if x.equals(y) returns true and y.equals(z) returns true, then x.equals(z) must return true.

It is consistent. For any reference values x and y, multiple invocations of x.equals(y) consistently return true or consistently return false, provided no information used in equals comparisons on the object is modified.

For any non-null reference value x, x.equals(null) should return false.

And you're so not off the hook yet. We haven't looked at the hashCode() method, but equals() and hashCode() are bound together by a joint contract that specifies if two objects are considered equal using the equals() method, then they must have identical hashcode values. So to be truly safe, your rule of thumb should be, if you override equals(), override hashCode() as well. So let's switch over to hashCode() and see how that method ties in to equals().

Overriding hashCode()

Hashcodes are typically used to increase the performance of large collections of data. The hashcode value of an object is used by some collection classes (We'll look at the collections later in this chapter). Although you can think of it as kind of an object ID number, it isn't necessarily unique. Collections such as HashMap and HashSet use the hashcode value of an object to determine how the object should be stored in the collection, and the hashcode is used again to help locate the object in the collection. For the exam you do not need to understand the deep details of how the collection classes that use hashing are implemented, but you do need to know which collections use them (but, um, they all have "hash" in the name so you should be good there). You must also be able to recognize an appropriate or correct implementation of hashCode(). This does not mean legal and does not even mean efficient. It's perfectly legal to have a terribly inefficient hashcode method in your class, as long as it doesn't violate the contract specified in the Object class documentation (We'll look at that contract in a moment). So for the exam, if you're asked to pick out an appropriate or correct use of hashcode, don't mistake appropriate for legal or efficient.

Understanding Hashcodes

In order to understand what's appropriate and correct, we have to look at how some of the collections use hashcodes.

Imagine a set of buckets lined up on the floor. Someone hands you a piece of paper with a name on it. You take the name and calculate an integer code from it by using A is 1, B is 2, and so on, and adding the numeric values of all the letters in the name together. A given name will always result in the same code; see Figure 7-1.

FIGURE 7-1 A simplified hashcode example Key Hashcode Algorithm Hashcode Dirk Fred HashMap Collection

We don't introduce anything random, we simply have an algorithm that will always run the same way given a specific input, so the output will always be identical for any two identical inputs. So far so good? Now the way you use that code (and We'll call it a hashcode now) is to determine which bucket to place the piece of paper into (imagine that each bucket represents a different code number you might get). Now imagine that someone comes up and shows you a name and says, "Please retrieve the piece of paper that matches this name." So you look at the name they show you, and run the same hashcode-generating algorithm. The hashcode tells you in which bucket you should look to find the name.

You might have noticed a little flaw in our system, though. Two different names might result in the same value. For example, the names Amy and May have the same letters, so the hashcode will be identical for both names. That's acceptable, but it does mean that when someone asks you (the bucket-clerk) for the Amy piece of paper, you'll still have to search through the target bucket reading each name until we find Amy rather than May. The hashcode tells you only which bucket to go into, but not how to locate the name once we're in that bucket.

In real-life hashing, it's not uncommon to have more than one entry in a bucket. Hashing retrieval is a two-step process.

1. Find the right bucket (using hashCode() )

2. Search the bucket for the right element (using equals() ).

So for efficiency, your goal is to have the papers distributed as evenly as possible across all buckets. Ideally, you might have just one name per bucket so that when someone asked for a paper you could simply calculate the hashcode and just grab the one paper from the correct bucket (without having to go flipping through different papers in that bucket until you locate the exact one you're looking for). The least efficient (but still functional) hashcode generator would return the same hashcode (say, 42) regardless of the name, so that all the papers landed in the same bucket while the others stood empty. The bucket-clerk would have to keep going to that one bucket and flipping painfully through each one of the names in the bucket until the right one was found. And if that's how it works, they might as well not use the hashcodes at all but just go to the one big bucket and start from one end and look through each paper until they find the one they want.

This distributed-across-the-buckets example is similar to the way hashcodes are used in collections. When you put an object in a collection that uses hashcodes, the collection uses the hashcode of the object to decide in which bucket/slot the object should land. Then when you want to fetch that object (or, for a hashtable, retrieve the associated value for that object), you have to give the collection a reference to an object that the collection compares to the objects it holds in the collection. As long as the object (stored in the collection, like a paper in the bucket) you're trying to search for has the same hashcode as the object you're using for the search (the name you show to the person working the buckets), then the object will be found. But…and this is a Big One, imagine what would happen if, going back to our name example, you showed the bucket-worker a name and they calculated the code based on only half the letters in the name instead of all of them. They'd never find the name in the bucket because they wouldn't be looking in the correct bucket!

Now can you see why if two objects are considered equal, their hashcodes must also be equal? Otherwise, you'd never be able to find the object since the default hashcode method in class Object virtually always comes up with a unique number for each object, even if the equals() method is overridden in such a way that two or more objects are considered equal. It doesn't matter how equal the objects are if their hashcodes don't reflect that. So one more time: If two objects are equal, their hashcodes must be equal as well.

Implementing hashCode()

What the heck does a real hashcode algorithm look like? People get their PhDs on hashing algorithms, so from a computer science viewpoint, it's beyond the scope of the exam. The part we care about here is the issue of whether you follow the contract. And to follow the contract, think about what you do in the equals() method. You compare attributes. Because that comparison almost always involves instance variable values (remember when we looked at two Moof objects and considered them equal if their int moofValues were the same?). Your hashCode() implementation should use the same instance variables. Here's an example:

This equals() method says two objects are equal if they have the same x value, so objects with the same x value will have to return identical hashcodes.

A hashCode() that returns the same value for all instances whether they're equal or not is still a legal—even appropriate— hashCode() method! For example,

This does not violate the contract. Two objects with an x value of 8 will have the same hashcode. But then again, so will two unequal objects, one with an x value of 12 and the other a value of -920. This hashCode() method is horribly inefficient, remember, because it makes all objects land in the same bucket, but even so, the object can still be found as the collection cranks through the one and only bucket—using equals()—trying desperately to finally, painstakingly, locate the correct object. In other words, the hashcode was really no help at all in speeding up the search, even though improving search speed is hashcode's intended purpose! Nonetheless, this one-hash-fits-all method would be considered appropriate and even correct because it doesn't violate the contract. Once more, correct does not necessarily mean good.

Typically, you'll see hashCode() methods that do some combination of ^-ing (XOR-ing) a class's instance variables (in other words, twiddling their bits), along with perhaps multiplying them by a prime number. In any case, while the goal is to get a wide and random distribution of objects across buckets, the contract (and whether or not an object can be found) requires only that two equal objects have equal hashcodes. The exam does not expect you to rate the efficiency of a hashCode() method, but you must be able to recognize which ones will and will not work (work meaning "will cause the object to be found in the collection").

Now that we know that two equal objects must have identical hashcodes, is the reverse true? Do two objects with identical hashcodes have to be considered equal? Think about it—you might have lots of objects land in the same bucket because their hashcodes are identical, but unless they also pass the equals() test, they won't come up as a match in a search through the collection. This is exactly what you'd get with our very inefficient everybody-gets-the-same-hashcode method. It's legal and correct, just slooooow.

So in order for an object to be located, the search object and the object in the collection must have both identical hashcode values and return true for the equals() method. So there's just no way out of overriding both methods to be absolutely certain that your objects can be used in Collections that use hashing.

The hashCode() Contract

Now coming to you straight from the fabulous Java API documentation for class Object, may we present (drum roll) the hashCode() contract:

Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode() method must consistently return the same integer, provided no information used in equals() comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.

If two objects are equal according to the equals(Object) method, then calling the hashCode() method on each of the two objects must produce the same integer result.

It is NOT required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode() method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables.

And what this means to you is…

So let's look at what else might cause a hashCode() method to fail. What happens if you include a transient variable in your hashCode() method? While that's legal (compiler won't complain), under some circumstances an object you put in a collection won't be found. As you know, serialization saves an object so that it can be reanimated later by deserializing it back to full objectness. But danger Will Robinson—remember that transient variables are not saved when an object is serialized. A bad scenario might look like this:

Here's what could happen using code like the preceding example:

1. Give an object some state (assign values to its instance variables).

2. Put the object in a HashMap, using the object as a key.

3. Save the object to a file using serialization without altering any of its state.

4. Retrieve the object from the file through deserialization.

5. Use the deserialized (brought back to life on the heap) object to get the object out of the HashMap.

Oops. The object in the collection and the supposedly same object brought back to life are no longer identical. The object's transient variable will come back with a default value rather than the value the variable had at the time it was saved (or put into the HashMap). So using the preceding SaveMe code, if the value of x is 9 when the instance is put in the HashMap, then since x is used in the calculation of the hashcode, when the value of x changes, the hashcode changes too. And when that same instance of SaveMe is brought back from deserialization, x == 0, regardless of the value of x at the time the object was serialized. So the new hashcode calculation will give a different hashcode, and the equals() method fails as well since x is used to determine object equality.

Bottom line: transient variables can really mess with your equals() and hashCode() implementations. Keep variables non-transient or, if they must be marked transient, don't use them to determine hashcodes or equality.

CERTIFICATION OBJECTIVES

Collections (Exam Objective 6.1)

6.1 Given a design scenario, determine which collection classes and/or interfaces should be used to properly implement that design, including the use of the Comparable interface.

Can you imagine trying to write object-oriented applications without using data structures like hashtables or linked lists? What would you do when you needed to maintain a sorted list of, say, all the members in your Simpsons fan club? Obviously you can do it yourself; Amazon.com must have thousands of algorithm books you can buy. But with the kind of schedules programmers are under today, it's almost too painful to consider.

The Collections Framework in Java, which took shape with the release of JDK 1.2 and was expanded in 1.4 and again in Java 5, and yet again in Java 6, gives you lists, sets, maps, and queues to satisfy most of your coding needs. They've been tried, tested, and tweaked. Pick the best one for your job and you'll get—at the least—reasonable performance. And when you need something a little more custom, the Collections Framework in the java.util package is loaded with interfaces and utilities.

So What Do You Do with a Collection?

There are a few basic operations you'll normally use with collections:

Add objects to the collection.

Remove objects from the collection.

Find out if an object (or group of objects) is in the collection.

Retrieve an object from the collection (without removing it).

Iterate through the collection, looking at each element (object) one after another.

Key interfaces and Classes of the Collections framework

For the exam you'll need to know which collection to choose based on a stated requirement. The collections API begins with a group of interfaces, but also gives you a truckload of concrete classes. The core interfaces you need to know for the exam (and life in general) are the following nine:

The core concrete implementation classes you need to know for the exam are the following 13 (there are others, but the exam doesn't specifically cover them):

Not all collections in the Collections Framework actually implement the Collection interface. In other words, not all collections pass the IS-A test for Collection. Specifically, none of the Map-related classes and interfaces extend from Collection. So while SortedMap, Hashtable, HashMap, TreeMap, and LinkedHashMap are all thought of as collections, none are actually extended from Collection-with-a-capital-C (see Figure 7-2). To make things a little more confusing, there are really three overloaded uses of the word "collection':

collection (lowercase c), which represents any of the data structures in which objects are stored and iterated over.

Collection (capital C), which is actually the java.util.Collection interface from which Set, List, and Queue extend. (That's right, extend, not implement. There are no direct implementations of Collection.)

Collections (capital C and ends with s) is the java.util.Collections class that holds a pile of static utility methods for use with collections.

FIGURE 7-2 The interface and class hierarchy for collections

You can so easily mistake "Collections" for "Collection'—be careful. Keep in mind that Collections is a class, with static utility methods, while Collection is an interface with declarations of the methods common to most collections including add(), remove(), contains() , size(), and iterator()

Collections come in four basic flavors:

Lists Lists of things (classes that implement List).

Sets Unique things (classes that implement Set).

Maps Things with a unique ID (classes that implement Map).

Queues Things arranged by the order in which they are to be processed.

Figure 7-3 illustrates the structure of a List, a Set, and a Map

FIGURE 7-3 The structure of a List, a Set, and a Map

But there are sub-flavors within those four flavors of collections:

An implementation class can be unsorted and unordered, ordered but unsorted, or both ordered and sorted. But an implementation can never be sorted but unordered, because sorting is a specific type of ordering, as you'll see in a moment. For example, a HashSet is an unordered, unsorted set, while a LinkedHashSet is an ordered (but not sorted) set that maintains the order in which objects were inserted.

Maybe we should be explicit about the difference between sorted and ordered, but first we have to discuss the idea of iteration. When you think of iteration, you may think of iterating over an array using, say, a for loop to access each element in the array in order ([0], [1], [2], and so on). Iterating through a collection usually means walking through the elements one after another starting from the first element. Sometimes, though, even the concept of first is a little strange—in a Hashtable there really isn't a notion of first, second, third, and so on. In a Hashtable, the elements are placed in a (seemingly) chaotic order based on the hashcode of the key. But something has to go first when you iterate; thus, when you iterate over a Hashtable, there will indeed be an order. But as far as you can tell, it's completely arbitrary and can change in apparently random ways as the collection changes

Ordered When a collection is ordered, it means you can iterate through the collection in a specific (not-random) order. A Hashtable collection is not ordered. Although the Hashtable itself has internal logic to determine the order (based on hashcodes and the implementation of the collection itself), you won't find any order when you iterate through the Hashtable. An ArrayList, however, keeps the order established by the elements'index position (just like an array). LinkedHashSet keeps the order established by insertion, so the last element inserted is the last element in the LinkedHashSet (as opposed to an ArrayList, where you can insert an element at a specific index position). Finally, there are some collections that keep an order referred to as the natural order of the elements, and those collections are then not just ordered, but also sorted. Let's look at how natural order works for sorted collections.

Sorted A sorted collection means that the order in the collection is determined according to some rule or rules, known as the sort order. A sort order has nothing to do with when an object was added to the collection, or when was the last time it was accessed, or what "position" it was added at. Sorting is done based on properties of the objects themselves. You put objects into the collection, and the collection will figure out what order to put them in, based on the sort order. A collection that keeps an order (such as any List, which uses insertion order) is not really considered sorted unless it sorts using some kind of sort order. Most commonly, the sort order used is something called the natural order. What does that mean?

You know how to sort alphabetically—A comes before B, F comes before G, and so on. For a collection of String objects, then, the natural order is alphabetical. For Integer objects, the natural order is by numeric value—1 before 2, and so on. And for Foo objects, the natural order is…um…we don't know. There is no natural order for Foo unless or until the Foo developer provides one, through an interface (Comparable)that defines how instances of a class can be compared to one another (does instance a come before b, or does instance b come before a?). If the developer decides that Foo objects should be compared using the value of some instance variable (let's say there's one called bar), then a sorted collection will order the Foo objects according to the rules in the Foo class for how to use the bar instance variable to determine the order. Of course, the Foo class might also inherit a natural order from a superclass rather than define its own order, in some cases.

Aside from natural order as specified by the Comparable interface, it's also possible to define other, different sort orders using another interface: Comparator. We will discuss how to use both Comparable and Comparator to define sort orders later in this chapter. But for now, just keep in mind that sort order (including natural order) is not the same as ordering by insertion, access, or index.

Now that we know about ordering and sorting, We'll look at each of the four interfaces, and We'll dive into the concrete implementations of those interfaces.

List Interface

A List cares about the index. The one thing that List has that non-lists don't have is a set of methods related to the index. Those key methods include things like get(int index), indexOf(Object o), add(int index, Object obj), and so on. All three List implementations are ordered by index position—a position that you determine either by setting an object at a specific index or by adding it without specifying position, in which case the object is added to the end. The three List implementations are described in the following sections.

Arraylist Think of this as a growable array. It gives you fast iteration and fast random access. To state the obvious: it is an ordered collection (by index), but not sorted. You might want to know that as of version 1.4, ArrayList now implements the new RandomAccess interface—a marker interface (meaning it has no methods) that says, "this list supports fast (generally constant time) random access." Choose this over a LinkedList when you need fast iteration but aren't as likely to be doing a lot of insertion and deletion.

Vector Vector is a holdover from the earliest days of Java; Vector and Hashtable were the two original collections, the rest were added with Java 2 versions 1.2 and 1.4. A Vector is basically the same as an ArrayList, but Vector methods are synchronized for thread safety. you'll normally want to use ArrayList instead of Vector because the synchronized methods add a performance hit you might not need. And if you do need thread safety, there are utility methods in class Collections that can help. Vector is the only class other than ArrayList to implement RandomAccess.

Linkedlist A LinkedList is ordered by index position, like ArrayList, except that the elements are doubly-linked to one another. This linkage gives you new methods (beyond what you get from the List interface) for adding and removing from the beginning or end, which makes it an easy choice for implementing a stack or queue. Keep in mind that a LinkedList may iterate more slowly than an ArrayList, but it's a good choice when you need fast insertion and deletion. As of Java 5, the LinkedList class has been enhanced to implement the java.util.Queue interface. As such, it now supports the common queue methods: peek(), poll(), and offer().

Set Interface

A Set cares about uniqueness—it doesn't allow duplicates. Your good friend the equals() method determines whether two objects are identical (in which case only one can be in the set). The three Set implementations are described in the following sections.

Hashset A HashSet is an unsorted, unordered Set. It uses the hashcode of the object being inserted, so the more efficient your hashCode() implementation the better access performance you'll get. Use this class when you want a collection with no duplicates and you don't care about order when you iterate through it.

Linkedhashset A LinkedHashSet is an ordered version of HashSet that maintains a doubly-linked List across all elements. Use this class instead of HashSet when you care about the iteration order. When you iterate through a HashSet the order is unpredictable, while a LinkedHashSet lets you iterate through the elements in the order in which they were inserted.

When using HashSet or LinkedHashSet, the objects you add to them must override hashCode(). If they don't override hashCode(), the default Object. hashCode() method will allow multiple objects that you might consider "meaningfully equal" to be added to your "no duplicates allowed" set.

TreeSet The TreeSet is one of two sorted collections (the other being TreeMap). It uses a Red-Black tree structure (but you knew that), and guarantees that the elements will be in ascending order, according to natural order. Optionally, you can construct a TreeSet with a constructor that lets you give the collection your own rules for what the order should be (rather than relying on the ordering defined by the elements'class) by using a Comparable or Comparator. As of Java 6, TreeSet

Map Interface

A Map cares about unique identifiers. You map a unique key (the ID) to a specific value, where both the key and the value are, of course, objects. You're probably quite familiar with Maps since many languages support data structures that use a key/value or name/value pair. The Map implementations let you do things like search for a value based on the key, ask for a collection of just the values, or ask for a collection of just the keys. Like Sets, Maps rely on the equals() method to determine whether two keys are the same or different.

HashMap The HashMap gives you an unsorted, unordered Map. When you need a Map and you don't care about the order (when you iterate through it), then HashMap is the way to go; the other maps add a little more overhead. Where the keys land in the Map is based on the key's hashcode, so, like HashSet, the more efficient your hashCode() implementation, the better access performance you'll get. HashMap allows one null key and multiple null values in a collection.

Hashtable Like Vector, Hashtable has existed from prehistoric Java times. For fun, don't forget to note the naming inconsistency: HashMap vs. Hashtable. Where's the capitalization of t? Oh well, you won't be expected to spell it. Anyway, just as Vector is a synchronized counterpart to the sleeker, more modern ArrayList, Hashtable is the synchronized counterpart to HashMap. Remember that you don't synchronize a class, so when we say that Vector and Hashtable are synchronized, we just mean that the key methods of the class are synchronized. Another difference, though, is that while HashMap lets you have null values as well as one null key, a Hashtable doesn't let you have anything that's null.

LinkedHashMap Like its Set counterpart, LinkedHashSet, the LinkedHash-Map collection maintains insertion order (or, optionally, access order). Although it will be somewhat slower than HashMap for adding and removing elements, you can expect faster iteration with a LinkedHashMap.

TreeMap You can probably guess by now that a TreeMap is a sorted Map. And you already know that by default, this means "sorted by the natural order of the elements." Like TreeSet, TreeMap lets you define a custom sort order (via a Comparable or Comparator) when you construct a TreeMap, that specifies how the elements should be compared to one another when they're being ordered. As of Java 6, TreeMap implements NavigableMap.

Queue Interface

A Queue is designed to hold a list of "to-dos," or things to be processed in some way. Although other orders are possible, queues are typically thought of as FIFO (first-in, first-out). Queues support all of the standard Collection methods and they also add methods to add and subtract elements and review queue elements.

PriorityQueue This class is new with Java 5. Since the LinkedList class has been enhanced to implement the Queue interface, basic queues can be handled with a LinkedList. The purpose of a PriorityQueue is to create a "priority-in, priority out" queue as opposed to a typical FIFO queue. A PriorityQueue's elements are ordered either by natural ordering (in which case the elements that are sorted first will be accessed first) or according to a Comparator. In either case, the elements'ordering represents their relative priority.

You can easily eliminate some answers right away if you recognize that, for example, a Map can't be the class to choose when you need a name/value pair collection, since Map is an interface and not a concrete implementation class. The wording on the exam is explicit when it matters, so if you're asked to choose an interface, choose an interface rather than a class that implements that interface. The reverse is also true—if you're asked to choose a class, don't choose an interface type.

Table 7-2 summarizes the 11 of the 13 concrete collection-oriented classes you'll need to understand for the exam. (Arrays and Collections are coming right up!)

TABLE 7-2 Collection Interface Concrete Implementation Classes

Be sure you know how to interpret Table 7-2 in a practical way. For the exam, you might be expected to choose a collection based on a particular requirement, where that need is expressed as a scenario. For example, which collection would you use if you needed to maintain and search on a list of parts, identified by their unique alphanumeric serial number where the part would be of type Part? Would you change your answer at all if we modified the requirement such that you also need to be able to print out the parts in order, by their serial number? For the first question, you can see that since you have a Part class, but need to search for the objects based on a serial number, you need a Map. The key will be the serial number as a String, and the value will be the Part instance. The default choice should be HashMap, the quickest Map for access. But now when we amend the requirement to include getting the parts in order of their serial number, then we need a TreeMap—which maintains the natural order of the keys. Since the key is a String, the natural order for a String will be a standard alphabetical sort. If the requirement had been to keep track of which part was last accessed, then we'd probably need a LinkedHashMap. But since a LinkedHashMap loses the natural order (replacing it with last-accessed order), if we need to list the parts by serial number, we'll have to explicitly sort the collection, using a utility method.

CERTIFICATION OBJECTIVE

Using the Collections framework (objectives 6.3 and 6.5)

6.3 Write code that uses the NavigableSet and NavigableMap interfaces.

6.5 Use capabilities in the java.util package to write code to manipulate a list by sorting, performing a binary search, or converting the list to an array. Use capabilities in the java.util package to write code to manipulate an array by sorting, performing a binary search, or converting the array to a list. Use the java.util.Comparator and java.lang. Comparable interfaces to affect the sorting of lists and arrays. Furthermore, recognize the effect of the "natural ordering" of primitive wrapper classes and java.lang.String on sorting.

We've taken a high-level, theoretical look at the key interfaces and classes in the Collections Framework, now let's see how they work in practice.

ArrayList Basics

The java.util.ArrayList class is one of the most commonly used of all the classes in the Collections Framework. It's like an array on vitamins. Some of the advantages ArrayList has over arrays are

It can grow dynamically.

It provides more powerful insertion and search mechanisms than arrays.

Let's take a look at using an ArrayList that contains Strings. A key design goal of the Collections Framework was to provide rich functionality at the level of the main interfaces: List, Set, and Map. In practice, you'll typically want to instantiate an ArrayList polymorphically like this:

As of Java 5 you'll want to say

This kind of declaration follows the object oriented programming principle of "coding to an interface", and it makes use of generics. We'll say lots more about generics later in this chapter, but for now just know that, as of Java 5, the <String> syntax is the way that you declare a collection's type. (Prior to Java 5 there was no way to specify the type of a collection, and when we cover generics, We'll talk about the implications of mixing Java 5 (typed) and pre-Java 5 (untyped) collections.)

In many ways, ArrayList <String> is similar to a String[] in that it declares a container that can hold only Strings, but it's more powerful than a String[]. Let's look at some of the capabilities that an ArrayList has

which produces

There's lots going on in this small program. Notice that when we declared the ArrayList we didn't give it a size. Then we were able to ask the ArrayList for its size, we were able to ask it whether it contained specific objects, we removed an object right out from the middle of it, and then we rechecked its size.

Autoboxing with Collections

In general, collections can hold Objects but not primitives. Prior to Java 5, a very common use for the wrapper classes was to provide a way to get a primitive into a collection. Prior to Java 5, you had to wrap a primitive by hand before you could put it into a collection. With Java 5, primitives still have to be wrapped, but autoboxing takes care of it for you.

In this last example, we are still adding an Integer object to myInts (not an int primitive); it's just that autoboxing handles the wrapping for us.

Sorting Collections and Arrays

Sorting and searching topics have been added to the exam for Java 5. Both collections and arrays can be sorted and searched using methods in the API.

Sorting Collections

Let's start with something simple like sorting an ArrayList of Strings alphabetically. What could be easier? Okay, We'll wait while you go find ArrayList's sort() method…got it? Of course, ArrayList doesn't give you any way to sort its contents, but the java.util.Collections class does

This produces something like this:

Line 1 is declaring an ArrayList of Strings, and line 2 is sorting the ArrayList alphabetically. We'll talk more about the Collections class, along with the Arrays class in a later section, for now let's keep sorting stuff.

Let's imagine we're building the ultimate home-automation application. Today we're focused on the home entertainment center, and more specifically the DVD control center. We've already got the file I/O software in place to read and write data between the dvdInfo.txt file and instances of class DVDInfo. Here are the key aspects of the class:

Here's the DVD data that's in the dvdinfo.txt file:

In our home-automation application, we want to create an instance of DVDInfo for each line of data we read in from the dvdInfo.txt file. For each instance, we will parse the line of data (remember String.split()?) and populate DVDInfo's three instance variables. Finally, we want to put all of the DVDInfo instances into an ArrayList. Imagine that the populateList() method (below) does all of this. Here is a small piece of code from our application:

You might get output like this:

(Note: We overrode DVDInfo's toString() method, so when we invoked println() on the ArrayList it invoked toString() for each instance.)

Now that we've got a populated ArrayList, let's sort it:

Oops!, you get something like this:

What's going on here? We know that the Collections class has a sort() method, yet this error implies that Collections does NOT have a sort() method that can take a dvdlist. That means there must be something wrong with the argument we're passing (dvdlist).

If you've already figured out the problem, our guess is that you did it without the help of the obscure error message shown above…How the heck do you sort instances of DVDInfo Why were we able to sort instances of String? When you look up Collections.sort() in the API your first reaction might be to panic. Hang tight, once again the generics section will help you read that weird looking method signature. If you read the description of the one-arg sort() method, you'll see that the sort() method takes a List argument, and that the objects in the List must implement an interface called Comparable. It turns out that String implements Comparable, and that's why we were able to sort a list of Strings using the Collections.sort() method.

The Comparable Interface

The Comparable interface is used by the Collections.sort() method and the java.util.Arrays.sort() method to sort Lists and arrays of objects, respectively. To implement Comparable, a class must implement a single method, compareTo(). Here's an invocation of compareTo():

The compareTo() method returns an int with the following characteristics:

negative If thisObject <anotherObject

zero If thisObject == anotherObject

positive If thisObject > anotherObject

The sort() method uses compareTo() to determine how the List or object array should be sorted. Since you get to implement compareTo() for your own classes, you can use whatever weird criteria you prefer, to sort instances of your classes. Returning to our earlier example for class DVDInfo, we can take the easy way out and use the String class's implementation of compareTo():

In line 1 we declare that class DVDInfo implements Comparable in such a way that DVDInfo objects can be compared to other DVDInfo objects. In line 2 we implement compareTo() by comparing the two DVDInfo object's titles. Since we know that the titles are Strings, and that String implements Comparable, this is an easy way to sort our DVDInfo objects, by title. Before generics came along in Java 5, you would have had to implement Comparable something like this:

This is still legal, but you can see that it's both painful and risky, because you have to do a cast, and you need to verify that the cast will not fail before you try it.

It's important to remember that when you override equals() you MUst take an argument of type Object , but that when you override compareTo() you should take an argument of the type you're sorting.

Putting it all together, our DVDInfo class should now look like this:

Now, when we invoke Collections.sort(dvdList); we get

Hooray! Our ArrayList has been sorted by title. Of course, if we want our home automation system to really rock, We'll probably want to sort DVD collections in lots of different ways. Since we sorted our ArrayList by implementing the compareTo() method, we seem to be stuck. We can only implement compareTo() once in a class, so how do we go about sorting our classes in an order different than what we specify in our compareTo() method? Good question. As luck would have it, the answer is coming up next.

Sorting with Comparator

While you were looking up the Collections.sort() method you might have noticed that there is an overloaded version of sort() that takes both a List AND something called a Comparator. The Comparator interface gives you the capability to sort a given collection any number of different ways. The other handy thing about the Comparator interface is that you can use it to sort instances of any class—even classes you can't modify—unlike the Comparable interface, which forces you to change the class whose instances you want to sort. The Comparator interface is also very easy to implement, having only one method, compare(). Here's a small class that can be used to sort a List of DVDInfo instances, by genre.

The Comparator.compare() method returns an int whose meaning is the same as the Comparable.compareTo() method's return value. In this case we're taking advantage of that by asking compareTo() to do the actual comparison work for us. Here's a test program that lets us test both our Comparable code and our new Comparator code:

You've already seen the first two output lists, here's the third:

Because the Comparable and Comparator interfaces are so similar, expect the exam to try to confuse you. For instance you might be asked to implement the compareTo() method in the Comparator interface. Study Table 7-3 to burn in the differences between these two interfaces.

TABLE 7-3 Comparing Comparable to Comparator

Sorting with the Arrays Class

We've been using the java.util.Collections class to sort collections; now let's look at using the java.util.Arrays class to sort arrays. The good news is that sorting arrays of objects is just like sorting collections of objects. The Arrays.sort() method is overridden in the same way the Collections.sort() method is.

Arrays.sort(arrayToSort)

Arrays.sort(arrayToSort, Comparator)

In addition, the Arrays.sort() method is overloaded about a million times to provide a couple of sort methods for every type of primitive. The Arrays.sort() methods that sort primitives always sort based on natural order. Don't be fooled by an exam question that tries to sort a primitive array using a Comparator.

Finally, remember that the sort() methods for both the Collections class and the Arrays class are static methods, and that they alter the objects they are sorting, instead of returning a different sorted object.

We've talked a lot about sorting by natural order and using Comparators to sort. The last rule you'll need to burn in is that, whenever you want to sort an array or a collection, the elements inside must all be mutually comparable. In other words, if you have an Object[] and you put Cat and Dog objects into it, you won't be able to sort it. In general, objects of different types should be considered NOT mutually comparable, unless specifically stated otherwise.

Searching Arrays and Collections

The Collections class and the Arrays class both provide methods that allow you to search for a specific element. When searching through collections or arrays, the following rules apply:

Searches are performed using the binarySearch() method.

Successful searches return the int index of the element being searched.

Unsuccessful searches return an int index that represents the insertion point. The insertion point is the place in the collection/array where the element would be inserted to keep the collection/array properly sorted. Because positive return values and 0 indicate successful searches, the binarySearch() method uses negative numbers to indicate insertion points. Since 0 is a valid result for a successful search, the first available insertion point is −1. Therefore, the actual insertion point is represented as (-(insertion point) −1). For instance, if the insertion point of a search is at element 2, the actual insertion point returned will be −3.

The collection/array being searched must be sorted before you can search it.

If you attempt to search an array or collection that has not already been sorted, the results of the search will not be predictable.

If the collection/array you want to search was sorted in natural order, it must be searched in natural order. (Usually this is accomplished by NOT sending a Comparator as an argument to the binarySearch() method.)

If the collection/array you want to search was sorted using a Comparator, it must be searched using the same Comparator, which is passed as the second argument to the binarySearch() method. Remember that Comparators cannot be used when searching arrays of primitives.

Let's take a look at a code sample that exercises the binarySearch() method:

which produces something like this:

Here's what happened:

Line 1	Sort the sa array, alphabetically (the natural order).

Line 2	Search for the location of element "one", which is 1.

Line 3	Make a Comparator instance. On the next line we re-sort the array using the Comparator.

Line 4	Attempt to search the array. We didn't pass the binarySearch() method the Comparator we used to sort the array, so we got an incorrect (undefined) answer.

Line 5	Search again, passing the Comparator to binarySearch(). This time we get the correct answer, 2

Line 6	We define the Comparator; it's okay for this to be an inner class.

Line 7	By switching the use of the arguments in the invocation of compareTo(), we get an inverted sort.

When solving searching and sorting questions, two big gotchas are

1. Searching an array or collection that hasn't been sorted.

2. Using a Comparator in either the sort or the search, but not both.

Converting arrays to lists to arrays

There are a couple of methods that allow you to convert arrays to Lists, and Lists to arrays. The List and Set classes have toArray() methods, and the Arrays class has a method called asList().

The Arrays.asList() method copies an array into a List. The API says, "Returns a fixed-size list backed by the specified array. (Changes to the returned list' write through' to the array.)" When you use the asList() method, the array and the List become joined at the hip. When you update one of them, the other gets updated automatically. Let's take a look:

This produces

Notice that when we print the final state of the array and the List, they have both been updated with each other's changes. Wouldn't something like this behavior make a great exam question?

Now let's take a look at the toArray() method. There's nothing too fancy going on with the toArray() method; it comes in two flavors: one that returns a new Object array, and one that uses the array you send it as the destination array:

Using Lists

Remember that Lists are usually used to keep things in some kind of order. You can use a LinkedList to create a first-in, first-out queue. You can use an ArrayList to keep track of what locations were visited, and in what order. Notice that in both of these examples it's perfectly reasonable to assume that duplicates might occur. In addition, Lists allow you to manually override the ordering of elements by adding or removing elements via the element's index. Before Java 5, and the enhanced for loop, the most common way to examine a List "element by element" was by the use of an Iterator. you'll still find Iterators in use in the Java code you encounter, and you might just find an Iterator or two on the exam. An Iterator is an object that's associated with a specific collection. It lets you loop through the collection step by step. The two Iterator methods you need to understand for the exam are

boolean hasNext() Returns true if there is at least one more element in the collection being traversed. Invoking hasNext() does NOT move you to the next element of the collection.

Object next() This method returns the next object in the collection, AND moves you forward to the element after the element just returned.

Let's look at a little code that uses a List and an Iterator:

This produces

First off, we used generics syntax to create the Iterator (an Iterator of type Dog). Because of this, when we used the next() method, we didn't have to cast the Object returned by next() to a Dog. We could have declared the Iterator like this:

But then we would have had to cast the returned value:

The rest of the code demonstrates using the size(), get(), indexOf(), and toArray() methods. There shouldn't be any surprises with these methods. In a few pages Table 7-5 will list all of the List, Set, and Map methods you should be familiar with for the exam. As a last warning, remember that List is an interface!

Using sets

Remember that Sets are used when you don't want any duplicates in your collection. If you attempt to add an element to a set that already exists in the set, the duplicate element will not be added, and the add() method will return false. Remember, HashSets tend to be very fast because, as we discussed earlier, they use hashcodes.

You can also create a TreeSet, which is a Set whose elements are sorted. You must use caution when using a TreeSet (we're about to explain why):

If you insert the following line of code you'll get output something like this:

It's important to know that the order of objects printed in the second for loop is not predictable: HashSets do not guarantee any ordering. Also, notice that the fourth invocation of add() failed, because it attempted to insert a duplicate entry (a String with the value a) into the Set.

If you insert this line of code you'll get something like this:

The issue is that whenever you want a collection to be sorted, its elements must be mutually comparable. Remember that unless otherwise specified, objects of different types are not mutually comparable.

Using Maps

Remember that when you use a class that implements Map, any classes that you use as a part of the keys for that map must override the hashCode() and equals() methods. (Well, you only have to override them if you're interested in retrieving stuff from your Map. Seriously, it's legal to use a class that doesn't override equals() and hashCode() as a key in a Map; your code will compile and run, you just won't find your stuff.) Here's some crude code demonstrating the use of a HashMap:

which produces something like this:

Let's review the output. The first value retrieved is a Dog object (your value will vary). The second value retrieved is an enum value (DOG). The third value retrieved is a String; note that the key was an enum value. Pop quiz: What's the implication of the fact that we were able to successfully use an enum as a key?

The implication of this is that enums override equals() and hashCode(). And, if you look at the java.lang.Enum class in the API, you will see that, in fact, these methods have been overridden.

The fourth output is a String. The important point about this output is that the key used to retrieve the String was made of a Dog object. The fifth output is null. The important point here is that the get() method failed to find the Cat object that was inserted earlier. (The last line of output confirms that indeed, 5 key/value pairs exist in the Map.) Why didn't we find the Cat key String? Why did it work to use an instance of Dog as a key, when using an instance of Cat as a key failed?

It's easy to see that Dog overrode equals() and hashCode() while Cat didn't.

Let's take a quick look at hashcodes. We used an incredibly simplistic hashcode formula in the Dog class—the hashcode of a Dog object is the length of the instance's name. So in this example the hashcode = 6. Let's compare the following two hashCode() methods:

Time for another pop quiz: Are the preceding two hashcodes legal? Will they successfully retrieve objects from a Map? Which will be faster?

The answer to the first two questions is Yes and Yes. Neither of these hashcodes will be very efficient (in fact they would both be incredibly inefficient), but they are both legal, and they will both work. The answer to the last question is that the first hashcode will be a little bit faster than the second hashcode. In general, the more unique hashcodes a formula creates, the faster the retrieval will be. The first hashcode formula will generate a different code for each name length (for instance the name Robert will generate one hashcode and the name Benchley will generate a different hashcode). The second hashcode formula will always produce the same result, 4, so it will be slower than the first.

Our last Map topic is what happens when an object used as a key has its values changed? If we add two lines of code to the end of the earlier MapTest.main(),

we get something like this:

The Dog that was previously found now cannot be found. Because the Dog.name variable is used to create the hashcode, changing the name changed the value of the hashcode. As a final quiz for hashcodes, determine the output for the following lines of code if they're added to the end of MapTest.main():

Remember that the hashcode is equal to the length of the name variable. When you study a problem like this, it can be useful to think of the two stages of retrieval:

1. Use the hashCode() method to find the correct bucket

2. Use the equals() method to find the object in the bucket

In the first call to get(), the hashcode is 8 (magnolia) and it should be 6 (clover), so the retrieval fails at step 1 and we get null. In the second call to get(), the hashcodes are both 6, so step 1 succeeds. Once in the correct bucket (the "length of name = 6" bucket), the equals() method is invoked, and since Dog's equals() method compares names, equals() succeeds, and the output is Dog key. In the third invocation of get(), the hashcode test succeeds, but the equals() test fails because arthur is NOT equal to clover.

Navigating (Searching) TreeSets and TreeMaps

We've talked about searching lists and arrays. Let's turn our attention to searching TreeSets and TreeMaps. Java 6 introduced (among others) two new interfaces: java.util.NavigableSet and java.util.NavigableMap. For the purposes of the exam, you're interested in how TreeSet and TreeMap implement these interfaces.

Imagine that the Santa Cruz– Monterey ferry has an irregular schedule. Let's say that we have the daily Santa Cruz departure times stored, in military time, in a TreeSet. Let's look at some code that determines two things:

1. The last ferry that leaves before 4 pm (1600 hours)

2. The first ferry that leaves after 8 pm (2000 hours)

This should produce the following:

As you can see in the preceding code, before the addition of the NavigableSet interface, zeroing in on an arbitrary spot in a Set—using the methods available in Java 5—was a compute-expensive and clunky proposition. On the other hand, using the new Java 6 methods lower() and higher(), the code becomes a lot cleaner.

For the purpose of the exam, the NavigableSet methods related to this type of navigation are lower(), floor(), higher(), ceiling(), and the mostly parallel NavigableMap methods are lowerKey(), floorKey(), ceilingKey(), and higherKey(). The difference between lower() and floor() is that lower() returns the element less than the given element, and floor() returns the element less than or equal to the given element. Similarly, higher() returns the element greater than the given element, and ceiling() returns the element greater than or equal to the given element. Table 7-4 summarizes the methods you should know for the exam.

TABLE 7-4 Important "Navigation" Related Methods

Other Navigation Methods

In addition to the methods we just discussed there are a few more new Java 6 methods that could be considered "navigation" methods. (Okay, it's a little stretch to call these "navigation" methods, but just play along.)

Polling

Although the idea of polling isn't new to Java 6 (as you'll see in a minute, PriorityQueue had a poll() method before Java 6), it is new to TreeSet and TreeMap. The idea of polling is that we want both to retrieve and remove an element from either the beginning or the end of a collection. In the case of TreeSet, pollFirst() returns and removes the first entry in the set, and pollLast() returns and removes the last. Similarly, TreeMap now provides pollFirstEntry() and pollLastEntry() to retrieve and remove key-value pairs.

Descending Order

Also new to Java 6 for TreeSet and TreeMap are methods that return a collection in the reverse order of the collection on which the method was invoked. The important methods for the exam are TreeSet.descendingSet() and TreeMap.descendingMap().

Table 7-4 summarizes the "navigation" methods you'll need to know for the exam.

Backed Collections

Some of the classes in the java.util package support the concept of "backed collections". We'll use a little code to help explain the idea:

This should produce something like this:

The important method in this code is the TreeMap.subMap() method. It's easy to guess (and it's correct), that the subMap() method is making a copy of a portion of the TreeMap named map. The first line of output verifies the conclusions we've just drawn.

What happens next is powerful and a little bit unexpected (now we're getting to why they're called backed collections). When we add key-value pairs to either the original TreeMap or the partial-copy SortedMap, the new entries were automatically added to the other collection—sometimes. When submap was created, we provided a value range for the new collection. This range defines not only what should be included when the partial copy is created, but also defines the range of values that can be added to the copy. As we can verify by looking at the second line of output, we can add new entries to either collection within the range of the copy, and the new entries will show up in both collections. In addition, we can add a new entry to the original collection, even if it's outside the range of the copy. In this case, the new entry will show up only in the original—it won't be added to the copy because it's outside the copy's range. Notice that we commented out line #6. If you attempt to add an out-of-range entry to the copied collection an exception will be thrown.

For the exam, you'll need to understand the basics just explained, plus a few more details about three methods from TreeSet—(headSet(), subSet(), and tailSet()), and three methods from TreeMap (headMap(), subMap(), and tailMap()). As with the navigation-oriented methods we just discussed, we can see a lot of parallels between the TreeSet and the TreeMap methods. The headSet() / headMap() methods create a subset that starts at the beginning of the original collection and ends at the point specified by the method's argument. The tailSet() / tailMap() methods create a subset that starts at the point specified by the method's argument and goes to the end of the original collection. Finally, the subSet() / subMap() methods allow you to specify both the start and end points for the subset collection you're creating.

As you might expect, the question of whether the subsetted collection's end points are inclusive or exclusive is a little tricky. The good news is that for the exam you have to remember only that when these methods are invoked with endpoint and boolean arguments, the boolean always means "is inclusive?". A little more good news is that all you have to know for the exam is that unless specifically indicated by a boolean argument, a subset's starting point will always be inclusive. Finally, you'll notice when you study the API that all of the methods we've been discussing here have an overloaded version that's new to Java 6. The older methods return either a SortedSet or a SortedMap, the new Java 6 methods return either a NavigableSet or a NavigableMap. Table 7-5 summarizes these methods.

TABLE 7-5 Important "Backed Collection" Methods for TreeSet and TreeMap

* NOTE: These boolean arguments are optional. If they exist it's a Java 6 method that lets you specify whether the endpoint is exclusive, and these methods return a NavigableXxx. If the boolean argument(s) don't exist, the method returns either a SortedSet or a SortedMap.

Let's say that you've created a backed collection using either a tailXxx() or subXxx() method. Typically in these cases the original and copy collections have different "first" elements. For the exam it's important that you remember that the pollFirstXxx() methods will always remove the first entry from the collection on which they're invoked, but they will remove an element from the other collection only if it has same value. So it's most likely that invoking pollFirstXxx() on the copy will remove an entry from both collections, but invoking pollFirstXxx() on the original will remove only the entry from the original.

Using the PriorityQueue Class

The last collection class you'll need to understand for the exam is the PriorityQueue. Unlike basic queue structures that are first-in, first-out by default, a PriorityQueue orders its elements using a user-defined priority. The priority can be as simple as natural ordering (in which, for instance, an entry of 1 would be a higher priority than an entry of 2). In addition, a PriorityQueue can be ordered using a Comparator, which lets you define any ordering you want. Queues have a few methods not found in other collection interfaces: peek(), poll(), and offer().

This code produces something like this:

Let's look at this in detail. The first for loop iterates through the ia array, and uses the offer() method to add elements to the PriorityQueue named pq1. The second for loop iterates through pq1 using the poll() method, which returns the highest priority entry in pq1 AND removes the entry from the queue. Notice that the elements are returned in priority order (in this case, natural order). Next, we create a Comparator—in this case, a Comparator that orders elements in the opposite of natural order. We use this Comparator to build a second PriorityQueue, pq2, and we load it with the same array we used earlier. Finally, we check the size of pq2 before and after calls to peek() and poll(). This confirms that peek() returns the highest priority element in the queue without removing it, and poll() returns the highest priority element, AND removes it from the queue. Finally, we review the remaining elements in the queue.

Method Overview for Arrays and Collections

For these two classes, we've already covered the trickier methods you might encounter on the exam. Table 7-6 lists a summary of the methods you should be aware of. (Note: The T[] syntax will be explained later in this chapter; for now, think of it as meaning "any array that's NOT an array of primitives.")

TABLE 7-6 Key Methods in Arrays and Collections

Method Overview For List, Set, Map, and Queue

For these four interfaces, we've already covered the trickier methods you might encounter on the exam. Table 7-7 lists a summary of the List, Set, and Map methods you should be aware of, but don't forget the new "Navigable" methods floor, lower, ceiling, and higher that we discussed a few pages back.

TABLE 7-7 Key Methods in List, Set, and Map

For the exam, the PriorityQueue methods that are important to understand are offer() (which is similar to add()), peek() (which retrieves the element at the head of the queue, but doesn't delete it), and poll() (which retrieves the head element and removes it from the queue).

It's important to know some of the details of natural ordering. The following code will help you understand the relative positions of uppercase characters, lowercase characters, and spaces in a natural ordering:

This produces

If you remember that spaces sort before characters and that uppercase letters sort before lowercase characters, you should be good to go for the exam.

CERTIFICATION OBJECTIVE

Generic Types (Objectives 6.3 and 6.4)

6.3 Write code that uses the generic versions of the Collections API, in particular the Set, List, and Map interfaces and implementation classes. Recognize the limitations of the non-generic Collections API and how to refactor code to use the generic versions.

6.4 Develop code that makes proper use of type parameters in class/interface declarations, instance variables, method arguments, and return types; and write generic methods or methods that make use of wildcard types and understand the similarities and differences between these two approaches. Write code that uses the NavigableSet and NavigableMap interfaces.

Arrays in Java have always been type safe—an array declared as type String (String []) can't accept Integers (or ints), Dogs, or anything other than Strings. But remember that before Java 5 there was no syntax for declaring a type safe collection. To make an ArrayList of Strings, you said,

or, the polymorphic equivalent

There was no syntax that let you specify that myList will take Strings and only Strings. And with no way to specify a type for the ArrayList, the compiler couldn't enforce that you put only things of the specified type into the list. As of Java 5, we can use generics, and while they aren't only for making type safe collections, that's just about all most developers use generics for. So, while generics aren't just for collections, think of collections as the overwhelming reason and motivation for adding generics to the language.

And it was not an easy decision, nor has it been an entirely welcome addition. Because along with all the nice happy type safety, generics come with a lot of baggage—most of which you'll never see or care about, but there are some gotchas that come up surprisingly quickly. We'll cover the ones most likely to show up in your own code, and those are also the issues that you'll need to know for the exam.

The biggest challenge for Sun in adding generics to the language (and the main reason it took them so long) was how to deal with legacy code built without generics. Sun's Java engineers obviously didn't want to break everyone's existing Java code, so they had to find a way for Java classes with both type safe (generic) and non-type safe (non-generic/pre-Java 5) collections to still work together. Their solution isn't the friendliest, but it does let you use older non-generic code, as well as use generic code that plays with non-generic code. But notice we said "plays," and not "plays WELL."

While you can integrate Java 5 and Java 6 generic code with legacy non-generic code, the consequences can be disastrous, and unfortunately, most of the disasters happen at runtime, not compile time. Fortunately, though, most compilers will generate warnings to tell you when you're using unsafe (meaning non-generic) collections.

The Java 5 exam covers both pre-Java 5 (non-generic) and Java 5 style collections, and you'll see questions that expect you to understand the tricky problems that can come from mixing non-generic and generic code together. And like some of the other topics in this book, you could fill an entire book if you really wanted to cover every detail about generics, but the exam (and this book) covers more than most developers will ever need to use.

The Legacy Way to Do Collections

Here's a review of a pre-Java 5 ArrayList intended to hold Strings. (We say "intended" because that's about all you had—good intentions—to make sure that the ArrayList would hold only Strings).

A non-generic collection can hold any kind of object! A non-generic collection is quite happy to hold anything that is NOT a primitive.

This meant it was entirely up to the programmer to be…careful. Having no way to guarantee collection type wasn't very programmer-friendly for such a strongly typed language. We're so used to the compiler stopping us from, say, assigning an int to a boolean reference or a String to a Dog reference, but with collections, it was, "Come on in! The door is always open! All objects are welcome here any time!"

And since a collection could hold anything, the methods that get objects out of the collection could have only one kind of return type—java.lang.Object. That meant that getting a String back out of our only-Strings-intended list required a cast:

And since you couldn't guarantee that what was coming out really was a String (since you were allowed to put anything in the list), the cast could fail at runtime.

So, generics takes care of both ends (the putting in and getting out) by enforcing the type of your collections. Let's update the String list:

Perfect. That's exactly what we want. By using generics syntax—which means putting the type in angle brackets <String>, we're telling the compiler that this collection can hold only String objects. The type in angle brackets is referred to as either the "parameterized type," "type parameter," or of course just old-fashioned "type." In this chapter, We'll refer to it both new ways.

So, now that what you put IN is guaranteed, you can also guarantee what comes OUT, and that means you can get rid of the cast when you get something from the collection. Instead of

we can now just say

The compiler already knows that myList contains only things that can be assigned to a String reference, so now there's no need for a cast. So far, it seems pretty simple. And with the new for loop, you can of course iterate over the guaranteed-to-be-String list:

And of course you can declare a type parameter for a method argument, which then makes the argument a type safe reference:

The method above would NOT compile if we changed it to

Return types can obviously be declared type safe as well:

The compiler will stop you from returning anything not compatible with a List <Dog> (although what is and is not compatible is going to get very interesting in a minute). And since the compiler guarantees that only a type safe Dog List is returned, those calling the method won't need a cast to take Dogs from the List:

With pre-Java 5, non-generic code, the getDogList() method would be

and the caller would need a cast:

(The cast in this example applies to what comes from the List's get() method; we aren't casting what is returned from the getDogList() method, which is a List.)

But what about the benefit of a completely heterogeneous collection? In other words, what if you liked the fact that before generics you could make an ArrayList that could hold any kind of object?

is almost identical to

Declaring a List with a type parameter of <Object> makes a collection that works in almost the same way as the original pre-Java 5, non-generic collection—you can put ANY Object type into the collection. you'll see a little later that non-generic collections and collections of type <Object> aren't entirely the same, but most of the time the differences do not matter.

Oh, if only this were the end of the story…but there are still a few tricky issues with methods arguments, polymorphism, and integrating generic and non-generic code, so we're just getting warmed up here.

Generics and Legacy Code

The easiest generics thing you'll need to know for the exam is how to update non-generic code to make it generic. You just add a type in angle brackets ( <> ) immediately following the collection type in BOTH the variable declaration and the constructor call, including any place you declare a variable (so that means arguments and return types too). A pre-Java 5 List meant to hold only Integers:

becomes

and a list meant to hold only Strings goes from

to this:

Easy. And if there's code that used the earlier non-generic version and performed a cast to get things out, that won't break anyone's code:

Mixing Generic and Non-generic Collections

Now here's where it starts to get interesting…imagine we have an ArrayList, of type Integer, and we're passing it into a method from a class whose source code we don't have access to. Will this work?

The older, non-generics class we want to use:

Yes, this works just fine. You can mix correct generic code with older non-generic code, and everyone is happy.

In the previous example, the addAll() legacy method assumed (trusted? hoped?) that the list passed in was indeed restricted to Integers, even though when the code was written, there was no guarantee. It was up to the programmers to be careful.

Since the addAll() method wasn't doing anything except getting the Integer (using a cast) from the list and accessing its value, there were no problems. In that example, there was no risk to the caller's code, but the legacy method might have blown up if the list passed in contained anything but Integers (which would cause a ClassCastException).

But now imagine that you call a legacy method that doesn't just read a value but adds something to the ArrayList? Will this work?

Sure, this code works. It compiles, and it runs. The insert() method puts an Integer into the list that was originally typed as <Integer>, so no problem.

But…what if we modify the insert() method like this:

Will that work? Yes, sadly, it does! It both compiles and runs. No runtime exception. Yet, someone just stuffed a String into a supposedly type safe ArrayList of type <Integer>. How can that be?

Remember, the older legacy code was allowed to put anything at all (except primitives) into a collection. And in order to support legacy code, Java 5 and Java 6 allows your newer type safe code to make use of older code (the last thing Sun wanted to do was ask several million Java developers to modify all their existing code).

So, the Java 5 or Java 6 compiler is forced into letting you compile your new type safe code even though your code invokes a method of an older class that takes a non-type safe argument and does who knows what with it.

However, just because the Java 5 compiler allows this code to compile doesn't mean it has to be HAPPY about it. In fact the compiler will warn you that you're taking a big, big risk sending your nice protected ArrayList <Integer> into a dangerous method that can have its way with your list and put in Floats, Strings, or even Dogs.

When you called the addAll() method in the earlier example, it didn't insert anything to the list (it simply added up the values within the collection), so there was no risk to the caller that his list would be modified in some horrible way. It compiled and ran just fine. But in the second version, with the legacy insert() method that adds a String, the compiler generated a warning:

Remember that compiler warnings are NOT considered a compiler failure. The compiler generated a perfectly valid class file from the compilation, but it was kind enough to tell you by saying, in so many words, "I seriously hope you know what you are doing because this old code has NO respect (or even knowledge) of your <Integer> typing, and can do whatever the heck it wants to your precious ArrayList <Integer>."

Be sure you know the difference between "compilation fails" and "compiles without error" and "compiles without warnings" and "compiles with warnings." In most questions on the exam, you care only about compiles vs. compilation fails—compiler warnings don't matter for most of the exam. But when you are using generics, and mixing both typed and untyped code, warnings matter.

Back to our example with the legacy code that does an insert, keep in mind that for BOTH versions of the insert() method (one that adds an Integer and one that adds a String) the compiler issues warnings. The compiler does NOT know whether the insert() method is adding the right thing (Integer) or wrong thing (String). The reason the compiler produces a warning is because the method is ADDING something to the collection! In other words, the compiler knows there's a chance the method might add the wrong thing to a collection the caller thinks is type safe.

For the purposes of the exam, unless the question includes an answer that mentions warnings, then even if you know compilation will produce warnings, that is still a successful compile! Compiling with warnings is NEVER considered a compilation failure.

One more time—if you see code that you know will compile with warnings, you must NOT choose "Compilation fails." as an answer. The bottom line is this: code that compiles with warnings is still a successful compile. If the exam question wants to test your knowledge of whether code will produce a warning (or what you can do to the code to ELIMINATE warnings), the question (or answer) will explicitly include the word "warnings."

So far, we've looked at how the compiler will generate warnings if it sees that there's a chance your type safe collection could be harmed by older, non-type-safe code. But one of the questions developers often ask is, "Okay, sure, it compiles, but why does it RUN? Why does the code that inserts the wrong thing into my list work at runtime?" In other words, why does the JVM let old code stuff a String into your ArrayList <Integer>, without any problems at all? No exceptions, nothing. Just a quiet, behind-the-scenes, total violation of your type safety that you might not discover until the worst possible moment.

There's one Big Truth you need to know to understand why it runs without problems—the JVM has no idea that your ArrayList was supposed to hold only Integers. The typing information does not exist at runtime! All your generic code is strictly for the compiler. Through a process called "type erasure," the compiler does all of its verifications on your generic code and then strips the type information out of the class bytecode. At runtime, ALL collection code—both legacy and new Java 5 code you write using generics—looks exactly like the pre-generic version of collections. None of your typing information exists at runtime. In other words, even though you WROTE

By the time the compiler is done with it, the JVM sees what it always saw before Java 5 and generics:

The compiler even inserts the casts for you—the casts you had to do to get things out of a pre-Java 5 collection.

Think of generics as strictly a compile-time protection. The compiler uses generic type information (the <type> in the angle brackets) to make sure that your code doesn't put the wrong things into a collection, and that you do not assign what you get from a collection to the wrong reference type. But NONE of this protection exists at runtime.

This is a little different from arrays, which give you BOTH compile-time protection and runtime protection. Why did they do generics this way? Why is there no type information at runtime? To support legacy code. At runtime, collections are collections just like the old days. What you gain from using generics is compile-time protection that guarantees that you won't put the wrong thing into a typed collection, and it also eliminates the need for a cast when you get something out, since the compiler already knows that only an Integer is coming out of an Integer list.

The fact is, you don't NEED runtime protection…until you start mixing up generic and non-generic code, as we did in the previous example. Then you can have disasters at runtime. The only advice we have is to pay very close attention to those compiler warnings:

This compiler warning isn't very descriptive, but the second note suggests that you recompile with -Xlint:unchecked. If you do, you'll get something like this:

When you compile with the -Xlint:unchecked flag, the compiler shows you exactly which method(s) might be doing something dangerous. In this example, since the list argument was not declared with a type, the compiler treats it as legacy code and assumes no risk for what the method puts into the "raw" list.

On the exam, you must be able to recognize when you are compiling code that will produce warnings but still compile. And any code that compiles (even with warnings) will run! No type violations will be caught at runtime by the JVM, until those type violations mess with your code in some other way. In other words, the act of adding a String to an <Integer> list won't fail at runtime until you try to treat that String-you-think-is-an-Integer as an Integer.

For example, imagine you want your code to pull something out of your supposedly type safe ArrayList <Integer> that older code put a String into. It compiles (with warnings). It runs…or at least the code that actually adds the String to the list runs. But when you take the String-that-wasn't-supposed-to-be-there out of the list, and try to assign it to an Integer reference or invoke an Integer method, you're dead.

Keep in mind, then, that the problem of putting the wrong thing into a typed (generic) collection does not show up at the time you actually do the add () to the collection. It only shows up later, when you try to use something in the list and it doesn't match what you were expecting. In the old (pre-Java 5) days, you always assumed that you might get the wrong thing out of a collection (since they were all non-type safe), so you took appropriate defensive steps in your code. The problem with mixing generic with non-generic code is that you won't be expecting those problems if you have been lulled into a false sense of security by having written type safe code. Just remember that the moment you turn that type safe collection over to older, non-type safe code, your protection vanishes.

Again, pay very close attention to compiler warnings, and be prepared to see issues like this come up on the exam.

When using legacy (non-type safe) collections—watch out for unboxing problems! If you declare a non-generic collection, the get() method ALWAYS returns a reference of type java.lang.Object. Remember that unboxing can't convert a plain old Object to a primitive, even if that Object reference refers to an Integer (or some other wrapped primitive) on the heap. Unboxing converts only from a wrapper class reference (like an Integer or a Long) to a primitive.

Unboxing gotcha, continued:

Watch out for missing casts associated with pre-Java 5, non-generic collections.

Polymorphism and Generics

Generic collections give you the same benefits of type safety that you've always had with arrays, but there are some crucial differences that can bite you if you aren't prepared. Most of these have to do with polymorphism.

You've already seen that polymorphism applies to the "base" type of the collection:

In other words, we were able to assign an ArrayList to a List reference, because List is a supertype of ArrayList. Nothing special there—this polymorphic assignment works the way it always works in Java, regardless of the generic typing.

But what about this?

Think about it for a minute.

Keep thinking…

No, it doesn't work. There's a very simple rule here—the type of the variable declaration must match the type you pass to the actual object type. If you declare List <Foo> foo then whatever you assign to the foo reference MUST be of the generic type <Foo>. Not a subtype of <Foo>. Not a supertype of <Foo>. Just <Foo>.

These are wrong:

But these are fine:

So far so good. Just keep the generic type of the reference and the generic type of the object to which it refers identical. In other words, polymorphism applies here to only the "base" type. And by "base," we mean the type of the collection class itself—the class that can be customized with a type. In this code,

List and ArrayList are the base type and JButton is the generic type. So an ArrayList can be assigned to a List, but a collection of <JButton> cannot be assigned to a reference of <Object>, even though JButton is a subtype of Object.

The part that feels wrong for most developers is that this is NOT how it works with arrays, where you are allowed to do this,

which means you're also allowed to do this

but not this:

Why are the rules for typing of arrays different from the rules for generic typing? We'll get to that in a minute. For now, just burn it into your brain that polymorphism does not work the same way for generics as it does with arrays.

Generic Methods

If you weren't already familiar with generics, you might be feeling very uncomfortable with the implications of the previous no-polymorphic-assignment-for-generic-types thing. And why shouldn't you be uncomfortable? One of the biggest benefits of polymorphism is that you can declare, say, a method argument of a particular type and at runtime be able to have that argument refer to any subtype—including those you'd never known about at the time you wrote the method with the supertype argument.

For example, imagine a classic (simplified) polymorphism example of a veterinarian (AnimalDoctor) class with a method checkup(). And right now, you have three Animal subtypes—Dog, Cat, and Bird—each implementing the abstract checkup() method from Animal:

Forgetting collections/arrays for a moment, just imagine what the AnimalDoctor class needs to look like in order to have code that takes any kind of Animal and invokes the Animal checkup () method. Trying to overload the AnimalDoctor class with checkup() methods for every possible kind of animal is ridiculous, and obviously not extensible. You'd have to change the AnimalDoctor class every time someone added a new subtype of Animal.

So in the AnimalDoctor class, you'd probably have a polymorphic method:

And of course we do want the AnimalDoctor to also have code that can take arrays of Dogs, Cats, or Birds, for when the vet comes to the dog, cat, or bird kennel. Again, we don't want overloaded methods with arrays for each potential Animal subtype, so we use polymorphism in the AnimalDoctor class:

Here is the entire example, complete with a test of the array polymorphism that takes any type of animal array (Dog[], Cat[], Bird[]).

This works fine, of course (we know, we know, this is old news). But here's why we brought this up as refresher—this approach does NOT work the same way with type safe collections!

In other words, a method that takes, say, an ArrayList <Animal> will NOT be able to accept a collection of any Animal subtype! That means ArrayList <Dog> cannot be passed into a method with an argument of ArrayList <Animal>, even though we already know that this works just fine with plain old arrays.

Obviously this difference between arrays and ArrayList is consistent with the polymorphism assignment rules we already looked at—the fact that you cannot assign an object of type ArrayList <JButton> to a List <Object>. But this is where you really start to feel the pain of the distinction between typed arrays and typed collections.

We know it won't work correctly, but let's try changing the AnimalDoctor code to use generics instead of arrays:

So what does happen?

The compiler stops us with errors, not warnings. You simply CANNOT assign the individual ArrayLists of Animal subtypes ( <Dog>, <Cat>, or <Bird> ) to an ArrayList of the supertype <Animal>, which is the declared type of the argument.

This is one of the biggest gotchas for Java programmers who are so familiar with using polymorphism with arrays, where the same scenario (Animal[] can refer to Dog[], Cat[], or Bird[]) works as you would expect. So we have two real issues:

1. Why doesn't this work?

2. How do you get around it?

You'd hate us and all of the Sun engineers if we told you that there wasn't a way around it—that you had to accept it and write horribly inflexible code that tried to anticipate and code overloaded methods for each specific <type>. Fortunately, there is a way around it.

But first, why can't you do it if it works for arrays? Why can't you pass an ArrayList <Dog> into a method with an argument of ArrayList <Animal> ?

We'll get there, but first let's step way back for a minute and consider this perfectly legal scenario:

Part of the benefit of declaring an array using a more abstract supertype is that the array itself can hold objects of multiple subtypes of the supertype, and then you can manipulate the array assuming everything in it can respond to the Animal interface (in other words, everything in the array can respond to method calls defined in the Animal class). So here, we're using polymorphism not for the object that the array reference points to, but rather what the array can actually HOLD—in this case, any subtype of Animal. You can do the same thing with generics:

So this part works with both arrays and generic collections—we can add an instance of a subtype into an array or collection declared with a supertype. You can add Dogs and Cats to an Animal array (Animal[]) or an Animal collection (ArrayList <Animal> ).

And with arrays, this applies to what happens within a method:

So if this is true, and if you can put Dogs into an ArrayList <Animal>, then why can't you use that same kind of method scenario? Why can't you do this?

Actually, you CAN do this under certain conditions. The code above WILL compile just fine IF what you pass into the method is also an ArrayList <Animal>. This is the part where it differs from arrays, because in the array version, you COULD pass a Dog[] into the method that takes an Animal[].

The ONLY thing you can pass to a method argument of ArrayList <Animal> is an ArrayList <Animal> ! (Assuming you aren't trying to pass a subtype of ArrayList, since remember—the "base" type can be polymorphic.)

The question is still out there—why is this bad? And why is it bad for ArrayList but not arrays? Why can't you pass an ArrayList <Dog> to an argument of ArrayList <Animal> ? Actually, the problem IS just as dangerous whether you're using arrays or a generic collection. It's just that the compiler and JVM behave differently for arrays vs. generic collections.

The reason it is dangerous to pass a collection (array or ArrayList) of a subtype into a method that takes a collection of a supertype, is because you might add something. And that means you might add the WRONG thing! This is probably really obvious, but just in case (and to reinforce), let's walk through some scenarios. The first one is simple:

This is no problem. We passed a Dog[] into the method, and added a Dog to the array (which was allowed since the method parameter was type Animal[], which can hold any Animal subtype). But what if we changed the calling code to

and the original method stays the same:

The compiler thinks it is perfectly fine to add a Dog to an Animal[] array, since a Dog can be assigned to an Animal reference. The problem is, if you passed in an array of an Animal subtype (Cat, Dog, or Bird), the compiler does not know. The compiler does not realize that out on the heap somewhere is an array of type Cat[], not Animal[], and you're about to try to add a Dog to it. To the compiler, you have passed in an array of type Animal, so it has no way to recognize the problem.

THIS is the scenario we're trying to prevent, regardless of whether it's an array or an ArrayList. The difference is, the compiler lets you get away with it for arrays, but not for generic collections.

The reason the compiler won't let you pass an ArrayList <Dog> into a method that takes an ArrayList <Animal>, is because within the method, that parameter is of type ArrayList <Animal>, and that means you could put any kind of Animal into it. There would be no way for the compiler to stop you from putting a Dog into a List that was originally declared as <Cat>, but is now referenced from the <Animal> parameter.

We still have two questions…how do you get around it and why the heck does the compiler allow you to take that risk for arrays but not for ArrayList (or any other generic collection)?

The reason you can get away with compiling this for arrays is because there is a runtime exception (ArrayStoreException) that will prevent you from putting the wrong type of object into an array. If you send a Dog array into the method that takes an Animal array, and you add only Dogs (including Dog subtypes, of course) into the array now referenced by Animal, no problem. But if you DO try to add a Cat to the object that is actually a Dog array, you'll get the exception.

But there IS no equivalent exception for generics, because of type erasure! In other words, at runtime the JVM KNOWS the type of arrays, but does NOT know the type of a collection. All the generic type information is removed during compilation, so by the time it gets to the JVM, there is simply no way to recognize the disaster of putting a Cat into an ArrayList <Dog> and vice versa (and it becomes exactly like the problems you have when you use legacy, non-type safe code).

So this actually IS legal code:

As long as the only thing you pass to the addAnimals (List <Animal>) is an ArrayList <Animal>, the compiler is pleased—knowing that any Animal subtype you add will be valid (you can always add a Dog to an Animal collection, yada, yada, yada). But if you try to invoke addAnimal() with an argument of any OTHER ArrayList type, the compiler will stop you, since at runtime the JVM would have no way to stop you from adding a Dog to what was created as a Cat collection.

For example, this code that changes the generic type to <Dog>, but without changing the addAnimal() method, will NOT compile:

The compiler says something like:

Notice that this message is virtually the same one you'd get trying to invoke any method with the wrong argument. It's saying that you simply cannot invoke addAnimal (List <Animal> ) using something whose reference was declared as List <Dog>. (It's the reference type, not the actual object type that matters—but remember—the generic type of an object is ALWAYS the same as the generic type declared on the reference. List <Dog> can refer ONLY to collections that are subtypes of List, but which were instantiated as generic type <Dog>.)

Once again, remember that once inside the addAnimals() method, all that matters is the type of the parameter—in this case, List <Animal>. (We changed it from ArrayList to List to keep our "base" type polymorphism cleaner.)

Back to the key question—how do we get around this? If the problem is related only to the danger of adding the wrong thing to the collection, what about the checkup() method that used the collection passed in as read-only? In other words, what about methods that invoke Animal methods on each thing in the collection, which will work regardless of which kind of ArrayList subtype is passed in?

And that's a clue! It's the add() method that is the problem, so what we need is a way to tell the compiler, "Hey, I'm using the collection passed in just to invoke methods on the elements—and I promise not to ADD anything into the collection." And there IS a mechanism to tell the compiler that you can take any generic subtype of the declared argument type because you won't be putting anything in the collection. And that mechanism is the wildcard <?>.

The method signature would change from

By saying <? extends Animal>, we're saying, "I can be assigned a collection that is a subtype of List and typed for <Animal> or anything that extends Animal. And oh yes, I SWEAR that I will not ADD anything into the collection." (There's a little more to the story, but We'll get there.)

So of course the addAnimal() method above won't actually compile even with the wildcard notation, because that method DOES add something.

you'll get a very strange error that might look something like this:

which basically says, "you can't add a Dog here." If we change the method so that it doesn't add anything, it works.

But wait—there's more. (And by the way, everything we've covered in this generics section is likely to be tested for on the exam, with the exception of "type erasure," for which you aren't required to know any details.)

First, the <? extends Animal> means that you can take any subtype of Animal; however, that subtype can be EITHER a subclass of a class (abstract or concrete) OR a type that implements the interface after the word extends. In other words, the keyword extends in the context of a wildcard represents BOTH subclasses and interface implementations. There is no <? implements Serializable> syntax. If you want to declare a method that takes anything that is of a type that implements Serializable, you'd still use extends like this:

This looks strange since you would never say this in a class declaration because Serializable is an interface, not a class. But that's the syntax, so burn it in!

One more time—there is only ONE wildcard keyword that represents both interface implementations and subclasses. And that keyword is extends. But when you see it, think "Is-a", as in something that passes the instanceof test.

However, there is another scenario where you can use a wildcard AND still add to the collection, but in a safe way—the keyword super.

Imagine, for example, that you declared the method this way:

Now what you've said in this line

is essentially, "Hey compiler, please accept any List with a generic type that is of type Dog, or a supertype of Dog. Nothing lower in the inheritance tree can come in, but anything higher than Dog is OK."

You probably already recognize why this works. If you pass in a list of type Animal, then it's perfectly fine to add a Dog to it. If you pass in a list of type Dog, it's perfectly fine to add a Dog to it. And if you pass in a list of type Object, it's STILL fine to add a Dog to it. When you use the <? super …> syntax, you are telling the compiler that you can accept the type on the right-hand side of super or any of its supertypes, since—and this is the key part that makes it work—a collection declared as any supertype of Dog will be able to accept a Dog as an element. List <Object> can take a Dog. List <Animal> can take a Dog. And List <Dog> can take a Dog. So passing any of those in will work. So the super keyword in wildcard notation lets you have a restricted, but still possible way to add to a collection.

So, the wildcard gives you polymorphic assignments, but with certain restrictions that you don't have for arrays. Quick question: are these two identical?

If there IS a difference (and we're not yet saying there is), what is it?

There IS a huge difference. List <?>, which is the wildcard <?> without the keywords extends or super, simply means "any type." So that means any type of List can be assigned to the argument. That could be a List of <Dog>, <Integer>, <JButton>, <Socket>, whatever. And using the wildcard alone, without the keyword super (followed by a type), means that you cannot ADD anything to the list referred to as List<?>.

List <Object> is completely different from List <?>. List <Object> means that the method can take ONLY a List <Object>. Not a List <Dog>, or a List <Cat>. It does, however, mean that you can add to the list, since the compiler has already made certain that you're passing only a valid List <Object> into the method.

Based on the previous explanations, figure out if the following will work:

If not, where is the problem?

The problem is in the list.add() method within doInsert(). The <?> wildcard allows a list of ANY type to be passed to the method, but the add() method is not valid, for the reasons we explored earlier (that you could put the wrong kind of thing into the collection). So this time, the TestWildcards class is fine, but the Bar class won't compile because it does an add() in a method that uses a wildcard (without super). What if we change the doInsert() method to this:

Now will it work? If not, why not?

This time, class Bar, with the doInsert() method, compiles just fine. The problem is that the TestWildcards code is trying to pass a List <Integer> into a method that can take ONLY a List <Object>. And nothing else can be substituted for <Object>.

By the way, List <? extends Object> and List <?> are absolutely identical! They both say, "I can refer to any type of object." But as you can see, neither of them are the same as List <Object>. One way to remember this is that if you see the wildcard notation (a question mark ?), this means "many possibilities". If you do NOT see the question mark, then it means the <type> in the brackets, and absolutely NOTHING ELSE. List <Dog> means List <Dog> and not List <Beagle>, List <Poodle>, or any other subtype of Dog. But List <? extends Dog> could mean List <Beagle>, List <Poodle>, and so on. Of course List <?> could be…anything at all.

Keep in mind that the wildcards can be used only for reference declarations (including arguments, variables, return types, and so on). They can't be used as the type parameter when you create a new typed collection. Think about that—while a reference can be abstract and polymorphic, the actual object created must be of a specific type. You have to lock down the type when you make the object using new.

As a little review before we move on with generics, look at the following statements and figure out which will compile:

The correct answers (the statements that compile) are 1, 2, and 5. The three that won't compile are

Statement: List <?> foo = new ArrayList <? extends Animal> ();
Problem: you cannot use wildcard notation in the object creation. So the new ArrayList <? extends Animal> () will not compile.

Statement: List <? extends Dog> cList = new ArrayList <Integer> ();
Problem: You cannot assign an Integer list to a reference that takes only a Dog (including any subtypes of Dog, of course).

Statement: List <? super Animal> dList = new ArrayList <Dog> ();
Problem: You cannot assign a Dog to <? super Animal>. The Dog is too "low" in the class hierarchy. Only <Animal> or <Object> would have been legal.

Generic Declarations

Until now, we've talked about how to create type safe collections, and how to declare reference variables including arguments and return types using generic syntax. But here are a few questions: How do we even know that we're allowed/ supposed to specify a type for these collection classes? And does generic typing work with any other classes in the API? And finally, can we declare our own classes as generic types? In other words, can we make a class that requires that someone pass a type in when they declare it and instantiate it?

First, the one you obviously know the answer to—the API tells you when a parameterized type is expected. For example, this is the API declaration for the java.util.List interface:

The <E> is a placeholder for the type you pass in. The List interface is behaving as a generic "template" (sort of like C++ templates), and when you write your code, you change it from a generic List to a List <Dog> or List <Integer>, and so on.

The E, by the way, is only a convention. Any valid Java identifier would work here, but E stands for "Element," and it's used when the template is a collection. The other main convention is T (stands for "type"), used for, well, things that are NOT collections.

Now that you've seen the interface declaration for List, what do you think the add() method looks like?

In other words, whatever E is when you declare the List, that' what you can add to it. So imagine this code:

The E in the List API suddenly has its waveform collapsed, and goes from the abstract <your type goes here>, to a List of Animals. And if it's a List of Animals, then the add() method of List must obviously behave like this:

When you look at an API for a generics class or interface, pick a type parameter (Dog, JButton, even Object) and do a mental find and replace on each instance of E (or whatever identifier is used as the placeholder for the type parameter).

Making Your Own Generic Class

Let's try making our own generic class, to get a feel for how it works, and then We'll look at a few remaining generics syntax details. Imagine someone created a class Rental, that manages a pool of rentable items.

Now imagine you wanted to make a subclass of Rental that was just for renting cars. You might start with something like this:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 7 Generics and Collections

Create new playlist

Sign In

Sign Up

7 Generics and Collections

overriding hashCode() and equals() (objective 6.2)

Overriding equals()

What It Means If You Don't Override equals()

Implementing an equals() Method

The equals() Contract

Overriding hashCode()

Understanding Hashcodes

Implementing hashCode()

The hashCode() Contract

Collections (Exam Objective 6.1)

So What Do You Do with a Collection?

Key interfaces and Classes of the Collections framework

List Interface

Set Interface

Map Interface

Queue Interface

Using the Collections framework (objectives 6.3 and 6.5)

ArrayList Basics

Autoboxing with Collections

Sorting Collections and Arrays

Sorting Collections

The Comparable Interface

Sorting with Comparator

Sorting with the Arrays Class

Searching Arrays and Collections

Converting arrays to lists to arrays

Using Lists

Using sets

Using Maps

Navigating (Searching) TreeSets and TreeMaps

Other Navigation Methods

Polling

Descending Order

Backed Collections

Using the PriorityQueue Class

Method Overview for Arrays and Collections

Method Overview For List, Set, Map, and Queue

Generic Types (Objectives 6.3 and 6.4)

The Legacy Way to Do Collections

Generics and Legacy Code

Mixing Generic and Non-generic Collections

Polymorphism and Generics

Generic Methods

Generic Declarations

Making Your Own Generic Class

Table of Contents for
7 Generics and Collections

7
Generics and Collections