16: Collections

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

CHAPTER

Collections

f you have taken a course on data structures in your computer science curriculum, you have used collections. In our earlier chapters, we used the Vector class and ArrayList. These classes are nothing but a part of the Collections framework. A collection is simply an object that groups multiple objects into a single unit, making it easier to perform group operations on them, to search quickly through thousands of sorted items, to insert and remove elements in the middle of an ordered sequence, and so on. Collections store and aggregate multiple data items so that they can be retrieved and manipulated with ease. In the earlier versions of Java (pre–J2SE 1.2) only the Vector and Hashtable classes were provided as part of Collections. The built-in Array class was used for creating arrays. Now, the Java platform provides a full-fledged Collections framework, which is a unified architecture for representing and manipulating various collections. You will learn to use the Collections framework in this chapter. In particular, you will learn the following:

What the Collections framework is

Interfaces defined in the Collections framework

Various implementation classes of the framework

Lists

Sets

Queues

Maps

Polymorphic algorithms

What Is the Java Collections Framework?

Many languages provide a collections framework, so it is very likely you have already used collections in other languages. For example, the Standard Template Library (STL) in C++ is a collections framework. Java introduced the Collections framework beginning in J2SE 1.2. Earlier to that it only had Vector, Hashtable, and built-in arrays. Each of these had different syntax and methods for accessing its members. Arrays use square brackets, Vector uses the elementAt method, and Hashtable uses get and put methods to access the members. Besides this, some of the methods in Vector are marked final and therefore cannot be inherited. Arrays have a fixed size, making it tricky to deal with those situations where the number of elements can dynamically vary at runtime.

What is desired is a standard interface for member access, a more powerful set of classes, and some built-in algorithms for sorting, searching, and so on. This is now achieved through the Collections framework. The Java Collections framework is not restricted to the aforementioned preexisting classes but rather contains much more—namely, interfaces, implementations, and algorithms.

The Hashtable and Vector classes have now been updated to implement the Collection interfaces. Many new collection implementations have been added, including HashSet and TreeSet, ArrayList and LinkedList, and HashMap and TreeMap. Each of these provides a unique advantage over the others. For example, TreeSet and TreeMap implicitly support ordering, making it lot easier to maintain a sorted list with no effort. Finding the smallest and largest element in a sorted list is very easy. You may sort the elements based on their natural sort order or provide your own comparator for sorting. Also, searching is made easier with the implementation of a binary search algorithm. The ordered insertion in collections usually results in a performance penalty, and in certain situations you may not need to order the collection elements. For example, in the previous chapter, we used ArrayList to maintain a list of graphics objects, where the ordering was not at all important. In some situations, you may want to maintain a key/value pair like the one in a word dictionary. For this, you would use HashMap. Thus, the Collections framework provides you with several types of collections to meet your current needs. Each type has a unique purpose, as you will learn as you read on.

Benefits of the Collections Framework

One of the biggest benefits that comes out of using the Collections framework is the set of standard interfaces it provides. Whether you are working on a Set, List, or Map, the interface remains the same. All the classes in the framework conform to a common API and thus become more regular and easily understood. The application type does not matter, and the user sees the same interface, whether he is developing a chat application, working on a SQL database, or creating a graphics editor like the one in Chapter 15. The standard interfaces make it easier to pass collection objects between methods as parameters or return values. Thus, a method can work on a wider variety of collections. All the collection classes have a common implementation that makes your code shorter and quicker to download. Also, any changes to this core implementation to enhance performance or add features are immediately available to your program code.

Earlier collections classes used Enumeration to traverse their elements. The Collections framework, on the other hand, introduced Iterator, which allows for element operations such as insertion and deletion. The Iterator is fail-fast, ensuring that you get an exception if the list you are iterating is modified by another user. Also, iterators such as ListIterator that operate on a list-based collection such as Vector allow bidirectional iteration and updating. As mentioned earlier, some of the collections allow you to maintain a sorted collection of objects, making it easy to find the first and the last based on a certain sort order and to perform quick searches.

The Collections framework also provides a static class called Collections that provides read-only and synchronized versions of existing collections. The read-only collections protect you from accidental changes to collection items, and synchronized versions are useful in developing multithreaded applications. You create a read-only set by calling the unmodifiableSet static method of the Collections class. You learn about the use of synchronized collection classes in thread programming later in the chapter.

What the Collections Framework Offers

The Java Collections framework consists of interfaces, implementations, and algorithms:

Interfaces Just like the Java interfaces you studied in the earlier chapters, these interfaces provide a uniform interface to the collections independent of the objects they represent. The various collection classes implement a common interface. The interface hierarchy is designed to account for the different types of data structures, such as lists, sets, queues, and maps.

Implementations These are concrete implementations for the various types of data structures, such as lists, sets, queues, and maps.

Algorithms These provide useful methods for performing common operations such as sorting and searching that are applied polymorphically to collections regardless of the object types they store.

In this chapter, you learn about all three of these aspects of the Collections framework.

The Collections Framework Interfaces

The interface hierarchy consists of two distinct trees. List, Set, and Queue fall under the Iterable tree. Map is somewhat different from these collections and therefore falls under a different tree all its own. Besides these, you have the Iterator interface that allows you to iterate through the elements of a collection. Finally, the RandomAccess interface is a marker to indicate that the implementing data structure class provides fast random access to its data. The interface hierarchy defined in the Collections framework is depicted in Figure 16-1.

FIGURE 16-1. Interface hierarchy in the Collections framework

List is an ordered collection that can contain duplicates. In a list, when you insert an element, you have control over where the element is inserted; therefore, you are able to retrieve an element by knowing its index. A Vector (which we used in previous chapters) falls under this category.

Set is a collection that cannot contain duplicate elements. More precisely, a set cannot contain a pair of elements, e1 and e2, such that e1.equals(e2) is true. Also, it can contain at most one null element. This models the mathematical set abstraction. As an example, you could use a set to create a timetable of departing trains from a particular train station or to create a schedule of courses offered at the sophomore level. The SortedSet and NavigableSet interfaces extend the functionality defined by Set.

SortedSet is a set that provides a total ordering of its elements. It is used for naturally ordered sets and exposes the comparator object used for sorting. This interface provides methods to obtain subset views of the collection and the iterator that traverses the set in ascending element order. For ease of searching and traversal in such sorted sets, the NavigableSet interface was introduced in Java SE 6. Given a set of ordered numbers such as […, 10, 15, 20, 35, 50, …] you can easily determine the element greater than or less than 20 in a single method call of this interface. Methods such as lower, higher, floor, and ceiling are provided for this purpose. You can also use the headSet and tailSet methods to get a head or tail subset with respect to any element in the sorted set. ConcurrentSkipListSet and TreeSet are the two implementing classes for this interface.

Queue is a collection of multiple objects that you would like to insert into a collection before processing them. Just like in real life, we first form a queue of objects and then process them, maybe on a FIFO (first in, first out) basis. The interface provides methods for inserting, extracting, and inspecting elements; it is not necessary to extract elements in FIFO order. The interface provides a means of ordering elements as per their natural ordering or a supplied comparator.

Deque represents a double-ended queue, meaning that you are allowed to add and remove elements at both ends. The implementation can be used for LIFO (last in, first out) order, as in the case of a Stack, or FIFO order, as in the case of a Queue. The Deque provides methods to insert, remove, and examine the element. Each of these methods exists in two forms—one that throws an exception and the other that returns a null or false value, depending on the operation. For Deque implementations having capacity restrictions, the latter form is used. For example, the offerFirst method inserts the element specified in the method parameter at the front of the deque. If the insertion fails due to capacity restriction, a false value is returned. This is probably better than having an exception generated when you use the corresponding addFirst method.

BlockingDeque is a sub-interface of Deque that supports blocking operations during an insert or retrieval. Removing an element operation would block for the deque to become non-empty (if the deque is empty, the remove operation blocks until somebody inserts an element in it), and similarly inserting an element operation waits (blocks) for the space to become available in the deque. It supports four forms of methods:

Methods that throw exceptions

Methods that time out

Methods that block (wait indefinitely)

Methods that return a special value

Map provides a collection of key/value pairs. This is useful in creating dictionaries and such, where the objects are accessed using their keys. For example, a database of employee objects may be accessed using the employee IDs. Therefore, the ID in this case becomes the key, and the employee record becomes the value associated with the key. Just the way you have sorted and navigable sets, you have interfaces for creating sorted maps and navigating them using a tree-like structure. SortedMap provides a total ordering on its keys. The ordering is based on either the natural ordering of its keys or on a provided comparator. Along with the NavigableSet you saw earlier, Java SE 6 introduced the NavigableMap interface to provide similar navigational methods on a map that return the closest matches for given search targets. Methods such as headMap and tailMap are similar to headSet and tailSet, which you saw earlier for sets. In the case of NavigableSet, the methods typically returned a single value; in the case of NavigableMap, they return the key/value pair.

The Iterator interface provides methods for iterating through the elements of a collection. Finally, as mentioned earlier, RandomAccess is a marker interface that indicates that the implementing List class supports fast random access.

In summary, we say that the Collection interface is a group of objects. Set extends Collection but forbids duplicates. List extends Collection and also allows duplicates and introduces positional indexing. Map extends neither Set nor Collection. You can easily see from this discussion the amount of variety you have in creating different collections of objects, depending on your needs.

The Collections Framework Classes

The class diagram in Figure 16-2 shows the various classes you have for working with different kinds of collections. We will now discuss the four types of data structures individually—list, set, queue, and map—illustrated in this figure, along with code examples.

FIGURE 16-2. Implementation classes in the Collections framework

List

As stated earlier, the List data structure defines an ordered collection of elements. The user can precisely control the position where a new element is inserted in the list. To retrieve elements from the list, the user can use the index, which is the position in the list. Lists allow duplicates, meaning that a pair of elements, e1 and e2, such that e1.equals(e2), is allowed. Also, you can have multiple null elements, provided a null element is permitted in the first place (some implementations prohibit them). Note that sets do not allow such duplicates. A special iterator, called ListIterator, defined in this interface, besides its typical operations, allows element insertion and replacement as well as bidirectional access. The interface defines an indexOf method that takes an Object argument and, after searching the list, returns the index of the object, if found. The get method accepts an index as its argument and returns the element at the specified position. Such searches should be used with caution because many implementations perform costly linear searches. The interface also provides methods to efficiently insert and remove multiple elements at an arbitrary point in the list.

Java libraries define several implementations of this interface—a few examples are ArrayList, LinkedList, Stack, and Vector. We will now consider the use of the List interface through a code example based on the LinkedList implementation. The LinkedList class allows duplicate elements. A simple example of a list might be maintaining the names of players on a soccer team. The team can have two players with the same name (in some cases you might use a prefix such as Jr. to distinguish between the two). We will now write a program to create and manipulate a list of players. We will create two teams—one for men and the other for women. After adding five members to each team, we will print out the teams. We will then merge the two teams to create a mixed team. Later on, we will disqualify a few members and remove them from the mixed team. All this will certainly expose you to the various methods of the List interface. The program is given in Listing 16-1.

Listing 16-1 Soccer Team Builder Based on the LinkedList Data Structure

In the main method, we first create the list with the following declaration:

Note that the collection classes now use generics. Therefore, we specify the type of object our list is going to hold in angular brackets. The maleTeam is the list that holds String-type elements. The instance of the LinkedList class is assigned to a List-type variable, which is a super-interface of the LinkedList class. After creating a list, we add a few items to it by calling the List’s add method:

To print the list of created members, we use our regular SOP (System.out.println) method:

Likewise, we create a list of female team members called femaleTeam and add a few members of the String type to it. We’ll now merge the two teams. While merging the teams, we will mix the members of the two teams in an alternating fashion, adding the female players to the team of male players. Therefore, the mixed team will be stored in the maleTeam list. For this, we create two iterators:

Both iterator classes use generics; therefore, we need to provide the appropriate data type while creating them. The ListIterator class provides an add method that allows for the insertion of an object in the list. The Iterator class does not allow the insertion; it allows only the removal of an element. Thus, the maleListIterator is of type ListIterator, and the femaleListIterator is of type Iterator. After creating iterators, we define the following loop to copy the elements of the female team into the male team in alternating positions:

The hasNext method returns null on encountering the end-of-list, and the next method returns the object from the list and advances the pointer in the list. After the while loop completes its execution, the mixed team is available in the maleTeam list, which we print to the terminal:

Now, to define a list of disqualified members, we create another list called disqualify and add the desired members to it:

To remove the disqualified members from the mixed team, we use the removeAll method and pass the list of disqualified members to it as a parameter:

We then print the modified team for verification:

When you run the program, you will see the following output:

From this simple example, you can see how easy it is to create and manipulate a list of objects using the List interface.

Optional Operations of the List Interface

The methods we used in Listing 16-1 were the mandatory implementations for an implementing class. The List interface also defines several methods that are optional in the sense that implementing classes need not use them, but must throw a runtime exception if they are not supported.

The interface defines four overloaded add methods and three overloaded remove methods that obviously take different sets of parameters. The clear method removes all the elements from the list. The set method takes two arguments—the index and the element—and replaces the element at the specified position in the list with the specified element. The retainAll method takes a Collection as an argument and retains only the elements in the list that are contained in the specified collection. We will now discuss a few implementations using some or all of these optional methods so that you get a feel for the various implementing classes of the List interface.

The ArrayList class used in the previous chapter is a resizable-array implementation of the List interface that implements all the aforementioned optional operations. One of the benefits of using this class is that the size, isEmpty, get, set, iterator, and listIterator operations run in constant time. The add operation runs in amortized constant time; adding n elements requires O(n) time. The constant factor here is low compared to the corresponding LinkedList implementation. All other operations roughly run in linear time. Each instance of ArrayList has a capacity, which defines the size of the array used to store the elements in the list. This capacity grows dynamically as you add more elements to the list. Before adding a large number of elements, use the ensureCapacity operation to possibly reduce the amount of incremental allocation. This class is somewhat equivalent to the java.util.Vector class we used earlier, except that it is unsynchronized.

The CopyOnWriteArrayList defined in the java.util.concurrent package provides a thread-safe variant of ArrayList. In this class, all mutative operations, such as add, set, and so on, are implemented by creating a fresh copy of the underlying array, making it generally costly to use. Ordinarily, when you have more traversal operations than mutations, this implementation is preferred over synchronizing traversals yourself. This class permits null.

The Vector is another implementation of the List interface. It implements a “growable” array of objects. The size grows or shrinks as needed. Storage management can be optimized by using the capacity and ensureCapacity operations and the capacityIncrement field. The major benefit of using this class lies in the fact that the iterators returned by the iterator and listIterator methods are fail-fast. Concurrent external modifications to Vector’s structure result in the iterator throwing a ConcurrentModificationException. The iterator fails quickly and cleanly, saving you from encountering nondeterministic behavior in the future. Unlike the LinkedList and ArrayList from earlier, the Vector is synchronized.

The Stack is yet another implementation of the List interface. The Stack extends the Vector. Conventionally, it represents a LIFO stack of objects. When you create a Stack object, it contains no elements. The Stack class has existed since JDK 1.0. For a more complete and consistent set of LIFO operations, you should use the implementations of the Deque interface, discussed later in this chapter.

Having studied the various implementations of the List interface, we will now move ahead to study the Set data structure.

Set

The Set interface represents a collection that does not permit duplicate elements. The definition of duplicates in terms of the object identity remains the same as described previously. Due to this restriction, the interface places additional stipulations on its constructors and the add, equals, and hashCode methods. This interface models the mathematical set abstraction. HashSet, LinkedHashSet, TreeSet, EnumSet, and CopyOnWriteArraySet are some of the concrete implementations of this interface. We will discuss just two implementations: HashSet and TreeSet. Refer to javadocs for details about the other implementations.

HashSet

The HashSet implementation is backed by a hash table (a HashMap instance) and does not permit null. This class offers constant-time performance for basic operations such as add, remove, contains, and size, assuming the hash function disperses the elements properly among the buckets. The iteration time depends largely on the number of elements and the number of buckets in the backing HashMap instance; do not set the initial capacity too high and the load factor too low if iteration performance is important.

The implementation does not guarantee the iteration order. As a matter of fact, it does not even guarantee that the iteration order will remain constant over time.

This class is not synchronized. You can provide an external synchronization by wrapping the set as illustrated here:

The iterators are fail-fast, as described earlier for the Vector.

To illustrate the use of this class, let’s look at a concrete example. Suppose we have been asked to write a program that accepts election voter names from the user. We will store these names in a set, with no duplicates allowed. In the case of a long list of names, the user might unknowingly enter the same name twice. This program should therefore ensure that the resultant set contains only distinct names and all duplicates are rejected without warning to the user. The program that does all this is given in Listing 16-2.

Listing 16-2 Building a Set of Distinct Words Using the HashSet Class

In the main method, we create an instance of HashSet that stores String-type objects:

Next, we read the names from the user. We use the Scanner class (introduced in J2SE 5.0) to read input from the keyboard. The parameter to the class constructor is System.in, which is the default keyboard input:

We read multiple names from the user until he inputs a “blank” name:

The nextLine method of the Scanner class returns a String. As long as this string does not equal a blank string, we keep on asking for a new name from the user. We add the input string to the Set we created earlier:

In case of a duplicate entry, the add method simply refuses to add it to the set. After the while loop terminates, the set contains only the distinct words. We print both the total number of input words and the number of distinct words to the terminal:

We also print the contents of the created set to the terminal for verification by creating and using an iterator:

Sample output from the program is shown here:

Note that the user has entered a total of six names, out of which five are unique. The name Jack has been entered twice. The final list contains only one occurrence of Jack.

TreeSet

TreeSet uses a tree for storage. Objects are stored in sorted, ascending order. The elements are ordered using their natural ordering or by a comparator provided at the set creation time. Therefore, when you iterate over the set elements, the order of elements is constant. By passing the comparator object to the constructor, you can set a different ordering than the natural ordering of the elements. You create a comparator object by instantiating a built-in Comparator class. You will need to define your own comparison in its compare method. The operations, such as add, remove, and contains, are guaranteed to perform in log(n) time, making this a better choice over ArrayList for fast retrievals from large amounts of sorted information. Like HashSet, this class, too, is not synchronized.

To illustrate this class, let’s look at a concrete example. Suppose we have to create a team of players. We will store this list of players in a TreeSet. For each player, we store his name and age.

We will create a Player class for storing this information. We will define the default ordering based on the players’ age. Later on, we will create another sorted list based on the existing list that sorts the elements of the list by player name. You will understand how easy it is to sort the team on different fields when you use this class. The program is given in Listing 16-3.

Listing 16-3 Building a Sortable Team Based on TreeSet

Let’s first look at the Player class. We will store the instances of this class in the team we create in the main application:

The class is declared private and defined as an inner class to the SortableTeam class. Because we will not use the Player class outside this sample application, we declare it private so as to shield its definition from other classes in the application.

The Player class implements the Comparable interface that operates on the Player data type. As part of the Comparable interface implementation, we need to implement the compareTo method:

The compareTo method returns the difference between the age of the current object and the age of the received object. This defines the default ordering while inserting the objects of the Player type in the tree. Besides this compareTo method, the class defines the conventional constructor that takes two parameters and initializes the object’s state. We also create two getter methods to retrieve the age and name fields. Now, let’s look at the main application.

In the main method, we create an instance of TreeSet that operates on a Player data type:

We add a few elements to the ageSortedTeam:

Note that when the elements are inserted, they will be ordered based on the comparator provided in the class definition; they will be arranged in the ascending order of each player’s age. To verify this, we print the set by calling the printSet method, which is discussed later:

Now, we create another list that is sorted alphabetically by player name. To do this, we create another instance of the TreeSet class:

To the constructor, we pass an instance of Comparator (remember the use of anonymous classes from previous chapters). The compare method receives two parameters of the Player type:

We obtain the players’ names from the two received objects by calling the getter method. The compareTo method compares two strings and returns their alphabetical ordering:

After we have constructed the ordered set with a new ordering method defined, we need to add elements to it. We do this by calling the addAll method:

The addAll method takes a parameter, which is an existing set of players. All the elements of the ageSortedTeam will now be added to the new set; however, during insertion, the new ordering mechanism is used. We verify this ordering by printing the list to the terminal:

Finally, we discuss the implementation of the printSet method. This is a static method of the class that accepts a Set-type parameter:

In the method body, we first create an iterator:

We iterate through all the elements of the set by calling the hasNext method:

The next method retrieves the object from the set. We typecast this to the Player type:

We get the name and the age attributes of the Player object and print them to the terminal:

When we run the program, we see the following output:

Note that the first list is arranged by the players’ age, whereas the second one is arranged alphabetically by the players’ name.

TIP

Here are some pointers for deciding between using Set and List. A TreeSet consumes a little bit more memory than an ArrayList. This is because a TreeSet uses a tree data structure to store its information, where each node in the tree is an object that keeps pointers to its parent, the left branch, the right branch, the element, and more. In comparison, an ArrayList is a simple array with the elements. Also, inserting an element in a TreeSet is faster than the insertion in ArrayList. This is because when you insert an element in an arbitrary position in an ArrayList, on average half the list will have to be shifted by one position. This takes O(n) time, which means inserting into a list containing 1,000,000 elements will take 1,000 times as long as the time taken to insert into a list of 1,000 elements. On the other hand, a TreeSet needs to traverse the depth of the tree, which takes O(log(n)) time. Therefore, if the set contains 1,000,000 elements, it takes only twice as long as when the set contains 1,000 elements.

We’ll now move on to the next type of data structure—a queue.

Queue

A queue is a collection that is designed for holding elements prior to processing. Queues typically order elements in FIFO order, although this is not always true. The PriorityQueue implementation, described next, orders elements based on their natural ordering or according to the supplied comparator. Besides basic operations, Queue provides additional methods for insertion, extraction, and inspection. To insert an element, you use the offer method instead of the typical add method.

To remove an element, you use poll instead of the typical remove, and to inspect you use peek instead of the regular element method. The offer, poll, and peek methods return a special value instead of throwing an exception in case of failure. The Queue interface does not define blocking methods; however, the BlockingQueue interface that extends the Queue interface blocks on an element to appear or for the space to become available. The Queue implementation generally does not allow null insertion; the exception is the LinkedList implementation. You should generally avoid the insertion of null because it is also used as a special return value by the poll method.

Java libraries provide several implementations of the Queue interface—a few examples are PriorityQueue, PriorityBlockingQueue, LinkedList, and ArrayDeque. Refer to the javadocs for a detailed study of the different implementations. In this section, we discuss just one implementation: PriorityQueue.

The PriorityQueue class represents an unbounded priority queue based on a priority heap. The elements are ordered according to their natural ordering or by a comparator provided at the queue construction time. A sorted order is always permanently imposed on the elements it contains. The highest-priority element is at the head of the queue, and the lowest is at its tail. Removing the highest-priority element and adding a new element operation are both efficient. Searching for an element that is not at the head of the queue is usually inefficient.

You would typically use priority queues in situations where you want to add elements in any order but retrieve them in sorted order, and where you do not necessarily retrieve the elements all at once. A typical application of this is building a Huffman compression tree, where you need to sort something gradually, occasionally peeking at the topmost element, altering it, and placing it back into the tree. Other typical applications might be performing a heapsort and creating a job scheduler.

An example where a priority queue might be useful is in the allocation of the CPU to waiting threads. Generally, several processes running on the system create threads that are posted in a queue for execution. The threads have different priorities. The operating system executes all threads with the highest priority first, followed by the threads at the next priority level, and so on. We will now write a program that does this allocation of CPU time. In this example, the thread priorities range from 0 to 9, with 0 being the highest priority and 9 being the lowest.

CAUTION

Generally, for thread priorities, 0 is the lowest priority and 9 (or whatever is the highest level) is considered the highest priority. We use the reverse order in this example because this is the natural ordering for integers. To change to descending ordering, you need to provide your comparator, which is discussed in later examples in this chapter.

We assume that the processes will create threads and put them in a queue for execution. The size of this queue is 100. Therefore, at any moment, we may have up to 100 threads for allocation. We will rearrange all these threads based on their priorities in a priority queue. We will then simply allocate the CPU cycles by picking the element at the top of the queue, one after another. For all threads having the same priority level, we do not further distinguish between them on the basis of their time of creation. Therefore, it’s possible for a thread created later than another thread (both with the same priority) to get the CPU earlier than the other one.

The program that does this allocation is given in Listing 16-4.

Listing 16-4 Building a Thread Scheduler Using the PriorityQueue Class

In the main function, we first declare an array to store the waiting threads:

We create 100 threads with a random priority in the range of 0 to 9 assigned to each one:

Next, we create a PriorityQueue class instance that operates on the Integer data type:

We fill this queue with the elements of our thread list by calling the addAll method:

We now print the list of all waiting threads:

To allocate the CPU, we remove the thread from the top of the queue and print its priority level to the user terminal:

When we run the program, we see output similar to the following:

Note that the threads are deployed strictly according to their priorities, and until the threads with higher priorities finish, threads at lower priorities starve.

Map

A Map represents an object that maps keys to values. It cannot contain duplicate keys; each key can map to at most one value. This is typically used in building word dictionary applications. Java libraries define several implementations of the Map interface. We just discuss one implementation in this section: HashMap. As usual, refer to javadocs for information on the other implementations. The HashMap permits the use of null values and keys. The HashMap does not make any guarantee that the retrieval order will remain constant over time. It provides constant-time performance for basic operations such as get and put, assuming that the hash function disperses the elements properly among the buckets. To have the HashMap work efficiently, we use its two parameters— initial capacity and load factor. The load factor controls when the capacity should be increased. Note that HashMap stores only object references; if you want to store primitive data types, use wrapper classes. This class is not synchronized. For synchronized access, use java.util.Hashtable instead. You will learn the pros and cons of using synchronized access in Chapter 17.

We will now discuss the use of the HashMap class through a concrete example. Suppose we are building a word list typically used by students taking the Graduate Record Examination (GRE), where for each word we want to store its type, synonym, and antonym. The list should be searchable by a keyword. After the search, the aforementioned three attributes of the word should be displayed. This user of the program should be able to modify any of the entries as well as print the entire word list on demand. The program in Listing 16-5 does all this.

Listing 16-5 Building a GRE Word List Using HashMap

In the declaration, we first create an instance of the HashMap class:

The HashMap class uses two generic parameters. We set the first parameter to the type String and the second parameter to the type Word, which is a user-defined class (discussed later). In the main method, we add a few entries to the map, as follows:

The first parameter to the put method is the key, and the second parameter is its value, which in this case is an instance of the Word class. Note that the Word constructor requires three parameters—the first parameter specifies the type of word, the second parameter specifies synonyms, and the third parameter specifies antonyms. After building the list, we print the entire list by calling the printList method (discussed later). To look up an entry in the list, we call the get method:

The key to search for becomes the parameter to the get method. On return, it gives the value stored in the map. This value is printed by using the overridden toString method of the Word class.

To modify an entry, we simply call the put method with the desired key and the new value. This will replace the value for the existing key:

To remove an entry from the list, we call the remove method with the desired key as its parameter:

The implementation of the printList method is very straightforward. We use the for-each loop to iterate the list. For each entry, we use the getter methods to retrieve the key and its value and then print them to the terminal using SOP:

The Word class declaration is very straightforward. It contains three class variables, with the corresponding getter methods and a constructor to initialize these variables at the time of construction. The overridden toString method provides the formatted output of the three fields.

Algorithms

The Collections class in the framework provides implementations of several useful algorithms. These are polymorphic and therefore can be applied to any type of collection. This class provides many popular algorithms, such as binary search, sort, shuffle, max, min, frequency, and so on. These algorithms are defined as static methods of the Collections class.

We’ll cover the use of some of these algorithms with the help of a program. We will first create a set of 100 random numbers as sample data for our program. We will sort these numbers based on their natural ordering by calling the built-in sort algorithm. We will then apply a binary search algorithm to locate a particular test number. We will also find the largest and the smallest number in the set. Then, we will determine the frequency of occurrence of a certain test number by using the provided frequency algorithm. We will then proceed to find the number of distinct elements by creating a sorted HashSet. We shuffle this newly created set and sort it in ascending order to determine top 10 picks. Isn’t this exciting? The program given in Listing 16-6 teaches you all these techniques.

Listing 16-6 Demonstrating the Power of Built-in Collection Algorithms

To demonstrate the use of algorithms, we first create an array of integers:

We fill this array with 100 random numbers in the range 0 to 99:

We then sort the contents of this array by calling the sort method of the Collections class:

The sort method stores back the result in the same array. We print this array to verify the sorting:

Next, we perform a binary search on this sorted array to locate a desired number:

The binarySearch method takes two arguments: The first argument specifies the list on which a search is to be performed, and the second argument specifies the search element. On completion, it returns the index at which the specified number is found. If the search element is not found, it returns a negative number.

To determine the largest number in the array, we use the max method:

Likewise, to determine the smallest number, we use the min method. Both methods take the list to be searched as the argument.

To determine the frequency of occurrence of a certain number in the list, we use the frequency method:

The first parameter to the frequency method is the list to be searched, and the second argument is the search element.

Next, we create a set of distinct elements from our set. For this, we use the HashSet class:

We add all the elements of our list into this new set by calling the addAll method:

The addAll method adds only the distinct elements to the set. We determine and print the size of this set by calling its size method:

Next, we shuffle the elements of this set and list the top 10 numbers from the shuffled set as the winners. For this, we use the shuffle method. Because the shuffle works only on the List interface, we need to copy the elements of our set into a list. We first clear the existing contents of the list by calling its clear method and then add all the elements of the set into it:

The shuffle method now shuffles the contents of the modified list:

To pick the top 10 numbers from this new list, we call the subList method. The first parameter specifies the start index and the second parameter specifies the end index for the sublist:

We now sort this new set and print it to the user console as a set of top 10:

When we run the program, typical output would be as follows:

This simple example demonstrates the power of just a few algorithms. The Collections class provides several more of them. You are encouraged to go through the documentation to learn more about these algorithms.

TIP

The Collections framework also makes it easy to write your own custom algorithms that can operate on a generic collection.

Summary

In this chapter, you studied Java’s Collections framework. The framework defines interfaces, implementation classes, and algorithms that operate polymorphically on various collection classes. The interfaces provide a unified look to the various collection implementations. The classes provide the implementations of several popular data structures. You studied the use of lists, sets, queues, and maps. The Collections class in the framework provides implementations of several useful algorithms as the static methods of the class. These algorithms work polymorphically on the collection classes and therefore can be applied to them easily without considering the data type they operate upon. The Collections framework also makes it viable for you to develop custom algorithms that operate polymorphically on various collection classes.

The next three chapters cover another very important feature of Java language programming— threads. As you might have guessed, thread programming is a vast subject.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 16: Collections

Create new playlist

Sign In

Sign Up

Table of Contents for
16: Collections