Chapter 7. Structuring Data with Java

7.0 Introduction

Almost every application beyond “Hello World” needs to keep track of some structured data. A simple numeric problem might work with three or four numbers only, but most applications have groups of similar data items. A GUI-based application may need to keep track of a number of dialog windows. A personal information manager, or PIM, needs to keep track of a number of, well, persons. An operating system needs to keep track of who is allowed to log in, who is currently logged in, and what those users are doing. A library needs to keep track of who has books checked out and when they’re due. A network server may need to keep track of its active clients. A pattern emerges here, and it revolves around variations of what has traditionally been called data structuring.

There are data structures in the memory of a running program; there is structure in the data in a file on disk, and there is structure in the information stored in a database. In this chapter, we concentrate on the first aspect: in-memory data. We’ll cover the second aspect in Chapter 10; the third is out of scope for this book.

If you had to think about in-memory data, you might want to compare it to a collection of index cards in a filing box, or to a treasure hunt where each clue leads to the next. Or you might think of it like my desk—apparently scattered, but actually a very powerful collection filled with meaningful information. Each of these is a good analogy for a type of data structuring that Java provides. An array is a fixed-length linear collection of data items, like the card filing box: it can only hold so much, then it overflows. The treasure hunt is like a data structure called a linked list. The first release of Java had no standard linked list class, but you could write your own “traditional data structure” classes (and still can; you see a DIY Linked List implementation in Recipe 7.8). The complex collection represents Java’s Collection classes. A document entitled Collections Framework Overview, distributed with the Java Development Kit documentation (and stored therein as file …/docs/guide/collections/overview.html and # This URL was not brought forward past Java 8(?), do not update. online), provides a detailed discussion of the Collections Framework. The framework aspects of Java collections are summarized in Recipe 7.3.

Beware of some typographic issues. The word Arrays (in constant width font) refers to the class java.util.Arrays, but in the normal typeface, the word “arrays” is simply the plural of “array” (and will be found capitalized at the beginning of a sentence). Also, note that HashMap and HashSet follow the rule of having a “midcapital” at each word boundary, whereas the older Hashtable does not (the “t” is not capitalized).

The java.util package has become something of a catch-all over the years. Besides the legacy Date/Time API covered in Recipe 6.9, several other classes from java.util are not covered in this chapter. All the classes whose names begin with Abstract are, in fact, abstract, and we discuss their nonabstract subclasses. The StringTokenizer class is covered in Recipe 3.1. BitSet is used less frequently than some of the classes discussed here and is simple enough to learn on your own. BitSet stores the bits very compactly in memory, but because it predates the Collection API and wasn’t retrofitted, it doesn’t implement any of the standard collection interfaces. Also not covered here are EnumSet and EnumMap, specialized for efficient storage/retrieval of enums. These are newer than BitSet and do implement the modern collection interfaces.

We start our discussion of data structuring techniques with one of the oldest structures, the array. We discuss the overall structure of java.util’s Collections Framework. Then we’ll go through a variety of structuring techniques using classes from java.util.

7.1 Using Arrays for Data Structuring

Problem

You need to keep track of a fixed amount of information and retrieve it (usually) sequentially.

Solution

Use an array.

Discussion

Arrays can be used to hold any linear collection of data. The items in an array must all be of the same type. You can make an array of any primitive type or any object type. For arrays of primitive types, such as ints, booleans, etc., the data is stored in the array. For arrays of objects, a reference is stored in the array, so the normal rules of reference variables and casting apply. Note in particular that if the array is declared as Object[], object references of any type can be stored in it without casting, although a valid cast is required to take an Object reference out and use it as its original type. I’ll say a bit more on two-dimensional arrays in Recipe 7.17; otherwise, you should treat this as a review example:

public class Array1  {
    @SuppressWarnings("unused")
    public static void main(String[] argv) {
        int[] monthLen1;            // declare a reference
        monthLen1 = new int[12];        // construct it
        int[] monthLen2 = new int[12];    // short form
        // even shorter is this initializer form:
        int[] monthLen3 = {
                31, 28, 31, 30,
                31, 30, 31, 31,
                30, 31, 30, 31,
        };

        final int MAX = 10;
        LocalDate[] days = new LocalDate[MAX];
        for (int i=0; i<MAX; i++) {
            days[i] = LocalDate.of(2022, 02, i + 1);
        }

        // Two-Dimensional Arrays
        // Want a 10-by-24 array
        int[][] me = new int[10][];
        for (int i=0; i<10; i++)
            me[i] = new int[24];

        // Remember that an array has a ".length" attribute
        System.out.println(me.length);
        System.out.println(me[0].length);

    }
}

Arrays in Java work nicely. The type checking provides reasonable integrity, and array bounds are always checked by the runtime system, further contributing to reliability.

The only problem with arrays is: what if the array fills up and you still have data coming in? See Recipe 7.2.

7.2 Resizing an Array

Problem

The array filled up, and you got an ArrayIndexOutOfBoundsException.

Solution

Make the array bigger. Or, use an ArrayList.

Discussion

One approach is to allocate the array at a reasonable size to begin with, but if you find yourself with more data than will fit, reallocate a new, bigger array and copy the elements into it.1 Here is code that does so:

public class Array2  {
    public final static int INITIAL = 10,   1
        GROW_FACTOR = 2;                    2

    public static void main(String[] argv) {
        int nDates = 0;
        LocalDateTime[] dates = new LocalDateTime[INITIAL];
        StructureDemo source = new StructureDemo(21);
        LocalDateTime c;
        while ((c=source.getDate()) != null) {

            // if (nDates >= dates.length) {
            //     throw new RuntimeException(
            //         "Too Many Dates! Simplify your life!!");
            // }

            // better: reallocate, making data structure dynamic
            if (nDates >= dates.length) {
                LocalDateTime[] tmp =
                    new LocalDateTime[dates.length * GROW_FACTOR];
                System.arraycopy(dates, 0, tmp, 0, dates.length);
                dates = tmp;    // copies the array reference
                // old array will be garbage collected soon...
            }
            dates[nDates++] = c;
        }
        System.out.println("Final array size = " + dates.length);
    }
}
1

A good guess is necessary; know your data!

2

The growth factor is arbitary; 2 is a good value here but will continue to double exponentially. You might want to use a factor like 1.5, which would mean more allocations at the low end but less explosive growth. You need to manage this somehow!

This technique works reasonably well for simple or relatively small linear collections of data. For data with a more variable structure, you probably want to use a more dynamic approach, as in Recipe 7.4.

7.3 The Collections Framework

Problem

You’re having trouble keeping track of all these lists, sets, and iterators.

Solution

There’s a pattern to it. See Figure 7-1 and Table 7-1.

Discussion

List, Set, Map and Queue are the four fundamental data structures of the Collections Framework. List and Set are both sequences, with the difference that List preserves order and allows duplicate entries, whereas Set, true to the mathematical concept behind it, does not. Map is a key/value store, also known as a “hash,” a “dictionary,” or an “associative store.” Queues are, as the same suggests, structures that you can push into at one end and pull out from the other.

Figure 7-1 shows some of the important collection-based classes from package java.util. It is intenally not 100% complete due to space limitations.

See Also

The javadoc documentation on Collections, Arrays, List, Set, and the classes that implement them provides more details than there’s room for here. Table 7-1 may further help you to absorb the regularity of the Collections Framework.

Table 7-1. Java collections
Interfaces Resizable array Hashed table Linked list Balanced tree

Implementations

List

ArrayList, Vector

LinkedList

Set

HashSet

TreeSet

Map

HashMap, HashTable

TreeMap

Queue

+Deque+s, +BlockingQueue+s, etc

Figure 7-1 shows the relationships among several of these types.

jcb4 0701
Figure 7-1. The Collections Framework

Legend: Rectangles are interfaces; ovals classes. Solid lines are inheritance; dashed lines represent implements.

Note that Queue and its subtypes are treated in Chapter 16.

7.4 Like an Array, but More Dynamic

Problem

You don’t want to worry about storage reallocation (often because you don’t know how big the incoming dataset is going to be); you want a standard class to handle it for you. You want to store your data in any of the Collection classes defined in Chapter 7 with type safety, and without having to write downcasts when retrieving data from the collection.

Solution

Use a List implementation or one of the other Collections classes, along with Java’s Generic Types mechanism, declaring the Collection with a “type parameter” identifying the type of your data. The type parameter name appears in angle brackets after the declaration and instantiation.

Discussion

The first of the Collections classes we will discuss, ArrayList is a standard class from java.util that encapsulates the functionality of an array but allows it to expand automatically. You can just keep on adding things to it, and each addition behaves the same. If you watch really closely, you might notice a brief extra pause once in a while when adding objects as the ArrayList reallocates and copies. But you don’t have to think about it.

However, because ArrayList is a class and isn’t part of the syntax of Java, you can’t use Java’s array syntax; you must use methods to access the ArrayList’s data. It has methods to add objects, retrieve objects, find objects, and tell you how big the List is and how big it can become without having to reallocate (note that the ArrayList class is but one implementation of the List interface; more on that later). Like the other collection classes in java.util, ArrayList’s storing and retrieval methods were originally defined to have parameters and return values of java.lang.Object. Because Object is the ancestor of every defined type, you can store objects of any type in a List (or any collection) and cast it when retrieving it. If you need to store a small number of built-ins (like int, float, etc.) into a collection containing other data, use the appropriate wrapper class (see the Introduction to Chapter 5). To store booleans, either store them directly in a java.util.BitSet (see the online documentation) or store them in a List using the Boolean wrapper class.

Because Object is usually too general for accurate work, all modern versions of Java provide the “generic types” mechanism. Nowadays, you declare an ArrayList (or other collection) with a “type parameter” in angle brackets, and the parameters and returns are treated as being of that type by the compiler, ensuring that objects of the wrong type don’t make it into your collections, and avoiding the need to write casts when retrieving objects. For example, to declare an ArrayList for holding String object references:

List<String> myList = new ArrayList<>();

It is a good practice to declare the variable as the interface type List, even though you are defining it (constructing it) as an ArrayList. This makes it easier to change from one List implementation to another, and avoids accidentally depending on an implementation-specific method not in the List interface (which would also make it harder to change the implementation).

The <> in the definition part is a vestige of legacy Java versions, in which you had to repeat the type definition, so you’d write new ArrayList<String>() in that example. Nowadays just use <> (as in the example) to indicate that you want the type copied from the declaration. The “<>” is called the “diamond operator.”

As of Java 13, you can simplify by use of the new var keyword, though you lose the benefit of the List<T> declaration syntax:

var myList = new ArrayList<String>();

Table 7-2 shows some of the most important methods of the List interface, which is implemented by ArrayList and other List implementations. This means that the exact same methods can be used with the older Vector class and several other implementing classes. You’d just have to change the name used in the constructor call.

Table 7-2. Common List<T> Methods
Method signature Usage

add(T o)

Add the given element at the end

add(int i, T o)

Insert the given element at the specified position

clear( )

Remove all element references from the Collection

contains(T o)

True if the List contains the given object

forEach(lambda)

Perform the lambda for each element

get(int i)

Return the object reference at the specified position

indexOf(T o)

Return the index where the given object is found, or –1

of(T t, …)

Create a list from multiple objects

remove(T o), remove(int i)

Remove an object by reference or by position

toArray( )

Return an array containing the objects in the Collection

ArrayListDemo stores data in an ArrayList and retrieves it for processing:

public class ArrayListDemo {
    public static void main(String[] argv) {
        List<LocalDate> editions = new ArrayList<>();

        // Add lots of elements to the ArrayList...
        editions.add(LocalDate.of(2001, 06, 01));
        editions.add(LocalDate.of(2004, 06, 01));
        editions.add(LocalDate.of(2014, 06, 20));

        // Use old-style 'for' loop to get index number.
        System.out.println("Retrieving by index:");
        for (int i = 0; i<editions.size(); i++) {
            System.out.printf("Edition %d was %s
", i + 1, editions.get(i));
        }
        // Use normal 'for' loop for simpler access
        System.out.println("Retrieving by Iterable:");
        for (LocalDate dt : editions) {
            System.out.println("Edition " + dt);
        }

    }
}

The older Vector and Hashtable classes predate the Collections framework, so they offer additional methods with different names: Vector provides addElement() and elementAt(). You may still run across these in legacy code, but you should use the Collection methods add() and get() instead. Another difference is that the methods of Vector are synchronized, meaning that they can be accessed safely from multiple threads (see Recipe 16.5). This does mean more overhead, though, so for single-threaded access it is faster to use an ArrayList (see timing results in Recipe 7.19).

There are various conversion methods. Table 7-2 mentions toArray(), which will expose the contents of a List as an array. The List interface in Java 9+ features a static of() method, which converts in the other direction, from an array into a List. In conjunction with the Variable Arguments feature of modern Java, you can create and populate a list in one call to List.of(). For example:

List<String> firstNames = List.of("Robin", "Jaime", "Joey");

In legacy code that you will find in older apps and in web searches, Arrays.asList() provided this functionality, so you will come across code like this:

List<String> lastNames = Arrays.asList("Smith", "Jones", "MacKenzie");
// or even
List<String> lastNames =
    Arrays.asList(new String[]{"Smith", "Jones", "MacKenzie"});

Java does indeed get less verbose as time goes by!

You can still instantiate classes such as ArrayList without using a specific type. In this case, you will get a compiler warning, and the class will behave as in the old days— that is, the objects returned from a Collection or Iterator will be of type java.lang.Object and must be downcast before you can call any class-specific methods or use them in any application-specific method calls.

As a further example, consider the Map interface mentioned in Chapter 7. A Map requires a key and a value in its put() method. A Map, therefore, has two parameterized types. To set up a Map whose keys are Person objects and whose values are Address objects (assuming these two classes exist in your application), you could define it as:

Map<Person, Address> addressMap = new HashMap<>();

This Map expects a Person as its key and an Address as its value in the put() method; the get() method returns an Address object, the keySet() method returns Set<Person> (i.e., a Set specialized for Person objects), Map.of(key,value,key,value…) similar to List.of() (but limited to 10 pairs), and so on.

See Also

Although the generics avoid your having to write downcasts, the casts still occur at runtime; they are just provided by the compiler. The compiler techniques used in compiling these new constructs in a backward-compatible way include erasure and bridging, topics discussed in the O’Reilly book Java Generics by Maurice Naftalin and Philip Wadler.

7.5 Using Generic Types In Your Own Class

Problem

You wish to define your own container classes using the Generic Type mechanism to avoid needless casting.

Solution

Define a class using < TypeName > where the container type is declared, and TypeName where it is used.

Discussion

Consider the very simple Stack class in Example 7-1. (We discuss the nature and uses of stack classes in Recipe 7.16).

This version has been parameterized to take a type whose local name is T. This type T will be the type of the argument of the push() method, the return type of the pop() method, and so on. Because of this return type—more specific than the Object return type of the original Collections—the return value from pop() does not need to be downcasted. All containers in the Collections Framework (java.util) are parameterized similarly.

Example 7-1. main/src/main/java/structure/MyStack.java
public class MyStack<T> implements SimpleStack<T> {

    private int depth = 0;
    public static final int DEFAULT_INITIAL = 10;
    private T[] stack;

    public MyStack() {
        this(DEFAULT_INITIAL);
    }

    public MyStack(int howBig) {
        if (howBig <= 0) {
            throw new IllegalArgumentException(
            howBig + " must be positive, but was " + howBig);
        }
        stack = (T[])new Object[howBig];
    }

    @Override
    public boolean empty() {
        return depth == 0;
    }

    /** push - add an element onto the stack */
    @Override
    public void push(T obj) {
        // Could check capacity and expand
        stack[depth++] = obj;
    }

    /* pop - return and remove the top element */
    @Override
    public T pop() {
        --depth;
        T tmp = stack[depth];
        stack[depth] = null;
        return tmp;
    }

    /** peek - return the top element but don't remove it */
    @Override
    public T peek() {
        if (depth == 0) {
            return null;
        }
        return stack[depth-1];
    }

    public boolean hasNext() {
        return depth > 0;
    }

    public boolean hasRoom() {
        return depth < stack.length;
    }

    public int getStackDepth() {
        return depth;
    }
}

The association of a particular type is done at the time the class is instantiated. For example, to instantiate a MyStack specialized for holding BankAccount objects, you would need to code only the following:

MyStack<BankAccount> theAccounts = new MyStack<>( );

If you don’t provide a type parameter T, this collection, like the ones in java.util, will behave as they did in the days before Generic Collections—accepting input arguments of any type, returning java.lang.Object from getter methods, and requiring downcasting—as their default, backward-compatible behavior. Example 7-2 shows a program that creates two instances of MyStack, one specialized for Strings and one left general. The general one, called ms2, is loaded up with the same two String objects as ms1 but also includes a Date object. The printing code is now “broken,” because it will throw a ClassCastException: a Date is not a String. I handle this case specially for pedantic purposes: it is illustrative of the kinds of errors you can get into when using nonparameterized container classes.

Example 7-2. main/src/main/java/structure/MyStackDemo.java
public class MyStackDemo {

    @SuppressWarnings({"rawtypes","unchecked"})
    public static void main(String[] args) {
        MyStack<String> ms1 = new MyStack<>();
        ms1.push("billg");
        ms1.push("scottm");

        while (ms1.hasNext()) {
            String name = ms1.pop();
            System.out.println(name);
        }

        // Old way of using Collections: not type safe.
        // DO NOT GENERICIZE THIS
        MyStack ms2 = new MyStack();
        ms2.push("billg");               // EXPECT WARNING
        ms2.push("scottm");              // EXPECT WARNING
        ms2.push(new java.util.Date());  // EXPECT WARNING

        // Show that it is broken
        try {
            String bad = (String)ms2.pop();
            System.err.println("Didn't get expected exception, popped " + bad);
        } catch (ClassCastException ex) {
            System.out.println("Did get expected exception.");
        }

        // Removed the brokenness, print rest of it.
        while (ms2.hasNext()) {
            String name = (String)ms2.pop();
            System.out.println(name);
        }
    }
}

Because of this potential for error, the compiler warns that you have unchecked raw types. Like the deprecation warnings discussed in Recipe 1.9, by default, these warnings are not printed in detail by the javac compiler (they will appear in most IDEs). You ask for them, with the rather lengthy option -Xlint:unchecked:

C:> javac -source 1.5 structure/MyStackDemo.java
Note: MyStackDemo.java uses unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
C:> javac -source 1.5 -Xlint:unchecked structure/MyStackDemo.java
MyStackDemo.java:14: warning: unchecked call to push(T) as a member of the raw
type MyStack
                ms2.push("billg");
                   ^
MyStackDemo.java:15: warning: unchecked call to push(T) as a member of the raw
type MyStack
                ms2.push("scottm");
                   ^
MyStackDemo.java:16: warning: unchecked call to push(T) as a member of the raw
type MyStack
                ms2.push(new java.util.Date( ));
                   ^
3 warnings
C:>

I say more about the development and evolution of MyStack in Recipe 7.16.

7.6 How Shall I Iterate Thee? Let Me Enumerate the Ways

Problem

You need to iterate over some structured data.

Solution

Java provides many ways to iterate over collections of data. In newest-first order:

  • Stream.forEach() method (Java 8)

  • Iterable.forEach() method (Java 8)

  • Java “foreach” loop (Java 5)

  • java.util.Iterator (Java 2)

  • Three-part for loop

  • “while” loop

  • Enumeration

Pick one and use it. Or learn them all and save!

Discussion

A few words on each of the iteration methods are given here. Note that the first few are the most common.

Stream.forEach method (Java 8)

The Stream mechanism introduced as part of Java’s Functional Programming provides one of two most-recent ways of iterating, Stream.forEach(), and is discussed in Recipe 9.4. For now, here’s a quick example, using the BufferedReader method lines() that returns a Stream:

$ jshell
jshell> import java.io.*;
jshell> BufferedReader is = new BufferedReader(new FileReader("/home/ian/.profile"));
is ==> java.io.BufferedReader@58651fd0
jshell> is.lines().forEach(System.out::println)
... prints the lines of the file ...

Iterable.forEach method (Java 8)

The other recent iteration technique is the Iterable.forEach() method, added in Java 8. This method can be called on any Iterable (unfortunately, the array class does not yet implement Iterable), and takes one argument implementing the functional interface java.util.function.Consumer. Functional Interfaces are discussed in Chapter 9, but here is one example:

public class IterableForEach {

    public static void main(String[] args) {
        Collection<String> c =                    1
                List.of("One", "Two", "Three");   2
        c.forEach(s -> System.out.println(s));    3
    }
}
1

Declare a Collection (a Collection is an Iterable).

2

Populate it with Arrays.of() with an array or sequence of object (see Recipe 7.4 for how this arbitrary argument list becomes an array).

3

Invoke the collection’s forEach() method, passing a lambda expression (see Chapter 9 for a discussion of how s→System.out.println(s) gets mapped to a Consumer interface implementation without your even having to import this interface).

This style of iteration—sometimes called “internal iteration”—inverts the control from the traditional for loop; the collection is in charge of when and how the iteration works.

Tip

Both Stream.forEach and Iterable.forEach() take one argument, of type java.util.function.Consumer, so they work largely the same way. This is intentional.

Java “foreach” loop (Java 5)

The “foreach” loop syntax is:

for (Type var : Iterable<Type>) {
	// do something with "var"
}

This is probably the most common style of loop in modern Java code. The Iterable can be an array, or anything that implements Iterable (the Collection implementations are included in this).

This style is used throughout the book. As well, many third-party frameworks/libraries provide their own types that implement Iterable for use with the for loop.

java.util.Iterator (Java 2)

The older Iterator interface has three methods:

public interface java.util.Iterator<E> {
  public abstract boolean hasNext();
  public abstract E next();
  public default void remove();
}

It was once common to write code like this, which you’ll still find occasionally in older code:

Iterator it = ...; // legacy code; might not even have type parameter
while (it.hasNext()) {
	(MyDataType) c = it.next();
	// Do something with c
}

The remove() method throws an UnsupportedOperationException if called on a read-only collection. In conjunction with Streams and default methods, there is now a fourth method:

public default void forEachRemaining(java.util.function.Consumer<? super E>);

Three-part for loop

This is the traditional for loop invented by Dennis Ritchie in the early 1970s for the C language. The syntax is:

for (init; test; change) {
	// do something
}

Its most common form is with an int “index variable” or “loop variable”:

MyDataType[] data = ...
for (int i = 0; i < data.length; i++)
	MyDataType d = data[i];
	// do something with it
}

while loop

A while loop executes its loop body as long as (“while”) the test condition is true. Commonly used in conjunction with the Enumeration or Iterator, as in

Iterator<MyData> iterator = ...
while (iterator.hasNext()) {
	MyData md = iterator.next();
	//
}

Enumeration

An Enumeration is like an Iterator (shown earlier), but it lacks the remove() method, and the control methods have longer names—for example, hasMoreElements() and nextElement(). There is little to recommend implementing Enumeration in new code.

7.7 Eschewing Duplicates with a Set

Problem

You want a structure that will avoid storing duplicates.

Solution

Use a Set implementation instead of a List (e.g., Set<String> myNames = new HashSet<>()).

Discussion

The Set interface is similar to the List interface,2 with methods like add(), remove(), contains(), size(), isEmpty(), and the like. The differences are that it does not preserve order; instead, it enforces uniqueness—if you add the “same” item (as considered by its equals() method) twice or more, it will only be present once in the set. For this reason, the index-based methods such as add(int, Object) and get(int), are missing from the Set implementation: you might “know” that you’ve added seven objects but only five of those were unique, so calling get() to retrieve the sixth one would have to throw an ArrayIndexOutOfBoundsException! Better that you don’t think of a Set as being indexed.

Warning

As the Java 7 Set document states: “Note: Great care must be exercised if mutable objects are used as set elements. The behavior of a set is not specified if the value of an object is changed in a manner that affects equals comparisons while the object is an element in the set. A special case of this prohibition is that it is not permissible for a set to contain itself as an element.”

This code shows a duplicate entry being made to a Set, which will contain only one copy of the strong "One":.

        Set<String> hashSet = new HashSet<>();
        hashSet.add("One");
        hashSet.add("Two");
        hashSet.add("One"); // DUPLICATE
        hashSet.add("Three");
        hashSet.forEach(s -> System.out.println(s));

Not surprisingly, only the three distinct values are printed.

If you need a sorted Set, there is in fact a SortedSet interface, of which the most common implementation is a TreeSet; see a TreeSet example in Recipe 7.12.

As with Lists, the Set interface offers the of method as of Java 9:

Set<Double> nums = Set.of(Math.PI, 22D/7, Math.E);
Set<String> firstNames = Set.of("Robin", "Jaime", "Joey");

7.8 Structuring Data in a Linked List

Problem

Your data isn’t suitable for use in an array.

Solution

Use a Linked List; Java’s LinkedList class is quite suitable.

Discussion

Anybody who’s taken Computer Science 101 (or any computer science course) should be familiar with data structuring, such as linked lists, binary trees, and the like. Though this is not the place to discuss the details of such things, I’ll give a brief illustration of the common linked list. A linked list is commonly used when you have an unpredictably large number of data items, you wish to allocate just the right amount of storage, and usually want to access them in the same order that you created them. Figure 7-2 is a diagram showing the normal arrangement.

jcb4 0702
Figure 7-2. Linked list structure

Of course, the Collections API provides a LinkedList class; here is a simple program that uses it:

public class LinkedListDemo {
    public static void main(String[] argv) {
        System.out.println("Here is a demo of Java's LinkedList class");
        LinkedList<String> l = new LinkedList<>();
        l.add(new Object().toString());
        l.add("Hello");
        l.add("end of the list");

        System.out.println("Here is a list of all the elements");
        l.forEach(o ->
            System.out.println("Next element: " + o));

        if (l.indexOf("Hello") < 0)
            System.err.println("Lookup does not work");
        else
            System.err.println("Lookup works");

        // Now, for added fun, let's walk the linked list backwards.
        ListIterator<String> li = l.listIterator();
        while (li.hasPrevious()) {
            System.out.println("Back to: " + li.previous());
        }
    }
}

The ListIterator used here is a subinterface of Iterator, which was discussed in Recipe 7.6.

Just to show how this kind of list works, here is code that shows part of the implemention of a simple linked list:

public class LinkList<T> implements List<T> {

    /* A TNode stores one node or item in a Linked List */
    private static class TNode<T> {
        private TNode<T> next;
        private T data;
        TNode(T o, TNode<T> next) {
            data = o;
            this.next = next;
        }
        @Override
        public String toString() {
            return String.format("TNode: data='%s', next='%d'", data,
                    next == null ? 0 : next.hashCode());
        }
    }

    private boolean DIAGNOSTIC = false;

    /** The root or first TNode in the list; is a dummy pointer,
     * so its data will always be null. Simpler this way.
     */
    protected TNode<T> first;
    /**
     * For certain optimizations: A second ref to the last TNode in the list;
     * initially == first; always valid (never null), always has next == null.
     */
    protected TNode<T> last;

    /** Construct a LinkList: initialize the first and last nodes */
    public LinkList() {
        clear();
    }

    /** Construct a LinkList given another Collection.
     * This method is recommended by the general contract of List.
     */
    public LinkList(Collection<T> c) {
        this();
        addAll(c);
    }

    /** Set the List (back) to its initial state.
     * Any references held will be discarded.
     */
    @Override
    public void clear() {
        first = new TNode<T>(null, null);
        last = first;
    }

    /** Add one object to the end of the list. Update the "next"
     * reference in the previous end, to refer to the new node.
     * Update "last" to refer to the new node.
     */
    @Override
    public boolean add(T o) {
        last.next = new TNode<T>(o, null);
        last = last.next;
        return true;
    }

    @Override
    public void add(int where, T o) {
        TNode<T> t = first;
        for (int i=0; i<=where; i++) {
            t = t.next;
            if (t == null) {
                throw new IndexOutOfBoundsException(
                    "'add(n,T) went off end of list");
            }
            if (DIAGNOSTIC) {
                System.out.printf("in add(int,T): i = %d, t = %s%n", i, t);
            }
        }
        if (DIAGNOSTIC) {
            System.out.printf("in add(int,T): to insert before %s
", t);
        }
        final TNode<T> nn = new TNode<>(o, t.next);
        t.next = nn;
        if (DIAGNOSTIC) {
            System.out.printf("add(%d,%s)
", where, o);
            dump("add(int,T)");
        }
    }

    @Override
    public boolean addAll(Collection<? extends T> c) {
        c.forEach(o -> add((T) o));
        return false;
    }

    @Override
    public boolean addAll(int i, Collection<? extends T> c) {
        AtomicInteger j = new AtomicInteger(i);
        c.forEach(o -> { add(j.getAndIncrement(), o); });
        return true;
    }

    @Override
    public boolean contains(Object o) {
        TNode<T> t = first;
        while ((t = t.next) != null) {
            if (t.data.equals(o)) {
                return true;
            }
        }
        return false;
    }

    @Override
    public T get(int where) {
        TNode<T> t = first;
        int i=0;
        // If we get to the end of list before 'where', error out
        while (i++<=where) {
            if (t.next == null) {
                throw new IndexOutOfBoundsException();
            }
            t = t.next;
        }
        return t.data;
    }

    @Override
    public boolean isEmpty() {
        return first == last;
    }

    public Iterator<T> iterator() {
        return new Iterator<T>() {
            final int size = size();
            int n = 0;
            TNode<T> t = first;
            /**
             * Two cases in which next == null:
             * 1) The list is empty, we are at first
             * 2) The list is not empty, we are at last.
             */
            public boolean hasNext() {
                return n < size;
            }

            public T next() {
                if (t == first) {
                    t = t.next;
                }
                TNode<T> result = t;
                t = t.next;
                ++n;
                return result.data;
            }
            public void remove() {
                throw new UnsupportedOperationException("remove");
            }
        };
    }

    @Override
    public boolean remove(Object o) {
        TNode<T> p = first, prev = null;
        while (p != null) {
            if (p.data == o) {
                prev.next = p.next;
                return true;
            }
            prev = p; p = p.next;
        }
        return false;
    }

    @Override
    public T set(int i, T o) {
        TNode<T> tmp = find(i);
        tmp.data = o;
        return o;
    }

    @Override
    public int size() {
        TNode<T> t = first;
        int i;
        for (i=0; ; i++) {
            if (t == null)
                break;
            t = t.next;
        }
        return i - 1;    // subtract one for mandatory head node.
    }

    @SuppressWarnings("unchecked")
    public T[] toArray(Object[] data) {
        // First is an empty anchor, start at its next
        TNode<T> p = first.next;
        for (int i = 0; p != null && i < data.length; i++) {
            data[i] = p.data;
            p = p.next;
        }
        return (T[]) data;
    }

    public Object[] toArray() {
        Object[] data = new Object[size()];
        return toArray(data);
    }
Warning

This is just to show how the implementation of a linked list might work. Do not use the simple LinkList class shown here; use the real one, java.util.LinkedList, shown in action in the first example.

7.9 Mapping with Hashtable and HashMap

Problem

You need a one-way mapping from one data item to another.

Solution

Use a HashMap

Discussion

HashMap provides a one-way mapping from one set of object references to another. They are completely general purpose. I’ve used them to map from Swing push buttons to the URL that is to be opened when the button is pushed, to map names to addresses, and to implement a simple in-memory cache in a web server. You can map from anything to anything. In the following example, we map from company names to addresses; the addresses here are String objects, but in real life they’d probably be Address objects:

public class HashMapDemo {

    public static void main(String[] argv) {

        // Construct and load the hash. This simulates loading a
        // database or reading from a file, or wherever the data is.

        Map<String,String> map = new HashMap<String,String>();

        // The hash maps from company name to address.
        // In real life this might map to an Address object...
        map.put("Adobe", "Mountain View, CA");
        map.put("IBM", "White Plains, NY");
        map.put("Learning Tree", "Los Angeles, CA");
        map.put("Microsoft", "Redmond, WA");
        map.put("Netscape", "Mountain View, CA");
        map.put("O'Reilly", "Sebastopol, CA");
        map.put("Sun", "Mountain View, CA");

        // Two versions of the "retrieval" phase.
        // Version 1: get one pair's value given its key
        // (presumably the key would really come from user input):
        String queryString = "O'Reilly";
        System.out.println("You asked about " + queryString + ".");
        String resultString = map.get(queryString);
        System.out.println("They are located in: " + resultString);
        System.out.println();

        // Version 2: get ALL the keys and values
        // (maybe to print a report, or to save to disk)
        for( String key : map.keySet()) {
            System.out.println("Key " + key +
                "; Value " + map.get(key));
        }

        // Version 3: Same but using a Map.Entry lambda
        map.entrySet().forEach(mE ->
            System.out.println("Key + " + mE.getKey()+
                "; Value " +mE.getValue()));
    }
}

For this version we used both a for loop and a forEach() loop; the latter uses the return from entrySet(), a set of Map.Entry, each of which contains one key and one value (this may be faster on large maps because it avoids going back into the map to get the value each time through the loop). If you are modifying the list as you are going through it (e.g., removing elements), either inside the loop or in another thread, then these forms will fail with a ConcurrentModificationException. You then need to use the Iterator explicitly to control the loop:

        // Version 2: get ALL the keys and values
        // with concurrent modification
        Iterator<String> it = map.keySet().iterator();
        while (it.hasNext()) {
            String key = it.next();
            if (key.equals("Sun") || key.equals("Netscape")) {
                it.remove();
                continue;
            }
            System.out.println("Company " + key + "; " +
                "Address " + map.get(key));
        }

A more functional (see Chapter 9) way of writing the removal, not involving explicit looping, would be

        // Alternate to just do the removals, without explicit looping
        map.keySet().removeIf(key -> Set.of("Netscape", "Sun").contains(key));
        // or
        map .entrySet()
            .removeIf(entry -> Set.of("Netscape", "Sun")
            .contains(entry.getKey()));
        map.entrySet().forEach(System.out::println);
Tip

HashMap methods are not synchronized. The older and similar Hashtable methods are synchronized, for use with multiple threads.

7.10 Storing Strings in Properties and Preferences

Problem

You need to store keys and values that are both strings, possibly with persistence across runs of a program—for example, program customization.

Solution

Use a java.util.prefs.Preferences object or a java.util.Properties object.

Discussion

Here are three approaches to customization based on the user’s environment. Java offers Preferences and Properties for cross-platform customizations.

Preferences

The Preferences class java.util.prefs.Preferences provides an easy-to-use mechanism for storing user customizations in a system-dependent way (which might mean dot files on Unix, a preferences file on the Mac, or the registry on Windows systems). This class provides a hierarchical set of nodes representing a user’s preferences. Data is stored in the system-dependent storage format but can also be exported to or imported from an XML format. Here is a simple demonstration of Preferences:

public class PrefsDemo {

    public static void main(String[] args) throws Exception {

        // Setup the Preferences for this application, by class.
        Preferences prefs = Preferences.userNodeForPackage(PrefsDemo.class);

        // Retrieve some preferences previously stored, with defaults in case
        // this is the first run.
        String text    = prefs.get("textFontName", "lucida-bright");
        String display = prefs.get("displayFontName", "lucida-blackletter");
        System.out.println(text);
        System.out.println(display);

        // Assume the user chose new preference values: Store them back.
        prefs.put("textFontName", "times-roman");
        prefs.put("displayFontName", "helvetica");

        // Toss in a couple more values for the curious who want to look
        // at how Preferences values are actually stored.
        Preferences child = prefs.node("a/b");
        child.putInt("meaning", 42);
        child.putDouble("pi", Math.PI);

        // And dump the subtree from our first node on down, in XML.
        prefs.exportSubtree(System.out);
    }
}

When you run the PrefsDemo program the first time, of course, it doesn’t find any settings, so the calls to preferences.get( ) return the default values:

$ java -cp target/classes structure.PrefsDemo
lucida-bright
lucida-blackletter
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE preferences SYSTEM "http://java.sun.com/dtd/preferences.dtd">
<preferences EXTERNAL_XML_VERSION="1.0">
  <root type="user">
    <map/>
    <node name="structure">
      <map>
        <entry key="displayFontName" value="helvetica"/>
        <entry key="textFontName" value="times-roman"/>
      </map>
      <node name="a">
        <map/>
        <node name="b">
          <map>
            <entry key="meaning" value="42"/>
            <entry key="pi" value="3.141592653589793"/>
          </map>
        </node>
      </node>
    </node>
  </root>
</preferences>

On subsequent runs, it finds and returns the “user provided” settings (I’ve elided the XML output from the second run, because most of the XML output is the same):

> java structure.PrefsDemo
times-roman
helvetica
...
>

Properties

The Properties class is similar to a HashMap or Hashtable (it extends the latter), but with methods defined specifically for string storage and retrieval and for loading/saving. Properties objects are used throughout Java, for everything from setting the platform font names to customizing user applications into different Locale settings as part of internationalization and localization. When stored on disk, a Properties object looks just like a series of name=value assignments, with optional comments. Comments are added when you edit a Properties file by hand, ignored when the Properties object reads itself, and lost when you ask the Properties object to save itself to disk. Here is an example of a Properties file that could be used to internationalize the menus in a GUI-based program:

# Default properties for MenuIntl
program.title=Demonstrate I18N (MenuIntl)
program.message=Welcome to an English-localized Java Program
#
# The File Menu
#
file.label=File Menu
file.new.label=New File
file.new.key=N
file.open.label=Open...
file.open.key=O
file.save.label=Save
file.save.key=S
file.exit.label=Exit
file.exit.key=Q

Here is another example, showing some personalization properties:

name=Ian Darwin
favorite_popsicle=cherry
favorite_rock group=Fleetwood Mac
favorite_programming_language=Java
pencil_color=green

A Properties object can be loaded from a file. The rules are flexible: either =, :, or spaces can be used after a key name and its values. Spaces after a nonspace character are ignored in the key. A backslash can be used to continue lines or to escape other characters. Comment lines may begin with either # or !. Thus, a Properties file containing the previous items, if prepared by hand, could look like this:

# Here is a list of properties
! first, my name
name Ian Darwin
favorite_popsicle = cherry
favorite_rock group 
 Fleetwood Mac
favorite_programming_language=Java
pencil_color green

Fortunately, when a Properties object writes itself to a file, it uses the simple format:

key=value

Here is an example of a program that creates a Properties object and adds into it the list of companies and their locations from Recipe 7.9. It then loads additional properties from disk. To simplify the I/O processing, the program assumes that the Properties file to be loaded is contained in the standard input, as would be done using a command-line redirection on either Unix or DOS:

public class PropsCompanies {

    public static void main(String[] argv) throws java.io.IOException {

        Properties props = new Properties();

        // Get my data
        props.put("Adobe", "Mountain View, CA");
        props.put("IBM", "White Plains, NY");
        props.put("Learning Tree", "Los Angeles, CA");
        props.put("Microsoft", "Redmond, WA");
        props.put("Netscape", "Mountain View, CA");
        props.put("O'Reilly", "Sebastopol, CA");
        props.put("Sun", "Mountain View, CA");

        // Now load additional properties
        props.load(System.in);

        // List merged properties, using System.out
        props.list(System.out);
    }
}

Running it as:

java structure.PropsCompanies < PropsDemo.out

produces the following output in the file PropsDemo.out:

-- listing properties --
Sony=Japan
Sun=Mountain View, CA
IBM=White Plains, NY
Netscape=Mountain View, CA
Nippon_Kogaku=Japan
Acorn=United Kingdom
Adobe=Mountain View, CA
Ericsson=Sweden
O'Reilly & Associates=Sebastopol, CA
Learning Tree=Los Angeles, CA

In case you didn’t notice in either the HashMap or the Properties examples, the order in which the outputs appear in these examples is neither sorted nor in the same order we put them in. The hashing classes and the Properties subclass make no claim about the order in which objects are retrieved. If you need them sorted, see Recipe 7.11.

As a convenient shortcut, my FileProperties class includes a constructor that takes a filename, as in:

import com.darwinsys.util.FileProperties;
...
Properties p = new FileProperties("PropsDemo.out");

Note that constructing a FileProperties object causes it to be loaded, and therefore the constructor may throw a checked exception of class IOException.

7.11 Sorting a Collection

Problem

You put your data into a collection in random order or used a Properties object that doesn’t preserve the order, and now you want it sorted.

Solution

Use the static method Arrays.sort() or Collections.sort(), optionally providing a Comparator.

Discussion

If your data is in an array, then you can sort it using the static sort() method of the Arrays utility class. If it is in a Collection, you can use the static sort() method of the Collections class. Here is a set of strings being sorted in-place in an Array:

public class SortArray {
    public static void main(String[] unused) {
        String[] strings = {
            "painful",
            "mainly",
            "gaining",
            "raindrops"
        };
        Arrays.sort(strings);
        for (int i=0; i<strings.length; i++) {
            System.out.println(strings[i]);
        }
    }
}

What if the default sort order isn’t what you want? Well, you can create an object that implements the Comparator<T> interface and pass that as the second argument to sort. Fortunately, for the most common ordering next to the default, you don’t have to: a public constant String.CASE_INSENSITIVE_ORDER can be passed as this second argument. The String class defines it as “a Comparator<String> that orders String objects as by compareToIgnoreCase.” But if you need something fancier, you probably need to write a Comparator<T>. In some cases you may be able to use the Comparator.comparing() method and other static methods on Comparator to create a custom comparator without having to create a class. Suppose that, for some strange reason, you need to sort strings using all but the first character of the string. One way to do this would be to write this Comparator<String>:

/** Comparator for comparing strings ignoring first character.
 */
public class SubstringComparator implements Comparator<String> {
    @Override
    public int compare(String s1, String s2) {
        s1 = s1.substring(1);
        s2 = s2.substring(1);
        return s1.compareTo(s2);
        // or, more concisely:
        // return s1.substring(1).compareTo(s2.substring(1));
    }
}

Using it is just a matter of passing it as the Comparator argument to the correct form of sort(), as shown here:

public class SubstringComparatorDemo {
    public static void main(String[] unused) {
        String[] strings = {
            "painful",
            "mainly",
            "gaining",
            "raindrops"
        };
        Arrays.sort(strings);
        dump(strings, "Using Default Sort");
        Arrays.sort(strings, new SubstringComparator());
        dump(strings, "Using SubstringComparator");

        // tag::functional[]
        System.out.println("Functional approach:");
        Arrays.stream(strings)
            .sorted(Comparator.comparing(s->s.substring(1)))
            .forEach(System.out::println);
        // end::functional[]
    }

    static void dump(String[] args, String title) {
        System.out.println(title);
        for (String s : args)
            System.out.println(s);
    }
}

Again, a more functional (see Chapter 9) way of writing this might be:

        System.out.println("Functional approach:");
        Arrays.stream(strings)
            .sorted(Comparator.comparing(s->s.substring(1)))
            .forEach(System.out::println);

Here is the output of running it:

$ java structure.SubstrCompDemo
Using Default Sort
gaining
mainly
painful
raindrops
Using SubstringComparator
raindrops
painful
gaining
mainly

And this is all as it should be.

On the other hand, you may be writing a class and want to build in the comparison functionality so that you don’t always have to remember to pass the Comparator with it. In this case, you can directly implement the java.lang.Comparable interface, as is done by many classes in the standard API. These include String class; the wrapper classes Byte, Character, Double, Float, Long, Short, and Integer, BigInteger, and BigDecimal from java.math; most objects in the date/time API in java.time; and java.text.CollationKey all implement this interface. Arrays or Collections of these types can be sorted without providing a Comparator. Classes that implement Comparable are said to have a “natural” ordering. The documentation strongly recommends that a class’s natural ordering be consistent with its equals() method, and it is consistent with equals() if and only if e1.compareTo((Object)e2)==0 has the same Boolean value as e1.equals((Object)e2) for every instance, e1 and e2 of the given class. This means that if you implement Comparable, you should also implement equals(), and the logic of equals() should be consistent with the logic of the compareTo() method. If you implement equals(), incidentally, you also should implement +hashCode() (as discussed in “hashCode() and equals()”). Here, for example, is part of the appointment class Appt from a hypothetical scheduling program. The class has a LocalDate date variable and a LocalTime time variable; the latter may be null (for e.g., an all-day appointment or a TODO item); this complicates the compareTo() function a little.

// public class Appt implements Comparable {
    // Much code and variables omitted - see online version
    //-----------------------------------------------------------------
    //    METHODS - COMPARISON
    //-----------------------------------------------------------------
    /** compareTo method, from Comparable interface.
     * Compare this Appointment against another, for purposes of sorting.
     * <P>Only date and time, then text, participate, not repetition!
     * (Repetition has to do with recurring events, e.g.,
     *  "Meeting every Tuesday at 9").
     * This methods is consistent with equals().
     * @return -1 if this<a2, +1 if this>a2, else 0.
     */
    @Override
    public int compareTo(Appt a2) {
        // If dates not same, trigger on their comparison
        int dateComp = date.compareTo(a2.date);
        if (dateComp != 0)
            return dateComp;
        // Same date. If times not same, trigger on their comparison
        if (time != null && a2.time != null) {
            // Neither time is null
            int timeComp = time.compareTo(a2.time);
            if (timeComp != 0)
                return timeComp;
        } else /* At least one time is null */ {
            if (time == null && a2.time != null) {
                return -1; // All-day appts sort low to appear first
            } else if (time != null && a2.time == null)
                return +1;
                // else both have no time set, so carry on,
        }
        // Same date & time, trigger on text
        return text.compareTo(a2.text);
    }

    @Override
    public int hashCode() {
        final int prime = 31;
        int result = 1;
        result = prime * result + ((date == null) ? 0 : date.hashCode());
        result = prime * result + ((text == null) ? 0 : text.hashCode());
        result = prime * result + ((time == null) ? 0 : time.hashCode());
        return result;
    }

    @Override
    public boolean equals(Object o2) {
        if (this == o2)
            return true;
        if (o2.getClass() != Appt.class)
            return false;
        Appt a2 = (Appt) o2;
        if (!date.equals(a2.date))
            return false;
        if (time != null && !time.equals(a2.time))
            return false;
        return text.equals(a2.text);
    }

    /** Return a String representation of this Appt.
     * Output is intended for debugging, not presentation!
     */
    @Override
    public String toString() {
        var sb = new StringBuilder();
        sb.append(date).append(' ');
        if (time != null) {
            sb.append(time.getHour())
            .append(':')
            .append(time.getMinute())
            .append(' ');
        } else {
            sb.append("(All day)").append(' ');
        }
        sb.append(text).toString();
        return sb.toString();
    }

If you’re still confused between Comparable and Comparator, you’re probably not alone. Table 7-3 summarizes the two “comparison” interfaces.

Table 7-3. Comparable compared with Comparator
Interface name Description Method(s)

java.lang.Comparable<T>

Provides a natural ordering to objects. Written in the class whose objects are being sorted.

int compareTo(T o);

java.util.Comparator<T>

Provides total control over sorting objects of another class. Standalone Strategy object; pass to sort() method or Collection constructor.

int compare(T o1, T o2); boolean equals(T c2)

7.12 Avoiding the Urge to Sort

Problem

Your data needs to be sorted, but you don’t want to stop and sort it periodically.

Solution

Not everything that requires order requires an explicit sort operation. Just keep the data sorted at all times.

Discussion

You can avoid the overhead and elapsed time of an explicit sorting operation by ensuring that the data is in the correct order at all times, though this may or may not be faster overall, depending on your data and how you choose to keep it sorted. You can keep it sorted either manually or by using a TreeSet or a TreeMap. First, here is some code from a call tracking program that I first wrote on the very first public release of Java (the code has been modernized slightly!) to keep track of people I had extended contact with. Far less functional than a Rolodex, my CallTrack program maintained a list of people sorted by last name and first name. It also had the city, phone number, and email address of each person. Here is a very small portion of the code surrounding the event handling for the New User push button:

public class CallTrack {

    /** The list of Person objects. */
    protected List<Person> usrList = new ArrayList<>();

    /** The scrolling list */
    protected java.awt.List visList = new java.awt.List();

    /** Add one (new) Person to the list, keeping the list sorted. */
    protected void add(Person p) {
        String lastName = p.getLastName();
        int i;
        // Find in "i" the position in the list where to insert this person
        for (i=0; i<usrList.size(); i++)
            if (lastName.compareTo((usrList.get(i)).getLastName()) <= 0)
                break; // If we don't break, OK, will insert at end of list.
        usrList.add(i, p);

        // Now insert them in the scrolling list, in the same position.
        visList.add(p.getFullName(), i);
        visList.select(i);      // ensure current
    }

}

This code uses the String class compareTo(String) routine.

Warning

This code uses a linear search, which was fine for the original application, but could get very slow on large lists (it is O(n)). You’d need to use hashing or a binary search to find where to put the values on large lists.

If I were writing this code today, I might well use a TreeSet (which keeps objects in order) or a TreeMap (which keeps the keys in order and maps from keys to values; the keys would be the name and the values would be the Person objects). Both insert the objects into a tree in the correct order, so an Iterator that traverses the tree always returns the objects in sorted order. In addition, they have methods such as headSet() and headMap(), which give a new Set or Map of objects of the same class, containing the objects lexically before a given value. The tailSet() and tailMap() methods, similarly, return objects greater than a given value, and subSet() and subMap() return a range. The first() and last() methods retrieve the obvious components from the collection. The following program uses a TreeSet to sort some names:

        // A TreeSet keeps objects in sorted order. Use a Comparator
        // published by String for case-insensitive sorting order.
        TreeSet<String> theSet = new TreeSet<>(String.CASE_INSENSITIVE_ORDER);
        theSet.add("Gosling");
        theSet.add("da Vinci");
        theSet.add("van Gogh");
        theSet.add("Java To Go");
        theSet.add("Vanguard");
        theSet.add("Darwin");
        theSet.add("Darwin");    // TreeSet is Set, ignores duplicates.

        System.out.printf("Our set contains %d elements", theSet.size());

        // Since it is sorted we can easily get various subsets
        System.out.println("Lowest (alphabetically) is " + theSet.first());

        // Print how many elements are greater than "k"
        // Should be 2 - "van Gogh" and "Vanguard"
        System.out.println(theSet.tailSet("k").toArray().length +
            " elements higher than "k"");

        // Print the whole list in sorted order
        System.out.println("Sorted list:");
        theSet.forEach(name -> System.out.println(name));

One last point to note is that if you have a Hashtable or HashMap, you can convert it to a TreeMap, and therefore get it sorted, just by passing it to the TreeMap constructor:

TreeMap sorted = new TreeMap(unsortedHashMap);

7.13 Finding an Object in a Collection

Problem

You need to see whether a given collection contains a particular value.

Solution

Ask the collection if it contains an object of the given value.

Discussion

If you have created the contents of a collection, you probably know what is in it and what is not. But if the collection is prepared by another part of a large application, or even if you’ve just been putting objects into it and now need to find out if a given value was found, this recipe’s for you. There is quite a variety of methods, depending on which collection class you have. The methods in Table 7-4 can be used.

Table 7-4. Finding objects in a collection
Method(s) Meaning Implementing classes

binarySearch()

Fairly fast search

Arrays, Collections

contains( )

Search

ArrayList, HashSet, Hashtable, LinkList, Properties, Vector

containsKey( ), containsValue( )

Checks if the collection contains the object as a Key or as a Value

HashMap, Hashtable, Properties, TreeMap

indexOf( )

Returns location where object is found

ArrayList, LinkedList, List, Stack, Vector

search( )

Search

Stack

The methods whose names start with contains will use a linear search if the collection is a collection (List, Set), but will be quite fast if the collection is hashed (HashSet, HashMap). So you do have to know what implementation is being used to think about performance, particularly when the collection is (or is likely to grow) large.

The next example plays a little game of “find the hidden number” (or “needle in a haystack”): the numbers to look through are stored in an array. As games go, it’s fairly pathetic: the computer plays against itself, so you probably know who’s going to win. I wrote it that way so I would know that the data array contains valid numbers. The interesting part is not the generation of the random numbers (discussed in Recipe 5.9). The array to be used with Arrays.binarySearch() must be in sorted order, but because we just filled it with random numbers, it isn’t initially sorted. Hence, we call Arrays.sort() on the array. Then we are in a position to call Arrays.binarySearch(), passing in the array and the value to look for. If you run the program with a number, it runs that many games and reports on how it fared overall. If you don’t bother, it plays only one game:

public class ArrayHunt  {
    /** the maximum (and actual) number of random ints to allocate */
    protected final static int MAX    = 4000;
    /** the value to look for */
    protected final static int NEEDLE = 1999;
    int[] haystack;
    Random r;

    public static void main(String[] argv) {
        ArrayHunt h = new ArrayHunt();
        if (argv.length == 0)
            h.play();
        else {
            int won = 0;
            int games = Integer.parseInt(argv[0]);
            for (int i=0; i<games; i++)
                if (h.play())
                    ++won;
            System.out.println("Computer won " + won +
                " out of " + games + ".");
        }
    }

    /** Construct the hunting ground */
    public ArrayHunt() {
        haystack = new int[MAX];
        r = new Random();
    }

    /** Play one game. */
    public boolean play() {
        int i;

        // Fill the array with random data (hay?)
        for (i=0; i<MAX; i++) {
            haystack[i] = (int)(r.nextFloat() * MAX);
        }

        // Precondition for binary search is that data be sorted!
        Arrays.sort(haystack);

        // Look for needle in haystack
        i = Arrays.binarySearch(haystack, NEEDLE);

        if (i >= 0) {        // Found it, we win.
            System.out.println("Value " + NEEDLE +
                " occurs at haystack[" + i + "]");
            return true;
        } else {        // Not found, we lose.
            System.out.println("Value " + NEEDLE +
                " does not occur in haystack; nearest value is " +
                haystack[-(i+2)] + " (found at " + -(i+2) + ")");
            return false;
        }
    }
}

The Collections.binarySearch() works almost exactly the same way, except it looks in a Collection, which must be sorted (presumably using Collections.sort, as discussed in Recipe 7.11).

7.14 Converting a Collection to an Array

Problem

You have a Collection but you need a Java language array.

Solution

Use the Collection method toArray().

Discussion

If you have an ArrayList or other Collection and you need an array, you can get it just by calling the Collection’s toArray() method. With no arguments, you get an array whose type is Object[]. You can optionally provide an array argument, which is used for two purposes:

  • The type of the array argument determines the type of array returned.

  • If the array is big enough (and you can ensure that it is by allocating the array based on the Collection’s size() method), then this array is filled and returned. If the array is not big enough, a new array is allocated instead. If you provide an array and objects in the Collection cannot be cast to this type, then you will get an ArrayStoreException.

Example 7-3 shows code for converting an ArrayList to an array of type Object.

Example 7-3. main/src/main/java/structure/ToArray.java
        List<String> list = new ArrayList<>();
        list.add("Blobbo");
        list.add("Cracked");
        list.add("Dumbo");

        // Convert a collection to Object[], which can store objects
        // of any type.
        Object[] ol = list.toArray();
        System.out.println("Array of Object has length " + ol.length);

        String[] sl = (String[]) list.toArray(new String[0]);
        System.out.println("Array of String has length " + sl.length);

7.15 Making Your Data Iterable

Problem

You have written your own data structure, and want to publish the data to be iterable so it can be used in the for-each loop.

Solution

Make your data class Iterable: this interace has only one method, iterator(). Write your own Iterator. Just implement (or provide an inner class that implements) the Iterator interface.

Discussion

To be usable in the modern Java for-each loop, your data class must implement Iterable, a simple interface with one method, Iterator<T> iterator(). Whether you use this interface, or want to use the older Iterator interface directly, the way to make data from one part of your program available in a storage-independent way to other parts of the code is to generate an Iterator. Here is a short program that constructs, upon request, an Iterator for some data that it is storing—in this case, in an array. The Iterator interface has only three methods—hasNext(), next(), and remove(), demonstrated in Example 7-4.

Example 7-4. main/src/main/java/structure//IterableDemo
public class IterableDemo {

    /** Demo implements Iterable, meaning it must provide an Iterator,
     * and, that it can be used in a foreach loop.
     */
    static class Demo implements Iterable<String> {

        // Simple demo: use array instead of inventing new data structure
        String[] data = { "One", "Two", "Three"};

        /** This is the Iterator that makes it all happen */
        class DemoIterator implements Iterator<String> {
            int i = 0;

            /**
             * Tell if there are any more elements.
             * @return true if next() will succeed, false otherwise
             */
            public boolean hasNext() {
                return i < data.length;
            }

            /** @return the next element from the data */
            public String next() {
                return data[i++];
            }

            /** Remove the object that next() just returned.
             * An Iterator is not required to support this interface, and we don't.
             * @throws UnsupportedOperationException unconditionally
             */
            public void remove() {
                throw new UnsupportedOperationException("remove");
            }
        }

        /** Method by which the Demo class makes its iterator available */
        public Iterator<String> iterator() {
            return new DemoIterator();
        }
    }

    public static void main(String[] args) {
        Demo demo = new Demo();
        for (String s : demo) {
            System.out.println(s);
        }
    }
}

The comments on the remove() method remind me of an interesting point. This interface introduces java.util’s attempt at something Java doesn’t really have, the “optional method.” Because there is no syntax for this, and they didn’t want to introduce any new syntax, the developers of the Collections Framework decided on an implementation using existing syntax. “Optional” methods they are not implemented are required to throw an UnsupportedOperationException if they ever get called. My remove( ) method does just that. Note that UnsupportedOperationException is subclassed from RuntimeException, so it is not required to be declared or caught.

This code is simplistic, but it does show the syntax and demonstrates how the Iterator interface works. In real code, the Iterator and the data are usually separate objects (the Iterator might be an inner class from the data store class). Also, you don’t even need to write this code for an array; you can just construct an ArrayList object, copy the array elements into it, and ask it to provide the Iterator. However, I believe it’s worth showing this simple example of the internals of an Iterator so that you can understand both how it works and how you could provide one for a more sophisticated data structure, should the need arise.

The Iterable interface has only one nondefault method, iterator(), which must provide an Iterator for objects of the given type. Because the ArrayIterator class implements this as well, we can use an object of type ArrayIterator in a “foreach” loop:

Example 7-5. main/src/main/java/structure/ArrayIteratorDemo.java
package structure;

import com.darwinsys.util.ArrayIterator;

public class ArrayIteratorDemo {

    private final static String[] names = {
        "rose", "petunia", "tulip"
    };

    public static void main(String[] args) {
        ArrayIterator<String> arrayIterator = new ArrayIterator<>(names);

        System.out.println("Java 5, 6 way");
        for (String s : arrayIterator) {
            System.out.println(s);
        }

        System.out.println("Java 5, 6 ways");
        arrayIterator.forEach(s->System.out.println(s));
        arrayIterator.forEach(System.out::println);
    }
}

Java 8 Iterable.foreach

Java 8 adds foreach to the Iterator interface, a default method (discussed in Recipe 9.1) that you don’t have to write. Thus, without changing the ArrayIterator, after moving to Java 8 we can use the newest-style loop, Iterator.foreach(Consumer), with a lambda expression (see Chapter 9) to print each element (see Example 7-5).

7.16 Using a Stack of Objects

Problem

You need to process data in “last-in, first-out” (LIFO) or “most recently added” order.

Solution

Write your own code for creating a stack; it’s easy. Or, use a java.util.Stack.

Discussion

You need to put things into a holding area quickly and retrieve them in last-in, first-out order. This is a common data structuring operation and is often used to reverse the order of objects. The basic operations of any stack are push() (add to stack), pop() (remove from stack), and peek() (examine top element without removing). ToyStack in Example 7-6 is a simple class for stacking values of the primitive type int. I’ll expand it in a page or two to allow to stack user-defined objects.

Example 7-6. main/src/main/java/structure/ToyStack.java
public class ToyStack {

    /** The maximum stack depth */
    protected int MAX_DEPTH = 10;
    /** The current stack depth */
    protected int depth = 0;
    /* The actual stack */
    protected int[] stack = new int[MAX_DEPTH];

    /** push - add an element onto the stack */
    protected void push(int n) {
        stack[depth++] = n;
    }
    /** pop - return and remove the top element */
    protected int pop() {
        return stack[--depth];
    }
    /** peek - return the top element but don't remove it */
    protected int peek() {
        return stack[depth-1];
    }
}

If you are not familiar with the basic idea of a stack, you should work through the code here; if you are familiar with it, you can skip ahead. While looking at it, of course, think about what happens if pop() or peek() is called when push() has never been called, or if push() is called to stack more data than will fit.

While working on ToyStack2 (not shown but in the online source), I extracted its interface into SimpleStack, which just lists the operations. At the same time I added the empty() method for some compatibility with the “standard” java.util.Stack class. And importantly, I made it a generic type, so it can be used with values of any type.

public interface SimpleStack<T> {

    /** empty - return true if the stack is empty */
    abstract boolean empty();

    /** push - add an element onto the stack */
    abstract void push(T n);

    /** pop - return and remove the top element */
    abstract T pop();

    /** peek - return the top element but don't remove it */
    abstract T peek();
}

I then made another demo stack class, MyStack, implement the new interface:

public class MyStack<T> implements SimpleStack<T> {

    private int depth = 0;
    public static final int DEFAULT_INITIAL = 10;
    private T[] stack;

    public MyStack() {
        this(DEFAULT_INITIAL);
    }

    public MyStack(int howBig) {
        if (howBig <= 0) {
            throw new IllegalArgumentException(
            howBig + " must be positive, but was " + howBig);
        }
        stack = (T[])new Object[howBig];
    }

    @Override
    public boolean empty() {
        return depth == 0;
    }

    /** push - add an element onto the stack */
    @Override
    public void push(T obj) {
        // Could check capacity and expand
        stack[depth++] = obj;
    }

    /* pop - return and remove the top element */
    @Override
    public T pop() {
        --depth;
        T tmp = stack[depth];
        stack[depth] = null;
        return tmp;
    }

    /** peek - return the top element but don't remove it */
    @Override
    public T peek() {
        if (depth == 0) {
            return null;
        }
        return stack[depth-1];
    }

    public boolean hasNext() {
        return depth > 0;
    }

    public boolean hasRoom() {
        return depth < stack.length;
    }

    public int getStackDepth() {
        return depth;
    }
}

This version has a lot more error checking (and a unit test, in the src/test/java/structure folder), as well as some additional methods not in the original (e.g., hasRoom()—unlike the full-blown java.util.Stack, this one does not expand beyond its original size, so we need a way to see if it is full without throwing an exception).

Now that you see how a stack works, I recommend to use the provided java.util.Stack instead of my demo versions; it is more fully fleshed out, more fully tested, and widely used. Unlike the “major” Collections API components List, Set and Map, java.util.Stack does not have an interface and implementation class(es); it is based on Vector which is a List implementation. The “real” java.util.Stack works in a similar manner to mine, but has more methods and more flexibility. To see that in operation, Recipe 5.12 provides a simple stack-based numeric calculator.

7.17 Multidimensional Structures

Problem

You need a two-, three-, or more dimensional array or ArrayList.

Solution

No problem. Java supports this.

Discussion

As mentioned back in Recipe 7.1, Java arrays can hold any reference type. Because an array is a reference type, it follows that you can have arrays of arrays or, in other terminology, multidimensional arrays. Further, because each array has its own length attribute, the columns of a two-dimensional array, for example, do not all have to be the same length (see Figure 7-3).

Here is code to allocate a couple of two-dimensional arrays, one using a loop and the other using an initializer. Both are selectively printed:

public class ArrayTwoDObjects {

    /** Return list of subscript names (unrealistic; just for demo). */
    public static String[][] getArrayInfo() {
        String info[][];
        info = new String[10][10];
        for (int i=0; i < info.length; i++) {
            for (int j = 0; j < info[i].length; j++) {
                info[i][j] = "String[" + i + "," + j + "]";
            }
        }
        return info;
    }

    /** Run the initialization method and print part of the results */
    public static void main(String[] args) {
        print("from getArrayInfo", getArrayInfo());
    }

    /** Print selected elements from the 2D array */
    public static void print(String tag, String[][] array) {
        System.out.println("Array " + tag + " is " + array.length + " x " +
            array[0].length);
        System.out.println("Array[0][0] = " + array[0][0]);
        System.out.println("Array[0][1] = " + array[0][1]);
        System.out.println("Array[1][0] = " + array[1][0]);
        System.out.println("Array[0][0] = " + array[0][0]);
        System.out.println("Array[1][1] = " + array[1][1]);
    }
}
jcb4 0703
Figure 7-3. Multidimensional arrays

Running it produces this output:

> java structure.ArrayTwoDObjects
Array from getArrayInfo is 10 x 10
Array[0][0] = String[0,0]
Array[0][1] = String[0,1]
Array[1][0] = String[1,0]
Array[0][0] = String[0,0]
Array[1][1] = String[1,1]
Array from getParameterInfo is 2 x 3
Array[0][0] = fontsize
Array[0][1] = 9-18
Array[1][0] = URL
Array[0][0] = fontsize
Array[1][1] = -
>

The same kind of logic can be applied to any of the Collections. You could have an ArrayList of ArrayLists, or a Vector of linked lists, or whatever your little heart desires.

As Figure 7-3 shows, it is not necessary for the array to be “regular” (i.e., it’s possible for each column of the 2D array to have a different height). That is why I used array[0].length for the length of the first column in the code example.

7.18 Simplifying Data Objects with Lombok or Record

Problem

You waste time writing POJO data classes with boilerplate code such as setters and getters, equals(), toString(), etc.

Solution

Use lombok to auto-generate boilerplate methods. In Java 14+, use the new record data type which generates the boilerplate methods for you.

Discussion

When Java was new, before there were good IDEs, developers had to write getters and setters by hand, or by copy-paste-change. Back then I did a study of one existing large code base and found about a 1/2% failure rate. The setter stored the value in the wrong place or the getter retrieved the wrong value. Assuming random distribution, this meant that one getter call in a hundred gave the wrong answer! The application still worked, so I must assume those wrong answers didn’t matter.

Now we have IDEs that can generate all the boilerplate methods such as setters/getters, equals, toString(), and so on. But you still have to remember to invoke these generators.

Lombok

Project Lombok provides one solution. It reads your .class files looking for its own annotations and, when it finds them, re-writes the class files to have the chosen methods.

To use Lombok, you need to add the dependency org.projectlombok:lombok:1.18.4 (or newer) to your build script. Or, if you are using an IDE, download the lombok jar file from https://projectlombok.org/ and install it as per the instructions there. Then you can annotate your class with annotations like these:

@Setters @Getters

Presto! No more forgetting to generate these methods; Lombok will do the work for you.

Other annotations include:

@ToString
@EqualsAndHashCode
@AllArgsConstructor

For data classes, there is even @Data which is a shortcut for @ToString, @EqualsAndHashCode, @Getter on all fields, and @Setter on all non-final fields, and @RequiredArgsConstructor!

Java 14 Record (Preview)

The new record type provides another solution. A record is a class-like construct for data classes, a restricted form of class as are enums and annotations. You need only write the name of a data object and its fields, and the compiler will provide a constructor, getters, hashCode() and equals(), and toString().

public record Person(String name, String emailAddress) { }

The provided constructor has the same signature as the record declaration. All fields are implicitly final, and the record provides getters but not setters. The getters have the name of the field; they do not follow the JavaBeans “getName()” pattern. Immutable objects are important for reliable code (see Recipe 9.1). You can provide other members such as extra constructors, static fields, static or instance methods, and so on. Records cannot be abstract and cannot declare additional instance fields. All in keeping with the fact that the state of the object is as declared in the record header.

$ jshell --enable-preview
|  Welcome to JShell -- Version 14-ea
|  For an introduction type: /help intro

jshell> record Person(String name, String email) {}

jshell> var p = new Person("Covington Roderick Smythe", "[email protected]")
p ==> Person[name=Covington Roderick Smythe, email=roddy@smythe.tld]

jshell> p.name()
$3 ==> "Covington Roderick Smythe"

jshell>

One-line record definitions typically don’t need to be in a source file all their own, so I baked it into the demo program PersonRecordDemo.java. We can save this file, compile it with javac and then use javap to view the class’ structure:

$ javac --enable-preview -source 14 PersonRecordDemo.java
Note: PersonRecordDemo.java uses preview language features.
Note: Recompile with -Xlint:preview for details.
$ javap PersonRecordDemo'$'Person
Compiled from "PersonRecordDemo.java"
public final class PersonRecordDemo$Person extends java.lang.Record {
  public PersonRecordDemo$Person(java.lang.String, java.lang.String);
  public java.lang.String toString();
  public final int hashCode();
  public final boolean equals(java.lang.Object);
  public java.lang.String name();
  public java.lang.String email();
}

The $ in the filename has to be escaped from the Unix shell. We see that the compiler has generated the constructor, toString(), hashCode() and equals(), and read-only accessors name() and email().

Warning

As of Java 14 the record mechanism is a “Preview” so it may change from what is described here or might even (however unlikely) not appear in the final Java 14 or in a future Java release (though we hope it will appear as-is, non-preview, in Java 15). If you are using Java 14 you need the --enable-preview option on commands like javac, javac, jshell, etc, as well as --source 14 on commands that read the source file.

See Also

The original description of and rationale for the record mechanism is in the Java Enhancement Proposal JEP-359 at OpenJDK.net.

7.19 Program: Timing Comparisons

New developers sometimes worry about the overhead of these collections and think they should use arrays instead of data structures. To investigate, I wrote a program that creates and accesses 250,000 objects, once through a Java array and again through an ArrayList. This is a lot more objects than most programs use. First the code for the Array version:

public class Array {
    public static final int MAX = 250000;
    public static void main(String[] args) {
        System.out.println(new Array().run());
    }
    public int run() {
        MutableInteger list[] = new MutableInteger[MAX];
        for (int i=0; i<list.length; i++) {
            list[i] = new MutableInteger(i);
        }
        int sum = 0;
        for (int i=0; i<list.length; i++) {
            sum += list[i].getValue();
        }
        return sum;
    }
}

And the ArrayList version:

public class ArrayLst {
    public static final int MAX = 250000;
    public static void main(String[] args) {
        System.out.println(new ArrayLst().run());
    }
    public int run() {
        ArrayList<MutableInteger> list = new ArrayList<>();
        for (int i=0; i<MAX; i++) {
            list.add(new MutableInteger(i));
        }
        int sum = 0;
        for (int i=0; i<MAX; i++) {
            sum += ((MutableInteger)list.get(i)).getValue();
        }
        return sum;
    }
}

The Vector-based version, ArrayVec, is sufficiently similar that I don’t feel the need to kill a tree reprinting its code—it’s online.

How can we time this? As covered in Recipe 17.7, you can either use the operating system’s time command, if available, or just use a bit of Java that times a run of your main program. To be portable, I chose to use the latter on an older, slower machine. Its exact speed doesn’t matter because the important thing is to compare only versions of this program running on the same machine.

Finally (drum roll, please), the results:

$ java performance.Time Array 
Starting class class Array
1185103928
runTime=4.310
$ java performance.Time ArrayLst
Starting class class ArrayLst
1185103928
runTime=5.626
$ java performance.Time ArrayVec
Starting class class ArrayVec
1185103928
runTime=6.699
$

Notice that I have ignored one oft-quoted bit of advice that recommends giving a good initial estimate on the size of the ArrayList. I did time it that way as well; in this example, it made a difference of less than 4% in the total runtime.

The bottom line is that the efficiency of ArrayList is not totally awful compared to arrays. Obviously there is more overhead in calling a “get” method than in retrieving an element from an array. The overhead of objects whose methods actually do some computation probably outweighs the overhead of fetching and storing objects in an ArrayList rather than in an Array. Unless you are dealing with large numbers of objects, you may not need to worry about it. Vector is slightly slower but still only about two-thirds the speed of the original array version. If you are concerned about the time, once the “finished” size of the ArrayList is known, you can convert the ArrayList to an array (see Recipe 7.14).

1 You could copy it yourself using a for loop if you wish, but System.arrayCopy( ) is likely to be faster because it’s implemented in native code.

2 Both List and Set extend Collection.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.172.38