56 CHAPTER 3FLAT FILES
name at subscript 8, which is “Madison” (we are indexing zero-up, just like Java,
and we arbitrarily choose always to round up to the first subscript in the upper
half of any list or sublist if we have to break ties). Tyler is larger than Madison, so
we know we have a record in the second half of the list (if it’s there at all). Since
we have retrieved the record at subscript 8, we test equality as well as greater-
equal/less-equal, and find that we don’t have equality. The second half of the list
has subscripts 8–15, and we know we didn’t get a match at subscript 8, so we are
looking at subscripts 9–15. The midpoint (and this time it’s actually the midpoint)
is 12, and the entry at subscript 12 is “Taylor.” Tyler is still larger than Taylor, so
we continue searching in the short list of subscripts 13–15, since we know it’s not
at subscript 12. Comparing against subscript 14, “Van Buren,” we know we are in
the first half of that sublist. The sublist has length 1, though, so when we compare
against subscript 13, which is “Tyler,” we get a match.
If we look for the name “Clay” (remembering that Henry Clay very much wanted
to become a U.S. president but never managed to do so), we compare against
subscripts 8 (Madison), then 4 (Jackson), then 2 (Fillmore), then 1 (Buchanan).
Since we don’t find a match, we know that Clay is not in the list.
As should be obvious, the difficult part about writing a method for this is getting
right the concept of “halfway.” With an odd number of items in the list, there is a
midpoint. With an even number of items in the list, one has to decide whether to
take the last entry in the first half or the first entry in the second half.
If the list is of N =2
n
entries, then it will take at most n =log
2
N probes to
reduce the problem to looking at a list of one entry, which is merely a single test.
To verify that Clay was not in the list of 2
4
= 16 items, for example, we checked
the four entries at subscripts 8, 4, 2, and 1.
Now we need to clarify a couple of points on which we have been a little sloppy.
First, when we execute a probe, we can test both
if(this.compareTo(that) < 0)
and
if(this.compareTo(that) == 0)
If we find the entry for which we are looking, then of course we have solved the
problem. If not, we continue. Technically this would split the list not into two
halves but into one sublist larger than the probed value, one sublist smaller than
the probed value, and the probed value itself. If we test in this way, we can shorten
our sublist by one by not including the probed value, since we have already tested it.
Similarly, if N is not an exact power of 2, the length in time of the binary search
is really no worse than it would be for the next larger power of 2. If we were to