Chapter 5.  Searching

The right of the people to be secure against unreasonable searches and seizures, shall not be violated....

Constitution of the United States, 1787

Computers—and people—are always trying to find things. Both of them often need to perform tasks like these:

  • Select files on a disk

  • Find memory locations

  • Identify processes to be killed

  • Choose the right item to work upon

  • Decide upon the best algorithm

  • Search for the right place to put a result

The efficiency of searching is invariably affected by the data structures storing the information. When speed is critical, you’ll want your data sorted beforehand. In this chapter, we’ll draw on what we’ve learned in the previous chapters to explore techniques for searching through large amounts of data, possibly sorted and possibly not. (Later, in Chapter 9, we’ll separately treat searching through text.)

As with any algorithm, the choice of search technique depends upon your criteria. Does it support all the operations you need to perform on your data? Does it run fast enough for frequently used operations? Is it the simplest adequate algorithm?

We present a large assortment of searching algorithms here. Each technique has its own advantages and disadvantages and particular data structures and sorting methods for which it works especially well. You have to know which operations your program performs frequently to choose the best algorithm; when in doubt, benchmark and profile your programs to find out.

There are two general categories of searching. The first, which we call lookup searches, involves preparing and searching a collection of existing data. The second category, generative searches, involves creating the data to be searched, often choosing dynamically the computation to be performed and almost always using the results of the search to control the generation process. An example might be looking for a job. While there is a great deal of preparation you can do beforehand, you may learn things at an actual interview that drastically change how you rate that company as a prospective employer—and what other employers you should be seeking out.

Most of this chapter is devoted to lookup searches because they’re the most general. They can be applied to most collections of data, regardless of the internal details of the particular data. Generative algorithms depend more upon the nature of the data and computations involved.

Consider the task of finding a phone number. You can search through a phone book fairly quickly—say, in less than a minute. This gives you a phone number for anyone in the city—a primitive lookup search. But you don’t usually call just anyone, most often you call an acquaintance, and for their phone number you might use a personal address book instead and find the number in a few seconds. That’s a speedier lookup search. And if it’s someone you call often and you have their number memorized, your brain can complete the search before your hand can even pick up the address book.

Hash Search and Other Non-Searches

The fastest search technique is not to have to search at all. If you choose your data structures in a way that best fits the problem at hand, most of your “searching” is simply the trivial task of accessing the data directly from that structure. For example, if your program determined mean monthly rainfall for later use, you would likely store it in a list or a hash indexed by the month. Later, when you wanted to use the value for March, you’d “search” for it with either $rainfall[3] or $rainfall{March}.

You don’t have to do a lot of work to look up a phone number that you have memorized. You just think of the person’s name and your mind immediately comes up with the number. This is very much like using a hash: it provides a direct association between the key value and its additional data. (The underlying implementation is rather different, though.)

Often you only need to search for specific elements in the collection. In those cases, a hash is generally the best choice. But if you need to answer more complicated questions like “What is the smallest element?” or “Are any elements within a particular range?” which depend upon the relative order of elements in the collection, a hash won’t do.

Both array and hash index operations are —taking a fixed amount of time regardless of the number of elements in the hash (with rare pathological exceptions for hashes).

Lookup Searches

A lookup search is what most programmers think of when they use the term “search”—they know what item they’re looking for but don’t know where it is in their collection of items. We return to a favorite strategy of problem solving in any discipline: decompose the problem into easy-to-solve pieces. A fundamental technique of program design is to break a problem into pieces that can be dealt with separately. The typical components of a search are as follows:

  1. Collecting the data to be searched

  2. Structuring the data

  3. Selecting the data element(s) of interest

  4. Restructuring the selected element(s) for subsequent use

Collecting and structuring the data is often done in a separate, earlier phase, before the actual search. Sometimes it is done a long time before—a database built up over years is immediately available for searching. Many companies base their business upon having built such collections, such as companies that provide mailing lists for qualified targets, or encyclopedia publishers who have been collecting and updating their data for centuries.

Sometimes your program might need to perform different kinds of searches on your data, and in that case, there might be no data structure that performs impeccably for them all. Instead of choosing a simple data structure that handles one search situation well, it’s better to choose a more complicated data structure that handles all situations acceptably.

A well-suited data structure makes selection trivial. For example, if your data is organized in a heap (a structure where small items bubble up towards the top) searching for the smallest element is simply a matter of removing the top item. For more information on heaps, see Chapter 3.

Rather than searching for multiple elements one at a time, you might find it better to select and organize them once. This is why you sort a bridge hand—a little time spent sorting makes all of the subsequent analysis and play easier.

Sorting is often a critical technique—if a collection of items is sorted, then you can often find a specific item in time, even if you have no prior knowledge of which item will be needed. If you do have some knowledge of which items might be needed, searches can often be performed faster, maybe even in constant——time. A postman walks up one side of the street and back on the other, delivering all of the mail in a single linear operation—the top letter in the bag is always going to the current house. However, there is always some cost to sorting the collection beforehand. You want to pay that cost only if the improved speed of subsequent searches is worth it. (While you’re busy precisely ordering items 25 through 50 of your to-do list, item 1 is still waiting for you to perform it.)

You can adapt the routines in this chapter to your own data in two ways, as was the case in Chapter 4. You could rewrite the code for each type of data and insert a comparison function for that data, or you could write a more general but slower searching function that accepts a comparison function as an argument.

Speaking of comparison testing, some of the following search methods don’t explicitly consider the possibility that there might be more than one element in the collection that matches the target value —they simply return the first match they find. Usually, that will be fine—if you consider two items different, your comparison routine should too. You can extend the part of the value used in comparisons to distinguish the different instances. A phone book does this: after you have found “J Macdonald,” you can use his address to distinguish between people with the same name. On the other hand, once you find a jar of cinnamon in the spice rack, you stop looking even if there might be others there, too—only the fussiest cook would care which bottle to use.

Let’s look at some searching techniques. This table gives the order of the speed of the methods we’ll be examining for some common operations:

Method

Lookup

Insert

Delete

ransack

(unbounded)

(unbounded)

list—linear

list—binary

list—proportional

to

binary tree (balanced)

binary tree (unbalanced)

bushier trees

(various)

(various)

(various)

list—using index

lists of lists

(number of lists)

(length of lists)

B-trees (k entries per node)

hybrid searches

(various)

(various)

(various)

Ransack Search

People, like computers, use searching algorithms. Here’s one familiar to any parent—the ransack search. As searching algorithms go, it’s atrocious, but that doesn’t stop three-year-olds. The particular variant described here can be attributed to Gwilym Hayward, who is much older than three years and should know better. The algorithm is as follows:

  1. Remove a handful of toys from the chest.

  2. Examine the newly exposed toy: if it is the desired object, exchange it with the handful and terminate.

  3. Otherwise, replace the removed toys into a random location in the chest and repeat.

This particular search can take infinitely long to terminate: it will never recognize for certain if the element being searched for is not present. (Termination is an important consideration for any search.) Additionally, the random replacement destroys any cached location information that any other person might have about the order of the collection. That does not stop children of all ages from using it.

The ransack search is not recommended. My mother said so.

Linear Search

How do you find a particular item in an unordered pile of papers? You look at each item until you find the one you want. This is a linear search. It is so simple that programmers do it all the time without thinking of it as a search.

Here’s a Perl subroutine that linear searches through an array for a string match:[22]

# $index = linear_string( @array, $target )
#      @array is (unordered) strings
#      on return, $index is undef or else $array[$index] eq $target

sub linear_string {
    my ($array, $target) = @_;

    for ( my $i = @$array; $i--; ) {
        return $i if $array->[$i] eq $target;
    }
    return undef;
}

Often this search will be written inline. There are many variations depending upon whether you need to use the index or the value itself. Here are two variations of linear search; both find all matches rather than just the first:

# Get all the matches.
@matches = grep { $_ eq $target } @array;

# Generate payment overdue notices.
foreach $cust (@customers) {
    # Search for overdue accounts.
    next unless $cust->{status} eq "overdue";
    # Generate and print a mailing label.
    print $cust->address_label;
}

Linear search takes time—it’s proportional to the number of elements. Before it can fail, it has to search every element. If the target is present, on the average, half of the elements will be examined before it is found. If you are searching for all matches, all elements must be examined. If there are a large number of elements, this time can be expensive.

Nonetheless, you should use linear search unless you are dealing with very large arrays or very many searches; generally, the simplicity of the code is more important than the possible time savings.

Binary Search in a List

How do you look up a name in a phone book? A common method is to stick your finger into the book, look at the heading to determine whether the desired page is earlier or later. Repeat with another stab, moving in the right direction without going past any page examined earlier. When you’ve found the right page, you use the same technique to find the name on the page—find the right column, determine whether it is in the top or bottom half of the column, and so on.

That is the essence of the binary search: stab, refine, repeat.

The prerequisite for a binary search is that the collection must already be sorted. For the code that follows, we assume that ordering is alphabetical. You can modify the comparison operator if you want to use numerical or structured data.

A binary search “takes a stab” by dividing the remaining portion of the collection in half and determining which half contains the desired element.

Here’s a routine to find a string in a sorted array:

# $index = binary_string( @array, $target )
#        @array is sorted strings
#    on return,
#        either (if the element was in the array):
#           # $index is the element
#           $array[$index] eq $target
#        or (if the element was not in the array):
#           # $index is the position where the element should be inserted
#           $index == @array or $array[$index] gt $target
#           splice( @array, $index, 0, $target ) would insert it
#               into the right place in either case
#
sub binary_string {
     my ($array, $target) = @_;# $low is first element that is not too low;
     # $high is the first that is too high
     #
     my ( $low, $high ) = ( 0, calar@$array ));
# Keep trying as long as there are elements that might work.
     #
     while ( $low < $high ) {
         # Try the middle element.
use integer;
         my $cur = ($low+$high)/2;
         if ($array->[$cur] lt $target) {
             $low  = $cur + 1;                     # too small, try higher
         } else {
             $high = $cur;                         # not too small, try lower

     }
     return $low;
}
# example use:
my $index = binary_string( @keywords, $word );
if( $index < @keywords && $keywords[$index] eq $word ) {
    # found it: use $keywords[$index]
    ...
} else {
    # It's not there.
# You might issue an error
    warn "unknown keyword $word";
    ...
# or you might insert it.
    splice( @keywords, $index, 0, $word );
    ...
}

This particular implementation of binary search has a property that is sometimes useful: if there are multiple elements that are all equal to the target, it will return the first.

A binary search takes time—either to find a target or to determine that the target is not in the array. (If you have the extra cost of sorting the array, however, that is an operation.) It is tricky to code binary search correctly—you could easily fail to check the first or last element, or conversely try to check an element past the end of the array, or end up in a loop that checks the same element each time. (Knuth, in The Art of Computer Programming: Sorting and Searching, section 6.2.1, points out that the binary search was first documented in 1946 but the first algorithm that worked for all sizes of array was not published until 1962.)

One useful feature of the binary search is that you can use it to find a range of elements with only two searches and without copying the array. For example, perhaps you want all of the transactions that happened in February. Searching for a range looks like this:

# ($index_low, $index_high) =
#   binary_range_string( @array, $target_low, $target_high );
#      @array is sorted strings
#      On return:
#         $array[$index_low..$index_high] are all of the
#           values between $target_low and $target_high inclusive
#           (if there are no such values, then $index_low will
#           equal $index_high+1, and $index_low will indicate
#           the position in @array where such a value should
#           be inserted, i.e., any value in the range should be
#           inserted just before element $index_low

sub binary_range_string {
    my ($array, $target_low, $target_high) = @_;
    my $index_low  = binary_string( $array, $target_low );
    my $index_high = binary_string( $array, $target_high );

    --$index_high
        if $index_high == @$array ||
                            $array->[$index_high] gt $target_high;

    return ($index_low,$index_high);
}
($Feb_start, $Feb_end) = binary_range_string(@year, '0201', '0229'),

The binary search method suffers if elements must be added or removed after you have sorted the array. Inserting or deleting an element into or from an array without disrupting the sort generally requires copying many of the elements of the array. This condition makes the insert and delete operations instead of .

This algorithm is recommended as long as the following are true:

  • The array will be large enough.

  • The array will be searched often.[23]

  • Once the array has been built and sorted, it remains mostly unchanged (i.e., there will be far many more searches than inserts and deletes).

It could also be used with a separate list of the inserts and deletions as part of a compound strategy if there are relatively few inserts and deletions. After binary searching and finding an entry in the main array, you would perform a linear search of the deletion list to verify that the entry is still valid. Alternatively, after binary searching and failing to find an element, you perform a linear search of the addition list to confirm that the element still does not exist. This compound approach is O((log N) + K) where K is the number of inserts and deletes. As long as K is much smaller than N (say, less than log N) this approach is workable.

Proportional Search

A significant speedup to binary search can be achieved. When you are looking in a phone book for a name like “Anderson”, you don’t take your first guess in the middle of the book. Instead, you begin a short way from the beginning. As long as the values are roughly evenly distributed throughout the range, you can help binary search along, making it a proportional search. Instead of computing the index to be halfway between the known upper and lower bounds, you compute the index that is the right proportion of the distance between them—conceptually, for your next guess you would use:

                (target - $array->[low])           
(high-low)  ________________________________ + low     
            ($array->[high] - $array->[low])

To make proportional search work correctly requires care. You have to map the result to an integer—it’s hard to look up element 34.76 of an array. You also have to protect against the cases when the value of the high element equals the value of the low element so that you don’t divide by zero. (Note also that we are treating the values as numbers rather than strings. Computing proportions on strings is much messier, as you can see in the next code example.)

A proportional search can speed the search up considerably, but there are some problems:

  • It requires more computation at each stage.

  • It causes a divide by zero error if the range bounded by $low and $high is a group of elements with an identical key. (We’ll handle that issue in the following code by skipping the computation in such cases.)

  • It doesn’t work well for finding the first of a group of equal elements—the proportion always points to the same index, so you end up with a linear search for the beginning of the group of equal elements. This is only a problem if very large collections of equal-valued elements are allowed.

  • It degrades, sometimes very badly, if the keys aren’t evenly distributed.

To illustrate the last problem, suppose the array contains a million and one elements—all of the integers from 1 to 1,000,000, and then 1,000,000,000,000. Now, suppose that you search for 1,000,000. After determining that the values at the ends are 1 and 1,000,000,000,000, you compute that the desired position is about one millionth of the interval between them, so you check the element $array[1] since 1 is one millionth of the distance between indices 0 and 1,000,000. At each stage, your estimate of the element’s location is just as badly off, so by the time you’ve found the right element, you’ve tested every other element first. Some speedup! Add this danger to the extra cost of computing the new index at each stage, and even more lustre is lost. Use proportional search only if you know your data is well distributed. Later in this chapter, Section 5.2.10 shows how this example could be handled by making the proportional search part of a mixed strategy.

Computing proportional distances between strings is just the sort of “simple modification” (hiding a horrible mess) that authors like to leave as an exercise for the reader. However, with a valiant effort, we resisted that temptation:

sub proportional_binary_string_search {
    my ($array, $target) = @_;

    # $low is first element that is not too low;
    # $high is the first that is too high
    # $common is the index of the last character tested for
    #    equality in the elements at $low-1 and $high.
    #    Rather than compare the entire string value, we only
    #    use the "first different character".
    #    We start with character position -1 so that character
    #    0 is the one to be compared.
    #
    my ( $low, $high, $common ) = ( 0, scalar(@$array), -1 );

    return 0 if $high == -1 || $array->[0] ge $target;
    return $high if $array->[$high-1] lt $target;
    --$high;

    my ($low_ch,  $high_ch,  $targ_ch ) = (0, 0);
    my ($low_ord, $high_ord, $targ_ord);

    # Keep trying as long as there are elements that might work.
    #
    while( $low < $high ) {
        if ($low_ch eq $high_ch) {
            while ($low_ch eq $high_ch) {
                return $low if $common == length($array->[$high]);
                ++$common;
                $low_ch  = substr( $array->[$low],  $common, 1 );
                $high_ch = substr( $array->[$high], $common, 1 );
            }
            $targ_ch = substr( $target, $common, 1 );
            $low_ord  = ord( $low_ch  );
            $high_ord = ord( $high_ch );
            $targ_ord = ord( $targ_ch );
        }
        # Try the proportional element (the preceding code has
        # ensured that there is a nonzero range for the proportion
        # to be within).

        my $cur = $low;
        $cur += int( ($high - 1 - $low) * ($targ_ord - $low_ord)
                        / ($high_ord - $low_ord) );
        my $new_ch = substr( $array->[$cur], $common, 1 );
        my $new_ord = ord( $new_ch );

        if ($new_ord < $targ_ord
                || ($new_ord == $targ_ord
                    && $array->[$cur] lt $target) ) {
            $low  = $cur+1;       # too small, try higher
            $low_ch  = substr( $array->[$low], $common, 1 );
            $low_ord = ord( $low_ch );
        } else {
            $high = $cur;         # not too small, try lower
            $high_ch  = $new_ch;
            $high_ord = $new_ord;
        }
    }
    return $low;
}

Binary Search in a Tree

The binary tree data structure was introduced in Chapter 2. As long as the tree is kept balanced, finding an element in a tree takes time, just like binary search in an array. Even better, it only takes to perform an insert or delete operation, which is a lot less than the required to insert or delete an element in an array.

Should You Use a List or a Tree for Binary Searching?

Binary searching is for both sorted lists and balanced binary trees, so as a first approximation they are equally usable. Here are some guidelines:

  • Use a list when you search the data many times without having to change it. That has a significant savings in space because there’s only data in the structure (no pointers)—and only one structure (little Perl space overhead).

  • Use a tree when addition and removal of elements is interleaved with search operations. In this case, the tree’s greater flexibility outweighs the extra space requirements.

Bushier Trees

Binary trees provide performance, but it’s tempting to use wider trees—a tree with three branches at each node would have [] performance, four branches performance, and so on. This is analogous to changing a binary search to a proportional search—it changes from a division by two into a division by a larger factor. If the width of the tree is a constant, this does not reduce the order of the running time; it is still . What it does do is reduce by a constant factor the number of tree nodes that must be examined before finding a leaf. As long as the cost of each of those tree node examinations does not rise unduly, there can be an overall saving. If the tree width is proportional to the number of elements, rather than a constant width, there is an improvement, from to . We already discussed using lists and hashes in Section 5.1, they provide “trees” of one level that is as wide as the actual data. Next, though, we’ll discuss bushier structures that do have the multiple levels normally expected of trees.

Lists of Lists

If the key is sparse rather than dense, then sometimes a multilevel array can be effective. Break the key into chunks, and use an array lookup for each chunk. In the portions of the key range where the data is especially sparse, there is no need to provide an empty tree of subarrays—this will save some wasted space. For example, if you were keeping information for each day over a range of years, you might use arrays representing years, which are subdivided further into arrays representing months, and finally into elements for individual days:

# $value = datetab( $table, $date )
# datetab( $table, $date, $newvalue )
#
# Look up (and possibly change) a value index by a date.
# The date is of the form "yyyymmdd", year(1990-), month(1-12),
# day(1-31).

sub datetab {
    my ($tab, $date, $value) = @_;
    my ($year, $month, $day) = ($date =~ /^(dddd)(dd)(dd)$/)
        or die "Bad date format $date";

    $year -= 1990;
    --$month; --$day;
    if (@_ < 3) {
        return $tab->[$year][$month][$day];
    } else {
        return $tab->[$year][$month][$day] = $value;
    }
}

You can use a variant on the same technique even if your data is a string rather than an integer. Such a breakdown is done on some Unix systems to store the terminfo database, a directory of information about how to control different kinds of terminals. This terminal information is stored under the directory /usr/lib/terminfo. Accessing files becomes slow if the directory contains a very large number of files. To avoid that slowdown, some systems keep this information under a two-level directory. Instead of the description for vt100 being in the file /usr/lib/terminfo/vt100, it is placed in /usr/lib/terminfo/v/vt100. There is a separate directory for each letter, and each terminal type with that initial is stored in that directory. CPAN uses up to two levels of the same method for storing user IDs—for example, the directory K/KS/KSTAR has the entry for Kurt D. Starsinic.

B-Trees

Another wide tree algorithm is the B-tree. It uses a multilevel tree structure. In each node, the B-tree keeps a list of pairs of values, one pair for each of its child branches. One value specifies the minimum key that can be found in that branch, the other points to the node for that branch. A binary search through this array can determine which one of the child branches can possibly contain the desired value. A node at the bottom level contains the actual value of the keyed item instead of a list. See Figure 5-1 for the structure of a B-tree.

Sample B-tree
Figure 5-1. Sample B-tree

B-trees are often used for very large structures such as filesystem directories—structures that must be stored on disk rather than in memory. Each node is constructed to be a convenient size in disk blocks. Constructing a wide tree this way satisfies the main requirement of data stored on file, which is to minimize the number of disk accesses. Because disk accesses are much slower than in-memory operations, we can afford to use more complicated data processing if it saves accesses. A B-tree node, read in one disk operation, might contain references to 64 subnodes. A binary tree structure would require six times as many disk accesses, but these disk accesses totally dwarf the cost of the B-tree’s binary search through the 64 elements.

If you’ve installed Berkeley DB (available at http://www.sleepycat.com/db)on your machine, using B-trees from Perl is easy:

use DB_File;tie %hash, "DB_File", $filename, $flags, $mode, $DB_BTREE;

This binds %hash to the file $filename, which keeps its data in B-tree format. You add or change items in the file simply by performing normal hash operations. Examine perldoc DB_File for more details. Since the data is actually in a file, it can be shared with other programs (or used by the same program when run at different times). You must be careful to avoid concurrent reads and writes, either by never running multiple programs at once if one of them can change the file, or by using locks to coordinate concurrent programs. There is an added bonus: unlike a normal Perl hash, you can iterate through the elements of %hash (using each, keys, or values) in order, sorted by the string value of the key.

The DB_File module, by Paul Marquess, has another feature: if the value of $filename is undefined when you tie the hash to the DB_File module, it keeps the B-tree in memory instead of in a file.

Alternatively, you can keep B-trees in memory using Mark-Jason Dominus’ BTree module, which is described in The Perl Journal, Issue #8. It is available at http://www.plover.com/~mjd/perl/BTree/BTree.pm.

Here’s an example showing typical hash operations with a B-tree:

use BTree;

my $tree = BTree->new( B => 20 );

# Insert a few items.
while ( my ( $key, $value ) = each %hash ) {
    $tree->B_search(
        Key    => $key,
        Data   => $value,
        Insert => 1 );
}

# Test whether some items are in the tree.
foreach ( @test ) {
    defined $tree->B_search( Key => $_ )
        ? process_yes($_)
        : process_no($_);
}

# Update an item only if it exists, do nothing if it doesn't.
$tree->B_search(
    Key     => 'some key',
    Data    => 'new value',
    Replace => 1 );

# Create or update an item whether it exists or not.
$tree->B_search(
    Key     => 'another key',
    Data    => 'a value',
    Insert  => 1,
    Replace => 1 );

Hybrid Searches

If your key values are not consistently distributed, you might find that a mixture of search techniques is advantageous. That familiar address book uses a sorted list (indexed by the initial letter) and then a linear, unsorted list within each page.

The example that ruined the proportional search (the array that included numbers from 1 through 1,000,000 as well as 1,000,000,000,000) would work really well if it used a three-level structure. A hybrid search would replace the binary search with a series of checks. The first check would determine whether the target was the Saganesque 1,000,000,000,000 (and return its index), and a second check would determine if the number was out of range for 1 .. 1,000,000 (saying “not found”). Otherwise, the third level would return the number (which is its own index in the array):

sub sagan_and_a_million {
    my $desired = shift;

    return 1_000_001 if $desired == 1_000_000_000_000;
    return undef if $desired < 0 || $desired > 1_000_000;
    return $desired;
}

This sort of search structure can be used in two situations. First, it is reasonable to spend a lot of effort to find the optimal structure for data that will be searched many times without modification. In that case, it might be worth writing a routine to discover the best multilevel organization. The routine would use lists for ranges in which the key space was completely filled, proportional search for areas where the variance of the keys was reasonably small, bushy trees or binary search lists for areas with large variance in the key distribution. Splitting the data into areas effectively would be a hard problem.

Second, the data might lend itself to a natural split. For example, there might be a top level indexed by company name (using a hash), a second level indexed by year (a list), and a third level indexed by company division (another hash), with gross annual profit as the target value:

$profit = $gross->{$company}[$year]{$division};

Perhaps you can imagine a tree structure in which each node is an object that has a method for testing a match. As the search progresses down the tree, entirely different match techniques might be used at each level.

Lookup Search Recommendations

Choosing a search algorithm is intimately tied to choosing the structure that will contain your data collection. Consider these factors as you make your choices:

  • What is the scale? How many items are involved? How many searches will you be making? A few? Thousands? Millions? 10100?

    When the scale is large, you must base your choice on performance. When the scale is small, you can instead base your choice on ease of writing and maintaining the program.

  • What operations on the data collection will be interleaved with search operations?

    When a data collection will be unchanged over the course of many searches, you can organize the collection to speed the searches. Usually that means sorting it. Changing the collection, by adding new elements or deleting existing elements, makes maintaining an optimized organization harder. But, there can be advantages to changing the collection. If an item has been searched for and found once, might it be requested again? If not, it could be removed from the collection; if you can remove many items from the structure in that way, subsequent searches will be faster. If the search can repeat, is it likely to do so? If it is especially likely to repeat, it is worth some effort to make the item easy to find again—this is called caching. You cache when you keep a recipe file of your favorite recipes. Perl caches object methods for inherited classes so that after it has found one, it remembers its location for subsequent invocations.

  • What form of search will you be using?

    Single key

    Find the element that matches a value.

    Key range

    Find all the elements that are within a range of values.

    Order

    Find the element with the smallest (or largest) value.

    Multiple keys

    Find the element that matches a value, but match against different parts of the element on different searches (e.g., search by name, postal code, or customer number). This can be a real problem, since having your data sorted by customer number doesn’t help at all when you are searching by name.

Table 5-1 lists a number of viable data structures and their fitness for searching.

Table 5-1. Best Data Structures and Algorithms for Searching

Data Structure

Recommended Use

Operation

Implementation

Cost

list (unsorted)

small scale tasks (including rarely used alternate search keys)

add

push

delete from end

pop, unshift

delete arbitrary element

splice

all searches

linear search

list (indexed by key)

when the key used for searching is a small unique positive integer (or can easily be mapped to one)

add/delete/key search

array element operations

range search

array slice

size of range

smallest

first defined element

(dense array), (sparse array)

list (sorted)

when there are range searches (or many single key searches) and few adds (or deletes)

add/delete

binary search; splice

key search

binary search

range searches

binary range search

smallest

first element

list (binary heap)

small to medium scale tasks, only search is for smallest, no random deletes

add

push; heapup

delete smallest

exchange; heapdown

delete known element

exchange; heapup or heapdown

smallest

first element

object (Fibonacci heap)

large scale tasks, only search is for smallest

add

add method

delete smallest

extract_minimum method

delete known element

delete method

smallest

minimum method

hash (indexed by key)

single key and order-independent searches

add/delete/key search

hash element operations

range search, smallest

linear search

hash and sorted list

single key searches mixed with order dependent searches, can be well handled by having both a hash and a sort list

add/delete

hash, plus binary search and splice

key search

hash element operations

range search, smallest

binary search

balanced binary tree

many elements (but still able to fit into memory), with very large numbers of searches, adds, and deletes

add

bal_tree_add

delete

bal_tree_del

key/range search

bal_tree_find

smallest

follow left link to end

external files method

When the data is too large to fit in memory, or is large and long-lived, keep it in a file. A sorted file allows binary search on the file. A dbm or B-tree file allows hash access conveniently. A B-tree also allows ordered access for range operations.

various

 

disk I/O

Table 5-1 give no recommendations for searches made on multiple, different keys. Here are some general approaches to dealing with multiple search keys:

  • For small scale collections, using a linear search is easiest.

  • When one key is used heavily and the others are not, choose the best method for that heavily used key and fall back to linear search for the others.

  • When multiple keys are used heavily, or if the collection is so large that linear search is unacceptable when an alternate key is used, you should try to find a mapping scheme that converts your problem into separate single key searches. A common method is to use an effective method for one key and maintain hashes to map the other keys into that one primary key. When you have multiple data structures like this, there is a higher cost for changes (adds and deletes) since all of the data structures must be changed.

Generative Searches

Until now, we’ve explored means of searching an existing collection of data. However, some problems don’t lend themselves to this model—they might have a large or infinite search space. Imagine trying to find where your phone number first occurs in the decimal expansion of π. The search space might be unknowable—you don’t know what’s around the corner of a maze until you move to a position where you can look; a doctor might be uncertain of a diagnosis until test results arrive. In these cases, it’s necessary to compute possible solutions during the course of the search, often adapting the search process itself as new information is learned.

We call these searches generative searches, and they’re useful for problems in which areas of the search space are unknown (for example, if they interact autonomously with the real world) or where the search space is so immense that it can never be fully investigated (such as a complicated game or all possible paths through a large graph).

In one way, analysis of games is more complicated than other searches. In a game, there is alternation of turns by the players. What you consider a “good” move depends upon whether it will happen on your turn or on your opponent’s turn, while nongame search operations tend to strive for the same goal each step of the way. Often, the alternation of goals, combined with being unable to control the opponent’s moves, makes the search space for game problems harder to organize.

In this chapter, we use games as examples because they require generative search and because they are familiar. This does not mean that generative search techniques are only useful for games—far from it. One example is finding a path. The list of routes tell you which locations are adjacent to your starting point, but then you have to examine those locations to discover which one might help you progress toward your eventual goal. There are many optimizing problems in this category: finding the best match for assigning production to factories might depend upon the specific manufacturing abilities of the factories, the abilities required by each product, the inventory at hand at each factory, and the importance of the products. Generative searching can be used for many specific answers to a generic question: “What should I do next?”

We will study the following techniques:

 

Exhaustive search

Minimax

 

Pruning

Alpha-beta pruning

 

Killer move

Transpose table

 

Greedy algorithms

Branch and bound

 

A*

Dynamic programming

Game Interface

Since we are using games for examples, we’ll assume a standard game interface for all game evaluations. We need two types of objects for the game interface—a position and a move.

A position object will contain data to define all necessary attributes of the game at one instant during a particular game (where pieces are located on the board, whose turn it is, etc.). It must have the following methods:

prepare_moves

Prepares to generate all possible moves from the position (returning undef if there are no legal moves from the position, i.e., it is a final position).

next_move

Returns a move object for the next of the possible moves (returning undef if all of the possible moves have already been returned since the last call to prepare_moves).

make_move(move)

Returns a new position object, the result of making that particular move from the current position.

evaluate

Returns a numerical rating for the position, giving the value for the player who most recently moved. Negating this value changes it to the viewpoint of the opponent.

best_rating

Returns a constant value that exceeds the highest result that could be returned by evaluate—the best possible win. Negating this value should be lower than the worst possible loss.

display

Displays the position.

A move object is much simpler. It must contain data sufficient to define all necessary attributes of a move, as determined by the needs of the position object’s make_move method, but the internal details of a move object are unimportant as far as the following algorithms are concerned (in fact, a move need not be represented as an object at all unless the make_move method expects it to be).

Here is a game interface definition for tic-tac-toe:

# tic-tac-toe game package
package tic_tac_toe;

    $empty = ' ';
    @move = ( 'X', 'O' );
    # Map X and O to 0 and 1.
    %move = ( 0=>0, 1=>1, 'X'=>0, 'O'=>1 );

    # new( turn, board )
    #
    # To create a new tic-tac-toe game:
    #    tic_tac_toe->new( )
    ## This routine is also used internally to create the position
    # that will occur after a move, switching whose turn it is and
    # adding a move to the board:
    #    $board = ... adjust current board for the selected move
    #    tic_tac_toe->new( 1 - $self->{turn}, $board )
    sub new {
        my ( $pkg, $turn, $board ) = @_;
        $turn = 0 unless defined $turn;
        $turn = $move{$turn};
        $board = [ ($empty) x 9 ] unless defined $board;
        my $self = { turn => $turn, board => $board };
        bless $self, $pkg;
        $self->evaluate_score;

        return $self;
    }

    # We cache the score for a position, calculating it once when
    # the position is first created.  Give the value from the
    # viewpoint of the player who just moved.
    #
    # scoring:
    #       100 win for current player (-100 for opponent)
    #        10 for each unblocked 2-in-a-row (-10 for opponent)
    #         1 for each unblocked 1-in-a-row (-1 for opponent)
    #         0 for each blocked row
    sub evaluate_score {
        my $self  = shift;
        my $me    = $move[1 - $self->{turn}];
        my $him   = $move[$self->{turn}];
        my $board = $self->{board};
        my $score = 0;

        # Scan all possible lines.
        foreach $line (
                [0,1,2], [3,4,5], [6,7,8],   # rows
                [0,3,6], [1,4,7], [2,5,8],   # columns
                [0,4,8], [2,4,6] )           # diagonals
        {
            my ( $my, $his );
            foreach (@$line) {
                my $owner = $board->[$_];

                ++$my if $owner eq $me;
                ++$his if $owner eq $him;
            }

            # No score if line is blocked.
            next if $my && $his;

            # Lost.
            return $self->{score} = -100 if $his == 3;

            # Win can't really happen, opponent just moved.
            return $self->{score} = 100 if $my == 3;
# Count 10 for 2 in line, 1 for 1 in line.
            $score +=
                ( -10, -1, 0, 1, 10 )[ 2 + $my - $his ];
        }

        return $self->{score} = $score;
    }

    # Prepare to generate all possible moves from this position.
    sub prepare_moves {
        my $self = shift;

        # None possible if game is already won.
        return undef if abs($self->{score}) == 100;

        # Check whether there are any possible moves:
        $self->{next_move} = -1;
        return undef unless defined( $self->next_move );

        # There are.  Next time we'll return the first one.
        return $self->{next_move} = -1;
    }

    # Determine the next move possible from the current position.
    # Return undef when there are no more moves possible.
    sub next_move {
        my $self = shift;

        # Continue returning undef if we've already finished.
        return undef unless defined $self->{next_move};

        # Check each square from where we last left off, skipping
        # squares that are already occupied.
        do {
            ++$self->{next_move}
        } while $self->{next_move} <= 8
            && $self->{board}[$self->{next_move}] ne $empty;

        $self->{next_move} = undef if $self->{next_move} == 9;
        return $self->{next_move};
    }

    # Create the new position that results from making a move.
    sub make_move {
        my $self = shift;
        my $move = shift;

        # Copy the current board, changing only the square for the move.
        my $myturn = $self->{turn};
        my $newboard = [ @{$self->{board}} ];
        $newboard->[$move] = $move[$myturn];

        return tic_tac_toe->new(1 - $myturn, $newboard);
    }

    # Get the cached evaluation of this position.
    sub evaluate {
        my $self = shift;

        return $self->{score};
    }

    # Display the position.
    sub description {
        my $self = shift;
        my $board = $self->{board};
        my $desc = "@$board[0..2]
@$board[3..5]
@$board[6..8]
";
        return $desc;
    }

    sub best_rating {
        return 101;
    }

Exhaustive Search

The technique of generating and analyzing all of the possible states of a situation is called exhaustive search. An exhaustive search is the generative analog of linear search—try everything until you succeed or run out of things to try. (Exhaustive search has also been called the British Museum Search, based on the light-hearted idea that the only way to find the most interesting object in the British Museum is to plod through the entire museum and examine everything. If your data structure, like the British Museum, does not order its elements according to how interesting they are, this technique may be your only hope.)

Consider a program that plays chess. If you were determined to use a lookup search, you might want to start by generating a data structure containing all possible chess positions. Positions could be linked wherever a legal move leads from one position to another. Then, identify all of the final positions as “win for white,” “win for black,” or “tie,” labeling them W, B, and T, respectively. In addition, when a link leads to a labeled position, label the link with the same letter as the position it leads to.

Next, you’d work backwards from identified positions. If a W move is available from a position where it is white’s turn to move, label that position W too (and remember the move that leads to the win). That determination can be made regardless of whether the other moves from that position have been identified yet—white can choose to win rather than move into unknown territory. (A similar check finds positions where it is black’s move and a B move is available.) If there is no winning move available, a position can only be identified if all of the possible moves have been labeled. In such a case, if any of the available moves is T, so is the position; but if all of the possible moves are losers, so is the position (i.e., B if it is white’s turn, or W if it is black’s turn). Repeat until all positions have been labeled.

Now you can write a program to play chess with a lookup search—simply lookup the current position in this data structure, and make the preferred move recorded there, an operation. Congratulations. You have just solved chess. White’s opening move will be labeled W, T, or B. Quick, publish your answer—no one has determined yet whether white has a guaranteed win (although it would come as quite a shock if you discovered that black does).

There are a number of problems, however. Obviously, we skipped a lot of detail—you’d need to use a number of algorithms from Chapter 8, to manage the board positions and the moves between them. We’ve glossed over the possibilities of draws that occur because of repeated positions—more graph algorithms to find loops so that we can check them to see whether either player would ever choose to leave the loop (because he or she would have a winning position).

But the worst problem is that there are a lot of positions. For white’s first move, there are 20 different possibilities. Similarly, for black’s first move. After that, the number of possible moves varies—as major pieces are exposed, more moves become available, but as pieces are captured, the number decreases.

A rough estimate says that there are about 20 choices for each possible turn, and a typical game lasts about 50 moves, which gives 2050 positions (or about 1065). Of course, there are lots of possible games that go much longer than the “typical” game, so this estimate is likely quite low.[24] If we guess that a single position can be represented in 32 bytes (8 bytes for a bitmap showing which squares are occupied, 4 bits for each occupied square to specify which piece is there, a few bits for whose turn it is, the number of times the position has been reached, and “win for white,” “win for black,” “tie,” or “not yet determined,” and a very optimistic assumption that the links to all of the possible successor positions can be squeezed into the remaining space), then all we need is about 1056) 32-gigabyte disk drives to store the data. With only an estimated 1070) protons in the universe, that may be difficult.

It will take quite a few rotations of our galaxy to generate all of those positions, so you can take advantage of bigger disk drives as they become available. Of course, the step to analyze all of the positions will take a bit longer. In the meantime, you might want to use a less complete analysis for your chess program.

The exponential growth of the problem’s size makes that technique unworkable for chess, but it is tolerable for tic-tac-toe:

use tic_tac_toe;        # defined earlier in this chapter

# exhaustive analysis of tic-tac-toe
sub ttt_exhaustive {

    my $game = tic_tac_toe->new( );

    my $answer = ttt_analyze( $game );
    if ( $answer > 0 ) {
        print "Player 1 has a winning strategy
";
    } elsif ( $answer < 0 ) {
        print "Player 2 has a winning strategy
";
    } else {
        print "Draw
";
    }
}

# $answer = ttt_analyze( $game )
#    Determine whether the other player has won.  If not,
#    try all possible moves (from $avail) for this player.
sub ttt_analyze {
    my $game = shift;

    unless ( defined $game->prepare_moves ) {
        # No moves possible.  Either the other player just won,
        # or else it is a draw.
        my $score = $game->evaluate;
        return -1 if $score < 0;
        return 0;
    }

    # Find result of all possible moves.
    my $best_score = -1;

    while ( defined( $move = $game->next_move ) ) {
        # Make the move negating the score
        #   - what's good for the opponent is bad for us.
        my $this_score = - ttt_analyze( $game->make_move( $move ) );

        # evaluate
        $best_score = $this_score if $this_score > $best_score;
    }

    return $best_score;
}

Running this:

print &ttt_exhaustive, "
";

produces:

Draw

As a comment on just how exhausting such a search can be, the tic-tac-toe exhaustive search had to generate 549,946 different game positions. More than half, 294,778, were partial positions (the game was not yet complete). Less than half, 209,088, were wins for one player or the other. Only a relative few, 46,080, were draw positions—yet with good play by both players, the game is always a draw. This run took almost 15 minutes. A human can analyze the game in about the same time—but not if they do it by exhaustive search.

Exhaustive search can be used for nongame generative searches, too, of course. Nothing about it depends upon the alternating turns common to games. For that matter, the definition of exhaustive search is vague. The exact meaning of “try everything” depends upon the particular problem. Each problem has its own way of trying everything, and often many different ways.

For many problems, exhaustive search is the best known method. Sometimes, it is known to be the best possible method. For example, to find the largest element in an unsorted collection, it is clear that you have to examine every element at least once. When that happens for a problem that grows exponentially, the problem is called intractable. For an intractable problem, you cannot depend on being able to find the best solution. You might find the best solution for some special cases, but generally you have to lower your sights—either accept an imperfect solution or be prepared to have no solution at all when you run out of time. (We’ll describe one example, the Traveling Salesman problem, later in this chapter.)

There are a number of known classes of really hard problems. The worst are called “undecidable”—no correct solution can possibly exist. The best known is the Halting Problem.[25]

There are also problems that are intractable. They are solvable, but all known solutions take exponentially long—e.g., 0(2N). Some of them have been proven to require an exponentially long time to solve. Others are merely believed to require an exponentially long time.

We are not going to list all of the intractable problems—that subject could fill a whole book.[26]

One example of an intractable problem is the Traveling Salesman problem. Given a list of cities and the distances between them, find the shortest route that takes the salesman to each of the cities on the list and then back to his original starting point. An exhaustive search requires checking N! different routes to see which is the shortest. As it happens, exhaustive search is the only method known to solve this problem. You’ll see this problem discussed further in Chapter 8.

When a problem is too large for exhaustive search, other approaches can be used. They tend to resemble bushy tree searches. A number of partial solutions are generated, and then one or some of them are selected as the basis for the next generative stage.

For some problems, such approaches can lead to a correct or best possible answer. For intractable problems, however, the only way to be certain of getting the best possible answer is exhaustive search. In these cases, the available alternative approaches only give approximations to the best answer—sometimes with a guarantee that the approximate answer is close to the best answer (for some specific definition of “close”). With other problems all you can do is to try a few different approximations and hope at least one provides a tolerable result. For example, for the Traveling Salesman problem, some solutions form a route by creating chains of nodes with relatively short connections and then choosing the minimum way of joining the endpoints of those chains into a loop. In some cases, Monte Carlo methods can be applied—generating some trial solutions in a random way and selecting the best.[27]

It is not always easy to know whether a particular problem is intractable. For example, it would appear that a close relative of the Traveling Salesman problem would be finding a minimum cost spanning tree—a set of edges that connects all of the vertices with no loops and with minimum total weight for the edges. But, this problem is not intractable; it can be solved rather easily, as you’ll see in Section 8.8.6.2.

Alternatives to Exhaustive Search in Games

Instead of an exhaustive search of the entire game, chess programs typically look exhaustively at only the next few moves and then perhaps look a bit deeper for some special cases. The variety of techniques used for chess can also be used in other programs—not only in other game programs but also in many graph problems.

Minimax

When you consider possible moves, you don’t get excited about what will happen if your opponent makes an obviously stupid move. Your opponent will choose the best move available—his “maximum” move. In turn, you should examine each of your available moves and for each one determine your opponent’s maximum response. Then, you select the least damaging of those maximum responses and select your move that leads to it. This minimum of the maximums strategy is called minimax. (“Let’s see, if I move here I get checkmated, if I move here I lose my queen, or if I move here the worst he can do is exchange knights—I’ll take that third choice.”)

Minimax is often used in game theory. We also used it implicitly earlier, in the exhaustive search when we assumed that black would always choose a “win for black” move if there was one available, and that white would similarly choose a “win for white” move, and that both would prefer a “tie” move to losing if no win were available. That was using minimax with exact values, but you can also use minimax with estimates. Chess programs search as far as time allows, rate the apparent value of the resulting position, and use that rating for the minimax computation. The rating might be wrong since additional moves might permit a significant change in the apparent status of the game.

The minimax algorithm is normally used in situations where response and counterresponse alternate. The following code for the minimax algorithm takes a starting position and a depth. It examines all possible moves from the starting position, but if it fails to find a terminating position after depth moves, it evaluates the position it has reached without examining further moves. It returns the minimax value and the sequence of moves determined to be the minimax.

# Usage:
#    To choose the next move:
#        ($moves,$score) = minimax($position,$depth)
#    You provide a game position object, and a maxmimum depth
#    (number of moves) to be expanded before cutting off the
#    move generation and evaluating the resulting position.
#    There are two return values:
#     1: a reference to a list of moves (the last element on the
#        list is the position at the end of the sequence - either
#        it didn't look beyond because $depth moves were found, or
#        else it is a terminating position with no moves posible.
#     2: the final score

sub minimax {
    my ( $position, $depth ) = @_;

    # Have we gone as far as permitted or as far as possible?
    if ( $depth-- and defined($position->prepare_moves) ) {
        # No - keep trying additional moves from $position.
        my $move;
        my $best_score = -$position->best_rating;
        my $best_move_seq;

        while ( defined( $move = $position->next_move ) ) {
            # Evaluate the next move.
            my ( $this_move_seq, $this_score ) =
                minimax(
                    $position->make_move($move),
                    $depth );
            # Opponent's score is opposite meaning of ours.
            $this_score = -$this_score;
            if ( $this_score > $best_score ) {
                $best_score = $this_score;
                $best_move_seq = $this_move_seq;
                unshift ( @$best_move_seq, $move );
            }
        }

        # Return the best one we found.
        return ( $best_move_seq, $best_score );

    } else {
        # Yes - evaluate current position, no move to be taken.
        return ( [ $position ], -$position->evaluate );
    }
}

As an example of using this routine, we’ll use that tic-tac-toe game description we defined earlier. We’ll limit the search depth to two half-turns. You’d probably use a higher number, if you wanted the program to play well.

use tic_tac_toe;

my $game = tic_tac_toe->new( );

my ( $moves, $score ) = minimax( $game, 2 );
my $my_move = $moves->[0];
print "I move: $my_move
";

This produces:

I move: 4

which is a perfectly reasonable choice of taking the center square as the first move.

Pruning

With a game like chess, you need to continue this analysis for many plies because there can be long chains of moves that combine to produce a result. If you examine every possible move that each player could make in each turn, then you won’t be able to examine many levels of resulting moves. Instead, programs compromise—they examine all possible moves that might be made for the first few turns, but examine only the most promising and the most threatening positions deeply. This act—skipping the detailed analysis of (apparently) uninteresting positions—is called pruning. It requires very careful distinction to label a move uninteresting; a simplistic analysis will overlook sacrifices—moves that trade an initial obvious loss for a positional advantage that can be used to recoup the loss later.

Alpha-beta pruning

One form of pruning is especially useful for any adversarial situation. It avoids evaluating many positions, but still returns the same result it would if it had evaluated them all. Suppose you’ve analyzed one of your possible moves and determined that your opponent’s best reply will lead to no change in relative advantage. Now you are about to examine another of your possible moves. If you find that one response your opponent might make leads to the loss of one of your pieces, you need not examine the rest of your opponent’s replies. You don’t care about finding out whether he may be able to checkmate you instead, because you already know that this move is not your best choice. So, you skip further analysis of this move and immediately go on to examine alternate moves that you actually might make.

Of course, the analysis of the opponent’s moves can use the same strategy. The algorithm that implements this is a slight variation of minimax called alpha-beta pruning. It uses two additional parameters, alpha and beta, to record the lower and upper cutoff bounds that are to be applied. The caller doesn’t have to provide these parameters; they are initalized internally. Like minimax, this routine is recursive. Note that on the recursive calls, the parameters $alpha and $beta are swapped and negated. That corresponds to the change of viewpoint as it becomes the other player’s turn to play.

# Usage:
#    To minimize the next move:
#        ($move,$score) = ab_minimax($position,$depth)
#    You provide a game position object, and a maxmimum depth
#    (number of moves) to be expanded before cutting off the
#    move generation and evaluating the resulting position.

sub ab_minimax {
    my ( $position, $depth, $alpha, $beta ) = @_;

    defined ($alpha) or $alpha = -$position->best_rating;
    defined ($beta)  or $beta  =  $position->best_rating;

    # Have we gone as far as permitted or as far as possible?
    if ( $depth-- and defined($position->prepare_moves) ) {
        # no - keep trying additional moves from $position
        my $move;
        my $best_score = -$position->best_rating;
        my $best_move_seq;
        my $alpha_cur = $alpha;

        while ( defined($move = $position->next_move) ) {
            # Evaluate the next move.
            my ( $this_move_seq, $this_score ) =
                ab_minimax( $position->make_move($move),
                                $depth, -$beta, -$alpha_cur );
            # Opponent's score is opposite meaning from ours.
            $this_score = -$this_score;
            if ( $this_score > $best_score ) {
                $best_score = $this_score;
                $alpha_cur = $best_score if $best_score > $alpha_cur;
                $best_move_seq = $this_move_seq;
                unshift ( @$best_move_seq, $move );

                # Here is the alpha-beta pruning.
                #    - quit when someone else is ahead!
                last if $best_score >= $beta;
            }
        }

        # Return the best one we found.
        return ( $best_move_seq, $best_score );

    } else {
        # Yes - evaluate current position, no move to be taken.
        return ( [ $position ], -$position->evaluate );
    }
}

As an example of using this routine, we’ll again use tic-tac-toe, limiting the search depth to two half-turns (one move by each player):

use tic_tac_toe;

my $game = tic_tac_toe->new( );

my ( $moves, $score ) = ab_minimax( $game, 2 );
my $my_move = $moves->[0];
print "I move: $my_move
";

This produces:

I move: 4

again taking the center square for the first move, but finding it in half the time.

Killer move

A useful search strategy is the killer move strategy. When a sequence of moves is found that produces an overwhelming decision (say, a checkmate) while analyzing one branch of possible moves, the same sequence of moves is checked first in the analysis of the other branches. It may lead to an overwhelming decision there too.

Killer move works especially well with alpha-beta pruning. The quicker your examination finds good bounds on the best and worst possibilities, the more frequently pruning occurs for the rest of the analysis. The time saved by this more frequent pruning can be used to allow deeper searching.

In fact, if the program is written to try shallow analyses first and progressively deeper analyses as time permits, then testing the best and worst moves found in the previous shallower analysis establishes the alpha and beta bounds immediately—unless the deeper analysis uncovers a previously unnoticed loophole.

Transpose tables

You may recall that the exhaustive search of tic-tac-toe examined 549,946 game positions. The tic-tac-toe board has 9 squares and each square can contain one of three different values—blank, X, or O. That means that there are a maximum of , or 19,683 possible board states. In fact, there are even fewer board states since the number of X squares must be either equal to or one greater than the number of O squares. That program examined most board positions repeatedly since it is possible to arrive at a particular position in many ways—by having the players occupy the same squares in a different order.

A common optimization uses a transpose table. When a move is being considered, the resulting position is checked against a cache of positions that have been considered previously. If it has already been examined, the cached result is returned without repeating the analysis. If we convert the exhaustive tic-tac-toe analysis to use a transpose table, we reduce the running time from 15 minutes to 12 seconds. The computer is now solving the game faster than a human could. The number of positions analyzed drops from 549,946 down to 16,168 (10,690 of them were found in the transpose table; only 5,478 actually had to be examined). Here’s the changed code:

use tic_tac_toe;        # defined earlier in this chapter

# exhaustive analysis of tic-tac-toe using a transpose table
sub ttt_exhaustive_table {

    my $game = tic_tac_toe->new( );

    my $answer = ttt_analyze_table( $game );
    if ( $answer > 0 ) {
        print "Player 1 has a winning strategy
";
    } elsif ( $answer < 0 ) {
        print "Player 2 has a winning strategy
";
    } else {
        print "Draw
";
    }
}

@cache = ( );

# $answer = ttt_analyze_table( $game )
#    Determine whether the other player has won.  If not,
#    try all possible moves (from $avail) for this player.
sub ttt_analyze_table {
    my $game = shift;
    my $move = shift;

    # Compute id - the index for the current position.
    #    Treat the board as a 9-digit base 3 number.  Each square#    contains 0 if it is unoccupied, 1 or 2 if it has been
    #    taken by one of the players.
    if( ! defined $move ) {
        # Empty board.
        $game->{id} = 0;
    } else {
        # A move is being tested, add its value to this id of
        # the starting position.
        my $id = $game->{id} + ($game->{turn}+1)*(3**$move);
        if( defined( my $score = $cache[$id] ) ) {
            # That resulting position was previously analyzed.
            return -1 if $score < 0;
            return 0;
        }
        my $prevgame = $game;
        # A new position - analyze it.
        $game = $game->make_move( $move );
        $game->{id} = $id;
    }

    unless ( defined $game->prepare_moves ) {
        # No moves possible.  Either the other player just won,
        # or else it is a draw.
        my $score = $game->evaluate;
        $cache[$game->{id}] = $score;
        return -1 if $score < 0;
        return 0;
    }

    # Find result of all possible moves.
    my $best_score = -1;

    while ( defined( $move = $game->next_move ) ) {
        # Make the move negating the score
        #   - what's good for the opponent is bad for us.
        my $this_score = - ttt_analyze_table( $game, $move );

        # evaluate
        $best_score = $this_score if $this_score > $best_score;
    }

    $cache[$game->{id}] = $best_score;
    return $best_score;
}

Of course, the revised program still determines that the game is a draw after best play.

A transpose table can be used with minimax or alpha-beta pruning, not just with exhaustive search. For a game like chess, where it is easy to arrive at the same position in different ways (like re-ordering the same sequence of moves), this strategy is very valuable.

Advanced pruning strategies

There are additional pruning strategies derived from alpha-beta pruning. If you invoke the alpha-beta search with narrower set of bounds than the “infinite” bounds used earlier, it can prune much more frequently. The result from such a search, however, is no longer necessarily exact. With the bounds alpha and beta and the result result there are three possibilities:

If

Then

alpha < result < beta

result is the exact minimax value

result <= alpha

result is an upper bound on the minimax value

beta <= result

result is a lower bound on the minimax value

When the result provides only a bound instead of an exact answer, it is necessary to carry out another search with different alpha and beta bounds. This sounds expensive, but it actually can be faster. Because alpha and beta start closer together, there is immediate opportunity for pruning. Using a transpose table, the second (and any subsequent) search will only have to search positions that weren’t searched in a previous attempt. See http://www.cs.vu.nl/~aske/mtdf.htmlfor a description of this algorithm in more detail.

Other strategies

The transpose table described earlier can be used in further ways. The transpose table can’t provide an exact answer if the value in it was computed by traversing a shallower depth than is currently required. However, it can still be used to give an estimate of the answer. By first trying the move with the best estimate, there is a good chance of establishing strong pruning bounds quickly. This method is a way of remembering information about positions from one round to another, which is more valuable than remembering a single killer move.

While alpha-beta pruning and transpose tables are risk-free, there are other pruning strategies that are risky—they are specific to the particular game and are more like the rules of thumb that a human expert might use. One example is the opening book. Most chess programs use a library of opening moves and responses. As long as the game is still within the pre-analyzed boundaries of this book, only moves listed within the book are considered. Until a position that is not in the book is reached, the program does no searching at all. Other strategies involve searching to a deeper level for specialized cases like a series of checks.

Some games, like tic-tac-toe, are symmetrical, so there are many positions that are equivalent to each other, varying only by a reflection or a rotation of the board. (In chess, there is rarely any point in checking for positions that are symmetric copies of each other—the one-directional movement of pawns and the asymmetry of having a king and a queen instead of two identical pieces makes symmetrically equivalent positions quite rare.) For games with such symmetry, where symmetrical variations are likely to be analyzed, it may be helpful to map positions cached in the transpose table into a particular one of its symmetrical variants. Then, the transpose table can provide an immediate result for all of those symmetric variants too.

Nongame Dynamic Searches

Game situations differ from other generative search situations in that they have adversaries. This makes the analysis more complicated because the goal flips every half-turn. Some algorithms, like minimax, apply only to such game situations. Other algorithms, like exhaustive search, can be applied to any type of situation. Still others apply only when, unlike in games, there is a single fixed goal.

All kinds of dynamic searches have to concern themselves with the search order among multiple choices. There is actually a continuum of ordering techniques. At one extreme is depth-first search; at the other extreme is breadth-first search.

They differ in the order that possibilities are examined. A breadth-first search examines all of the possible first choices, then all of the possible second choices (from any of the first choices), and so on. This is much like the way that an incoming tide covers a beach, extending its coverage across the entire beach with each wave, and then a bit further with each subsequent wave. A depth-first search, on the other hand, examines the first possible first choice, the first possible second choice (resulting from that first choice), the first third choice (resulting from that second choice), and so on. This is more like an octopus examining all of the nooks and crannies in one coral opening before moving on to check whether the next might contain a tasty lunch. The two searches are shown in Figure 5-2.

Breadth-first versus depth-first
Figure 5-2. Breadth-first versus depth-first

The minimax algorithm is necessarily depth-first to some extent—it examines a single sequence of moves all of the way down to a final position (or the maximum depth). Then, it evaluates that position and backs up to try the next choice for the final move. The choice of depth controls the extent to which it is depth first. We already saw how chess has an exponentially huge number of positions—a completely depth-first traversal would never accomplish anything useful in a reasonable amount of time. Using a depth of 1, then a depth of 2, and so on, actually turns it into a breadth-first series of searches.

Whether depth-first or breadth-first is a better answer depends upon the particular problem. If most choices lead to an acceptable answer and at about the same depth, then a depth-first search will generally be much faster—it finds one answer quickly while a breadth-first search will have almost found many answers before it completely finds any one answer. On the other hand, if there are huge areas that do not contain an acceptable answer, then breadth-first is safer. Suppose that you wanted to determine whether, starting on your home web page, you could follow links and arrive at another page on your site. Going depth-first takes the chance that you may happen to reach the “my favorite links” page and never get back to your own site again. This would be like having that poor octopus try to completely examine a hole that lead down to the bottom of the Marianas Trench and never finding the smorgasboard of tender morsels in the shallower hole a few meters away. You will normally prefer breadth-first—it is rare to use depth-first without a limit (such as the depth argument to our minimax implementation).

Here are two routines for depth-first and breadth-first searches. They use a similar interface as the minimax routines earlier. They require that a position object provide one additional method, is_answer, which returns true if the position is a final answer to the original problem.

# $final_position = depth_first( $position )
sub depth_first {
    my @positions = shift;

    while ( my $position = pop( @positions ) ) {
        return $position if $position->is_answer;

        # If this was not the final answer, try each position that
        # can be reached from this one.
        $position->prepare_moves;
        my $move;
while ( $move = $position->next_move ) {
            push ( @positions, $position->make_move($move) );
        }
    }
    # No answer found.
    return undef;
}

# $final_position = breadth_first( $position )
sub breadth_first {
    my @positions = shift;

    while ( my $position = shift( @positions ) ) {
        return $position if $position->is_answer;

        # If this was not the final answer, try each position that
        # can be reached from this one.
        $position->prepare_moves;
        my $move;
        while ( $move = $position->next_move ) {
            push ( @positions, $position->make_move($move) );
        }
    }
    # No answer found.
    return undef;
}

The two routines look very similar. The only difference is whether positions to examine are extracted from @positions using a shift or a pop. Treating the array as a stack or a queue determines the choice between depth and breadth. Other algorithms use this same structure but with yet another ordering technique to provide an algorithm that is midway between these two. We will see a couple of them shortly.

Greedy algorithms

A greedy algorithm works by taking the best immediately available action. If you are greedy, you always grab the biggest piece of cake you can, without worrying that you’ll take so long eating it that you’ll miss getting a second piece. A greedy algorithm does the same: it breaks the problem into pieces and chooses the best answer for each piece without considering whether another answer for that piece might work better in the long run. In chess, this logic would translate to always capturing the most valuable piece available—which is often a good move but sometimes a disaster: capturing a pawn is no good if you lose your queen as a result. In Section 8.8.5 in Chapter 8, we’ll find that for the problem of finding a minimal-weight-spanning tree in a graph, a greedy approach—specifically, always adding the lightest edge that doesn’t create a loop—leads to the optimal solution, so sometimes a greedy algorithm is not just an approximation but an exact solution.

For nongame searches, a greedy algorithm might choose whatever action will yield the best score thus far. That requires that you be able to determine some sort of metric to specify how well a partial solution satisfies your goal. For some problems, that is fairly easy; for others, it is hard. Finding a series of links to a particular web page is hard. Until you have examined all of the links from a page, you have no way of telling whether one of them leads to the target page. A similar problem with a better metric is finding a route from one city to another on a map. You know that all cities are reachable, barring washed out bridges and the like, and you can see a general direction that reasonable routes will have to follow, so you can downgrade the roads that lead in the opposite direction right away.

Branch and bound

As you consider partial solutions that may be part of the optimum answer, you will keep a “cost so far” value for them. You can then easily keep the cost of each solution updated by adding the cost of the next leg of the search.

Consider Figure 5-3, a map that shows the roads between the town of Urchin and the nearby town of Sula Center. The map shows the distance and the speed limit of each road. Naturally, you never exceed the speed limit on any road, and we’ll also assume that you don’t go any slower. What is the fastest route? From the values on the map, we can compute how long it takes to drive along each road:

Start Point

End Point

Distance

Speed Limit

Travel Time

Urchin

Wolfbane Corners

54 km

90 km/h

36 min.

Wolfbane Corners

Sula Center

30 km

90 km/h

20 min.

Urchin

Sula Junction

50 km

120 km/h

25 min.

Sula Junction

Sula Center

21 km

90 km/h

14 min.

Map of towns with distances and speeds
Figure 5-3. Map of towns with distances and speeds

When solving such problems, you can always examine the position that has the lowest cost-so-far and generate the possible continuations from that position. This is a reasonable way of finding the cheapest route. When the position with the lowest cost-so-far is the final destination, then you have your answer. All positions considered previously were not yet at the destination, while all positions not yet considered have a cost that is the same or worse. You now know the best route. This method is called branch and bound.

This method lies in between breadth-first and depth-first: it’s a greedy algorithm, choosing the cheapest move so far discovered, regardless of whether it is deep or shallow. To implement this requires that a position object provide a method for cost-so-far. We’ll have it inherit it from the Heap::Elem object interface. Keeping the known possible next positions on a heap, instead of a stack or queue, makes it easy to find the smallest:

# $final_position = branch_and_bound( $start_position )
sub branch_and_bound {
    my $position;

    use Heap::Fibonacci;

    my $positions = Heap::Fibonacci->new;

    $positions->add( shift );

    while ( $position = $positions->extract_minimum ) ) {
        return $position if $position->is_answer;

        # That wasn't the answer.
        # So, try each position that can be reached from here.
        $position->prepare_moves;
        my $move;
        while ( $move = $position->next_move ) {
            $positions->add( $position->make_move($move) );
        }
    }
    # No answer found.
    return undef;
}

Let’s define an appropriate object for a map route. We’ll only define here the facets of the object that deal with creating a route, using the same interface we used earlier for generating game moves. (In a real program, you’d add more methods to make use of the route once it’s been found.)

package map_route;

use Heap::Elem;

@ISA = qw(Heap::Elem);

# new - create a new map route object to try to create a
#     route from a starting node to a target node.
#
# $route = map_route->new( $start_town, $finish_town );
sub new {
    my $class = shift;
    $class    = ref($class) || $class;
    my $start = shift;
    my $end   = shift;

    return $class->SUPER::new(
        cur          => $start,
        end          => $end,
        cost_so_far  => 0,
        route_so_far => [$start],
    );
}

# cmp - compare two map routes.
#
# $cmp = $node1->cmp($node2);
sub cmp {
    my $self  = shift;
    my $other = shift;

    return $self->{cost_so_far} <=> $other->{cost_so_far};
}

# is_answer - does this route end at the destination (yet)
#
# $boolean = $route->is_answer;
sub is_answer {
    my $self = shift;
    return $self->{cur} eq $self->{end};
}

# prepare_moves - get ready to look at all valid roads.
#
# $route->prepare_moves;
sub prepare_moves {
    my $self = shift;
    $self->{edge} = -1;
}

# next_move - find next usable road.
#
# $move = $route->next_move;
sub next_move {
    my $self = shift;
    return $self->{cur}->edge( ++$self->{edge} );
}

# make_move - create a new route object that extends the
#     current route to travel the specified road.
#
# $route_new = $route->make_move( $move );
sub make_move {
    my $self = shift;
    my $edge = shift;
    my $next = $edge->dest;
    my $cost = $self->{cost_so_far} + $edge->cost;

    return $self->SUPER::new(
        cur          => $next,
        end          => $self->{end},
        cost_so_far  => $cost,
        route_so_far => [ @{$self->{route_so_far}}, $edge, $next ],
    );
}

This example needs more code, but it’s already getting too long. It needs a class for towns (nodes) and a class for roads (edges). The class for towns requires only one method to be used in this code: $town->edge($n) should return a reference to one of the roads leading from $town (or undef if $n is higher than the index of the last road). The class for roads has two methods: $road->dest returns the town at the end of that road, and $road->cost returns the time required to traverse that road. We omit the code to build town and road objects from the previous table. You can find relevant code in Chapter 8.

With those additional classes defined and initialized to contain the map in Figure 5-3, and references to the towns Urchin and Sula Center in the variables $urchin and $sula, respectively, you would find the fastest route from Urchin to Sula Center with this code:

$start_route = map_route->new( $urchin, $sula );
$best_route = branch_and_bound( $start_route );

When this code is done, the branch_and_bound function uses its heap to continually process the shortest route found so far. Initially, the only route is the route of length 0—we haven’t left Urchin. The following table shows how entries get added to the heap and when they get examined. In each iteration of the outer while loop, one entry gets removed from the heap, and a number of entries get added:

Iteration Added

Iteration Removed

Cost So Far

Route So Far

0

1

0

Urchin

1

2

25

Urchin → Sula Junction

1

3

36

Urchin → Wolfbane Corners

2

4 (success)

39

Urchin → Sula Junction → Sula Center

2

never

50

Urchin → Sula Junction → Urchin

3

never

44

Urchin → Wolfbane Corners → Sula Center

3

never

72

Urchin → Wolfbane Corners → Urchin

So, the best route from Urchin to Sula Center is to go through Wolfbane Corners.

The A* algorithm

The branch and bound algorithm can be improved in many cases if at each stopping point you can compute a minimum distance remaining to the final goal. For instance, on a road map the shortest route between two points will never be shorter than the straight line connecting those points (but it will be longer if there is no road that follows that straight line).

Instead of ordering by cost-so-far, the A* algorithm orders by the total of cost-so-far and the minimum remaining distance. As before, it doesn’t stop when the first road that leads to the target is seen, but rather when the first route that has reached the target is the next one to consider. When the next path to consider is already at the target, it must have a minimum remaining distance of 0 (and this “minimum” is actually exact). Because we require that minima never be higher than the correct value, no other postions need be examined—there might be unexamined answers that, at a minimum, are equal, but none of them can be better. This algorithm provides savings over branch and bound whenever there are positions that haven’t been considered yet have a cost so far that is less than the final cost, but whose minimum remainder is sufficiently high that it needn’t be considered.

In Figure 5-4, the straight-line distances provide part of a lower bound on the shortest possible time. The other limit to use is the maximum speed limit found anywhere on the map—120 km/h. Using these values gives a minimum cost: the time from any point to Sula Center must be at least as much as this “crow’s flight” distance driven at that maximum speed:

Location

Straight Line Distance

Minimum Cost

Urchin

50 km

25 min.

Sula Junction

4 km

2 min.

Wolfbane Corners

8 km

4 min.

Sula Center

0 km

0 min.

Minimum time determined by route “as the crow flies”
Figure 5-4. Minimum time determined by route “as the crow flies”

The code for A* is almost identical to branch and bound—in fact, the only difference is that the cmp metric adds the minimum remaining cost to cost_so_far. This requires that map objects provide a method to compute a minimum cost—straight line distance to the target divided by maximum speed limit. So, the only difference is that the cmp function is changed to the following:

package map_route_min_possible;

@ISA = qw(map_route);

# cmp - compare two map routes.
#
# $cmp = $node1->cmp($node2);
# Compare two heaped positions.
sub cmp {
    my $self   = shift->[0];
    my $other  = shift->[0];
    my $target = $self->{end};
    return  ($self->{cost_so_far} + $self->{cur}->min_cost($target) )
        <=> ($other->{cost_so_far} + $other->{cur}->min_cost($target) );
}

# To use A* searching:
$start_route = map_route_min_possible->new( $urchin, $sula );
$best_route = branch_and_bound( $start_route );

Because the code is nearly identical, you can see that branch and bound is just a special case of A*. It always uses a minimum remaining cost of 0. That’s the most conservative way of meeting the requirement that the minimum mustn’t exceed the true remaining cost; as we see in the following table, the more aggressive minimum speeds up the search process:

Iteration Added

Iteration Removed

Cost So Far

Minimum Remaining

Comparison Cost

Route So Far

0

1

0

25

25

Urchin

1

2

25

2

27

Urchin → Sula Junction

1

never

36

4

40

Urchin → Wolfbane Corners

2

never

50

25

75

Urchin → Sula Junction → Urchin

2

3 (success)

39

0

39

Urchin → Sula Junction → Sula Center

Notice that this time only three routes are examined. Routes from Wolfbane Corners are never examined because even if there was a perfectly straight maximum-speed highway between them, it would still be longer than the route through Sula Junction. While the A* algorithm only saves one route generation on this tiny map, it can save far more on a larger graph. You will see additional algorithms for this type of problem in Chapter 8.

Dynamic programming

Dynamic programming was mentioned in the introduction. Like the greedy approach, dynamic programming breaks the problem into pieces, but it does not determine the solution to each piece in isolation. The information about possible solutions is made available for the analysis of the other pieces of the problem, to assist in making a final selection. The killer move strategy discussed earlier is an example. If the killer move still applies, it doesn’t have to be rediscovered. The positions that permit the killer move to be used may never arise in the game—the other player will certainly choose a position that prevents that sequence from having the devastating effect (if there is any a safer alternative). Both branch and bound and A* are dynamic programming techniques.



[22] The peculiar-looking for loop in the linear_string() function is an efficiency measure. By counting down to 0, the loop end conditional is faster to execute. It is even faster than a foreach loop that iterates over the array and separately increments a counter. (However, it is slower than a foreach loop that need not increment a counter, so don’t use it unless you really need to track the index as well as the value within your loop.)

[23] “Large enough” and “often” are somewhat vague, especially because they affect each other. Multiplying the number of elements by the number of searches is your best indicator—if that product is in the thousands or less, you could tolerate a linear search instead.

[24] Patrick Henry Winston, in his book Artificial Intelligence, (Addison-Wesley, 1992) provides a casual estimate of 10120)

[25] The Halting Problem asks for a program (HP) that accepts two inputs: a program and a description of an input. HP must analyze the program and determine whether, invoked with that input, the program would run forever or halt. A “program” must include any descriptive information required for HP to understand it, as well as the code required for a computer to execute it. If you assume that HP could exist, then it is easy to write another program that we can call Contrary. Contrary runs HP, giving it Contrary’s own description as the program to be analyzed and HP’s description as the input. HP determines whether Contrary will halt. But now, Contrary uses the answer returned by HP to take an opposite choice of whether to halt or to run forever. Because of that contrary choice, HP will have been wrong in its answer. So HP is not a correct solution to the halting problem and since this argument can be applied to any solution, no correct solution can exist.

[26] In fact, it has filled at least one book. See Computers and Intractability: A Guide to the Theory of NP-Completeness, by Michael R. Garey and David S. Johnson (W. H. Freeman and Co., 1979).

[27] A way of carrying out non-deterministic computations in a practical amount of time has been shown recently in Science. A Hamiltonian path (a variant of the Traveling Salesman problem) can be solved by creating a tailored DNA structure and then growing enough of them to try out all of the possible routes at once.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.219.103.183