Appendix 1: A Catalog of Algorithms

This book covers much of the material in a college algorithms course, but from a different perspective — the emphasis is more on applications and coding than on mathematical analysis. This appendix relates the material to a more typical outline.

Sorting

Problem Definition. The output sequence is an ordered permutation of the input sequence. When the input is a file, the output is usually a distinct file; when the input is an array, the output is usually the same array.

Applications. This list only hints at the diversity of sorting applications.

• Output Requirements. Some users desire sorted output; see Section 1.1 and consider your telephone directory and monthly checking account statement. Functions such as binary search require sorted inputs.

• Collect Equal Items. Programmers use sorting to collect together the equal items in a sequence: the anagram program in Sections 2.4 and 2.8 collects words in the same anagram class. The suffix arrays in Sections 15.2 and 15.3 collect equal text phrases. See also Problems 2.6, 8.10 and 15.8.

• Other Applications. The anagram program in Sections 2.4 and 2.8 uses sorting as a canonical order for the letters in a word, and thereby as a signature of an anagram class. Problem 2.7 sorts to rearrange data on a tape.

General-Purpose Functions. The following algorithms sort an arbitrary n-element sequence.

• Insertion Sort. The program in Section 11.1 has O(n2) run time in the worst case and for random inputs. The run times of several variants are described in a table in that section. In Section 11.3 Insertion Sort is used to sort an almost-sorted array in O(n) time. It is the only stable sort in this book: output elements with equal keys are in the same relative order as in the input.

• Quicksort. The simple Quicksort in Section 11.2 runs in O(n log n) expected time on an array of n distinct elements. It is recursive and uses logarithmic stack space on the average. In the worst case, it requires O(n2) time and O(n) stack space. It runs in O(n2) time on an array of equal elements; the improved version in Section 11.3 has O(n log n) average run time for any array. The table in Section 11.3 presents empirical data on the run time of several implementations of Quicksort. The C Standard Library qsort is usually implemented with this algorithm; it is used in Sections 2.8, 15.2 and 15.3 and Solution 1.1. The C++ Standard Library sort often uses the algorithm; it is timed in Section 11.3.

• Heapsort. The Heapsort in Section 14.4 runs in O(n log ni) time on any n-element array; it is not recursive and uses only constant extra space. Solutions 14.1 and 14.2 describe faster Heapsorts.

• Other Sorting Algorithms. The Merge Sort algorithm sketched in Section 1.3 is effective for sorting files; a merging algorithm is sketched in Problem 14.4.d. Solution 11.6 gives pseudocode for Selection Sort and Shell Sort.

The run times of several sorting algorithms are described in Solution 1.3.

Special-Purpose Functions. These functions lead to short and efficient programs for certain inputs.

• Radix Sort. McIlroy’s bit-string sort in Problem 11.5 can be generalized to sort strings over larger alphabets (bytes, for instance).

• Bitmap Sort. The bitmap sort in Section 1.4 uses the fact that the integers to be sorted are from a small range, contain no duplicates, and have no additional data. Implementation details and extensions are described in Solutions 1.2, 1.3, 1.5 and 1.6.

• Other Sorts. The multiple-pass sort in Section 1.3 reads the input many times to trade time for space. Columns 12 and 13 produce sorted sets of random integers.

Searching

Problem Definition. A search function determines whether its input is a member of a given set, and possibly retrieves associated information.

Applications. Lesk’s telephone directory in Problem 2.6 searches to convert an (encoded) name to a telephone number. Thompson’s endgame program in Section 10.8 searches a set of chess boards to compute an optimal move. McIlroy’s spelling checker in Section 13.8 searches a dictionary to determine whether a word is spelled correctly. Additional applications are described along with the functions.

General-Purpose Functions. The following algorithms search an arbitrary n-element set.

• Sequential Search. Simple and tuned versions of sequential search in an array are given in Section 9.2. Sequential searches in an array and linked list are given in Section 13.2. The algorithm is used in hyphenating words (Problem 3.5), smoothing geographic data (Section 9.2), representing a sparse matrix (Section 10.2), generating random sets (Section 13.2), storing a compressed dictionary (Section 13.8), bin packing (Problem 14.5), and finding all equal text phrases (Section 15.3). The introduction to Column 3 and Problem 3.1 describe two foolish implementations of sequential search.

• Binary Search. The algorithm to search a sorted array in approximately log2 n comparisons is described in Section 2.2, and code is developed in Section 4.2. Section 9.3 extends the code to find the first occurrence of many equal items and tunes its performance. Applications include searching for records in a reservation system (Section 2.2), erroneous input lines (Section 2.2), anagrams of an input word (Problem 2.1), telephone numbers (Problem 2.6), the position of a point among line segments (Problem 4.7), the index of an entry in a sparse array (Solution 10.2), a random integer (Problem 13.3), and phrases of words (Sections 15.2 and 15.3). Problems 2.9 and 9.9 discuss the tradeoffs between binary and sequential search.

• Hashing. Problem 1.10 hashes telephone numbers, Problem 9.10 hashes a set of integers, Section 13.4 uses bins to hash a set of integers, and Section 13.8 hashes the words in a dictionary. Section 15.1 hashes to count the words in a document.

• Binary Search Trees. Section 13.3 uses (nonbalanced) binary search trees to represent a set of random integers. Balanced trees typically implement the set template in the C++ Standard Template Library, which we used in Sections 13.1 and 15.1 and Solution 1.1.

Special-Purpose Functions. These functions lead to short and efficient programs for certain inputs.

• Key Indexing. Some keys can be used as an index into an array of values. The bins and bit vectors in Section 13.4 both use integer keys as indices. Keys used as indices include telephone numbers (Section 1.4), characters (Solution 9.6), arguments to trigonometric functions (Problem 9.11), indices of sparse arrays (Section 10.2), program counter values (Problem 10.7), chess boards (Section 10.8), random integers (Section 13.4), hash values of a string (Section 13.8), and integer values in priority queues (Problem 14.8). Problem 10.5 reduces space with key indexing and a numerical function.

• Other Methods. Section 8.1 describes how search time was reduced by keeping common elements in a cache. Section 10.1 describes how searching a tax table became simple once the context was understood.

Other Set Algorithms

These problems deal with a collection of n elements that may possibly contain duplicates.

Priority Queues. A priority queue maintains a set of elements under the operations of inserting arbitrary elements and deleting the minimum element. Section 14.3 describes two sequential structues for the task, and gives a C++ class that efficiently implements priority queues using heaps. Applications are described in Problems 14.4, 14.5 and 14.8.

Selection. Problem 2.8 describes a problem in which we must select the kth-smallest element in the set. Solution 11.9 describes an efficient algorithm for the task; alternative algorithms are mentioned in Problems 2.8, 11.1 and 14.4.c.

Algorithms on Strings

Sections 2.4 and 2.8 compute the anagram sets in a dictionary. Solution 9.6 describes several ways to classify characters. Section 15.1 lists the distinct words in a file and also counts the words in a file, first using C++ Standard Template Library components and then with a custom-built hash table. Section 15.2 uses suffix arrays to find the longest repeated substring in a text file, and Section 15.3 uses a variant of suffix arrays to generate random text from a Markov model.

Vector and Matrix Algorithms

Algorithms for swapping subsequences of a vector are discussed in Section 2.3 and Problems 2.3 and 2.4; Solution 2.3 contains code for the algorithms. Problem 2.5 describes an algorithm for swapping nonadjacent subsequences in a vector. Problem 2.7 uses sorting to transpose a matrix represented on tape. Programs for computing the maximum of a vector are described in Problems 4.9, 9.4 and 9.8. Vector and matrix algorithms that share space are described in Sections 10.3 and 14.4. Sparse vectors and matrices are discussed in Sections 3.1, 10.2 and 13.8; Problem 1.9 describes a scheme for initializing sparse vectors that is used in Section 11.3. Column 8 describes five algorithms for computing the maximum-sum subsequence in a vector, and several of the problems in Column 8 deal with vectors and matrices.

Random Objects

Functions for generating pseudorandom integers are used throughout the book; they are implemented in Solution 12.1. Section 12.3 describes an algorithm for “shuffling” the elements of an array. Sections 12.1 through 12.3 describe several algorithms for selecting random subsets of a set (see also Problems 12.7 and 12.9); Problem 1.4 gives an application of the algorithm. Solution 12.10 gives an algorithm for randomly selecting one of an unknown number of objects.

Numerical Algorithms

Solution 2.3 presents the additive Euclidean algorithm for computing the greatest common divisor of two integers. Problem 3.7 sketches an algorithm for evaluating linear recurrences with constant coefficients. Problem 4.9 gives code for an efficient algorithm to raise a number to a positive integer power. Problem 9.11 computes trigonometric functions by table lookup. Solution 9.12 describes Homer’s method of evaluating a polynomial. Summing a large set of floating point numbers is described in Problems 11.1 and 14.4.b.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.222.93.132