Chapter 20. Searching and Sorting

 

With sobs and tears he sorted out Those of the largest size ...

 
 --Lewis Carroll
 

Attempt the end, and never stand to doubt; Nothing’s so hard, but search will find it out.

 
 --Robert Herrick
 

It is an immutable law in business that words are words, explanations are explanations, promises are promises — but only performance is reality.

 
 --Harold S. Green
<feature> <supertitle>Objectives</supertitle>

In this chapter you’ll learn:

<objective>

To search for a given value in an array using the linear search and binary search algorithm.

</objective>
<objective>

To sort arrays using the iterative selection and insertion sort algorithms.

</objective>
<objective>

To sort arrays using the recursive merge sort algorithm.

</objective>
<objective>

To determine the efficiency of searching and sorting algorithms.

</objective>
</feature>
<feature> <supertitle>Outline</supertitle> </feature>

Introduction

Searching data involves determining whether a value (referred to as the search key) is present in the data and, if so, finding the value’s location. Two popular search algorithms are the simple linear search and the faster, but more complex, binary search. Sorting places data in order, based on one or more sort keys. A list of names could be sorted alphabetically, bank accounts could be sorted by account number, employee payroll records could be sorted by social security number and so on. This chapter introduces two simple sorting algorithms, the selection sort and the insertion sort, along with the more efficient, but more complex, merge sort. Figure 20.1 summarizes the searching and sorting algorithms discussed in this book.

Table 20.1. Searching and sorting capabilities in this text.

Chapter

Algorithm

Location

Searching Algorithms:

20

Linear Search

Section 20.2.1

 

Binary Search

Section 20.2.2

 

Recursive Linear Search

Exercise 20.8

 

Recursive Binary Search

Exercise 20.9

23

BinarySearch method of class Array

Fig. 23.3

 

Contains method of classes List<T> and Stack<T>

Fig. 23.4

 

ContainsKey method of class Dictionary<K, T>

Fig. 23.7

Sorting Algorithms:

20

Selection Sort

Section 20.3.1

 

Insertion Sort

Section 20.3.2

 

Recursive Merge Sort

Section 20.3.3

 

Bubble Sort

Exercises 20.520.6

 

Bucket Sort

Exercise 20.7

 

Recursive Quicksort

Exercise 20.10

20, 23

Sort method of classes Array and List<T>

Figs. 20.4, 23.323.4

Searching Algorithms

Looking up a phone number, accessing a website and checking the definition of a word in a dictionary all involve searching large amounts of data. The next two sections discuss two common search algorithms—one that is easy to program yet relatively inefficient and one that is relatively efficient but more complex to program.

Linear Search

The linear search algorithm searches each element in an array sequentially. If the search key does not match an element in the array, the algorithm tests each element and, when the end of the array is reached, informs the user that the search key is not present. If the search key is in the array, the algorithm tests each element until it finds one that matches the search key and returns the index of that element.

As an example, consider an array containing the following values

34 56 2 10 77 51 93 30 5 52

and a method that is searching for 51. Using the linear search algorithm, the method first checks whether 34 matches the search key. It does not, so the algorithm checks whether 56 matches the search key. The method continues moving through the array sequentially, testing 2, then 10, then 77. When the method tests 51, which matches the search key, the method returns the index 5, which is the location of 51 in the array. If, after checking every array element, the method determines that the search key does not match any element in the array, the method returns a sentinel value (e.g., -1). If there are duplicate values in the array, linear search returns the index of the first element in the array that matches the search key.

Figure 20.2 declares class LinearArray. This class has a private instance variable data (an array of ints), and a static Random object named generator to fill the array with randomly generated ints. When an object of class LinearArray is instantiated, the constructor (lines 12–19) creates and initializes the array data with random ints in the range 1099.

Example 20.2. Class that contains an array of random integers and a method that searches that array sequentially.

 1   // Fig. 20.2: LinearArray.cs
 2   // Class that contains an array of random integers and a method
 3   // that searches that array sequentially.
 4   using System;
 5
 6   public class LinearArray
 7   {
 8      private int[] data; // array of values
 9      private static Random generator = new Random();
10
11      // create array of given size and fill with random integers
12      public LinearArray( int size )
13      {
14         data = new int[ size ]; // create space for array
15
16         // fill array with random ints in range 10-99
17         for ( int i = 0; i < size; i++ )
18            data[ i ] = generator.Next( 10, 100 );
19      } // end LinearArray constructor
20
21      // perform a linear search on the data                
22      public int LinearSearch( int searchKey )              
23      {                                                     
24         // loop through array sequentially                 
25         for ( int index = 0; index < data.Length; index++ )
26            if ( data[ index ] == searchKey )               
27               return index; // return index of integer     
28                                                            
29         return -1; // integer was not found                
30      } // end method LinearSearch                          
31
32      // method to output values in array
33      public override string ToString()
34      {
35         string temporary = string.Empty;
36
37         // iterate through array
38         foreach ( int element in data )
39            temporary += element + " ";
40
41         temporary += "
"; // add newline character
42         return temporary;
43      } // end method ToString
44   } // end class LinearArray

Lines 22–30 perform the linear search. The search key is passed to parameter searchKey. Lines 25–27 loop through the elements in the array. Line 26 compares each element in the array with searchKey. If the values are equal, line 27 returns the index of the element. If the loop ends without finding the value, line 29 returns -1. Lines 33–43 declare method ToString, which returns a string representation of the array for printing.

Figure 20.3 creates LinearArray object searchArray containing an array of 10 ints (line 13) and allows the user to search the array for specific elements. Lines 17–18 prompt the user for the search key and store it in searchInt. Lines 21–37 loop until the user enters the sentinel value -1. The array holds ints from 1099 (line 18 of Fig. 20.2). Line 24 calls the LinearSearch method to determine whether searchInt is in the array. If searchInt is found, LinearSearch returns the position of the element, which the method outputs in lines 27–29. If searchInt is not in the array, LinearSearch returns -1, and the method notifies the user (lines 31–32). Lines 35–36 retrieve the next integer from the user.

Example 20.3. Sequentially search an array for an item.

 1   // Fig. 20.3: LinearSearchTest.cs
 2   // Sequentially search an array for an item.
 3   using System;
 4
 5   public class LinearSearchTest
 6   {
 7      public static void Main( string[] args )
 8      {
 9         int searchInt; // search key
10         int position; // location of search key in array
11
12         // create array and output it
13         LinearArray searchArray = new LinearArray( 10 );
14         Console.WriteLine( searchArray ); // print array
15
16         // input first int from user
17         Console.Write( "Please enter an integer value (-1 to quit): " );
18         searchInt = Convert.ToInt32( Console.ReadLine() );
19
20         // repeatedly input an integer; -1 terminates the application
21         while ( searchInt != -1 )
22         {
23            // perform linear search
24            position = searchArray.LinearSearch( searchInt );
25
26            if ( position != -1 ) // integer was not found
27               Console.WriteLine(
28                  "The integer {0} was found in position {1}.
",
29                  searchInt, position );
30            else // integer was found
31               Console.WriteLine( "The integer {0} was not found.
",
32                  searchInt );
33
34            // input next int from user
35            Console.Write( "Please enter an integer value (-1 to quit): " );
36            searchInt = Convert.ToInt32( Console.ReadLine() );
37         } // end while
38      } // end Main
39   } // end class LinearSearchTest
64 90 84 62 28 68 55 27 78 73

Please enter an integer value (-1 to quit): 78
The integer 78 was found in position 8.

Please enter an integer value (-1 to quit): 64
The integer 64 was found in position 0.

Please enter an integer value (-1 to quit): 65
The integer 65 was not found.

Please enter an integer value (-1 to quit): -1

Efficiency of Linear Search

Searching algorithms all accomplish the same goal—finding an element that matches a given search key, if such an element exists. Many things, however, differentiate search algorithms from one another. The major difference is the amount of effort required to complete the search. One way to describe this effort is with Big O notation, which is a measure of the worst-case runtime for an algorithm—that is, how hard an algorithm may have to work to solve a problem. For searching and sorting algorithms, this is particularly dependent on how many elements there are in the data set and the algorithm used.

Suppose an algorithm is designed to test whether the first element of an array is equal to the second element. If the array has 10 elements, this algorithm requires one comparison. If the array has 1,000 elements, this algorithm still requires one comparison. In fact, this algorithm is completely independent of the number of elements in the array, and is thus said to have a constant runtime, which is represented in Big O notation as O(1). An algorithm that is O(1) does not necessarily require only one comparison. O(1) just means that the number of comparisons is constant—it does not grow as the size of the array increases. An algorithm that tests whether the first element of an array is equal to any of the next three elements is still O(1), even though it requires three comparisons.

An algorithm that tests whether the first element of an array is equal to any of the other elements of the array will require at most n − 1 comparisons, where n is the number of elements in the array. If the array has 10 elements, this algorithm requires up to nine comparisons. If the array has 1,000 elements, this algorithm requires up to 999 comparisons. As n grows larger, the n part of the expression “dominates,” and subtracting one becomes inconsequential. Big O is designed to highlight these dominant terms and ignore terms that become unimportant as n grows. For this reason, an algorithm that requires a total of n − 1 comparisons (such as the one we described earlier) is said to be O(n). An O(n) algorithm is referred to as having a linear runtime. O(n) is often pronounced “on the order of n” or more simply “order n.”

Now suppose you have an algorithm that tests whether any element of an array is duplicated elsewhere in the array. The first element must be compared with every other element in the array. The second element must be compared with every other element except the first (it was already compared to the first). The third element must be compared with every other element except the first two. In the end, this algorithm will end up making (n − 1) + (n − 2) +...+ 2 + 1 or n2/2 − n/2 comparisons. As n increases, the n2 term dominates and the n term becomes inconsequential. Again, Big O notation highlights the n2 term, leaving n2/2. But as we’ll soon see, constant factors are omitted in Big O notation.

Big O is concerned with how an algorithm’s runtime grows in relation to the number of items processed. Suppose an algorithm requires n2 comparisons. With four elements, the algorithm will require 16 comparisons; with eight elements, the algorithm will require 64 comparisons. With this algorithm, doubling the number of elements quadruples the number of comparisons. Consider a similar algorithm requiring n2/2 comparisons. With four elements, the algorithm will require eight comparisons; with eight elements, 32 comparisons. Again, doubling the number of elements quadruples the number of comparisons. Both of these algorithms grow as the square of n, so Big O ignores the constant, and both algorithms are considered to be O(n2), referred to as quadratic runtime and pronounced “on the order of n-squared” or more simply “order n-squared.”

When n is small, O(n2) algorithms (running on today’s billions-of-operations-persecond personal computers) will not noticeably affect performance. But as n grows, you’ll start to notice the performance degradation. An O(n2) algorithm running on a million-element array would require a trillion “operations” (where each could actually require several machine instructions to execute). This could require many minutes to execute. A billion-element array would require a quintillion operations, a number so large that the algorithm could take decades! O(n2) algorithms are easy to write, as you’ll see shortly. You’ll also see algorithms with more favorable Big O measures. These efficient algorithms often take more cleverness and effort to create, but their superior performance can be well worth the extra effort, especially as n gets large and algorithms are compounded into larger applications.

The linear search algorithm runs in O(n) time. The worst case in this algorithm is that every element must be checked to determine whether the search item exists in the array. If the size of the array is doubled, the number of comparisons that the algorithm must perform is also doubled. Linear search can provide outstanding performance if the element matching the search key happens to be at or near the front of the array. But we seek algorithms that perform well, on average, across all searches, including those where the element matching the search key is near the end of the array.

Linear search is the easiest search algorithm to program, but it can be slow compared to other search algorithms. If an application needs to perform many searches on large arrays, it may be better to implement a different, more efficient algorithm, such as the binary search, which we present in the next section.

Performance Tip 20.1

Performance Tip 20.1

Sometimes the simplest algorithms perform poorly. Their virtue is that they’re easy to program, test and debug. Sometimes more complex algorithms are required to realize maximum performance.

Binary Search

The binary search algorithm is more efficient than the linear search algorithm, but it requires that the array be sorted. The first iteration of this algorithm tests the middle element in the array. If this matches the search key, the algorithm ends. Assuming the array is sorted in ascending order, if the search key is less than the middle element, the search key cannot match any element in the second half of the array and the algorithm continues with only the first half of the array (i.e., the first element up to, but not including, the middle element). If the search key is greater than the middle element, the search key cannot match any element in the first half of the array, and the algorithm continues with only the second half of the array (i.e., the element after the middle element through the last element). Each iteration tests the middle value of the remaining portion of the array, called a subarray. A subarray can have no elements, or it can encompass the entire array. If the search key does not match the element, the algorithm eliminates half of the remaining elements. The algorithm ends by either finding an element that matches the search key or reducing the subarray to zero size.

As an example, consider the sorted 15-element array

2  3  5  10  27  30  34  51  56  65  77  81  82  93  99

and a search key of 65. An application implementing the binary search algorithm would first check whether 51 is the search key (because 51 is the middle element of the array). The search key (65) is larger than 51, so 51 is “discarded” (i.e., eliminated from consideration) along with the first half of the array (all elements smaller than 51.) Next, the algorithm checks whether 81 (the middle element of the remainder of the array) matches the search key. The search key (65) is smaller than 81, so 81 is discarded along with the elements larger than 81. After just two tests, the algorithm has narrowed the number of values to check to three (56, 65 and 77). The algorithm then checks 65 (which indeed matches the search key) and returns the index of the array element containing 65. This algorithm required just three comparisons to determine whether the search key matched an element of the array. Using a linear search algorithm would have required 10 comparisons. [Note: In this example, we have chosen to use an array with 15 elements so that there will always be an obvious middle element in the array. With an even number of elements, the middle of the array lies between two elements. We implement the algorithm to choose the higher of the two elements.]

Figure 20.4 declares class BinaryArray. This class is similar to LinearArray—it has a private instance variable data (an array of ints), a static Random object named generator to fill the array with randomly generated ints, a constructor, a search method (BinarySearch), a RemainingElements method (which creates a string containing the elements not yet searched) and a ToString method. Lines 12–21 declare the constructor. After initializing the array with random ints from 1099 (lines 17–18), line 20 calls method Array.Sort on the array data. Method Sort is a static method of class Array that sorts the elements in an array in ascending order. Recall that the binary search algorithm works only on sorted arrays.

Example 20.4. Class that contains an array of random integers and a method that uses binary search to find an integer.

 1   // Fig. 20.4: BinaryArray.cs
 2   // Class that contains an array of random integers and a method
 3   // that uses binary search to find an integer.
 4   using System;
 5
 6   public class BinaryArray
 7   {
 8      private int[] data; // array of values
 9      private static Random generator = new Random();
10
11      // create array of given size and fill with random integers
12      public BinaryArray( int size )
13      {
14         data = new int[ size ]; // create space for array
15
16         // fill array with random ints in range 10-99
17         for ( int i = 0; i < size; i++ )
18            data[ i ] = generator.Next( 10, 100 );
19
20         Array.Sort( data );
21      } // end BinaryArray constructor
22
23      // perform a binary search on the data                          
24      public int BinarySearch( int searchElement )                    
25      {                                                               
26         int low = 0; // low end of the search area                   
27         int high = data.Length - 1; // high end of the search area   
28         int middle = ( low + high + 1 ) / 2; // middle element       
29         int location = -1; // return value; -1 if not found          
30                                                                      
31         do // loop to search for element                             
32         {                                                            
33            // print remaining elements of array                      
34            Console.Write( RemainingElements( low, high ) );          
35                                                                      
36            // output spaces for alignment                            
37            for ( int i = 0; i < middle; i++ )                        
38               Console.Write( " " );                                  
39                                                                      
40            Console.WriteLine( " * " ); // indicate current middle    
41                                                                      
42            // if the element is found at the middle                  
43            if ( searchElement == data[ middle ] )                    
44               location = middle; // location is the current middle   
45                                                                      
46            // middle element is too high                             
47            else if ( searchElement < data[ middle ] )                
48               high = middle - 1; // eliminate the higher half        
49            else // middle element is too low                         
50               low = middle + 1; // eliminate the lower half          
51                                                                      
52            middle = ( low + high + 1 ) / 2; // recalculate the middle
53         } while ( ( low <= high ) && ( location == -1 ) );           
54                                                                      
55         return location; // return location of search key            
56      } // end method BinarySearch                                    
57
58      // method to output certain values in array
59      public string RemainingElements( int low, int high )
60      {
61         string temporary = string.Empty;
62
63         // output spaces for alignment
64         for ( int i = 0; i < low; i++ )
65            temporary += " ";
66
67         // output elements left in array
68         for ( int i = low; i <= high; i++ )
69            temporary += data[ i ] + " ";
70
71         temporary += "
";
72         return temporary;
73      } // end method RemainingElements
74
75      // method to output values in array
76      public override string ToString()
77      {
78         return RemainingElements( 0, data.Length - 1 );
79      } // end method ToString
80   } // end class BinaryArray

Lines 24–56 declare method BinarySearch. The search key is passed into parameter searchElement (line 24). Lines 26–28 calculate the low end index, high end index and middle index of the portion of the array that the application is currently searching. At the beginning of the method, the low end is 0, the high end is the length of the array minus 1 and the middle is the average of these two values. Line 29 initializes the location of the element to -1—the value that will be returned if the element is not found. Lines 31–53 loop until low is greater than high (this occurs when the element is not found) or location does not equal -1 (indicating that the search key was found). Line 43 tests whether the value in the middle element is equal to searchElement. If this is true, line 44 assigns middle to location. Then the loop terminates, and location is returned to the caller. Each iteration of the loop tests a single value (line 43) and eliminates half of the remaining values in the array (line 48 or 50).

Lines 22–40 of Fig. 20.5 loop until the user enters -1. For each other number the user enters, the application performs a binary search to determine whether the number matches an element in the array. The first line of output from this application is the array of ints, in increasing order. When the user instructs the application to search for 72, the application first tests the middle element (indicated by * in the sample output of Fig. 20.5), which is 52. The search key is greater than 52, so the application eliminates from consideration the first half of the array and tests the middle element from the second half. The search key is smaller than 82, so the application eliminates from consideration the second half of the subarray, leaving only three elements. Finally, the application checks 72 (which matches the search key) and returns the index 9.

Example 20.5. Using binary search to locate an item in an array.

 1   // Fig. 20.5: BinarySearchTest.cs
 2   // Using binary search to locate an item in an array.
 3   using System;
 4
 5   public class BinarySearchTest
 6   {
 7      public static void Main( string[] args )
 8      {
 9         int searchInt; // search key
10         int position; // location of search key in array
11
12         // create array and output it
13         BinaryArray searchArray = new BinaryArray( 15 );
14         Console.WriteLine( searchArray );
15
16         // prompt and input first int from user
17         Console.Write( "Please enter an integer value (-1 to quit): " );
18         searchInt = Convert.ToInt32( Console.ReadLine() );
19         Console.WriteLine();
20
21         // repeatedly input an integer; -1 terminates the application
22         while ( searchInt != -1 )
23         {
24            // use binary search to try to find integer
25            position = searchArray.BinarySearch( searchInt );
26
27            // return value of -1 indicates integer was not found
28            if ( position == -1 )
29               Console.WriteLine( "The integer {0} was not found.
",
30                  searchInt );
31            else
32               Console.WriteLine(
33                  "The integer {0} was found in position {1}.
",
34                  searchInt, position);
35
36            // prompt and input next int from user
37            Console.Write( "Please enter an integer value (-1 to quit): " );
38            searchInt = Convert.ToInt32( Console.ReadLine() );
39            Console.WriteLine();
40         } // end while
41      } // end Main
42   } // end class BinarySearchTest

12 17 22 25 30 39 40 52 56 72 76 82 84 91 93

Please enter an integer value (-1 to quit): 72
12 17 22 25 30 39 40 52 56 72 76 82 84 91 93
                      *
                        56 72 76 82 84 91 93
                                  *
                        56 72 76
                            *
The integer 72 was found in position 9.
Please enter an integer value (-1 to quit): 13
12 17 22 25 30 39 40 52 56 72 76 82 84 91 93
                      *
12 17 22 25 30 39 40
          *
12 17 22
    *
12
 *
The integer 13 was not found.
Please enter an integer value (-1 to quit): -1

Efficiency of Binary Search

In the worst-case scenario, searching a sorted array of 1,023 elements will take only 10 comparisons when using a binary search. Repeatedly dividing 1,023 by 2 (because after each comparison, we are able to eliminate half of the array) and rounding down (because we also remove the middle element) yields the values 511, 255, 127, 63, 31, 15, 7, 3, 1 and 0. The number 1023 (210 − 1) is divided by 2 only 10 times to get the value 0, which indicates that there are no more elements to test. Dividing by 2 is equivalent to one comparison in the binary search algorithm. Thus, an array of 1,048,575 (220 − 1) elements takes a maximum of 20 comparisons to find the key, and an array of one billion elements (which is less than 230 − 1) takes a maximum of 30 comparisons to find the key. This is a tremendous improvement in performance over the linear search. For a one-billion-element array, this is a difference between an average of 500 million comparisons for the linear search and a maximum of only 30 comparisons for the binary search! The maximum number of comparisons needed for the binary search of any sorted array is the exponent of the first power of 2 greater than the number of elements in the array, which is represented as log2 n. All logarithms grow at roughly the same rate, so in Big O notation the base can be omitted. This results in a big O of O(log n) for a binary search, which is also known as logarithmic runtime.

Sorting Algorithms

Sorting data (i.e., placing the data in some particular order, such as ascending or descending) is one of the most important computing applications. A bank sorts all checks by account number so that it can prepare individual bank statements at the end of each month. Telephone companies sort their lists of accounts by last name and, further, by first name to make it easy to find phone numbers. Virtually every organization must sort some data—often, massive amounts of it. Sorting data is an intriguing, compute-intensive problem that has attracted substantial research efforts.

It’s important to understand about sorting that the end result—the sorted array—will be the same no matter which (correct) algorithm you use to sort the array. The choice of algorithm affects only the runtime and memory use of the application. The rest of the chapter introduces three common sorting algorithms. The first two—selection sort and insertion sort—are simple to program, but inefficient. The last—merge sort—is much faster than selection sort and insertion sort but more difficult to program. We focus on sorting arrays of simple-type data, namely ints. It’s possible to sort arrays of objects as well—we discuss this in Chapter 23, Collections.

Selection Sort

Selection sort is a simple, but inefficient, sorting algorithm. The first iteration of the algorithm selects the smallest element in the array and swaps it with the first element. The second iteration selects the second-smallest element (which is the smallest of the remaining elements) and swaps it with the second element. The algorithm continues until the last iteration selects the second-largest element and, if necessary, swaps it with the second-to-last element, leaving the largest element in the last position. After the ith iteration, the smallest i elements of the array will be sorted in increasing order in the first i positions of the array.

As an example, consider the array

34  56  4  10  77  51  93  30  5  52

An application that implements selection sort first determines the smallest element (4) of this array, which is contained in index 2 (i.e., position 3). The application swaps 4 with 34, resulting in

4  56  34  10  77  51  93  30  5  52

The application then determines the smallest value of the remaining elements (all elements except 4), which is 5, contained in index 8. The application swaps 5 with 56, resulting in

4  5  34  10  77  51  93  30  56  52

On the third iteration, the application determines the next smallest value (10) and swaps it with 34.

4  5  10  34  77  51  93  30  56  52

The process continues until the array is fully sorted.

4  5  10  30  34  51  52  56  77  93

After the first iteration, the smallest element is in the first position. After the second iteration, the two smallest elements are in order in the first two positions. After the third iteration, the three smallest elements are in order in the first three positions.

Figure 20.6 declares class SelectionSort, which has an instance variable data (an array of ints) and a static Random object generator to generate random integers to fill the array. When an object of class SelectionSort is instantiated, the constructor (lines 12–19) creates and initializes array data with random ints in the range 1099.

Example 20.6. Class that creates an array filled with random integers. Provides a method to sort the array with selection sort.

 1   // Fig. 20.6: SelectionSort.cs
 2   // Class that creates an array filled with random integers.
 3   // Provides a method to sort the array with selection sort.
 4   using System;
 5
 6   public class SelectionSort
 7   {
 8      private int[] data; // array of values
 9      private static Random generator = new Random();
10
11      // create array of given size and fill with random integers
12      public SelectionSort( int size )
13      {
14         data = new int[ size ]; // create space for array
15
16         // fill array with random ints in range 10-99
17         for ( int i = 0; i < size; i++ )
18            data[ i ] = generator.Next( 10, 100 );
19      } // end SelectionSort constructor
20
21      // sort array using selection sort                               
22      public void Sort()                                               
23      {                                                                
24         int smallest; // index of smallest element                    
25                                                                       
26         // loop over data.Length - 1 elements                         
27         for ( int i = 0; i < data.Length - 1; i++ )                   
28         {                                                             
29            smallest = i; // first index of remaining array            
30                                                                       
31            // loop to find index of smallest element                  
32            for ( int index = i + 1; index < data.Length; index++ )    
33               if ( data[ index ] < data[ smallest ] )                 
34                  smallest = index;                                    
35                                                                       
36            Swap( i, smallest ); // swap smallest element into position
37            PrintPass( i + 1, smallest ); // output pass of algorithm  
38         } // end outer for                                            
39      } // end method Sort                                             
40
41      // helper method to swap values in two elements                  
42      public void Swap( int first, int second )                        
43      {                                                                
44         int temporary = data[ first ]; // store first in temporary    
45         data[ first ] = data[ second ]; // replace first with second  
46         data[ second ] = temporary; // put temporary in second        
47      } // end method Swap                                             
48
49      // print a pass of the algorithm
50      public void PrintPass( int pass, int index )
51      {
52         Console.Write( "after pass {0}: ", pass );
53
54         // output elements through the selected item
55         for ( int i = 0; i < index; i++ )
56            Console.Write( data[ i ] + " " );
57
58         Console.Write( data[ index ] + "* " ); // indicate swap
59
60         // finish outputting array
61         for ( int i = index + 1; i < data.Length; i++ )
62            Console.Write( data[ i ] + " " );
63
64         Console.Write( "
             " ); // for alignment
65
66         // indicate amount of array that is sorted
67         for( int j = 0; j < pass; j++ )
68            Console.Write( "-- " );
69         Console.WriteLine( "
" ); // skip a line in output
70      } // end method PrintPass
71
72      // method to output values in array
73      public override string ToString()
74      {
75         string temporary = string.Empty;
76
77         // iterate through array
78         foreach ( int element in data )
79            temporary += element + " ";
80
81         temporary += "
"; // add newline character
82         return temporary;
83      } // end method ToString
84   } // end class SelectionSort

Lines 22–39 declare the Sort method. Line 24 declares variable smallest, which will store the index of the smallest element in the remaining array. Lines 27–38 loop data.Length - 1 times. Line 29 initializes the index of the smallest element to the current item. Lines 32–34 loop over the remaining elements in the array. For each of these elements, line 33 compares its value to the value of the smallest element. If the current element is smaller than the smallest element, line 34 assigns the current element’s index to smallest. When this loop finishes, smallest will contain the index of the smallest element in the remaining array. Line 36 calls method Swap (lines 42–47) to place the smallest remaining element in the next spot in the array.

Line 10 of Fig. 20.7 creates a SelectionSort object with 10 elements. Line 13 implicitly calls method ToString to output the unsorted object. Line 15 calls method Sort (lines 22–39 of Fig. 20.6), which sorts the elements using selection sort. Then lines 17–18 output the sorted object. The output uses dashes to indicate the portion of the array that is sorted after each pass (lines 67–68). An asterisk is placed next to the position of the element that was swapped with the smallest element on that pass. On each pass, the element next to the asterisk and the element above the rightmost set of dashes were the two values that were swapped.

Example 20.7. Testing the selection sort class.

 1   // Fig. 20.7: SelectionSortTest.cs
 2   // Testing the selection sort class.
 3   using System;
 4
 5   public class SelectionSortTest
 6   {
 7      public static void Main( string[] args )
 8      {
 9         // create object to perform selection sort
10         SelectionSort sortArray = new SelectionSort( 10 );
11
12         Console.WriteLine( "Unsorted array:" );
13         Console.WriteLine( sortArray ); // print unsorted array
14
15         sortArray.Sort(); // sort array
16
17         Console.WriteLine( "Sorted array:" );
18         Console.WriteLine( sortArray ); // print sorted array
19      } // end Main
20   } // end class SelectionSortTest
Unsorted array:
86  97  83  45  19 31  86  13  57  61

after pass 1: 13  97  83  45  19  31  86  86*  57  61
              --

after pass 2: 13  19  83  45  97*  31  86  86  57  61
              --  --

after pass 3: 13  19  31  45  97  83*  86  86  57  61
              --  --  --
after pass 4: 13  19  31  45*  97  83  86  86  57  61
              --  --  --  --

after pass 5: 13  19  31  45  57  83  86  86  97*  61
              --  --  --  --  --

after pass 6: 13  19  31  45  57  61  86  86  97  83*
              --  --  --  --  --  --

after pass 7: 13  19  31  45  57  61  83  86  97  86*
              --  --  --  --  --  --  --

after pass 8: 13  19  31  45  57  61  83  86*  97  86
              --  --  --  --  --  --  --  --

after pass 9: 13  19  31  45  57  61  83  86  86  97*
              --  --  --  --  --  --  --  --  --

Sorted array:
13  19  31  45  57  61  83  86  86  97

Efficiency of Selection Sort

The selection sort algorithm runs in O(n2) time. Method Sort in lines 22–39 of Fig. 20.6, which implements the selection sort algorithm, contains nested for loops. The outer for loop (lines 27–38) iterates over the first n − 1 elements in the array, swapping the smallest remaining element to its sorted position. The inner for loop (lines 32–34) iterates over each element in the remaining array, searching for the smallest. This loop executes n − 1 times during the first iteration of the outer loop, n − 2 times during the second iteration, then n − 3, ..., 3, 2, 1. This inner loop will iterate a total of n(n − 1) / 2 or (n2n)/2. In Big O notation, smaller terms drop out and constants are ignored, leaving a final Big O of O(n2).

Insertion Sort

Insertion sort is another simple, but inefficient, sorting algorithm. Its first iteration takes the second element in the array and, if it’s less than the first, swaps them. The second iteration looks at the third element and inserts it in the correct position with respect to the first two elements, so all three elements are in order. At the ith iteration of this algorithm, the first i elements in the original array will be sorted.

Consider as an example the following array, which is identical to the array used in the discussions of selection sort and merge sort.

34  56  4  10  77  51  93  30  5  52

An application that implements the insertion sort algorithm first looks at the first two elements of the array, 34 and 56. These are already in order, so the application continues (if they were out of order, it would swap them).

In the next iteration, the application looks at the third value, 4. This value is less than 56, so the application stores 4 in a temporary variable and moves 56 one element to the right. The application then checks and determines that 4 is less than 34, so it moves 34 one element to the right. The application has now reached the beginning of the array, so it places 4 in the zeroth position. The array now is

4  34  56  10  77  51  93  30  5  52

In the next iteration, the application stores the value 10 in a temporary variable. Then the application compares 10 to 56 and moves 56 one element to the right because it’s larger than 10. The application then compares 10 to 34, moving 34 one element to the right. When the application compares 10 to 4, it observes that 10 is larger than 4 and places 10 in element 1. The array now is

4  10  34  56  77  51  93  30  5  52

Using this algorithm, at the ith iteration, the original array’s first i elements are sorted, but they may not be in their final locations—smaller values may be located later in the array.

Figure 20.8 declares the InsertionSort class. Lines 22–46 declare the Sort method. Line 24 declares variable insert, which holds the element to be inserted while the other elements are moved. Lines 27–45 loop through data.Length - 1 items in the array. In each iteration, line 30 stores in variable insert the value of the element that will be inserted in the sorted portion of the array. Line 33 declares and initializes variable moveItem, which keeps track of where to insert the element. Lines 36–41 loop to locate the correct position to insert the element. The loop will terminate either when the application reaches the front of the array or when it reaches an element that is less than the value to be inserted. Line 39 moves an element to the right, and line 40 decrements the position at which to insert the next element. After the loop ends, line 43 inserts the element in place. Figure 20.9 is the same as Fig. 20.7 except that it creates and uses an InsertionSort object. The output of this application uses dashes to indicate the portion of the array that is sorted after each pass (lines 66–67 of Fig. 20.8). An asterisk is placed next to the element that was inserted in place on that pass.

Example 20.8. Class that creates an array filled with random integers. Provides a method to sort the array with insertion sort.

 1   // Fig. 20.8: InsertionSort.cs
 2   // Class that creates an array filled with random integers.
 3   // Provides a method to sort the array with insertion sort.
 4   using System;
 5
 6   public class InsertionSort
 7   {
 8      private int[] data; // array of values
 9      private static Random generator = new Random();
10
11      // create array of given size and fill with random integers
12      public InsertionSort( int size )
13      {
14         data = new int[ size ]; // create space for array
15
16         // fill array with random ints in range 10-99
17         for ( int i = 0; i < size; i++ )
18            data[ i ] = generator.Next( 10, 100 );
19      } // end InsertionSort constructor
20
21      // sort array using insertion sort                            
22      public void Sort()                                            
23      {                                                             
24         int insert; // temporary variable to hold element to insert
25                                                                    
26         // loop over data.Length - 1 elements                      
27         for ( int next = 1; next < data.Length; next++ )           
28         {                                                          
29            // store value in current element                       
30            insert = data[ next ];                                  
31                                                                    
32            // initialize location to place element                 
33            int moveItem = next;                                    
34                                                                    
35            // search for place to put current element              
36            while ( moveItem > 0 && data[ moveItem - 1 ] > insert ) 
37            {                                                       
38               // shift element right one slot                      
39               data[ moveItem ] = data[ moveItem - 1 ];             
40               moveItem--;                                          
41            } // end while                                          
42                                                                    
43            data[ moveItem ] = insert; // place inserted element    
44            PrintPass( next, moveItem ); // output pass of algorithm
45         } // end for                                               
46      } // end method Sort                                          

47
48      // print a pass of the algorithm
49      public void PrintPass( int pass, int index )
50      {
51         Console.Write( "after pass {0}: ", pass );
52
53         // output elements till swapped item
54         for ( int i = 0; i < index; i++ )
55            Console.Write( data[ i ] + "  " );
56
57         Console.Write( data[ index ] + "* " ); // indicate swap
58
59         // finish outputting array
60         for ( int i = index + 1; i < data.Length; i++ )
61            Console.Write( data[ i ] + "  " );
62
63         Console.Write( "
              " ); // for alignment
64
65         // indicate amount of array that is sorted
66         for( int i = 0; i <= pass; i++ )
67            Console.Write( "--  " );
68         Console.WriteLine( "
" ); // skip a line in output
69      } // end method PrintPass
70
71      // method to output values in array
72      public override string ToString()
73      {
74         string temporary = string.Empty;
75
76         // iterate through array
77         foreach ( int element in data )
78            temporary += element + "  ";
79
80         temporary += "
"; // add newline character
81         return temporary;
82      } // end method ToString
83   } // end class InsertionSort

Example 20.9. Testing the insertion sort class.

 1   // Fig. 20.9: InsertionSortTest.cs
 2   // Testing the insertion sort class.
 3   using System;
 4
 5   public class InsertionSortTest
 6   {
 7      public static void Main( string[] args )
 8      {
 9         // create object to perform insertion sort
10         InsertionSort sortArray = new InsertionSort( 10 );
11
12         Console.WriteLine( "Unsorted array:" );
13         Console.WriteLine( sortArray ); // print unsorted array
14
15         sortArray.Sort(); // sort array
16
17         Console.WriteLine( "Sorted array:" );
18         Console.WriteLine( sortArray ); // print sorted array
19      } // end Main
20   } // end class InsertionSortTest

Unsorted array:
12  27  36  28  33  92  11  93  59  62

after pass 1: 12  27*  36  28  33  92  11  93  59  62
              --  --

after pass 2: 12  27  36*  28  33  92  11  93  59  62
              --  --  --

after pass 3: 12  27  28*  36  33  92  11  93  59  62
              --  --  --   --

after pass 4: 12  27  28  33*  36  92  11  93  59  62
              --  --  --  --   --

after pass 5: 12  27  28  33  36  92*  11  93  59  62
              --  --  --  --  --  --
after pass 6: 11*  12  27  28  33  36  92  93  59  62
              --   --  --  --  --  --  --

after pass 7: 11  12  27  28  33  36  92  93*  59  62
              --  --  --  --  --  --  --  --

after pass 8: 11  12  27  28  33  36  59*  92  93  62
              --  --  --  --  --  --  --   --  --

after pass 9: 11  12  27  28  33  36  59  62*  92  93
              --  --  --  --  --  --  --  --   --  --

Sorted array:
11  12  27  28  33  36  59  62  92  93

Efficiency of Insertion Sort

The insertion sort algorithm also runs in O(n2) time. Like selection sort, the implementation of insertion sort (lines 22–46 of Fig. 20.8) contains nested loops. The for loop (lines 27–45) iterates data.Length - 1 times, inserting an element in the appropriate position in the elements sorted so far. For the purposes of this application, data.Length - 1 is equivalent to n − 1 (as data.Length is the size of the array). The while loop (lines 36–41) iterates over the preceding elements in the array. In the worst case, this while loop will require n − 1 comparisons. Each individual loop runs in O(n) time. In Big O notation, nested loops mean that you must multiply the number of iterations of each loop. For each iteration of an outer loop, there will be a certain number of iterations of the inner loop. In this algorithm, for each O(n) iterations of the outer loop, there will be O(n) iterations of the inner loop. Multiplying these values results in a Big O of O(n2).

Merge Sort

Merge sort is an efficient sorting algorithm but is conceptually more complex than selection sort and insertion sort. The merge sort algorithm sorts an array by splitting it into two equal-sized subarrays, sorting each subarray and merging them in one larger array. With an odd number of elements, the algorithm creates the two subarrays such that one has one more element than the other.

The implementation of merge sort in this example is recursive. The base case is an array with one element. A one-element array is, of course, sorted, so merge sort immediately returns when it’s called with a one-element array. The recursion step splits an array in two approximately equal-length pieces, recursively sorts them and merges the two sorted arrays in one larger, sorted array.

Suppose the algorithm has already merged smaller arrays to create sorted arrays A:

4  10  34  56  77

and B:

5  30  51  52  93

Merge sort combines these two arrays in one larger, sorted array. The smallest element in A is 4 (located in the zeroth element of A). The smallest element in B is 5 (located in the zeroth element of B). In order to determine the smallest element in the larger array, the algorithm compares 4 and 5. The value from A is smaller, so 4 becomes the first element in the merged array. The algorithm continues by comparing 10 (the second element in A) to 5 (the first element in B). The value from B is smaller, so 5 becomes the second element in the larger array. The algorithm continues by comparing 10 to 30, with 10 becoming the third element in the array, and so on.

Lines 22–25 of Fig. 20.10 declare the Sort method. Line 24 calls method SortArray with 0 and data.Length - 1 as the arguments—these are the beginning and ending indices of the array to be sorted. These values tell method SortArray to operate on the entire array.

Example 20.10. Class that creates an array filled with random integers. Provides a method to sort the array with merge sort.

 1   // Fig. 20.10: MergeSort.cs
 2   // Class that creates an array filled with random integers.
 3   // Provides a method to sort the array with merge sort.
 4   using System;
 5
 6   public class MergeSort
 7   {
 8      private int[] data; // array of values
 9      private static Random generator = new Random();
10
11      // create array of given size and fill with random integers
12      public MergeSort( int size )
13      {
14         data = new int[ size ]; // create space for array
15
16         // fill array with random ints in range 10-99
17         for ( int i = 0; i < size; i++ )
18            data[ i ] = generator.Next( 10, 100 );
19      } // end MergeSort constructor
20
21      // calls recursive SortArray method to begin merge sorting
22      public void Sort()                                      
23      {                                                       
24         SortArray( 0, data.Length - 1 ); // sort entire array
25      } // end method Sort                                    
26
27      // splits array, sorts subarrays and merges subarrays into sorted array
28      private void SortArray( int low, int high )                         
29      {                                                                   
30         // test base case; size of array equals 1                        
31         if ( ( high - low ) >= 1 ) // if not base case                   
32         {                                                                
33            int middle1 = ( low + high ) / 2; // calculate middle of array
34            int middle2 = middle1 + 1; // calculate next element over     
35                                                                          
36            // output split step                                          
37            Console.WriteLine( "split:   " + Subarray( low, high ) );     
38            Console.WriteLine( "         " + Subarray( low, middle1 ) );  
39            Console.WriteLine( "         " + Subarray( middle2, high ) ); 
40            Console.WriteLine();                                          
41                                                                          
42            // split array in half; sort each half (recursive calls)      
43            SortArray( low, middle1 ); // first half of array             
44            SortArray( middle2, high ); // second half of array           
45                                                                          
46            // merge two sorted arrays after split calls return           
47            Merge( low, middle1, middle2, high );                         
48         } // end if                                                      
49      } // end method SortArray                                           
50
51      // merge two sorted subarrays into one sorted subarray             
52      private void Merge( int left, int middle1, int middle2, int right )
53      {                                                                  
54         int leftIndex = left; // index into left subarray               
55         int rightIndex = middle2; // index into right subarray          
56         int combinedIndex = left; // index into temporary working array 
57         int[] combined = new int[ data.Length ]; // working array       
58                                                                         
59         // output two subarrays before merging                          
60         Console.WriteLine( "merge:   " + Subarray( left, middle1 ) );   
61         Console.WriteLine( "         " + Subarray( middle2, right ) );  
62                                                                         
63         // merge arrays until reaching end of either                    
64         while ( leftIndex <= middle1 && rightIndex <= right )           
65         {                                                               
66            // place smaller of two current elements into result         
67            // and move to next space in arrays                          
68            if ( data[ leftIndex ] <= data[ rightIndex ] )               
69               combined[ combinedIndex++ ] = data[ leftIndex++ ];        
70            else                                                         
71               combined[ combinedIndex++ ] = data[ rightIndex++ ];       
72         } // end while                                                  
73                                                                         
74         // if left array is empty                                       
75         if ( leftIndex == middle2 )                                     
76            // copy in rest of right array                               
77            while ( rightIndex <= right )                                
78               combined[ combinedIndex++ ] = data[ rightIndex++ ];       
79         else // right array is empty                                    
80            // copy in rest of left array                                
81            while ( leftIndex <= middle1 )                               
82               combined[ combinedIndex++ ] = data[ leftIndex++ ];        
83                                                                         
84         // copy values back into original array                         
85         for ( int i = left; i <= right; i++ )                           
86            data[ i ] = combined[ i ];                                   
87                                                                         
88         // output merged array                                          
89         Console.WriteLine( "         " + Subarray( left, right ) );     
90         Console.WriteLine();                                            
91      } // end method Merge                                              
92
93      // method to output certain values in array
94      public string Subarray( int low, int high )
95      {
96         string temporary = string.Empty;
97
98         // output spaces for alignment
99         for ( int i = 0; i < low; i++ )
100           temporary += " ";
101
102        // output elements left in array
103        for ( int i = low; i <= high; i++ )
104           temporary += " " + data[ i ];
105
106        return temporary;
107     } // end method Subarray
108
109     // method to output values in array
110     public override string ToString()
111     {
112        return Subarray( 0, data.Length - 1 );
113     } // end method ToString
114  } // end class MergeSort

Method SortArray is declared in lines 28–49. Line 31 tests the base case. If the size of the array is 1, the array is already sorted, so the method simply returns immediately. If the size of the array is greater than 1, the method splits the array in two, recursively calls method SortArray to sort the two subarrays and merges them. Line 43 recursively calls method SortArray on the first half of the array, and line 44 recursively calls method SortArray on the second half of the array. When these two method calls return, each half of the array has been sorted. Line 47 calls method Merge (lines 52–91) on the two halves of the array to combine the two sorted arrays in one larger sorted array.

Lines 64–72 in method Merge loop until the application reaches the end of either subarray. Line 68 tests which element at the beginning of the arrays is smaller. If the element in the left array is smaller or equal, line 69 places it in position in the combined array. If the element in the right array is smaller, line 71 places it in position in the combined array. When the while loop has completed (line 72), one entire subarray is placed in the combined array, but the other subarray still contains data. Line 75 tests whether the left array has reached the end. If so, lines 77–78 fill the combined array with the remaining elements of the right array. If the left array has not reached the end, then the right array has, and lines 81–82 fill the combined array with the remaining elements of the left array. Finally, lines 85–86 copy the combined array into the original array. Figure 20.11 creates and uses a MergeSort object. The output from this application displays the splits and merges performed by merge sort, showing the progress of the sort at each step of the algorithm.

Example 20.11. Testing the merge sort class.

 1   // Fig. 20.11: MergeSortTest.cs
 2   // Testing the merge sort class.
 3   using System;
 4
 5   public class MergeSortTest
 6   {
 7      public static void Main( string[] args )
 8      {
 9         // create object to perform merge sort
10         MergeSort sortArray = new MergeSort( 10 );
11
12         // print unsorted array
13         Console.WriteLine( "Unsorted: {0}
", sortArray );
14
15         sortArray.Sort(); // sort array
16
17         // print sorted array
18         Console.WriteLine( "Sorted: {0}", sortArray );
19      } // end Main
20   } // end class MergeSortTest

Unsorted: 36 38 81 93 85 72 31 11 33 74

split:   36 38 81 93 85 72 31 11 33 74
         36 38 81 93 85
                        72 31 11 33 74

split:   36 38 81 93 85
         36 38 81
                  93 85
split:   36 38 81
         36 38
               81
split:   36 38
         36
            38

merge:   36
            38
         36 38

merge:   36 38
               81
         36 38 81
split:            93 85
                  93
                     85
merge:            93
                     85
                  85 93
merge:   36 38 81
                  85 93
         36 38 81 85 93
split:                    72 31 11 33 74
                          72 31 11
                                   33 74
split:                    72 31 11
                          72 31
                                11
split:                    72 31
                          72
                             31
merge:                    72
                             31
                          31 72
merge:                    31 72
                                11
                          11 31 72
split:                             33 74
                                   33
                                      74
merge:                             33
                                      74
                                   33 74
merge:                    11 31 72
                                   33 74
                          11 31 33 72 74
merge:     36 38 81 85 93
                          11 31 33 72 74
           11 31 33 36 38 72 74 81 85 93
Sorted:   11 31 33 36 38 72 74 81 85 93

Efficiency of Merge Sort

Merge sort is a far more efficient algorithm than either insertion sort or selection sort when sorting large sets of data. Consider the first (nonrecursive) call to method SortArray. This results in two recursive calls to method SortArray with subarrays each approximately half the size of the original array, and a single call to method Merge. This call to method Merge requires, at worst, n − 1 comparisons to fill the original array, which is O(n). (Recall that each element in the array can be chosen by comparing one element from each of the subarrays.) The two calls to method SortArray result in four more recursive calls to SortArray, each with a subarray approximately a quarter the size of the original array, along with two calls to method Merge. These two calls to method Merge each require, at worst, n/2 − 1 comparisons for a total number of comparisons of (n/2 − 1) + (n/2 − 1) = n − 2, which is O(n). This process continues, each call to SortArray generating two additional calls to method SortArray and a call to Merge, until the algorithm has split the array into one-element subarrays. At each level, O(n) comparisons are required to merge the subarrays. Each level splits the size of the arrays in half, so doubling the size of the array requires only one more level. Quadrupling the size of the array requires only two more levels. This pattern is logarithmic and results in log2 n levels. This results in a total efficiency of O(n log n).

Summary of the Efficiency of Searching and Sorting Algorithms

Figure 20.12 summarizes many of the searching and sorting algorithms covered in this book and lists the Big O of each. Figure 20.13 lists the Big O values covered in this chapter, along with a number of values for n to highlight the differences in the growth rates.

Table 20.12. Searching and sorting algorithms with Big O values.

Algorithm

Location

Big O

Searching Algorithms:

  

Linear Search

Section 20.2.1

O(n)

Binary Search

Section 20.2.2

O(log n)

Recursive Linear Search

Exercise 20.8

O(n)

Recursive Binary Search

Exercise 20.9

O(log n)

Sorting Algorithms:

  

Selection Sort

Section 20.3.1

O(n2)

Insertion Sort

Section 20.3.2

O(n2)

Merge Sort

Section 20.3.3

O(n log n)

Bubble Sort

Exercises 20.520.6

O(n2)

Table 20.13. Number of comparisons for common Big O notations.

n =

O(log n)

O(n)

O(n log n)

O(n2)

1

0

1

0

1

2

1

2

2

4

3

1

3

3

9

4

1

4

4

16

5

1

5

5

25

10

1

10

10

100

100

2

100

200

10000

1,000

3

1000

3000

106

1,000,000

6

1000000

6000000

1012

1,000,000,000

9

1000000000

9000000000

1018

Wrap-Up

In this chapter, you learned how to search for items in arrays and how to sort arrays so that their elements are arranged in order. We discussed linear search and binary search, and selection sort, insertion sort and merge sort. You learned that linear search can operate on any set of data, but that binary search requires the data to be sorted first. You also learned that the simplest searching and sorting algorithms can exhibit poor performance. We introduced Big O notation—a measure of the efficiency of algorithms—and used it to compare the efficiency of the algorithms we discussed. In the next chapter, you’ll learn about dynamic data structures that can grow or shrink at execution time.

Summary

Section 20.1 Introduction

  • Searching involves determining if a search key is present in the data and, if so, finding its location.

  • Sorting involves arranging data in order.

Section 20.2.1 Linear Search

  • The linear search algorithm searches each element in an array sequentially until it finds the element that matches the search key. If the search key is not in the array, the algorithm tests each element in the array, and when the end of the array is reached, informs the user that the search key is not present, usually by means of a sentinel value.

  • One way to describe the efficiency of an algorithm is with Big O notation (O), which indicates how hard an algorithm may have to work to solve a problem.

  • In searching and sorting algorithms, Big O is dependent on how many elements are in the data.

  • An O(n) algorithm is referred to as having a linear runtime.

  • Big O is designed to highlight dominant factors and ignore terms that become unimportant with high n values. Big O notation is concerned with the growth rate of algorithm runtimes, so constants are ignored.

  • The linear search algorithm runs in O(n) time.

  • The worst case in linear search is that every element must be checked to determine whether the search item exists. This occurs if the search key is the last element in the array or is not present.

Section 20.2.2 Binary Search

  • The binary search algorithm is more efficient than the linear search algorithm, but it requires that the array be sorted.

  • The first iteration of binary search tests the middle array element. If this is the search key, the algorithm returns its location. If the search key is less than the middle element, the search continues with the first half of the array. If the search key is greater than the middle element, the search continues with the second half of the array. Each iteration of binary search tests the middle value of the remaining array and, if the element is not found, eliminates half of the remaining elements.

  • Binary search is a more efficient searching algorithm than linear search because with each comparison it eliminates from consideration half of the elements in the array.

  • Binary search runs in O(log n) time, because each step removes half of the remaining elements.

Section 20.3.1 Selection Sort

  • The selection sort is a simple, but inefficient, sorting algorithm.

  • The first iteration of the selection sort selects the smallest element in the array and swaps it with the first element. The second iteration selects the second-smallest element (which is the smallest remaining element) and swaps it with the second element. Selection sort continues until the largest element is in the last position. After the ith iteration of selection sort, the smallest i elements of the whole array are sorted into the first i positions.

  • The selection sort algorithm runs in O(n2) time.

Section 20.3.2 Insertion Sort

  • The first iteration of insertion sort takes the second element in the array and, if it’s less than the first, swaps them. The second iteration looks at the third element and inserts it in the correct position with respect to the first two. After the ith iteration of insertion sort, the first i elements in the original array are sorted.

  • The insertion sort algorithm runs in O(n2) time.

Section 20.3.3 Merge Sort

  • Merge sort is a sorting algorithm that is faster, but more complex to implement, than selection sort and insertion sort.

  • The merge sort algorithm sorts an array by splitting it into two equal-sized subarrays, sorting each recursively and merging them into one larger array.

  • Merge sort’s base case is an array with one element. A one-element array is already sorted.

  • Merge sort performs the merge by looking at the first element in each array, which is also the smallest. Merge sort takes the smallest of these and places it in the first element of the larger array. If there are still elements in the subarray, merge sort looks at the second element in that subarray (which is now the smallest element remaining) and compares it to the first element in the other subarray. Merge sort continues this process until the larger array is filled.

  • In the worst case, the first call to merge sort has to make O(n) comparisons to fill the n slots in the final array.

  • The merging portion of the merge sort algorithm is performed on two subarrays, each of approximately size n/2. Creating each of these subarrays requires n/2 − 1 comparisons for each subarray, or O(n) comparisons total. This pattern continues as each level works on twice as many arrays, but each is half the size of the previous array. Similar to binary search, this halving results in log n levels for a total efficiency of O(n log n).

Self-Review Exercises

20.1

Fill in the blanks in each of the following statements:

  1. A selection sort application would take approximately ____________ times as long to run on a 128-element array as on a 32-element array.

  2. The efficiency of merge sort is ____________.

20.1

  1. 16, because an O(n2) algorithm takes 16 times as long to sort four times as much information.

  2. O(n log n).

20.2

What key aspect of both the binary search and the merge sort accounts for the logarithmic portion of their respective Big Os?

20.2

Both of these algorithms incorporate “halving”—somehow reducing something by half. The binary search eliminates from consideration one half of the array after each comparison. The merge sort splits the array in half each time it’s called.

20.3

In what sense is the insertion sort superior to the merge sort? In what sense is the merge sort superior to the insertion sort?

20.3

The insertion sort is easier to understand and to program than the merge sort. The merge sort is far more efficient (O(n log n)) than the insertion sort (O(n2)).

20.4

In the text, we say that after the merge sort splits the array into two subarrays, it then sorts these two subarrays and merges them. Why might someone be puzzled by our statement that “it then sorts these two subarrays”?

20.4

In a sense, it does not really sort these two subarrays. It simply keeps splitting the original array in half until it provides a one-element subarray, which is, of course, sorted. It then builds up the original two subarrays by merging these one-element arrays to form larger subarrays, which are then merged until the whole array has been sorted.

Answers to Self-Review Exercises

Exercises

20.5

(Bubble Sort) Implement the bubble sort—another simple, yet inefficient, sorting technique. It’s called bubble sort or sinking sort because smaller values gradually “bubble” their way to the top of the array (i.e., toward the first element) like air bubbles rising in water, while the larger values sink to the bottom (end) of the array. The technique uses nested loops to make several passes through the array. Each pass compares successive overlapping pairs of elements (i.e., elements 0 and 1, 1 and 2, 2 and 3, etc.). If a pair is in increasing order (or the values are equal), the bubble sort leaves the values as they are. If a pair is in decreasing order, the bubble sort swaps their values in the array.

The first pass compares the first two elements of the array and swaps them if necessary. It then compares the second and third elements. The end of this pass compares the last two elements in the array and swaps them if necessary. After one pass, the largest element will be in the last position. After two passes, the largest two elements will be in the last two positions. Explain why bubble sort is an O(n2) algorithm.

20.6

(Enhanced Bubble Sort) Make the following simple modifications to improve the performance of the bubble sort you developed in Exercise 20.5:

  1. After the first pass, the largest number is guaranteed to be in the highest-numbered element of the array; after the second pass, the two highest numbers are “in place”; and so on. Instead of making nine comparisons on every pass, modify the bubble sort to make eight comparisons on the second pass, seven on the third pass and so on.

  2. The data in the array may already be in the proper order or near-proper order, so why make nine passes if fewer will suffice? Modify the sort to check at the end of each pass whether any swaps have been made. If none have been made, the data must already be in the proper order, so the application should terminate. If swaps have been made, at least one more pass is needed.

20.7

(Bucket Sort) A bucket sort begins with a one-dimensional array of positive integers to be sorted and a two-dimensional array of integers with rows indexed from 0 to 9 and columns indexed from 0 to n − 1, where n is the number of values to be sorted. Each row of the two-dimensional array is referred to as a bucket. Write a class named BucketSort containing a method called Sort that operates as follows:

  1. Place each value of the one-dimensional array into a row of the bucket array, based on the value’s “ones” (rightmost) digit. For example, 97 is placed in row 7, 3 is placed in row 3 and 100 is placed in row 0. This procedure is called a distribution pass.

  2. Loop through the bucket array row by row, and copy the values back to the original array. This procedure is called a gathering pass. The new order of the preceding values in the one-dimensional array is 100, 3 and 97.

  3. Repeat this process for each subsequent digit position (tens, hundreds, thousands, etc.).

On the second (tens digit) pass, 100 is placed in row 0, 3 is placed in row 0 (because 3 has no tens digit) and 97 is placed in row 9. After the gathering pass, the order of the values in the one-dimensional array is 100, 3 and 97. On the third (hundreds digit) pass, 100 is placed in row 1, 3 is placed in row 0 and 97 is placed in row 0 (after the 3). After the last gathering pass, the original array is in sorted order.

The two-dimensional array of buckets is 10 times the length of the integer array being sorted. This sorting technique provides better performance than a bubble sort, but requires much more memory—the bubble sort requires space for only one additional element of data. This comparison is an example of the space/time trade-off: The bucket sort uses more memory than the bubble sort, but performs better. This version of the bucket sort requires copying all the data back to the original array on each pass. Another possibility is to create a second two-dimensional bucket array and repeatedly swap the data between the two bucket arrays.

20.8

(Recursive Linear Search) Modify Fig. 20.2 to use recursive method RecursiveLinearSearch to perform a linear search of the array. The method should receive the search key and starting index as arguments. If the search key is found, return its index in the array; otherwise, return –1. Each call to the recursive method should check one index in the array.

20.9

(Recursive Binary Search) Modify Fig. 20.4 to use recursive method RecursiveBinary-Search to perform a binary search of the array. The method should receive the search key, starting index and ending index as arguments. If the search key is found, return its index in the array. If the search key is not found, return –1.

20.10

(Quicksort) The recursive sorting technique called quicksort uses the following basic algorithm for a one-dimensional array of values:

  1. Partitioning Step: Take the first element of the unsorted array and determine its final location in the sorted array (i.e., all values to the left of the element in the array are less than the element, and all values to the right of the element in the array are greater than the element—we show how to do this below). We now have one element in its proper location and two unsorted subarrays.

  2. Recursive Step: Perform Step a on each unsorted subarray.

Each time Step a is performed on a subarray, another element is placed in its final location in the sorted array, and two unsorted subarrays are created. When a subarray consists of one element, that element is in its final location (because a one-element array is already sorted).

The basic algorithm seems simple enough, but how do we determine the final position of the first element of each subarray? As an example, consider the following set of values (the element in bold is the partitioning element—it will be placed in its final location in the sorted array):

  • 37 2 6 4 89 8 10 12 68 45

  1. Starting from the rightmost element of the array, compare each element with 37 until an element less than 37 is found, then swap 37 and that element. The first element less than 37 is 12, so 37 and 12 are swapped. The new array is

    • 12 2 6 4 89 8 10 37 68 45

    Element 12 is in italics to indicate that it was just swapped with 37.

  2. Starting from the left of the array, but beginning with the element after 12, compare each element with 37 until an element greater than 37 is found—then swap 37 and that element. The first element greater than 37 is 89, so 37 and 89 are swapped. The new array is

    • 12 2 6 4 37 8 10 89 68 45

  3. Starting from the right, but beginning with the element before 89, compare each element with 37 until an element less than 37 is found—then swap 37 and that element. The first element less than 37 is 10, so 37 and 10 are swapped. The new array is

    • 12 2 6 4 10 8 37 89 68 45

  4. Starting from the left, but beginning with the element after 10, compare each element with 37 until an element greater than 37 is found—then swap 37 and that element. There are no more elements greater than 37, so when we compare 37 with itself, we know that 37 has been placed in its final location of the sorted array. Every value to the left of 37 is smaller than it, and every value to the right of 37 is larger than it.

Once the partition has been applied on the previous array, there are two unsorted subarrays. The subarray with values less than 37 contains 12, 2, 6, 4, 10 and 8. The subarray with values greater than 37 contains 89, 68 and 45. The sort continues recursively, with both subarrays being partitioned in the same manner as the original array.

Based on the preceding discussion, write recursive method QuickSortHelper to sort a one-dimensional integer array. The method should receive as arguments a starting index and an ending index in the original array being sorted.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.138.122.11