Chapter 23. Collections

 

The shapes a bright container can contain!

 
 --Theodore Roethke
 

I think this is the most extraordinary collection of talent, of human knowledge, that has ever been gathered together at the White House—with the possible exception of when Thomas Jefferson dined alone.

 
 --John F. Kennedy
<feature> <supertitle>Objectives</supertitle>

In this chapter you’ll learn:

<objective>

The nongeneric and generic collections that are provided by the .NET Framework.

</objective>
<objective>

To use class Array’s static methods to manipulate arrays.

</objective>
<objective>

To use enumerators to “walk through” a collection.

</objective>
<objective>

To use the foreach statement with the .NET collections.

</objective>
<objective>

To use nongeneric collection classes ArrayList, Stack, and Hashtable.

</objective>
<objective>

To use generic collection classes SortedDictionary and LinkedList.

</objective>
</feature>
<feature> <supertitle>Outline</supertitle> </feature>

Introduction

Chapter 21 discussed how to create and manipulate data structures. The discussion was “low level,” in the sense that we painstakingly created each element of each data structure dynamically with new and modified the data structures by directly manipulating their elements and references to their elements. For the vast majority of applications, there’s no need to build custom data structures. Instead, you can use the prepackaged data-structure classes provided by the .NET Framework. These classes are known as collection classes—they store collections of data. Each instance of one of these classes is a collection of items. Some examples of collections are the cards you hold in a card game, the songs stored in your computer, the real-estate records in your local registry of deeds (which map book numbers and page numbers to property owners), and the players on your favorite sports team.

Collection classes enable programmers to store sets of items by using existing data structures, without concern for how they’re implemented. This is a nice example of code reuse. Programmers can code faster and expect excellent performance, maximizing execution speed and minimizing memory consumption. In this chapter, we discuss the collection interfaces that list the capabilities of each collection type, the implementation classes and the enumerators that “walk through” collections.

The .NET Framework provides three namespaces dedicated to collections. Namespace System.Collections contains collections that store references to objects. We included these because there’s a large amount of legacy code in industry that uses these collections. Most new applications should use the collections in the System.Collections.Generic namespace, which contains generic classes—such as the List<T> and Dictionary<K, V> classes you learned previously—to store collections of specific types. The System.Collections.Specialized namespace contains several collections that support specific types, such as strings and bits. You can learn more about this namespace at msdn.microsoft.com/en-us/library/system.collections.specialized.aspx. The collections in these namespaces provide standardized, reusable components; you do not need to write your own collection classes. These collections are written for broad reuse. They’re tuned for rapid execution and for efficient use of memory. As new data structures and algorithms are developed that fit this framework, a large base of programmers already will be familiar with the interfaces and algorithms implemented by those data structures.

Collections Overview

All collection classes in the .NET Framework implement some combination of the collection interfaces. These interfaces declare the operations to be performed generically on various types of collections. Figure 23.1 lists some of the interfaces of the .NET Framework collections. All the interfaces in Fig. 23.1 are declared in namespace System.Collections and have generic analogs in namespace System.Collections.Generic. Implementations of these interfaces are provided within the framework. Programmers may also provide implementations specific to their own requirements.

Table 23.1. Some common collection interfaces.

Interface

Description

ICollection

The interface from which interfaces IList and IDictionary inherit. Contains a Count property to determine the size of a collection and a CopyTo method for copying a collection’s contents into a traditional array.

IList

An ordered collection that can be manipulated like an array. Provides an indexer for accessing elements with an int index. Also has methods for searching and modifying a collection, including Add, Remove, Contains and IndexOf.

IDictionary

A collection of values, indexed by an arbitrary “key” object. Provides an indexer for accessing elements with an object index and methods for modifying the collection (e.g., Add, Remove). IDictionary property Keys contains the objects used as indices, and property Values contains all the stored objects.

IEnumerable

An object that can be enumerated. This interface contains exactly one method, GetEnumerator, which returns an IEnumerator object (discussed in Section 23.3). ICollection extends IEnumerable, so all collection classes implement IEnumerable directly or indirectly.

In earlier versions of C#, the .NET Framework primarily provided the collection classes in the System.Collections and System.Collections.Specialized namespaces. These classes stored and manipulated object references. You could store any object in a collection. One inconvenient aspect of storing object references occurs when retrieving them from a collection. An application normally needs to process specific types of objects. As a result, the object references obtained from a collection typically need to be downcast to an appropriate type to allow the application to process the objects correctly.

The .NET Framework also includes the System.Collections.Generic namespace, which uses the generics capabilities we introduced in Chapter 22. Many of these classes are simply generic counterparts of the classes in namespace System.Collections. This means that you can specify the exact type that will be stored in a collection. You also receive the benefits of compile-time type checking—the compiler ensures that you’re using appropriate types with your collection and, if not, issues compile-time error messages. Also, once you specify the type stored in a collection, any item you retrieve from the collection will have the correct type. This eliminates the need for explicit type casts that can throw InvalidCastExceptions at execution time if the referenced object is not of the appropriate type. This also eliminates the overhead of explicit casting, improving efficiency and type safety. Generic collections are especially useful for storing structs, since they eliminate the overhead of boxing and unboxing.

This chapter demonstrates collection classes Array, ArrayList, Stack, Hashtable, generic SortedDictionary, and generic LinkedList—plus built-in array capabilities. Namespace System.Collections provides several other data structures, including BitArray (a collection of true/false values), Queue and SortedList (a collection of key/value pairs that are sorted by key and can be accessed either by key or by index). Figure 23.2 summarizes many of the collection classes. We also discuss the IEnumerator interface. Collection classes can create enumerators that allow programmers to walk through the collections. Although these enumerators have different implementations, they all implement the IEnumerator interface so that they can be processed polymorphically. As we’ll soon see, the foreach statement is simply a convenient notation for using an enumerator. In the next section, we begin our discussion by examining enumerators and the capabilities for array manipulation. [Note: Collection classes directly or indirectly implement ICollection and IEnumerable (or their generic equivalents ICollection<T> and IEnumerable<T> for generic collections).]

Table 23.2. Some collection classes of the .NET Framework.

Class

Implements

Description

System namespace:

Array

IList

The base class of all conventional arrays. See Section 23.3.

System.Collections namespace:

ArrayList

IList

Mimics conventional arrays, but will grow or shrink as needed to accommodate the number of elements. See Section 23.4.1.

BitArray

ICollection

A memory-efficient array of bools.

Hashtable

IDictionary

An unordered collection of key/value pairs that can be accessed by key. See Section 23.4.3.

Queue

ICollection

A first-in, first-out collection. See Section 21.6.

SortedList

IDictionary

A collection of key/value pairs that are sorted by key and can be accessed either by key or by index.

Stack

ICollection

A last-in, first-out collection. See Section 23.4.2.

System.Collections.Generic namespace:

Dictionary< K, V >

IDictionary< K, V >

A generic, unordered collection of key/value pairs that can be accessed by key. See Section 17.4.

LinkedList< T >

ICollection< T >

A doubly linked list. See Section 23.5.2.

List< T >

IList< T >

A generic ArrayList. Section 9.4.

Queue< T >

ICollection< T >

A generic Queue.

SortedDictionary< K, V >

IDictionary< K, V >

A Dictionary that sorts the data by the keys in a binary tree. See Section 23.5.1.

SortedList< K, V >

IDictionary< K, V >

A generic SortedList.

Stack< T >

ICollection< T >

A generic Stack. See.

Class Array and Enumerators

Chapter 8 presented basic array-processing capabilities. All arrays implicitly inherit from abstract base class Array (namespace System); this class defines property Length, which specifies the number of elements in the array. In addition, class Array provides static methods that provide algorithms for processing arrays. Typically, class Array overloads these methods—for example, Array method Reverse can reverse the order of the elements in an entire array or can reverse the elements in a specified range of elements in an array. For a complete list of class Array’s static methods visit:

msdn.microsoft.com/en-us/library/system.array.aspx

Figure 23.3 demonstrates several static methods of class Array.

Example 23.3. Array class used to perform common array manipulations.

 1   // Fig. 23.3: UsingArray.cs
 2   // Array class static methods for common array manipulations.
 3   using System;
 4   using System.Collections;
 5
 6   // demonstrate algorithms of class Array
 7   public class UsingArray
 8   {
 9      private static int[] intValues = { 1, 2, 3, 4, 5, 6 };
10      private static double[] doubleValues = { 8.4, 9.3, 0.2, 7.9, 3.4 };
11      private static int[] intValuesCopy;
12
13      // method Main demonstrates class Array's methods
14      public static void Main( string[] args )
15      {
16         intValuesCopy = new int[ intValues.Length ]; // defaults to zeroes
17
18         Console.WriteLine( "Initial array values:
" );
19         PrintArrays(); // output initial array contents
20
21         // sort doubleValues
22         Array.Sort( doubleValues );
23
24         // copy intValues into intValuesCopy
25         Array.Copy( intValues, intValuesCopy, intValues.Length );
26
27         Console.WriteLine( "
Array values after Sort and Copy:
" );
28         PrintArrays(); // output array contents
29         Console.WriteLine();
30
31         // search for 5 in intValues
32         int result = Array.BinarySearch( intValues, 5 );
33         if ( result >= 0 )
34            Console.WriteLine( "5 found at element {0} in intValues",
35               result );
36         else
37            Console.WriteLine( "5 not found in intValues" );
38
39         // search for 8763 in intValues
40         result = Array.BinarySearch( intValues, 8763 );
41         if ( result >= 0 )
42            Console.WriteLine( "8763 found at element {0} in intValues",
43               result );
44         else
45            Console.WriteLine( "8763 not found in intValues" );
46      } // end Main
47
48      // output array content with enumerators
49      private static void PrintArrays()
50      {
51         Console.Write( "doubleValues: " );
52
53         // iterate through the double array with an enumerator
54         IEnumerator enumerator = doubleValues.GetEnumerator();
55                                                               
56         while ( enumerator.MoveNext() )                       
57            Console.Write( enumerator.Current + " " );         
58
59         Console.Write( "
intValues: " );
60
61         // iterate through the int array with an enumerator
62         enumerator = intValues.GetEnumerator();      
63                                                      
64         while ( enumerator.MoveNext() )              
65            Console.Write( enumerator.Current + " " );
66
67         Console.Write( "
intValuesCopy: " );
68
69         // iterate through the second int array with a foreach statement
70         foreach ( var element in intValuesCopy )
71            Console.Write( element + " " );      
72
73         Console.WriteLine();
74      } // end method PrintArrays
75   } // end class UsingArray
Initial array values:

doubleValues: 8.4 9.3 0.2 7.9 3.4
intValues: 1 2 3 4 5 6
intValuesCopy: 0 0 0 0 0 0

Array values after Sort and Copy:

doubleValues: 0.2 3.4 7.9 8.4 9.3
intValues: 1 2 3 4 5 6
intValuesCopy: 1 2 3 4 5 6

5 found at element 4 in intValues
8763 not found in intValues

The using directives in lines 3–4 include the namespaces System (for classes Array and Console) and System.Collections (for interface IEnumerator, which we discuss shortly). References to the assemblies for these namespaces are implicitly included in every application, so we do not need to add any new references to the project file.

Our test class declares three static array variables (lines 9–11). The first two lines initialize intValues and doubleValues to an int and double array, respectively. Static variable intValuesCopy is intended to demonstrate the Array’s Copy method, so it’s left with the default value null—it does not yet refer to an array.

Line 16 initializes intValuesCopy to an int array with the same length as array intValues. Line 19 calls the PrintArrays method (lines 49–74) to output the initial contents of all three arrays. We discuss the PrintArrays method shortly. We can see from the output of Fig. 23.3 that each element of array intValuesCopy is initialized to the default value 0.

Line 22 uses static Array method Sort to sort array doubleValues. When this method returns, the array contains its original elements sorted in ascending order. The elements in the array must implement the IComparable interface.

Line 25 uses static Array method Copy to copy elements from array intValues to array intValuesCopy. The first argument is the array to copy (intValues), the second argument is the destination array (intValuesCopy) and the third argument is an int representing the number of elements to copy (in this case, intValues.Length specifies all elements).

Lines 32 and 40 invoke static Array method BinarySearch to perform binary searches on array intValues. Method BinarySearch receives the sorted array in which to search and the key for which to search. The method returns the index in the array at which it finds the key (or a negative number if the key was not found). BinarySearch assumes that it receives a sorted array. Its behavior on an unsorted array is unpredictable. Chapter 20 discussed binary searching in detail.

Method PrintArrays (lines 49–74) uses class Array’s methods to loop though each array. The GetEnumerator method (line 54) obtains an enumerator for array doubleValues. Recall that Array implements the IEnumerable interface. All arrays inherit implicitly from Array, so both the int[] and double[] array types implement IEnumerable interface method GetEnumerator, which returns an enumerator that can iterate over the collection. Interface IEnumerator (which all enumerators implement) defines methods MoveNext and Reset and property Current. MoveNext moves the enumerator to the next element in the collection. The first call to MoveNext positions the enumerator at the first element of the collection. MoveNext returns true if there’s at least one more element in the collection; otherwise, the method returns false. Method Reset positions the enumerator before the first element of the collection. Methods MoveNext and Reset throw an InvalidOperationException if the contents of the collection are modified in any way after the enumerator is created. Property Current returns the object at the current location in the collection.

Common Programming Error 23.1

Common Programming Error 23.1

If a collection is modified after an enumerator is created for that collection, the enumerator immediately becomes invalid—any methods called on the enumerator after this point throw InvalidOperationExceptions. For this reason, enumerators are said to be “fail fast.”

When an enumerator is returned by the GetEnumerator method in line 54, it’s initially positioned before the first element in Array doubleValues. Then when line 56 calls MoveNext in the first iteration of the while loop, the enumerator advances to the first element in doubleValues. The while statement in lines 56–57 loops over each element until the enumerator passes the end of doubleValues and MoveNext returns false. In each iteration, we use the enumerator’s Current property to obtain and output the current array element. Lines 62–65 iterate over array intValues.

Notice that PrintArrays is called twice (lines 19 and 28), so GetEnumerator is called twice on doubleValues. The GetEnumerator method (lines 54 and 62) always returns an enumerator positioned before the first element. Also notice that the IEnumerator property Current is read-only. Enumerators cannot be used to modify the contents of collections, only to obtain the contents.

Lines 70–71 use a foreach statement to iterate over the collection elements like an enumerator. In fact, the foreach statement behaves exactly like an enumerator. Both loop over the elements of an array one by one in consecutive order. Neither allows you to modify the elements during the iteration. This is not a coincidence. The foreach statement implicitly obtains an enumerator via the GetEnumerator method and uses the enumerator’s MoveNext method and Current property to traverse the collection, just as we did explicitly in lines 54–57. For this reason, we can use the foreach statement to iterate over any collection that implements the IEnumerable interface—not just arrays. We demonstrate this functionality in the next section when we discuss class ArrayList.

Other static Array methods include Clear (to set a range of elements to 0, false or null, as appropriate), CreateInstance (to create a new array of a specified type), IndexOf (to locate the first occurrence of an object in an array or portion of an array), LastIndexOf (to locate the last occurrence of an object in an array or portion of an array) and Reverse (to reverse the contents of an array or portion of an array).

Nongeneric Collections

The System.Collections namespace in the .NET Framework Class Library is the primary source for nongeneric collections. These classes provide standard implementations of many of the data structures discussed in Chapter 21 with collections that store references of type object. In this section, we demonstrate classes ArrayList, Stack and Hashtable.

Class ArrayList

In most programming languages, conventional arrays have a fixed size—they cannot be changed dynamically to conform to an application’s execution-time memory requirements. In some applications, this fixed-size limitation presents a problem for programmers. They must choose between using fixed-size arrays that are large enough to store the maximum number of elements the application may require and using dynamic data structures that can grow and shrink the amount of memory required to store data in response to the changing requirements of an application at execution time.

The .NET Framework’s ArrayList collection class mimics the functionality of conventional arrays and provides dynamic resizing of the collection through the class’s methods. At any time, an ArrayList contains a certain number of elements less than or equal to its capacity—the number of elements currently reserved for the ArrayList. An application can manipulate the capacity with ArrayList property Capacity. [Note: New applications should use the generic List<T> class introduced in Chapter 9.]

Performance Tip 23.1

Performance Tip 23.1

As with linked lists, inserting additional elements into an ArrayList whose current size is less than its capacity is a fast operation.

Performance Tip 23.2

Performance Tip 23.2

It’s a slow operation to insert an element into an ArrayList that needs to grow larger to accommodate a new element. An ArrayList that’s at its capacity must have its memory reallocated and the existing values copied into it.

Performance Tip 23.3

Performance Tip 23.3

If storage is at a premium, use method TrimToSize of class ArrayList to trim an ArrayList to its exact size. This will optimize an ArrayList’s memory use. Be careful—if the application needs to insert additional elements, the process will be slower, because the ArrayList must grow dynamically (trimming leaves no room for growth).

ArrayLists store references to objects. All classes derive from class object, so an ArrayList can contain objects of any type. Figure 23.4 lists some useful methods and properties of class ArrayList.

Table 23.4. Some methods and properties of class ArrayList.

Method or property

Description

Add

Adds an object to the ArrayList and returns an int specifying the index at which the object was added.

Capacity

Property that gets and sets the number of elements for which space is currently reserved in the ArrayList.

Clear

Removes all the elements from the ArrayList.

Contains

Returns true if the specified object is in the ArrayList; otherwise, returns false.

Count

Read-only property that gets the number of elements stored in the ArrayList.

IndexOf

Returns the index of the first occurrence of the specified object in the ArrayList.

Insert

Inserts an object at the specified index.

Remove

Removes the first occurrence of the specified object.

RemoveAt

Removes an object at the specified index.

RemoveRange

Removes a specified number of elements starting at a specified index in the ArrayList.

Sort

Sorts the ArrayList.

TrimToSize

Sets the Capacity of the ArrayList to the number of elements the ArrayList currently contains (Count).

Figure 23.5 demonstrates class ArrayList and several of its methods. Class ArrayList belongs to the System.Collections namespace (line 4). Lines 8–11 declare two arrays of strings (colors and removeColors) that we’ll use to fill two ArrayList objects. Recall from Section 10.10 that constants must be initialized at compile time, but readonly variables can be initialized at execution time. Arrays are objects created at execution time, so we declare colors and removeColors with readonly—not const—to make them unmodifiable. When the application begins execution, we create an ArrayList with an initial capacity of one element and store it in variable list (line 16). The foreach statement in lines 19–20 adds the five elements of array colors to list via ArrayList’s Add method, so list grows to accommodate these new elements. Line 24 uses ArrayList’s overloaded constructor to create a new ArrayList initialized with the contents of array removeColors, then assigns it to variable removeList. This constructor can initialize the contents of an ArrayList with the elements of any ICollection passed to it. Many of the collection classes have such a constructor. Notice that the constructor call in line 24 performs the task of lines 19–20.

Example 23.5. Using class ArrayList.

 1   // Fig. 23.5: ArrayListTest.cs
 2   // Using class ArrayList.
 3   using System;
 4   using System.Collections;
 5
 6   public class ArrayListTest
 7   {
 8      private static readonly string[] colors =
 9         { "MAGENTA", "RED", "WHITE", "BLUE", "CYAN" };
10      private static readonly string[] removeColors =
11         { "RED", "WHITE", "BLUE" };
12
13      // create ArrayList, add colors to it and manipulate it
14      public static void Main( string[] args )
15      {
16         ArrayList list = new ArrayList( 1 ); // initial capacity of 1
17
18         // add the elements of the colors array to the ArrayList list
19         foreach ( var color in colors )
20            list.Add( color ); // add color to the ArrayList list
21
22         // add elements in the removeColors array to
23         // the ArrayList removeList with the ArrayList constructor
24         ArrayList removeList = new ArrayList( removeColors );
25
26         Console.WriteLine( "ArrayList: " );
27         DisplayInformation( list ); // output the list
28
29         // remove from ArrayList list the colors in removeList
30         RemoveColors( list, removeList );
31
32         Console.WriteLine( "
ArrayList after calling RemoveColors: " );
33         DisplayInformation( list ); // output list contents
34      } // end Main
35
36      // displays information on the contents of an array list
37      private static void DisplayInformation( ArrayList arrayList )
38      {
39         // iterate through array list with a foreach statement
40         foreach ( var element in arrayList )                    
41            Console.Write( "{0} ", element ); // invokes ToString
42
43         // display the size and capacity
44         Console.WriteLine( "
Size = {0}; Capacity = {1}",
45            arrayList.Count, arrayList.Capacity );
46
47         int index = arrayList.IndexOf( "BLUE" );
48
49         if ( index != -1 )
50            Console.WriteLine( "The array list contains BLUE at index {0}.",
51               index );
52         else
53            Console.WriteLine( "The array list does not contain BLUE." );
54      } // end method DisplayInformation
55
56      // remove colors specified in secondList from firstList    
57      private static void RemoveColors( ArrayList firstList,     
58         ArrayList secondList )                                  
59      {                                                          
60         // iterate through second ArrayList like an array       
61         for ( int count = 0; count < secondList.Count; count++ )
62            firstList.Remove( secondList[ count ] );             
63      } // end method RemoveColors                               
64   } // end class ArrayListTest

ArrayList:
MAGENTA RED WHITE BLUE CYAN
Size = 5; Capacity = 8
The array list contains BLUE at index 3.

ArrayList after calling RemoveColors:
MAGENTA CYAN
Size = 2; Capacity = 8
The array list does not contain BLUE.

Line 27 calls method DisplayInformation (lines 37–54) to output the contents of the list. This method uses a foreach statement to traverse the elements of an ArrayList. As we discussed in Section 23.3, the foreach statement is a convenient shorthand for calling ArrayList’s GetEnumerator method and using an enumerator to traverse the elements of the collection. Also, line 40 infers that the iteration variable’s type is object because class ArrayList is nongeneric and stores references to objects.

We use properties Count and Capacity (line 45) to display the current number and the maximum number of elements that can be stored without allocating more memory to the ArrayList. The output of Fig. 23.5 indicates that the ArrayList has capacity 8.

In line 47, we invoke method IndexOf to determine the position of the string "BLUE" in arrayList and store the result in local variable index. IndexOf returns -1 if the element is not found. The if statement in lines 49–53 checks if index is -1 to determine whether arrayList contains "BLUE". If it does, we output its index. ArrayList also provides method Contains, which simply returns true if an object is in the ArrayList, and false otherwise. Method Contains is preferred if we do not need the index of the element.

Performance Tip 23.4

Performance Tip 23.4

ArrayList methods IndexOf and Contains each perform a linear search, which is a costly operation for large ArrayLists. If the ArrayList is sorted, use ArrayList method BinarySearch to perform a more efficient search. Method BinarySearch returns the index of the element, or a negative number if the element is not found.

After method DisplayInformation returns, we call method RemoveColors (lines 57–63) with the two ArrayLists. The for statement in lines 61–62 iterates over ArrayList secondList. Line 62 uses an indexer to access an ArrayList element—by following the ArrayList reference name with square brackets ([]) containing the desired index of the element. An ArgumentOutOfRangeException occurs if the specified index is not both greater than 0 and less than the number of elements currently stored in the ArrayList (specified by the ArrayList’s Count property).

We use the indexer to obtain each of secondList’s elements, then remove each one from firstList with the Remove method. This method deletes a specified item from an ArrayList by performing a linear search and removing (only) the first occurrence of the specified object. All subsequent elements shift toward the beginning of the ArrayList to fill the emptied position.

After the call to RemoveColors, line 33 again outputs the contents of list, confirming that the elements of removeList were, indeed, removed.

Class Stack

The Stack class implements a stack data structure and provides much of the functionality that we defined in our own implementation in Section 21.5. Refer to that section for a discussion of stack data-structure concepts. We created a test application in Fig. 21.14 to demonstrate the StackInheritance data structure that we developed. We adapt Fig. 21.14 in Fig. 23.6 to demonstrate the .NET Framework collection class Stack. [Note: New applications requiring a stack class should use the generic Stack<T> class.]

Example 23.6. Demonstrating class Stack.

 1   // Fig. 23.6: StackTest.cs
 2   // Demonstrating class Stack.
 3   using System;
 4   using System.Collections;
 5
 6   public class StackTest
 7   {
 8      public static void Main( string[] args )
 9      {
10         Stack stack = new Stack(); // create an empty Stack
11
12         // create objects to store in the stack
13         bool aBoolean = true;
14         char aCharacter = '$';
15         int anInteger = 34567;
16         string aString = "hello";
17
18         // use method Push to add items to (the top of) the stack
19         stack.Push( aBoolean );
20         PrintStack( stack );
21         stack.Push( aCharacter );
22         PrintStack( stack );
23         stack.Push( anInteger );
24         PrintStack( stack );
25         stack.Push( aString );
26         PrintStack( stack );
27
28         // check the top element of the stack
29         Console.WriteLine( "The top element of the stack is {0}
",
30            stack.Peek() );
31
32         // remove items from stack
33         try
34         {
35            while ( true )                                    
36            {                                                 
37               object removedObject = stack.Pop();            
38               Console.WriteLine( removedObject + " popped" );
39               PrintStack( stack );                           
40            } // end while                                    
41         } // end try
42         catch ( InvalidOperationException exception )
43         {
44            // if exception occurs, output stack trace
45            Console.Error.WriteLine( exception );
46         } // end catch
47      } // end Main
48
49      // display the contents of a stack
50      private static void PrintStack( Stack stack )
51      {
52         if ( stack.Count == 0 )
53            Console.WriteLine( "stack is empty
" ); // the stack is empty
54         else
55         {
56            Console.Write( "The stack is: " );
57
58            // iterate through the stack with a foreach statement
59            foreach ( var element in stack )                        
60               Console.Write( "{0} ", element ); // invokes ToString
61
62            Console.WriteLine( "
" );
63         } // end else
64      } // end method PrintStack
65   } // end class StackTest
The stack is: True

The stack is: $ True

The stack is: 34567 $ True

The stack is: hello 34567 $ True

The top element of the stack is hello

hello popped
The stack is: 34567 $ True

34567 popped
The stack is: $ True

$ popped
The stack is: True

True popped
stack is empty

System.InvalidOperationException: Stack empty.
   at System.Collections.Stack.Pop()
   at StackTest.Main(String[] args) in C:examplesch23
      fig23_06StackTestStackTest.cs:line 37

The using directive in line 4 allows us to use the Stack class with its unqualified name from the System.Collections namespace. Line 10 creates a Stack. As one might expect, class Stack has methods Push and Pop to perform the basic stack operations.

Method Push takes an object as an argument and inserts it at the top of the Stack. If the number of items on the Stack (the Count property) is equal to the capacity at the time of the Push operation, the Stack grows to accommodate more objects. Lines 19–26 use method Push to add four elements (a bool, a char, an int and a string) to the stack and invoke method PrintStack (lines 50–64) after each Push to output the contents of the stack. Notice that this nongeneric Stack class can store only references to objects, so each of the value-type items—the bool, the char and the int—is implicitly boxed before it’s added to the Stack. (Namespace System.Collections.Generic provides a generic Stack class that has many of the same methods and properties used in Fig. 23.6. This version eliminates the overhead of boxing and unboxing simple types.)

Method PrintStack (lines 50–64) uses Stack property Count (implemented to fulfill the contract of interface ICollection) to obtain the number of elements in stack. If the stack is not empty (i.e., Count is not equal to 0), we use a foreach statement to iterate over the stack and output its contents by implicitly invoking the ToString method of each element. The foreach statement implicitly invokes Stack’s GetEnumerator method, which we could have called explicitly to traverse the stack via an enumerator.

Method Peek returns the value of the top stack element but does not remove the element from the Stack. We use Peek at line 30 to obtain the top object of the Stack, then output that object, implicitly invoking the object’s ToString method. An InvalidOperationException occurs if the Stack is empty when the application calls Peek. (We do not need an exception-handling block because we know the stack is not empty here.)

Method Pop takes no arguments—it removes and returns the object currently on top of the Stack. An infinite loop (lines 35–40) pops objects off the stack and outputs them until the stack is empty. When the application calls Pop on the empty stack, an InvalidOperationException is thrown. The catch block (lines 42–46) outputs the exception, implicitly invoking the InvalidOperationException’s ToString method to obtain its error message and stack trace.

Common Programming Error 23.2

Common Programming Error 23.2

Attempting to Peek or Pop an empty Stack (a Stack whose Count property is 0) causes an InvalidOperationException.

Although Fig. 23.6 does not demonstrate it, class Stack also has method Contains, which returns true if the Stack contains the specified object, and returns false otherwise.

Class Hashtable

When an application creates objects of new or existing types, it needs to manage those objects efficiently. This includes sorting and retrieving objects. Sorting and retrieving information with arrays is efficient if some aspect of your data directly matches the key value and if those keys are unique and tightly packed. If you have 100 employees with nine-digit social security numbers and you want to store and retrieve employee data by using the social security number as a key, it would nominally require an array with 1,000,000,000 elements, because there are 1,000,000,000 unique nine-digit numbers. If you have an array that large, you could get high performance storing and retrieving employee records by simply using the social security number as the array index, but it would be a large waste of memory.

Many applications have this problem—either the keys are of the wrong type (i.e., not nonnegative integers), or they’re of the right type but are sparsely spread over a large range.

What is needed is a high-speed scheme for converting keys such as social security numbers and inventory part numbers to unique array indices. Then, when an application needs to store something, the scheme could convert the application key rapidly to an index and the record of information could be stored at that location in the array. Retrieval occurs the same way—once the application has a key for which it wants to retrieve the data record, the application simply applies the conversion to the key, which produces the array index where the data resides in the array and retrieves the data.

The scheme we describe here is the basis of a technique called hashing, in which we store data in a data structure called a hash table. Why the name? Because, when we convert a key into an array index, we literally scramble the bits, making a “hash” of the number. The number actually has no real significance beyond its usefulness in storing and retrieving this particular data record.

A glitch in the scheme occurs when there are collisions (i.e., two different keys “hash into” the same cell, or element, in the array). Since we cannot sort two different data records to the same space, we need to find an alternative home for all records beyond the first that hash to a particular array index. One scheme for doing this is to “hash again” (i.e., to reapply the hashing transformation to the key to provide a next candidate cell in the array). The hashing process is designed so that with just a few hashes, an available cell will be found.

Another scheme uses one hash to locate the first candidate cell. If the cell is occupied, successive cells are searched linearly until an available cell is found. Retrieval works the same way—the key is hashed once, the resulting cell is checked to determine whether it contains the desired data. If it does, the search is complete. If it does not, successive cells are searched linearly until the desired data is found.

The most popular solution to hash-table collisions is to have each cell of the table be a hash “bucket”—typically, a linked list of all the key/value pairs that hash to that cell. This is the solution that the .NET Framework’s Hashtable class implements.

The load factor affects the performance of hashing schemes. The load factor is the ratio of the number of objects stored in the hash table to the total number of cells of the hash table. As this ratio gets higher, the chance of collisions tends to increase.

Performance Tip 23.5

Performance Tip 23.5

The load factor in a hash table is a classic example of a space/time trade-off: By increasing the load factor, we get better memory utilization, but the application runs slower due to increased hashing collisions. By decreasing the load factor, we get better application speed because of reduced hashing collisions, but we get poorer memory utilization because a larger portion of the hash table remains empty.

Computer-science students study hashing schemes in courses called “Data Structures” and “Algorithms.” Recognizing the value of hashing, the .NET Framework provides class Hashtable to enable programmers to easily employ hashing in applications.

This concept is profoundly important in our study of object-oriented programming. Classes encapsulate and hide complexity (i.e., implementation details) and offer user-friendly interfaces. Crafting classes to do this properly is one of the most valued skills in the field of object-oriented programming.

A hash function performs a calculation that determines where to place data in the hash table. The hash function is applied to the key in a key/value pair of objects. Class Hashtable can accept any object as a key. For this reason, class object defines method GetHashCode, which all objects inherit. Most classes that are candidates to be used as keys in a hash table override this method to provide one that performs efficient hash-code calculations for a specific type. For example, a string has a hash-code calculation that’s based on the contents of the string. Figure 23.7 uses a Hashtable to count the number of occurrences of each word in a string. [Note: New applications should use generic class Dictionary<K, V> (introduced in Section 17.4) rather than Hashtable.]

Example 23.7. Application counts the number of occurrences of each word in a string and stores them in a hash table.

 1   // Fig. 23.7: HashtableTest.cs
 2   // Application counts the number of occurrences of each word in a string
 3   // and stores them in a hash table.
 4   using System;
 5   using System.Text.RegularExpressions;
 6   using System.Collections;
 7
 8   public class HashtableTest
 9   {
10      public static void Main( string[] args )
11      {
12         // create hash table based on user input
13         Hashtable table = CollectWords();
14
15         // display hash-table content
16         DisplayHashtable( table );
17      } // end Main
18
19      // create hash table from user input
20      private static Hashtable CollectWords()
21      {
22         Hashtable table = new Hashtable(); // create a new hash table
23
24         Console.WriteLine( "Enter a string: " ); // prompt for user input
25         string input = Console.ReadLine(); // get input
26
27         // split input text into tokens
28         string[] words = Regex.Split( input, @"s+" );
29
30         // processing input words
31         foreach ( var word in words )                               
32         {                                                           
33            string wordKey = word.ToLower(); // get word in lowercase
34                                                                     
35            // if the hash table contains the word                   
36            if ( table.ContainsKey( wordKey ) )                      
37            {                                                        
38               table[ wordKey ] = ( ( int ) table[ wordKey ] ) + 1;  
39            } // end if                                              
40            else                                                     
41               // add new word with a count of 1 to hash table       
42               table.Add( wordKey, 1 );                              
43         } // end foreach                                            
44
45         return table;
46      } // end method CollectWords
47
48      // display hash-table content
49      private static void DisplayHashtable( Hashtable table )
50      {
51         Console.WriteLine( "
Hashtable contains:
{0,-12}{1,-12}",
52            "Key:", "Value:" );
53
54         // generate output for each key in hash table
55         // by iterating through the Keys property with a foreach statement
56         foreach ( var key in table.Keys )                           
57            Console.WriteLine( "{0,-12}{1,-12}", key, table[ key ] );
58
59         Console.WriteLine( "
size: {0}", table.Count );
60      } // end method DisplayHashtable
61   } // end class HashtableTest
Enter a string:
As idle as a painted ship upon a painted ocean

Hashtable contains:
Key:        Value:
ocean       1
a           2
as          2
ship        1
upon        1
painted     2
idle        1

size: 7

Lines 4–6 contain using directives for namespaces System (for class Console), System.Text.RegularExpressions (for class Regex) and System.Collections (for class Hashtable). Class HashtableTest declares three static methods. Method CollectWords (lines 20–46) inputs a string and returns a Hashtable in which each value stores the number of times that word appears in the string and the word is used for the key. Method DisplayHashtable (lines 49–60) displays the Hashtable passed to it in column format. The Main method (lines 10–17) simply invokes CollectWords (line 13), then passes the Hashtable returned by CollectWords to DisplayHashtable in line 16.

Method CollectWords (lines 20–46) begins by initializing local variable table with a new Hashtable (line 22) that has a default maximum load factor of 1.0. When the Hashtable reaches the specified load factor, the capacity is increased automatically. (This implementation detail is invisible to clients of the class.) Lines 24–25 prompt the user and input a string. We use static method Split of class Regex in line 28 to divide the string by its whitespace characters. This creates an array of “words,” which we then store in local variable words.

Lines 31–43 loop over every element of array words. Each word is converted to lowercase with string method ToLower, then stored in variable wordKey (line 33). Then line 36 calls Hashtable method ContainsKey to determine whether the word is in the hash table (and thus has occurred previously in the string). If the Hashtable does not contain an entry for the word, line 42 uses Hashtable method Add to create a new entry in the hash table, with the lowercase word as the key and an object containing 1 as the value. Autoboxing occurs when the application passes integer 1 to method Add, because the hash table stores both the key and value in references of type object.

Common Programming Error 23.3

Common Programming Error 23.3

Using the Add method to add a key that already exists in the hash table causes an ArgumentException.

If the word is already a key in the hash table, line 38 uses the Hashtable’s indexer to obtain and set the key’s associated value (the word count) in the hash table. We first down-cast the value obtained by the get accessor from an object to an int. This unboxes the value so that we can increment it by 1. Then, when we use the indexer’s set accessor to assign the key’s associated value, the incremented value is implicitly reboxed so that it can be stored in the hash table.

Notice that invoking the get accessor of a Hashtable indexer with a key that does not exist in the hash table obtains a null reference. Using the set accessor with a key that does not exist in the hash table creates a new entry, as if you had used the Add method.

Line 45 returns the hash table to the Main method, which then passes it to method DisplayHashtable (lines 49–60), which displays all the entries. This method uses readonly property Keys (line 56) to get an ICollection that contains all the keys. Because ICollection extends IEnumerable, we can use this collection in the foreach statement in lines 56–57 to iterate over the keys of the hash table. This loop accesses and outputs each key and its value in the hash table using the iteration variable and table’s get accessor. Each key and its value is displayed in a field width of -12. The negative field width indicates that the output is left justified. A hash table is not sorted, so the key/value pairs are not displayed in any particular order. Line 59 uses Hashtable property Count to get the number of key/value pairs in the Hashtable.

Lines 56–57 could have also used the foreach statement with the Hashtable object itself, instead of using the Keys property. If you use a foreach statement with a Hashtable object, the iteration variable will be of type DictionaryEntry. The enumerator of a Hashtable (or any other class that implements IDictionary) uses the DictionaryEntry structure to store key/value pairs. This structure provides properties Key and Value for retrieving the key and value of the current element. If you do not need the key, class Hashtable also provides a read-only Values property that gets an ICollection of all the values stored in the Hashtable. We can use this property to iterate through the values stored in the Hashtable without regard for where they’re stored.

Problems with Nongeneric Collections

In the word-counting application of Fig. 23.7, our Hashtable stores its keys and data as object references, even though we store only string keys and int values by convention. This results in some awkward code. For example, line 38 was forced to unbox and box the int data stored in the Hashtable every time it incremented the count for a particular key. This is inefficient. A similar problem occurs in line 56—the iteration variable of the foreach statement is an object reference. If we need to use any of its string-specific methods, we need an explicit downcast.

This can cause subtle bugs. Suppose we decide to improve the readability of Fig. 23.7 by using the indexer’s set accessor instead of the Add method to add a key/value pair in line 42, but accidentally type:

table[ wordKey ] = wordKey; // initialize to 1

This statement will create a new entry with a string key and string value instead of an int value of 1. Although the application will compile correctly, this is clearly incorrect. If a word appears twice, line 38 will try to downcast this string to an int, causing an InvalidCastException at execution time. The error that appears at execution time will indicate that the problem is at line 38, where the exception occurred, not at line 42. This makes the error more difficult to find and debug, especially in large software applications where the exception may occur in a different file—and even in a different assembly.

Generic Collections

The System.Collections.Generic namespace contains generic classes that allow us to create collections of specific types. As you saw in Fig. 23.2, many of the classes are simply generic versions of nongeneric collections. A couple of classes implement new data structures. Here, we demonstrate generic collections SortedDictionary and LinkedList.

Generic Class SortedDictionary

A dictionary is the general term for a collection of key/value pairs. A hash table is one way to implement a dictionary. The .NET Framework provides several implementations of dictionaries, both generic and nongeneric, all of which implement the IDictionary interface (described in Fig. 23.1). The application in Fig. 23.8 is a modification of Fig. 23.7 that uses the generic class SortedDictionary. Generic class SortedDictionary does not use a hash table, but instead stores its key/value pairs in a binary search tree. (We discuss binary trees in depth in Section 21.7.) As the class name suggests, the entries in SortedDictionary are sorted in the tree by key. When the key implements generic interface IComparable<T>, the SortedDictionary uses the results of IComparable<T> method CompareTo to sort the keys. Notice that despite these implementation details, we use the same public methods, properties and indexers with classes Hashtable and SortedDictionary in the same ways. In fact, except for the generic-specific syntax, Fig. 23.8 looks remarkably similar to Fig. 23.7. This is the beauty of object-oriented programming.

Example 23.8. Application counts the number of occurrences of each word in a string and stores them in a generic sorted dictionary.

 1   // Fig. 23.12: SortedDictionaryTest.cs
 2   // Application counts the number of occurrences of each word in a string
 3   // and stores them in a generic sorted dictionary.
 4   using System;
 5   using System.Text.RegularExpressions;
 6   using System.Collections.Generic;
 7
 8   public class SortedDictionaryTest
 9   {
10      public static void Main( string[] args )
11      {
12         // create sorted dictionary based on user input
13         SortedDictionary< string, int > dictionary = CollectWords();
14
15         // display sorted dictionary content
16         DisplayDictionary( dictionary );
17      } // end Main
18
19      // create sorted dictionary from user input
20      private static SortedDictionary< string, int > CollectWords()
21      {
22         // create a new sorted dictionary
23         SortedDictionary< string, int > dictionary =
24            new SortedDictionary< string, int >();   
25
26         Console.WriteLine( "Enter a string: " ); // prompt for user input
27         string input = Console.ReadLine(); // get input
28
29         // split input text into tokens
30         string[] words = Regex.Split( input, @"s+" );
31
32         // processing input words
33         foreach ( var word in words )
34         {
35            string wordKey = word.ToLower(); // get word in lowercase
36
37            // if the dictionary contains the word
38            if ( dictionary.ContainsKey( wordKey ) )
39            {
40               ++dictionary[ wordKey ];
41            } // end if
42            else
43               // add new word with a count of 1 to the dictionary
44               dictionary.Add( wordKey, 1 );
45         } // end foreach
46
47         return dictionary;
48      } // end method CollectWords
49
50      // display dictionary content
51      private static void DisplayDictionary< K, V >(
52         SortedDictionary< K, V > dictionary )      
53      {
54         Console.WriteLine( "
Sorted dictionary contains:
{0,-12}{1,-12}",
55            "Key:", "Value:" );
56
57         // generate output for each key in the sorted dictionary
58         // by iterating through the Keys property with a foreach statement
59         foreach ( K key in dictionary.Keys )
60            Console.WriteLine( "{0,-12}{1,-12}", key, dictionary[ key ] );
61
62         Console.WriteLine( "
size: {0}", dictionary.Count );
63      } // end method DisplayDictionary
64   } // end class SortedDictionaryTest
Enter a string:
We few, we happy few, we band of brothers

Sorted dictionary contains:
Key:        Value:
band        1
brothers    1
few,        2
happy       1
of          1
we          3

size: 6

Line 6 contains a using directive for the System.Collections.Generic namespace, which contains class SortedDictionary. The generic class SortedDictionary takes two type arguments—the first specifies the type of key (i.e., string) and the second the type of value (i.e., int). We have simply replaced the word Hashtable in line 13 and lines 23–24 with SortedDictionary<string, int> to create a dictionary of int values keyed with strings. Now, the compiler can check and notify us if we attempt to store an object of the wrong type in the dictionary. Also, because the compiler now knows that the data structure contains int values, there’s no longer any need for the downcast in line 40. This allows line 40 to use the much more concise prefix increment (++) notation. These changes result in code that can be checked for type safety at compile time.

Static method DisplayDictionary (lines 51–63) has been modified to be completely generic. It takes type parameters K and V. These parameters are used in line 52 to indicate that DisplayDictionary takes a SortedDictionary with keys of type K and values of type V. We use type parameter K again in line 59 as the type of the iteration key. This use of generics is a marvelous example of code reuse. If we decide to change the application to count the number of times each character appears in a string, method DisplayDictionary could receive an argument of type SortedDictionary<char, int> without modification. This is precisely what you’ll do in Exercise 23.12. The key-value pairs displayed are now ordered by key, as shown in Fig. 23.8.

Performance Tip 23.6

Performance Tip 23.6

Because class SortedDictionary keeps its elements sorted in a binary tree, obtaining or inserting a key/value pair takes O(log n) time, which is fast compared to linear searching, then inserting.

Common Programming Error 23.4

Common Programming Error 23.4

Invoking the get accessor of a SortedDictionary indexer with a key that does not exist in the collection causes a KeyNotFoundException. This behavior is different from that of the Hashtable indexer’s get accessor, which would return null.

Generic Class LinkedList

The generic LinkedList class is a doubly linked list—we can navigate the list both backward and forward with nodes of generic class LinkedListNode. Each node contains property Value and read-only properties Previous and Next. The Value property’s type matches LinkedList’s single type parameter because it contains the data stored in the node. The Previous property gets a reference to the preceding node in the linked list (or null if the node is the first of the list). Similarly, the Next property gets a reference to the subsequent reference in the linked list (or null if the node is the last of the list). We demonstrate a few linked-list manipulations in Fig. 23.9.

Example 23.9. Using LinkedLists.

 1   // Fig. 23.9: LinkedListTest.cs
 2   // Using LinkedLists.
 3   using System;
 4   using System.Collections.Generic;
 5
 6   public class LinkedListTest
 7   {
 8      private static readonly string[] colors = { "black", "yellow",
 9         "green", "blue", "violet", "silver" };
10      private static readonly string[] colors2 = { "gold", "white",
11         "brown", "blue", "gray" };
12
13      // set up and manipulate LinkedList objects
14      public static void Main( string[] args )
15      {
16         LinkedList< string > list1 = new LinkedList< string >();
17
18         // add elements to first linked list
19         foreach ( var color in colors )
20            list1.AddLast( color );
21
22         // add elements to second linked list via constructor
23         LinkedList< string > list2 = new LinkedList< string >( colors2 );
24
25         Concatenate( list1, list2 ); // concatenate list2 onto list1
26         PrintList( list1 ); // display list1 elements
27
28         Console.WriteLine( "
Converting strings in list1 to uppercase
" );
29         ToUppercaseStrings( list1 ); // convert to uppercase string
30         PrintList( list1 ); // display list1 elements
31
32         Console.WriteLine( "
Deleting strings between BLACK and BROWN
" );
33         RemoveItemsBetween( list1, "BLACK", "BROWN" );
34
35         PrintList( list1 ); // display list1 elements
36         PrintReversedList( list1 ); // display list in reverse order
37      } // end Main
38
39      // display list contents
40      private static void PrintList< T >( LinkedList< T > list )
41      {
42         Console.WriteLine( "Linked list: " );
43
44         foreach ( T value in list )       
45            Console.Write( "{0} ", value );
46
47         Console.WriteLine();
48      } // end method PrintList
49
50      // concatenate the second list on the end of the first list
51      private static void Concatenate< T >( LinkedList< T > list1,
52         LinkedList< T > list2 )
53      {
54         // concatenate lists by copying element values    
55         // in order from the second list to the first list
56         foreach ( T value in list2 )                      
57            list1.AddLast( value ); // add new node        
58      } // end method Concatenate
59
60      // locate string objects and convert to uppercase
61      private static void ToUppercaseStrings( LinkedList< string > list )
62      {                                                                  
63         // iterate over the list by using the nodes                     
64         LinkedListNode< string > currentNode = list.First;              
65                                                                         
66         while ( currentNode != null )                                   
67         {                                                               
68            string color = currentNode.Value; // get value in node       
69            currentNode.Value = color.ToUpper(); // convert to uppercase 
70                                                                         
71            currentNode = currentNode.Next; // get next node             
72         } // end while                                                  
73      } // end method ToUppercaseStrings                                 
74
75      // delete list items between two given items
76      private static void RemoveItemsBetween< T >( LinkedList< T > list,
77         T startItem, T endItem )
78      {
79         // get the nodes corresponding to the start and end item
80         LinkedListNode< T > currentNode = list.Find( startItem );
81         LinkedListNode< T > endNode = list.Find( endItem );      
82
83         // remove items after the start item
84         // until we find the last item or the end of the linked list
85         while ( ( currentNode.Next != null ) &&
86            ( currentNode.Next != endNode ) )
87         {
88            list.Remove( currentNode.Next ); // remove next node
89         } // end while
90      } // end method RemoveItemsBetween
91
92      // display reversed list
93      private static void PrintReversedList< T >( LinkedList< T > list )
94      {
95         Console.WriteLine( "Reversed List:" );
96
97         // iterate over the list by using the nodes
98         LinkedListNode< T > currentNode = list.Last;               
99                                                                    
100        while ( currentNode != null )                              
101        {                                                          
102           Console.Write( "{0} ", currentNode.Value );             
103           currentNode = currentNode.Previous; // get previous node
104        } // end while                                             
105
106        Console.WriteLine();
107     } // end method PrintReversedList
108  } // end class LinkedListTest
Linked list:
black yellow green blue violet silver gold white brown blue gray

Converting strings in list1 to uppercase

Linked list:
BLACK YELLOW GREEN BLUE VIOLET SILVER GOLD WHITE BROWN BLUE GRAY

Deleting strings between BLACK and BROWN

Linked list:
BLACK BROWN BLUE GRAY
Reversed List:
GRAY BLUE BROWN BLACK

The using directive in line 4 allows us to use the LinkedList class by its unqualified name. Lines 16–23 create LinkedLists list1 and list2 of strings and fill them with the contents of arrays colors and colors2, respectively. LinkedList is a generic class that has one type parameter for which we specify the type argument string in this example (lines 16 and 23). We demonstrate two ways to fill the lists. In lines 19–20, we use the foreach statement and method AddLast to fill list1. The AddLast method creates a new LinkedListNode (with the given value available via the Value property) and appends this node to the end of the list. There’s also an AddFirst method that inserts a node at the beginning of the list. Line 23 invokes the constructor that takes an IEnumerable<string> parameter. All arrays implicitly inherit from the generic interfaces IList and IEnumerable with the type of the array as the type argument, so the string array colors2 implements IEnumerable<string>. The type parameter of this generic IEnumerable matches the type parameter of the generic LinkedList object. This constructor call copies the contents of the array colors2 to list2.

Line 25 calls generic method Concatenate (lines 51–58) to append all elements of list2 to the end of list1. Line 26 calls method PrintList (lines 40–48) to output list1’s contents. Line 29 calls method ToUppercaseStrings (lines 61–73) to convert each string element to uppercase, then line 30 calls PrintList again to display the modified strings. Line 33 calls method RemoveItemsBetween (lines 76–90) to remove the elements between "BLACK" and "BROWN"—not including either. Line 35 outputs the list again, then line 36 invokes method PrintReversedList (lines 93–107) to display the list in reverse order.

Generic method Concatenate (lines 51–58) iterates over list2 with a foreach statement and calls method AddLast to append each value to the end of list1. The LinkedList class’s enumerator loops over the values of the nodes, not the nodes themselves, so the iteration variable has type T. Notice that this creates a new node in list1 for each node in list2. One LinkedListNode cannot be a member of more than one LinkedList. If you want the same data to belong to more than one LinkedList, you must make a copy of the node for each list to avoid InvalidOperationExceptions.

Generic method PrintList (lines 40–48) similarly uses a foreach statement to iterate over the values in a LinkedList, and outputs them. Method ToUppercaseStrings (lines 61–73) takes a linked list of strings and converts each string value to uppercase. This method replaces the strings stored in the list, so we cannot use an enumerator (via a foreach statement) as in the previous two methods. Instead, we obtain the first LinkedListNode via the First property (line 64), and use a while statement to loop through the list (lines 66–72). Each iteration of the while statement obtains and updates the contents of currentNode via property Value, using string method ToUpper to create an uppercase version of string color. At the end of each iteration, we move the current node to the next node in the list by assigning currentNode to the node obtained by its own Next property (line 71). The Next property of the last node of the list gets null, so when the while statement iterates past the end of the list, the loop exits.

Notice that it does not make sense to declare ToUppercaseStrings as a generic method, because it uses the string-specific methods of the values in the nodes. Methods PrintList (lines 40–48) and Concatenate (lines 51–58) do not need to use any string-specific methods, so they can be declared with generic type parameters to promote maximal code reuse.

Generic method RemoveItemsBetween (lines 76–90) removes a range of items between two nodes. Lines 80–81 obtain the two “boundary” nodes of the range by using method Find. This method performs a linear search on the list and returns the first node that contains a value equal to the passed argument. Method Find returns null if the value is not found. We store the node preceding the range in local variable currentNode and the node following the range in endNode.

The while statement in lines 85–89 removes all the elements between currentNode and endNode. On each iteration of the loop, we remove the node following currentNode by invoking method Remove (line 88). Method Remove takes a LinkedListNode, splices that node out of the LinkedList, and fixes the references of the surrounding nodes. After the Remove call, currentNode’s Next property now gets the node following the node just removed, and that node’s Previous property now gets currentNode. The while statement continues to loop until there are no nodes left between currentNode and endNode, or until currentNode is the last node in the list. (There’s also an overloaded version of method Remove that performs a linear search for the specified value and removes the first node in the list that contains it.)

Method PrintReversedList (lines 93–107) displays the list backward by navigating the nodes manually. Line 98 obtains the last element of the list via the Last property and stores it in currentNode. The while statement in lines 100–104 iterates through the list backward by moving the currentNode reference to the previous node at the end of each iteration, then exiting when we move past the beginning of the list. Note how similar this code is to lines 64–72, which iterated through the list from the beginning to the end.

Covariance and Contravariance for Generic Types

A new feature in Visual C# 2010 is covariance and contravariance of generic interface and delegate types. To understand these concepts, we’ll consider them in the context of arrays, which have always been covariant and contravariant in C#.

Covariance in Arrays

Recall our Employee class hierarchy from Section 12.5, which consisted of the base class Employee and the derived classes SalariedEmployee, CommissionEmployee and BasePlusCommissionEmployee. Assuming the declarations

SalariedEmployee[] salariedEmployees = {
   new SalariedEmployee( "Bob", "Blue", "111-11-1111", 800M ),
   new SalariedEmployee( "Rachel", "Red", "222-22-2222", 1234M ) };
Employee[] employees;

we can write the following statement:

employees = salariedEmployees;

Even though the array type SalariedEmployee[] does not derive from the array type Employee[], the preceding assignment is allowed because class SalariedEmployee is a derived class of Employee.

Similarly, suppose we have the following method, which displays the string representation of each Employee in its employees array parameter:

void PrintEmployees( Employee[] employees )

We can call this method with the array of SalariedEmployees, as in:

PrintEmployees( salariedEmployees );

and the method will correctly display the string representation of each SalariedEmployee object in the argument array. Assigning an array of a derived-class type to an array variable of a base-class type is an example of covariance.

Covariance in Generic Types

Covariance now also works with several generic interface and delegate types, including IEnumerable<T>. Arrays and generic collections implement the IEnumerable<T> interface. Using the salariedEmployees array declared previously, consider the following statement:

IEnumerable< Employee > employees = salariedEmployees;

Prior to Visual C# 2010, this generated a compilation error. Interface IEnumerable<T> is now covariant, so the preceding statement is allowed. If we modify method PrintEmployees as in:

void PrintEmployees( IEnumerable< Employee > employees )

we can call PrintEmployees with the array of SalariedEmployee objects, because that array implements the interface IEnumerable<SalariedEmployee> and because a SalariedEmployee is an Employee and because IEnumerable<T> is covariant. Covariance like this works only with reference types that are related by a class hierarchy.

Contravariance in Arrays

Previously, we showed that an array of a derived-class type (salariedEmployees) can be assigned to an array variable of a base-class type (employees). Now, consider the following statement, which has always worked in C#:

SalariedEmployee[] salariedEmployees2 =
   ( SalariedEmployee[] ) employees;

Based on the previous statements, we know that the Employee array variable employees currently refers to an array of SalariedEmployees. Using a cast operator to assign employees—an array of base-class-type elements—to salariedEmployees2—an array of derived-class-type elements—is an example of contravariance. The preceding cast will fail at runtime if employees is not an array of SalariedEmployees.

Contravariance in Generic Types

To understand contravariance in generic types, consider a SortedSet of SalariedEmployees. Class SortedSet<T> maintains a set of objects in sorted order—no duplicates are allowed. The objects placed in a SortedSet must implement the IComparable<T> interface. For classes that do not implement this interface, you can still compare their objects using an object that implements the IComparer<T> interface. This interface’s Compare method compares its two arguments and returns 0 if they’re equal, a negative integer if the first object is less than the second, or a positive integer if the first object is greater than the second.

Our Employee hierarchy classes do not implement IComparable<T>. Let’s assume we wish to sort Employees by social security number. We can implement the following class to compare any two Employees:

class EmployeeComparer : IComparer< Employee >
{
   int IComparer< Employee >.Compare( Employee a, Employee b)
   {
      return a.SocialSecurityNumber.CompareTo(
         b.SocialSecurityNumber );
   } // end method Compare
} // end class EmployeeComparer

Method Compare returns the result of comparing the two Employees social security numbers using string method CompareTo.

Now consider the following statement, which creates a SortedSet:

SortedSet< SalariedEmployee > set =
   new SortedSet< SalariedEmployee >( new EmployeeComparer() );

When the type argument does not implement IComparable<T>, you must supply an appropriate IComparer<T> object to compare the objects that will be placed in the SortedSet. Since, we’re creating a SortedSet of SalariedEmployees, the compiler expects the IComparer<T> object to implement the IComparer<SalariedEmployee>. Instead, we provided an object that implements IComparer<Employee>. The compiler allows us to provide an IComparer for a base-class type where an IComparer for a derived-class type is expected because interface IComparer<T> supports contravariance.

Web Resources

For a list of covariant and contravariant interface types in .NET 4, visit

msdn.microsoft.com/en-us/library/dd799517.aspx#VariantList

It’s also possible to create your own variant types. For information on this, visit

msdn.microsoft.com/en-us/library/dd997386.aspx

Wrap-Up

This chapter introduced the .NET Framework collection classes. You learned about the hierarchy of interfaces that many of the collection classes implement. You saw how to use class Array to perform array manipulations. You learned that the System.Collections and System.Collections.Generic namespaces contain many nongeneric and generic collection classes, respectively. We presented the nongeneric classes ArrayList, Stack and Hashtable as well as generic classes SortedDictionary and LinkedList. In doing so, we discussed data structures in greater depth. We discussed dynamically expanding collections, hashing schemes, and two implementations of a dictionary. You saw the advantages of generic collections over their nongeneric counterparts.

You also learned how to use enumerators to traverse these data structures and obtain their contents. We demonstrated the foreach statement with many of the classes of the Framework Class Library, and explained that this works by using enumerators “behind-the-scenes” to traverse the collections.

Summary

Section 23.1 Introduction

  • The prepackaged data-structure classes provided by the .NET Framework are known as collection classes—they store collections of data.

  • With collection classes, instead of creating data structures to store these sets of items, the programmer simply uses existing data structures, without concern for how they’re implemented.

Section 23.2 Collections Overview

  • The .NET Framework collections provide high-performance, high-quality implementations of common data structures and enable effective software reuse.

  • In earlier versions of C#, the .NET Framework primarily provided the collection classes in the System.Collections namespace to store and manipulate object references.

  • The .NET Framework’s System.Collections.Generic namespace contains collection classes that take advantage of .NET’s generics capabilities.

Section 23.3 Class Array and Enumerators

  • All arrays implicitly inherit from abstract base class Array (namespace System).

  • The static Array method Sort sorts an array.

  • The static Array method Copy copies elements from one array to another.

  • The static Array method BinarySearch performs binary searches on an array. This method assumes that it receives a sorted array.

  • A collection’s GetEnumerator method returns an enumerator that can iterate over the collection.

  • All enumerators have methods MoveNext and Reset and property Current.

  • MoveNext moves the enumerator to the next element in the collection. MoveNext returns true if there’s at least one more element in the collection; otherwise, the method returns false.

  • Read-only property Current returns the object at the current location in the collection.

  • If a collection is modified after an enumerator is created for that collection, the enumerator immediately becomes invalid.

  • The foreach statement implicitly obtains an enumerator via the GetEnumerator method and uses the enumerator’s MoveNext method and Current property to traverse the collection. This can be done with any collection that implements the IEnumerable interface—not just arrays.

Section 23.4.1 Class ArrayList

  • In most programming languages, conventional arrays have a fixed size.

  • The .NET Framework’s ArrayList collection class enhances the functionality of conventional arrays and provides dynamic resizing of the collection.

  • ArrayLists store references to objects.

  • ArrayList has a constructor that can initialize the contents of an ArrayList with the elements of any ICollection passed to it. Many of the collection classes have such a constructor.

  • The Count and Capacity properties correspond, respectively, to the current number of elements in the ArrayList and the maximum number of elements that can be stored without allocating more memory to the ArrayList.

  • Method IndexOf returns the position of a value in an ArrayList, or -1 if the element isn’t found.

  • We can access an element of an ArrayList by following the ArrayList variable name with square brackets ([]) containing the desired index of the element.

  • The Remove method removes the first occurrence of the specified object. All subsequent elements shift toward the beginning of the ArrayList to fill the emptied position.

Section 23.4.2 Class Stack

  • Class Stack has methods Push and Pop to perform the basic stack operations.

  • The non-generic Stack class can store only references to objects, so value-type items are implicitly boxed before they’re added to the Stack.

  • Method Peek returns the value of the top stack element but does not remove the element.

  • Attempting to Peek or Pop an empty Stack causes an InvalidOperationException.

Section 23.4.3 Class Hashtable

  • Hashing is a high-speed scheme for converting keys to unique array indices. The .NET Framework provides class Hashtable to enable programmers to employ hashing.

  • Class Hashtable can accept any object as a key.

  • Method ContainsKey determines whether a key is in the hash table.

  • Hashtable method Add creates a new entry in the hash table, with the first argument as the key and the second as the value.

  • We can use the Hashtable’s indexer to obtain and set the key’s associated value in the hash table.

  • Hashtable property Keys gets an ICollection that contains all the keys.

  • If you use a foreach statement with a Hashtable, the iteration variable is of type DictionaryEntry, which has properties Key and Value for retrieving the key and value of the current element.

Section 23.5.1 Generic Class SortedDictionary

  • A dictionary is a collection of key/value pairs. A hash table is one way to implement a dictionary.

  • Generic class SortedDictionary does not use a hash table, but instead stores its key/value pairs in a binary search tree.

  • Generic class SortedDictionary takes two type arguments—the first specifies the type of key and the second the type of value.

  • When the compiler knows the type that the data structure contains, there’s no need to downcast when we need to use the type-specific methods.

  • Invoking the get accessor of a SortedDictionary indexer with a key that does not exist in the collection causes a KeyNotFoundException. This behavior is different from that of the Hashtable indexer’s get accessor, which would return null.

Section 23.5.2 Generic Class LinkedList

  • The LinkedList class is a doubly linked list—we can navigate the list both backward and forward with nodes of generic class LinkedListNode.

  • Each node contains property Value and read-only properties Previous and Next.

  • The LinkedList class’s enumerator loops over the values of the nodes, not the nodes themselves.

  • One LinkedListNode cannot be a member of more than one LinkedList. Any attempt to add a node from one LinkedList to another generates an InvalidOperationException.

  • Method Find performs a linear search on the list and returns the first node that contains a value equal to the passed argument.

  • Method Remove deletes a node from a LinkedList.

Section 23.6 Covariance and Contravariance for Generic Types

  • Visual C# 2010 now supports covariance and contravariance of generic interface and delegate types.

  • Assigning an array of a derived-class type to an array variable of a base-class type is an example of covariance.

  • Covariance now works with several generic interface types, including IEnumerable<T>.

  • Covariance in generic collections works only with reference types in the same class hierarchy.

  • Using a cast operator to assign an array variable of a base-class type to an array variable of a derived-class type is an example of contravariance.

  • Class SortedSet maintains a set of objects in sorted order—no duplicates are allowed.

  • The objects placed in a SortedSet must be comparable to determine their sorting order. Objects are comparable if their classes implement the IComparable<T> interface.

  • For classes that do not implement IComparable<T>, you can compare the objects using an object that implements the IComparer<T> interface. This interface’s Compare method compares its two arguments and returns 0 if they are equal, a negative integer if the first object is less than the second, or a positive integer if the first object is greater than the second.

  • Providing an IComparer for a base-class type where an IComparer for a derived-class type is expected is allowed because interface IComparer<T> supports contravariance.

Terminology

Self-Review Exercises

23.1

Fill in the blanks in each of the following statements:

  1. A(n) __________ is used to walk through a collection but cannot remove elements from the collection during the iteration.

  2. Class __________ provides the capabilities of an arraylike data structure that can resize itself dynamically.

  3. An element in an ArrayList can be accessed by using the ArrayList’s __________.

  4. IEnumerator method __________ advances the enumerator to the next item.

  5. If the collection it references has been altered since the enumerator’s creation, calling method Reset will cause a(n) __________.

23.1

  1. enumerator (or foreach statement).

  2. ArrayList.

  3. indexer.

  4. MoveNext.

  5. InvalidOperationException.

23.2

State whether each of the following is true or false. If false, explain why.

  1. Class Stack is in the System.Collections namespace.

  2. A Hashtable stores key/value pairs.

  3. A class implementing interface IEnumerator must define only methods MoveNext and Reset, and no properties.

  4. Values of simple types may be stored directly in an ArrayList.

  5. An ArrayList can contain duplicate values.

  6. A Hashtable can contain duplicate keys.

  7. A LinkedList can contain duplicate values.

  8. Dictionary is an interface.

  9. Enumerators can change the values of elements but cannot remove them.

  10. With hashing, as the load factor increases, the chance of collisions decreases.

23.2

  1. True.

  2. True.

  3. False. The class must also implement property Current.

  4. False. An ArrayList stores only objects. Autoboxing occurs when adding a value type to the ArrayList. You can prevent boxing by instead using generic class List with a value type.

  5. True.

  6. False. A Hashtable cannot contain duplicate keys.

  7. True.

  8. False. Dictionary is a class; IDictionary is an interface.

  9. False. An enumerator cannot be used to change the values of elements.

  10. False. With hashing, as the load factor increases, there are fewer available slots relative to the total number of slots, so the chance of selecting an occupied slot (a collision) with a hashing operation increases.

Answers to Self-Review Exercises

Exercises

23.3

(Collections Terminology) Define each of the following terms:

  1. ICollection

  2. Array

  3. IList

  4. load factor

  5. Hashtable collision

  6. space/time trade-off in hashing

  7. Hashtable

23.4

(ArrayList Methods) Explain briefly the operation of each of the following methods of class ArrayList:

  1. Add

  2. Insert

  3. Remove

  4. Clear

  5. RemoveAt

  6. Contains

  7. IndexOf

  8. Count

  9. Capacity

23.5

(ArrayList Insertion Performance) Explain why inserting additional elements into an ArrayList object whose current size is less than its capacity is a relatively fast operation and why inserting additional elements into an ArrayList object whose current size is at capacity is relatively slow.

23.6

(Inheritance Negatives) In our implementation of a stack in Fig. 21.13, we were able to quickly extend a linked list to create class StackInheritance. The .NET Framework designers chose not to use inheritance to create their Stack class. What are the negative aspects of inheritance, particularly for class Stack?

23.7

(Collections Short Questions) Briefly answer the following questions:

  1. What happens when you add a simple type value to a nongeneric collection?

  2. Can you display all the elements in an IEnumerable object without explicitly using an enumerator? If yes, how?

23.8

(Enumerator Members) Explain briefly the operation of each of the following enumerator-related methods:

  1. GetEnumerator

  2. Current

  3. MoveNext

23.9

(HashTable Methods an Properties) Explain briefly the operation of each of the following methods and properties of class Hashtable:

  1. Add

  2. Keys

  3. Values

  4. ContainsKey

23.10

(True/False) Determine whether each of the following statements is true or false. If false, explain why.

  1. Elements in an array must be sorted in ascending order before a BinarySearch may be performed.

  2. Method First gets the first node in a LinkedList.

  3. Class Array provides static method Sort for sorting array elements.

23.11

(LinkedList without Duplicates) Write an application that reads in a series of first names and stores them in a LinkedList. Do not store duplicate names. Allow the user to search for a first name.

23.12

(Generic SortedDictionary) Modify the application in Fig. 23.8 to count the number of occurrences of each letter rather than of each word. For example, the string "HELLO THERE" contains two Hs, three Es, two Ls, one O, one T and one R. Display the results.

23.13

(SortedDictionary of Colors) Use a SortedDictionary to create a reusable class for choosing from some of the predefined colors in class Color (in the System.Drawing namespace). The names of the colors should be used as keys, and the predefined Color objects should be used as values. Place this class in a class library that can be referenced from any C# application. Use your new class in a Windows application that allows the user to select a color, then changes the background color of the Form.

23.14

(Duplicate Words in a Sentence) Write an application that determines and displays the number of duplicate words in a sentence. Treat uppercase and lowercase letters the same. Ignore punctuation.

23.15

(Using a Generic List) Recall from Fig. 23.2 that class List is the generic equivalent of class ArrayList. Write an application that inserts 25 random integers from 0 to 100 in order into an object of class List. The application should calculate the sum of the elements and the floating-point average of the elements.

23.16

(Reversing a LinkedList) Write an application that creates a LinkedList object of 10 characters, then creates a second list object containing a copy of the first list, but in reverse order.

23.17

(Prime Numbers and Prime Factorization) Write an application that takes a whole-number input from a user and determines whether it’s prime. If the number is not prime, display the unique prime factors of the number. Remember that a prime number’s factors are only 1 and the prime number itself. Every number that’s not prime has a unique prime factorization. For example, consider the number 54. The prime factors of 54 are 2, 3, 3 and 3. When the values are multiplied together, the result is 54. For the number 54, the prime factors output should be 2 and 3.

23.18

(Bucket Sort with LinkedList<int>) In Exercise 20.7, you performed a bucket sort of ints by using a two-dimensional array, where each row of the array represented a bucket. If you use a dynamically expanding data structure to represent each bucket, you do not have to write code that keeps track of the number of ints in each bucket. Rewrite your solution to use a one-dimensional array of LinkedList< int > buckets.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.143.205.27