Chapter 28. Collection Classes

Visual Basic .NET includes a large assortment of pre-built classes that store and manage groups of objects. These collection classes provide a wide variety of different features, so the right class for a particular purpose depends on your application.

For example, an array is good for storing objects in a particular fixed order. An ArrayList enables you to add, remove, and rearrange its objects much more easily than an array does. A Queue lets a program easily add items and remove them in first in, first out order. In contrast, a Stack lets the program remove items in last in, first out order.

This chapter describes these different kinds of collection classes and provides tips for selecting the right one for various purposes.

WHAT IS A COLLECTION?

The word collection means a group of objects that should be kept together. For example, a coin collection is a group of coins that you keep together because they are rare, valuable, or otherwise interesting.

Unfortunately, the idea of a collection is such a useful concept that Visual Basic adopted the word and made a specific class named Collection. The Collection class does keep a group of objects together, but it reserves for its own use the perfect word for other similar kinds of groups of objects.

That leads to some semantic ambiguity when you talk about collection classes. Do you mean the Collection class? Or do you mean some other class that groups objects? Even the Visual Basic documentation has this problem and sometimes uses collection classes to mean classes that group things together.

This chapter describes the Collection class as well as other collection classes.

One of the most basic Visual Basic entities that groups objects is an array. An array stores data values or references to objects in a simple block of memory with one entry directly following another. The Array class provides some special methods for manipulating arrays (such as reversing, sorting, or searching an array).

The Collection class provides a few specific features for working with its group of objects. It enables you to add an item to the Collection, optionally specifying a key for the item. You can then search for the item or remove the item using its key or its index in the Collection.

One of the most useful features of the Collection class is that it supports enumerators and For Each loops. That lets you easily loop over the objects in a Collection without worrying about the number of objects it contains.

Other classes derived from the Collection class provide additional features. For example, the Hashtable class can store a large number of objects with associated keys very efficiently. Dictionary is essentially a strongly-typed generic Hashtable so, although the Hashtable uses the Object data type for its key/value pairs, the Dictionary uses more specific data types that you specify such as Strings or Employees. The section "Dictionaries" later in this chapter describes the Dictionary class in more detail.

The Queue class makes it easy to work with objects on a first in, first out (FIFO) basis, whereas the Stack class helps you work with objects in a last in, first out order (LIFO).

The remainder of this chapter describes these classes in detail.

ARRAYS

Visual Bassic .NET provides two basic kinds of arrays. First, it provides the normal arrays that you get when you declare a variable by using parentheses. For example, the following code declares an array of Integers named "squares." The array contains 11 items with indexes ranging from 0 to 10. The code loops over the items, setting each one's value. Next, it loops over the values again, adding them to a string. When it has finished building the string, the program displays the result.

Private Sub ShowSquaresNormalArray()
    Dim squares(10) As Integer

    For i As Integer = 0 To 10
        squares(i) = i * i
    Next i

    Dim txt As String = ""
    For i As Integer = 0 To 10
        txt &= squares(i).ToString & vbCrLf
    Next i
    MessageBox.Show(txt)
End Sub
                                                  
ARRAYS

The Visual Basic Array class provides another kind of array. This kind of array is actually an object that provides methods for managing the items stored in the array.

The following code shows the previous version of the code rewritten to use an Array object. This version creates the array by using the Array class's shared CreateInstance method, passing it the data type that the array should contain and the number of items that it should hold. The code then loops over the items using the array's SetValue method to set the items' values. If you have Option Strict turned off, the code can set the items' values exactly as before by using the statement squares(i) = i * i. If Option Strict is on, you need to use SetValue. Next, the program loops over the items again, using the array's GetValue method to add the item values to a string. If Option Strict is off, you can use the same syntax as before: txt &= squares(i).ToString & vbCrLf. If Option Strict is on, you need to use the array's GetValue method. After building the string, the program displays it in a message box as before.

Private Sub ShowSquaresArrayObject()
    Dim squares As Array =
        Array.CreateInstance(GetType(Integer), 11)

    For i As Integer = 0 To 10
        squares.SetValue(i * i, i)
    Next i

    Dim txt As String = ""
    For i As Integer = 0 To 10
        txt &= squares.GetValue(i).ToString & vbCrLf
    Next i
    MessageBox.Show(txt)
End Sub
                                                  
INITIALIZING ARRAYS

Example program ShowSquares uses similar code to build a list of squares by using a normal array and by using an Array object.

The following sections describe the similarities and differences between normal arrays and Array objects.

Array Dimensions

Both normal variable arrays and Array objects can support multiple dimensions. The following statement declares a three-dimensional array with 11 items in the first dimension, 11 in the second, and 21 in the third. It then sets the value for the item in position (1, 2, 3).

Dim values(10, 10, 20) As Integer
values(1, 2, 3) = 100

The following code does the same thing with an Array object:

Dim values As Array =
    Array.CreateInstance(GetType(Integer), 11, 21, 31)
values.SetValue(100, 1, 2, 3)

If Option Strict is off, the code can use the same syntax for getting and setting the Array item's value. The following code sets the (1, 2, 3) item's value to 100 and then displays its value (100):

Option Strict Off
. . .
values(1, 2, 3) = 100
Debug.WriteLine(values(1, 2, 3))

Lower Bounds

A normal array of variables always has lower bound 0 in every dimension. The following code declares an array with indexes ranging from 0 to 10:

Dim values(10) As Integer

You can fake a variable array that has nonzero lower bounds, but it requires extra work on your part. You must add or subtract an appropriate amount from each index to map the indexes you want to use to the underlying zero-based indexes.

Array objects can handle nonzero lower bounds for you. The following code creates a two-dimensional array with indexes ranging from 1 to 10 in the first dimension, and 101 to 120 in the second dimension:

Dim dimension_lengths() As Integer = {10, 20}
Dim lower_bounds() As Integer = {1, 101}
Dim values As Array =
    Array.CreateInstance(GetType(Integer), dimension_lengths, lower_bounds)

The code first defines an array containing the number of elements it wants for each dimension (10 in the first dimension and 20 in the second dimension). It then creates an array containing the lower bounds it wants to use for each dimension (the first dimension starts with index 1 and the second dimension starts with index 101).

The code then calls the Array class's shared CreateInstance method, passing it the data type of the array's objects, the array of dimension lengths, and the array of lower bounds. The CreateInstance method uses the arrays of lower bounds and dimensions to create an Array object with the appropriate bounds.

The following code sets the values in this array. The last two parameters in the call to SetValue give the items' positions in the Array.

For i As Integer = 1 To 10
    For j As Integer = 101 To 120
        values.SetValue(i * 100 + j, i, j)
    Next j
Next i

If Option Strict is off, the program can use the following simpler syntax:

For i As Integer = 1 To 10
    For j As Integer = 101 To 120
        values(i, j) = i * 100 + j
    Next j
Next i

Resizing

You can use the ReDim statement to change a normal array's dimensions. Add the Preserve keyword if you want the array to keep its existing values, as shown here:

Dim values(100) As Integer
. . .
ReDim Preserve values(200)

An Array object cannot resize itself, but it is relatively easy to copy an Array object's items to another Array object. The following code creates a values array containing 101 items with indexes ranging from 0 to 100. Later, it creates a new Array object containing 201 items and uses the values array's CopyTo method to copy its values into the new array. The second parameter to CopyTo gives the index in the destination array where the copy should start placing values.

Dim values As Array =
    Array.CreateInstance(GetType(Integer), 101)
. . .
Dim new_array As Array =
    Array.CreateInstance(GetType(Integer), 201)
values.CopyTo(new_array, 0)
values = new_array

The Array class's shared Copy method allows you greater control. It lets you specify the index in the source array where the copy should start, the index in the destination array where the items should be copied, and the number of items to be copied.

Although building a new Array object and copying items into it is more cumbersome than using ReDim to resize a variable array, the process is surprisingly fast.

Speed

There's no doubt that arrays of variables are much faster than Array objects. In one test, setting and getting values in an Array object took more than 100 times as long as performing the same operations in a variable array.

If your application performs only a few hundred or a thousand array operations, the difference is unimportant. If your application must access array values many millions of times, you may need to consider using an array of variables even if the Array class would be more convenient for other reasons (such as nonzero lower bounds).

Microsoft has also optimized one-dimensional variable arrays, so they are faster than multidimensional arrays. The difference is much less dramatic than the difference between variable arrays and Array classes, however.

Example program ArraySpeeds, which is available for download on the book's web site, compares the speeds of variable arrays and Array objects. Enter the number of items that you want to use in the arrays and click Go. The program builds one- and two-dimensional arrays and Array objects holding integers. It then fills the arrays and displays the elapsed time.

Figure 28-1 shows the results. Variable arrays are much faster than array classes. One-dimensional variable arrays generally seem to be slightly faster than two-dimensional arrays.

Variable arrays are faster than array classes.

Figure 28.1. Variable arrays are faster than array classes.

Other Array Class Features

The Array class provides several other useful shared methods. For example, the IndexOf and LastIndexOf methods return the position of a particular item in an Array.

Methods such as IndexOf and LastIndexOf would be a strong argument supporting Array objects over normal arrays of variables if it weren't for one somewhat surprising fact: Those same methods work with regular arrays of variables, too! The following code fills an array of integers and then uses Array methods to display the indexes of the first item with value 6 and the last item with value 3:

Dim values(10) As Integer
For i As Integer = 0 To 10
    values(i) = i
Next i

MessageBox.Show(Array.IndexOf(values, 6).ToString)
MessageBox.Show(Array.LastIndexOf(values, 3).ToString)

The following sections describe some of the Array class's other useful shared methods. All of these work both for arrays of variables and Array objects.

Example program ArrayTests, which is available for download on the book's web site, demonstrates the Array class's IndexOf, LastIndexOf, Reverse, and BinarySearch methods. It also demonstrates the Sort method for arrays containing integers, objects that implement the IComparable interface, and objects that can be sorted with IComparer objects.

Array.Reverse

The Array.Reverse method reverses the order of the items in an array. There's nothing particularly confusing about this method. It can easily reverse its items even if the items are not things that you can reasonably compare. For example, it can reverse an array of integers, strings, StockOption objects, or TreeView controls.

Array.Sort

The Array.Sort method sorts the items in the array. To sort the items, this method must compare them to each other. That means the items must be things that can be reasonably compared, such as integers, strings, or dates. More precisely, the method can sort the items if they implement the IComparable interface, meaning they contain the means to compare themselves to each other.

The following code shows a Person class that implements the IComparable interface. The class defines two public strings, FirstName and LastName. For convenience, it also defines a constructor and a ToString function. The code then defines the CompareTo function that is required by the IComparable interface. This function should compare the value of the current object to the value of the object passed as a parameter. It should return −1 if the current object should come before the parameter, 0 if neither object must come before the other, and 1 if the parameter should come before the current object. The String.Compare function makes exactly that calculation for two strings, so the CompareTo function uses it to compare the two Person objects' names. You could use a more complicated CompareTo function to order just about anything.

Public Class Person
    Implements IComparable

    Public FirstName As String
    Public LastName As String

    Public Sub New(ByVal first_name As String, ByVal last_name As String)
        FirstName = first_name
        LastName = last_name
    End Sub

    Public Overrides Function ToString() As String
        Return LastName & ", " & FirstName
    End Function

    Public Function CompareTo(ByVal obj As Object) As Integer _
     Implements System.IComparable.CompareTo
        Dim other_Person As Person = DirectCast(obj, Person)
        Return String.Compare(Me.ToString, other_Person.ToString)
    End Function
End Class
                                                  
Array.Sort

If a program has an array of Person objects, the Array.Sort method will sort the objects by last name followed by first name.

You can sort objects that do not implement the IComparable interface if you pass the Sort method an object that can sort them. For example, suppose you define a Manager class that does not implement the IComparable interface. The following code shows a ManagerComparer class that implements the IComparer interface. Its Compare method compares two Manager objects and returns −1, 0, or 1 to indicate the one that should come first. The call to DirectCast converts the Compare method's parameters from Objects (specified by the IComparer interface) to Managers.

Public Class ManagerComparer
    Implements IComparer

    Public Function Compare(ByVal x As Object, ByVal y As Object) As Integer _
     Implements System.Collections.IComparer.Compare
        Dim mgr1 As Manager = DirectCast(x, Manager)
        Dim mgr2 As Manager = DirectCast(y, Manager)

        Return String.Compare(mgr1.ToString, mgr2.ToString)
    End Function
End Class
                                                  
Array.Sort

The following code uses a ManagerComparer object to sort an array of Manager objects named dept_managers:

' Sort.
Array.Sort(dept_managers, New ManagerComparer)

' Display the results.
Dim txt As String = ""
For i As Integer = 0 To dept_managers.GetUpperBound(0)
    txt &= dept_managers(i).ToString() & vbCrLf
Next i
MessageBox.Show(txt)
                                                  
Array.Sort

Other overloaded versions of the Sort method let you sort two arrays (one containing keys and the other values) in tandem or sort only parts of the array.

Array.BinarySearch

If the array contains items that are sorted, and the items implement the IComparable interface, then the Array.BinarySearch method uses a binary search algorithm to locate a specific item within the array.

For example, suppose that the array contains Person objects as defined in the previous section. Then you could use the following code to find the object representing Rod Stephens in the people array. The code starts by using Array.Sort to sort the array. Next the program makes a new Person object that has the name Rod Stephens to represent the target value. The program calls the Array.BinarySearch method to find the index of the object with the target name.

' Sort the array.
Array.Sort(people)

' Find Rod Stephens.
Dim target_person As New Person("Rod", "Stephens")
Dim target_index As Integer = Array.BinarySearch(people, target_person)
MessageBox.Show(people(target_index).ToString)
                                                  
Array.BinarySearch

Binary search is extremely fast, so it works well if you need to find an item within a sorted array. If the items are not sorted, you should consider using a database or Hashtable object to store and find items. See the section "Hashtable" later in this chapter for more information.

COLLECTIONS

The Visual Basic collection classes basically hold items and don't provide a lot of extra functionality. Other classes described later in this chapter provide more features.

The following sections describe the simple collection classes in Visual Basic: ArrayList, StringCollection, and NameValueCollection. They also describe strongly typed collections that you can build to make code that uses these classes a bit easier to debug and maintain.

ArrayList

The ArrayList class is a resizable array. You can add and remove items from any position in the list and it resizes itself accordingly. The following table describes some of the class's more useful properties and methods.

PROPERTY/METHOD

PURPOSE

Add

Adds an item at the end of the list.

AddRange

Adds the items in an object implementing the ICollection interface to the end of the list.

BinarySearch

Returns the index of an item in the list. The items must implement the IComparable interface, or you must provide the Sort method with an IComparer object.

Capacity

Gets or sets the number of items that the list can hold. For example, if you know that you will need to add 1000 items to the list, you may get better performance by setting Capacity to 1000 before starting, rather than letting the object grow incrementally as you add the items.

Clear

Removes all of the items from the list. The Capacity property remains unchanged, so the ArrayList keeps any space it has previously allocated to improve performance.

Contains

Returns True if a specified item is in the list.

CopyTo

Copies some or the entire list into a one-dimensional Array object.

Count

Returns the number of items currently in the list. This is always less than or equal to Capacity.

GetRange

Returns an ArrayList containing the items in part of the list.

IndexOf

Returns the zero-based index of the first occurrence of a specified item in the list.

Insert

Adds an item at a particular position in the list.

InsertRange

Adds the items in an object implementing the ICollection interface to a particular position in the list.

Item

Returns the item at a particular position in the list.

LastIndexOf

Returns the zero-based index of the last occurrence of a specified item in the list.

Remove

Removes the first occurrence of a specified item from the list.

RemoveAt

Removes the item at the specified position in the list.

RemoveRange

Removes the items in the specified positions from the list.

Reverse

Reverses the order of the items in the list.

SetRange

Replaces the items in part of the list with new items taken from an ICollection object.

Sort

Sorts the items in the list. The items must implement the IComparable interface, or you must provide the Sort method with an IComparer object.

ToArray

Copies the list's items into a one-dimensional array. The array can be an array of objects, an array of a specific type, or an Array object (holding objects).

TrimToSize

Reduces the list's allocated space so that it is just big enough to hold its items. This sets Capacity = Count.

A single ArrayList object can hold objects of many different kinds. The following code creates an ArrayList and adds a string, Form object, integer, and Bitmap to it. It then loops through the items in the list and displays their types.

Dim array_list As New ArrayList
array_list.Add("What?")
array_list.Add(Me)
array_list.Add(1001)
array_list.Add(New Bitmap(10, 10))
For Each obj As Object In array_list
    Debug.WriteLine(obj.GetType.ToString)
Next obj
                                                  
ArrayList

The following text shows the results:

System.String
UseArrayList.Form1
System.Int32
System.Drawing.Bitmap

The value displayed for the second item depends on the name of the project (in this case, UseArrayList).

Example program UseArrayList demonstrates several ArrayList methods.

StringCollection

A StringCollection is similar to an ArrayList, except that it can hold only strings. Because it works only with strings, this class provides some extra type checking that the ArrayList does not. If your program tries to add an Employee object to a StringCollection, the collection raises an error.

To take advantage of this extra error checking, you should always use a StringCollection instead of an ArrayList if you are working with strings. Of course, if you need other features (such as the fast lookups provided by a Hashtable), you should use one of the classes described in the following sections.

Strongly Typed Collections

A strongly typed collection is a collection class built to work with a particular data type. An ArrayList can store objects of any data type. A StringCollection is strongly typed to work only with strings. That gives you extra error checking that makes finding and fixing programming mistakes easier. If the program tries to insert an IncidentReport object into a StringCollection, the collection immediately raises an error and the problem is relatively easy to find.

Similarly, you can define your own collection classes that are strongly typed. For example, you could make an OrderCollection class that holds Order items. If the program tries to add a Manager or Secretary object to it, the collection raises an error.

To build a strongly typed collection from scratch, create a new class that inherits from System.Collections.CollectionBase. Inheriting from this class automatically gives your class an ArrayList object named List. It also gives your class some inherited routines that do not depend on the type of object you want the collection to hold. For example, the RemoveAt method removes the object at a specific index in the list.

Your class can implement other methods for adding and retrieving items in the collection. For example, it can implement the Add, Remove, and Item methods.

Fortunately, you don't need to build these methods from scratch. You can simply delegate them to the inherited List object. For example, the Add method can simply call List.Add, as shown in the following code:

' Add an Employee.
Public Sub Add(ByVal value As Employee)
    List.Add(value)
End Sub

This code does nothing other than call the List object's methods. The only magic here is that the EmployeeCollection class's Add method takes a parameter of a particular type (Employee), whereas the List object's Add method takes a generic Object as a parameter. It is the EmployeeCollection class's insistence on Employee objects that makes the collection strongly typed.

The Add and Item methods are about the minimum useful feature set you can provide for a strongly typed collection class.

The following table lists the standard methods provided by a strongly typed collection class. The third column indicates whether the CollectionBase parent class automatically provides the method, or whether you must delegate the method to the List object.

METHOD

PURPOSE

PROVIDED BY

Add

Adds an item to the collection

List

Capacity

Returns the amount of space in the collection

CollectionBase

Clear

Removes all items from the collection

CollectionBase

Contains

Returns True if the collection contains a particular item

List

CopyTo

Copies items from the collection into an array

List

Count

Returns the number of items in the collection

CollectionBase

IndexOf

Returns the index of an item

List

InnerList

Returns an ArrayList holding the collection's objects

CollectionBase

Insert

Inserts an item at a specific position

List

Item

Returns the item at a specific position

List

List

Returns an IList holding the collection's objects

CollectionBase

Remove

Removes an item

List

RemoveAt

Removes the item at a specific position

CollectionBase

You can also add other more specialized methods if they would be useful in your application. For example, you could add methods for working with object field values rather than with the objects themselves. You might make an overloaded version of the Item method that takes as parameters a first and last name and returns the corresponding Employee object if it is in the list. You could also modify the simple Add method shown previously so that it doesn't allow duplicates. And you could make an Add function that takes first and last names as parameters, creates a new Employee object using those names, and returns the new object.

Example program EmployeeCollection, which is available for download on the book's web site, builds a strongly typed collection of Employee objects that inherits from the CollectionBase class.

An additional benefit that comes with inheriting from the CollectionBase class is For Each support. The following code shows how a program might use an EmployeeCollection class named emp_list. It creates the collection and adds some Employee objects to the list. It then uses a For Each loop to display the Employees.

Dim emp_list As New EmployeeCollection
emp_list.Add(New Employee("Ann", "Anderson"))
emp_list.Add(New Employee("Bart", "Baskerville"))
. . .

For Each emp As Employee In emp_list
    Debug.WriteLine(emp.ToString)
Next emp
                                                  
Strongly Typed Collections

Generics provide another method for building strongly typed collections. Refer to the section "Generics" later in this chapter for more information on generic collections. For more general information on generics, see Chapter 29, "Generics."

Read-Only Strongly Typed Collections

The CollectionBase class enables you to build a strongly typed collection class that allows a program to store and retrieve values. In some cases, you might want a function to return a collection of objects that the calling program cannot modify. For example, suppose that your function returns a list of your company's production locations. You don't want the program to modify the list because it cannot change the locations. In this case, you can build a read-only strongly typed collection.

You can do this much as you build a strongly typed collection. Instead of deriving the new collection class from CollectionBase, however, derive it from the ReadOnlyCollectionBase class. Provide read-only Item methods, but do not provide any Add or Remove methods. The class itself can access its inherited InnerList object to add and remove items, but it must not give the program using your class access to that object.

Your program still needs a way to get objects into the collection, however. One method is to build the collection class in a separate library project and give it initialization methods declared with the Friend keyword. Other code in the library project could use those methods while the main program could not.

Another technique is to pass initialization data to the class's constructor. Your code creates the collection and returns it to the main program. The main program cannot change the collection's contents. It can create an instance of the collection of its own, but it cannot modify the one you built.

NameValueCollection

The NameValueCollection class is a collection that can hold more than one string value for a particular key (name). For example, you might use employee names as keys. The string values associated with a particular key could include extension, job title, employee ID, and so forth. Of course, you could also store the same information by putting extension, job title, employee ID, and the other fields in an object or structure, and then storing the objects or structures in some sort of collection class such as an ArrayList. A NameValueCollection, however, is very useful if you don't know ahead of time how many strings will be associated with each key.

The following table describes some of the NameValueCollection's most useful properties and methods.

PROPERTY/METHOD

DESCRIPTION

Add

Adds a new name/value pair to the collection. If the collection already holds an entry for the name, it adds the new value to that name's values.

AllKeys

Returns a string array holding all of the key values.

Clear

Removes all names and values from the collection.

CopyTo

Copies items starting at a particular index into a one-dimensional Array object. This copies only the items (see the Item property), not the keys.

Count

Returns the number of key/value pairs in the collection.

Get

Gets the item for a particular index or name as a comma-separated list of values.

GetKey

Returns the key for a specific index.

GetValues

Returns a string array containing the values for a specific name or index.

HasKeys

Returns True if the collection contains any non-null keys.

Item

Gets or sets the item for a particular index or name as a comma-separated list of values.

Keys

Returns a collection containing the keys.

Remove

Removes a particular name and all of its values.

Set

Sets the item for a particular index or name as a comma-separated list of values.

Note that there is no easy way to remove a particular value from a name. For example, if a person's name is associated with extension, job title, and employee ID, it is not easy to remove only the job title.

The following statement shows one approach to removing a person's JobTitle entry, although you would need to modify it slightly if you didn't know whether the JobTitle entry was last in the list (so it might or might not be followed by a comma):

nvc.Item("RodStephens") = nvc.Item("RodStephens").Replace("JobTitle,", "")

Example program UseNameValueCollection, which is available for download on the book's web site, demonstrates NameValueCollection class features.

DICTIONARIES

A dictionary is a collection that associates keys with values. You look up a key, and the dictionary provides you with the corresponding value. This is similar to the way a NameValueCollection works, except that a dictionary's keys and values need not be strings, and a dictionary associates each key with a single value object.

Visual Studio provides several different kinds of dictionary classes that are optimized for different uses. Their differences come largely from the ways in which they store data internally. Though you don't need to understand the details of how the dictionaries work internally, you do need to know how they behave so that you can pick the best one for a particular purpose.

Because all of the dictionary classes provide the same service (associating keys with values), they have roughly the same properties and methods. The following table describes some of the most useful of these.

PROPERTY/METHOD

DESCRIPTION

Add

Adds a key/value pair to the dictionary.

Clear

Removes all key/value pairs from the dictionary.

Contains

Returns True if the dictionary contains a specific key.

CopyTo

Copies the dictionary's data starting at a particular position into a one-dimensional array of DictionaryEntry objects. The DictionaryEntry class has Key and Value properties.

Count

Returns the number of key/value pairs in the dictionary.

Item

Gets or sets the value associated with a key.

Keys

Returns a collection containing all of the dictionary's keys.

Remove

Removes the key/value pair with a specific key.

Values

Returns a collection containing all of the dictionary's values.

The following sections describe different Visual Studio dictionary classes in more detail.

ListDictionary

A ListDictionary stores its data in a linked list. In a linked list, each item is held in an object that contains its data plus a reference or link to the next item in the list. Figure 28-2 illustrates a linked list. This list contains the key/value pairs Appetizer/Salad, Entrée/Sandwich, Drink/Water, and Dessert/Cupcake. The link out of the Dessert/Cupcake item is set to Nothing, so the program can tell when it has reached the end of the list. A reference variable inside the ListDictionary class, labeled Top in Figure 28-2, points to the first item in the list.

Each item in a linked list keeps a reference to the next item in the list.

Figure 28.2. Each item in a linked list keeps a reference to the next item in the list.

The links in a linked list make adding and removing items relatively easy. The ListDictionary simply moves the links to point to new objects, add objects, remove objects, or insert objects between two others. For example, to add a new item at the top of the list, you create the new item, set its link to point to the item that is currently at the top, and then make the list's Top variable point to the new item. Other rearrangements are almost as easy. For more information on how linked lists work, see a book on algorithms and data structures such as Data Structures and Algorithms Using Visual Basic.NET (McMillan, Cambridge University Press, 2005).

Unfortunately, if the list grows long, finding items in it can take a long time. To find an item in the list, the program starts at the top and works its way down, following the links between items, until it finds the one it wants. If the list is short, that doesn't take very long. If the list holds 100,000 items, this means potentially a 100,000-item crawl from top to bottom. That means a ListDictionary object's performance degrades if it contains too many items.

If you only need to store a few hundred items in the dictionary and you don't need to access them frequently, a ListDictionary is fine. If you need to store 100,000 entries, or if you need to access the dictionary's entries a huge number of times, you may get better performance using a "heavier" object such as a Hashtable. A Hashtable has more overhead than a ListDictionary but is faster at accessing its entries.

Hashtable

A Hashtable looks a lot like a ListDictionary on the outside, but internally it stores its data in a very different way. Rather than using a linked list, this class uses a hash table to hold data.

A hash table is a data structure that allows extremely fast access to items using their keys. It works by mapping an object's key into a bucket by calculating the key's hash value.

For example, suppose that you want to make a dictionary associating Employee objects with their Social Security numbers. You could make an array holding 10 buckets and then map employees to bucket numbers using the last digit of their Social Security numbers. Employee 123-45-6789 goes in bucket 9, employee 111-22-3333 goes in bucket 3, and employee 865-29-8361 goes in bucket 1.

This kind of hash table is easy to enlarge. If you add a few dozen employees, you might use 100 buckets and map Social Security numbers using their last two digits. Then employee 123-45-6789 goes in bucket 89, employee 111-22-3333 goes in bucket 33, and employee 865-29-8361 goes in bucket 61.

More employees? Make more buckets! You can use 1000, 10,000, or more buckets if necessary.

To find an employee's data in the dictionary, you only need to calculate the hash value (last digits of Social Security number) and then look in that bucket. For a large dictionary, this is much faster than digging through a linked list. If you have 10,000 employees perfectly divided one to a bucket in a table of 10,000 buckets, you only need to look in one bucket to find the data you want. That's a lot faster than slogging through a 10,000-item linked list.

Of course, there is a catch. Actually, there are two catches.

First, if you have too many employees, you are eventually going to find two that map to the same bucket. Suppose that you have 10 buckets and two employees with Social Security numbers 732-45-7653 and 145-76-4583. These both map to bucket 3. This is called a key collision, and the hash table must use some sort of collision resolution policy to figure out what to do when two keys map to the same bucket. This policy is generally some simple method such as making a linked list in the bucket or letting the second item spill over into the following bucket. Whatever collision resolution scheme you use, it takes a little extra time to search through full buckets.

You can make the buckets less full if you use more of them. In this example, if you switched to 100 buckets, the two employees would map to buckets 53 and 83, so no collision occurs. That makes looking up these items faster.

The second catch is that you can make a hash table faster by adding more buckets, but using more buckets means using more space. When a hash table starts to fill up, collisions are common, so performance suffers. Add more space and performance improves, but there may be a lot of wasted space not occupied by any data. If the table grows too big, it may start to use up all of the system's memory and starve the application for memory. The application must then page memory to disk when it needs more room, a process that can make the program extremely slow.

Those are the advantages and disadvantages of the Hashtable class. A Hashtable gives better performance than a ListDictionary, but takes more space.

In addition to the usual dictionary properties and methods, the Hashtable has a few that help manage the internal hash table that it uses to store its items.

One overloaded version of the Hashtable class's constructor takes a parameter that tells how many items the table should initially be able to hold. If you know you are going to load 1000 items, you might make the table initially hold room for 1500. Then the program could add all the items without filling the table too much, so it would still give good performance. If you don't set an initial size, the hash table might start out too small and need to resize itself many times before it could hold 1000 items, and that will slow it down.

Another version of the constructor lets you specify the hash table's load factor. The load factor is a number between 0.1 and 1.0 that gives the largest ratio of elements to buckets that the Hashtable will allow before it enlarges its internal table. For example, suppose that the hash table's capacity is 100 and its load factor is 0.8. Then when it holds 80 elements, the Hashtable will enlarge its internal table.

For high-performance lookups, the Hashtable class is a great solution.

HybridDictionary

A HybridDictionary is a cross between a ListDictionary and a Hashtable. If the dictionary is small, the HybridDictionary stores its data in a ListDictionary. If the dictionary grows too large, HybridDictionary switches to a Hashtable.

If you know that you will only need a few items, you can use a ListDictionary. If you know you will need to use a very large number of items, you can use a Hashtable. If you are unsure whether you will have few or many items, you can hedge your bet with a HybridDictionary. It'll take a bit of extra time to switch from a list to a Hashtable if you add a lot of items, but you'll save time in the long run if the list does turn out to be enormous.

Strongly Typed Dictionaries

Just as you can make strongly typed collections, you can also make strongly typed dictionaries. The idea is exactly the same. You derive your strongly typed class from a class that supports basic dictionary functionality. In this case, that parent class is DictionaryBase, and it provides a Dictionary object that your class can use to implement its dictionary features.

Next, you implement dictionary methods such as Add, Item, Keys, and so forth. Your code requires specific data types and uses the parent class's Dictionary variable to do all the hard work through delegation.

The following table lists the standard methods provided by a strongly typed dictionary class. The third column indicates whether the DictionaryBase parent class automatically provides the method or whether you must delegate the method to the Dictionary object.

METHOD

PURPOSE

PROVIDED BY

Add

Adds a key/value pair to the dictionary

Dictionary

Clear

Removes all key/value pairs from the dictionary

DictionaryBase

Contains

Returns True if the dictionary contains a specific key

Dictionary

CopyTo

Copies elements from the dictionary into an array of DictionaryEntry objects

DictionaryBase

Count

Returns the number of key/value pairs in the dictionary

DictionaryBase

Dictionary

Returns a Dictionary holding the dictionary's key/value pairs

DictionaryBase

InnerHashtable

Returns a Hashtable holding the dictionary's key/value pairs

DictionaryBase

Item

Returns the value corresponding to a specific key

Dictionary

Keys

Returns an ICollection containing the dictionary's keys

Dictionary

Remove

Removes the key/value pair for a specific key

Dictionary

Values

Returns an ICollection containing the dictionary's values

Dictionary

As is the case with strongly typed collections, you can add other, more specialized methods if they would be useful in your application.

Example program MakeEmployeeDictionary, which is available for download on the book's web site, shows how to build a strongly typed dictionary of Employee objects that inherits from the DictionaryBase class.

Other Strongly Typed Derived Classes

How a class stores its data internally is generally not a developer's concern. As long as it does its job, you shouldn't care whether the DictionaryBase class stores its data in a linked list, hash table, or some other data structure. (Although the DictionaryBase class has an InnerHashtable method that returns its data in a Hashtable form, so perhaps that's a hint.)

However, if you really want a strongly typed class that you know uses a ListDictionary instead of a hash table (or whatever CollectionBase uses), you could derive a strongly typed class from the ListDictionary class.

The following code shows the EmployeeListDictionary class, which is derived from the ListDictionary class. It uses the Shadows keyword to replace the ListDictionary class's Add, Item, Contains, and Remove methods with new strongly typed versions. Those methods simply pass their requests along to the base class ListDictionary.

Public Class EmployeeListDictionary
    Inherits ListDictionary

    ' Add a Dictionary entry.
    Public Shadows Sub Add(ByVal new_key As String,
     ByVal new_employee As Employee)
        MyBase.Add(new_key, new_employee)
    End Sub

    ' Return an object with the given key.
    Default Public Shadows Property Item(ByVal key As String) As Employee
        Get
            Return DirectCast(MyBase.Item(key), Employee)
        End Get
        Set(ByVal Value As Employee)
            MyBase.Item(key) = Value
        End Set
    End Property

    ' Return True if the Dictionary contains this Employee.
    Public Shadows Function Contains(ByVal key As String) As Boolean
        Return MyBase.Contains(key)
    End Function

    ' Remove this entry.
    Public Shadows Sub Remove(ByVal key As String)
        MyBase.Remove(key)
    End Sub
End Class

                                                  
Other Strongly Typed Derived Classes

Example program MakeEmployeeListDictionary, which is available for download on the book's web site, builds a strongly typed ListDictionary that holds Employee objects.

StringDictionary

The StringDictionary class uses a hash table to manage keys and values that are all strings. Because it uses a hash table, it can handle very large data sets quickly.

Its methods are strongly typed to require strings, so they provide extra type checking that can make finding potential bugs easier. For that reason, you should use a StringDictionary instead of a generic ListDictionary or Hashtable if you want to work exclusively with strings.

SortedList

The SortedList class acts as a Hashtable/Array hybrid. When you access a value by key, it acts as a hash table. When you access a value by index, it acts as an array containing items sorted by key value. For example, suppose that you add a number of Job objects to a SortedList named jobs using their priorities as keys. Then jobs(0) always returns the job with the smallest priority value.

Example program UseSortedList, which is available for download on the book's web site, demonstrates the SortedList class.

A SortedList is more complicated than a Hashtable or an array, so you should only use it if you need its special properties.

COLLECTIONSUTIL

Normally Hashtables and SortedLists are case-sensitive. The CollectionsUtil class provides two shared methods, CreateCaseInsensitiveHashtable and CreateCaseInsensitiveSortedList, that create Hashtables and SortedLists objects that are case-insensitive.

Example program UseCaseInsensitiveSortedList, which is available for download on the book's web site, uses code similar to the following to create a normal case-sensitive SortedList. It then adds two items with keys that differ only in their capitalization. This works because a case-sensitive SortedList treats the two keys as different values. The code then creates a case-insensitive SortedList. When it tries to add the same two items, the list raises an exception, complaining that it already has an object with key value Sport.

Dim sorted_list As SortedList

' Use a normal, case-sensitive SortedList.
sorted_list = New SortedList
sorted_list.Add("Sport", "Volleyball")
sorted_list.Add("sport", "Golf") ' Okay because Sport <> sport.

' Use a case-insensitive SortedList.
sorted_list = CollectionsUtil.CreateCaseInsensitiveSortedList()
sorted_list.Add("Sport", "Volleyball")
sorted_list.Add("sport", "Golf") ' Error because Sport = sport.

                                                  
COLLECTIONSUTIL

If you can use case-insensitive Hashtables and SortedLists, you should generally do so. This prevents the program from adding two entries that are supposed to be the same but have different capitalization. For example, if one routine spells a key value "Law Suit" and another spells it "law suit," the case-insensitive Hashtable or SortedList will quickly catch the error. Neither will notice an error if part of your program spells this "LawSuit." (You could also add extra logic to remove spaces and special symbols to increase the chances of finding similar terms that should be the same, but a discussion of these sorts of methods is beyond the scope of this book.)

STACKS AND QUEUES

Stacks and queues are specialized data structures that are useful in many programming applications that need to add and remove items in a particular order. The Visual Basic Stack and Queue classes implement stacks and queues.

The difference between a stack and a queue is the order in which they return the items stored in them. The following two sections describe stacks and queues and explain the ways in which they return items.

Stack

A stack returns items in last-in, first-out (LIFO, pronounced life-o) order. Because of the LIFO behavior, a stack is sometimes called a LIFO list or simply a LIFO.

Adding an item to the stack is called pushing the item onto the stack and removing an item is called popping the item off of the stack. These operations have the names push and pop because a stack is like a spring-loaded stack of plates in a cafeteria or buffet. You push new plates down onto the top of the stack and the plates sink into the counter. You pop the top plate off and the stack rises to give you the next plate.

Figure 28-3 illustrates this kind of stack. If you haven't seen this sort of thing before, don't worry about it. Just remember that push adds an item and pop removes the top item.

A Stack lets you remove items in last-in, first-out (LIFO) order.

Figure 28.3. A Stack lets you remove items in last-in, first-out (LIFO) order.

Normally, you use a Stack object's Push and Pop methods to add and remove items, but the Stack class also provides some cheating methods that let you peek at the Stack's top object or convert the Stack into an array. The following table describes the Stack class's most useful properties and methods.

PROPERTY/METHOD

PURPOSE

Clear

Removes all items from the Stack.

Contains

Returns True if the Stack contains a particular object.

CopyTo

Copies some or all of the Stack class's objects into a one-dimensional array.

Count

Returns the number of items in the Stack.

Peek

Returns a reference to the Stack class's top item without removing it from the Stack.

Pop

Returns the Stack class's top item and removes it from the Stack.

Push

Adds an item to the top of the Stack.

ToArray

Returns a one-dimensional array containing references to the objects in the Stack. The Stack class's top item is placed first in the array.

A Stack allocates memory to store its items. If you Push an object onto a Stack that is completely full, the Stack must resize itself to make more room and that slows down the operation.

To make memory management more efficient, the Stack class provides three overloaded constructors. The first takes no parameters and allocates a default initial capacity. The second takes as a parameter the number of items it should initially be able to hold. If you know that you will add 10,000 items to the Stack, you can avoid a lot of resizing by initially allocating room for 10,000 items.

The third version of the constructor takes as a parameter an object that implements the ICollection interface. The constructor allocates enough room to hold the items in the collection and copies them into the Stack.

Example program UseStack, which is available for download on the book's web site, uses a Stack to reverse the characters in a string.

Queue

A queue returns items in first-in, first-out (FIFO, pronounced fife-o) order. Because of the FIFO behavior, a queue is sometimes called a FIFO list or simply a FIFO.

A queue is similar to a line at a customer service desk. The first person in line is the first person to leave it when the service desk is free. Figure 28-4 shows the idea graphically.

Customers leave a queue in first-in, first-out (FIFO) order.

Figure 28.4. Customers leave a queue in first-in, first-out (FIFO) order.

Queues are particularly useful for processing items in the order in which they were created. For example, an order-processing application might keep orders in a queue so that customers who place orders first are satisfied first (or at least their order is shipped first, whether they are satisfied or not).

Historically, the routines that add and remove items from a queue are called Enqueue and Dequeue. The following table describes these methods and the Queue class's other most useful properties and methods.

PROPERTY/METHOD

PURPOSE

Clear

Removes all items from the Queue.

Contains

Returns True if the Queue contains a particular object.

CopyTo

Copies some or all of the Queue class's objects into a one-dimensional array.

Count

Returns the number of items in the Queue.

Dequeue

Returns the item that has been in the Queue the longest and removes it from the Queue.

Enqueue

Adds an item to the back of the Queue.

Peek

Returns a reference to the Queue class's oldest item without removing it from the Queue.

ToArray

Returns a one-dimensional array containing references to the objects in the Queue. The Queue class's oldest item is placed first in the array.

TrimToSize

Frees empty space in the Queue to set its capacity equal to the number of items it actually contains.

A Queue allocates memory to store its items. If you Enqueue an object while the queue's memory is full, the Queue must resize itself to make more room, and that slows down the operation.

To make memory management more efficient, the Queue class provides four overloaded constructors. The first takes no parameters and allocates a default initial capacity. If the Queue is full, it enlarges itself by a default growth factor.

The second constructor takes as a parameter its initial capacity. If you know that you will add 600 items to the Queue, you can save some time by initially allocating room for 600 items. With this constructor, the Queue also uses a default growth factor.

The third constructor takes as a parameter an object that implements the ICollection interface. The constructor allocates enough room to hold the items in the collection and copies them into the Queue. It also uses a default growth factor.

The final version of the constructor takes as parameters an initial capacity and a growth factor between 1.0 and 10.0. A larger growth factor will mean that the Queue resizes itself less often, but it may contain a lot of unused space.

Queues are useful for scheduling items in a FIFO order. For example, a shared network computer uses a queue. Users on different computers send jobs to the printer, and they are printed in FIFO order.

Example program UseQueue, which is available for download on the book's web site, demonstrates a Queue.

GENERICS

Chapter 29 explains how you can build and use generic classes to perform similar actions for objects of various types. For example, you could build a Tree class that can build a tree of any specific kind of object. Your program could then make a tree of Employees, a tree of Customers, a tree of Punchlines, or even a tree of trees. Visual Basic comes with a useful assortment of pre-built generic collection classes.

The System.Collections.Generic namespace provides several generic collection classes that you can use to build strongly typed collections. These collections work with a specific data type that you supply in a variable's declaration. For example, the following code makes a List that holds strings:

Imports System.Collections.Generic

. . .
Dim places As New List(Of String)
places.Add("Chicago")

The places object's methods are strongly typed and work only with strings, so they provide extra error protection that a less specialized collection doesn't provide. To take advantage of this extra protection, you should use generic collections or strongly typed collections derived from the CollectionBase class whenever possible.

When you derive a strongly typed collection from the CollectionBase class, you can add extra convenience functions to it. For example, an EmployeeCollection class could include an overloaded version of the Add method that accepts first and last names as parameters, makes a new Employee object, and adds it to the collection.

You cannot directly modify a generic collection class, but you can add extension methods to it. For example, the following code adds an AddPerson method to the generic List(Of Person) class. This method takes as parameters a first and last name, uses those values to make a Person object, and adds it to the list.

Module PersonListExtensions
    <Extension()>
    Public Sub AddPerson(ByVal person_list As List(Of Person),
     ByVal first_name As String, ByVal last_name As String)
        Dim per As New Person() With _
            {.FirstName = first_name, .LastName = last_name}
        person_list.Add(per)
    End Sub
End Module

For more information on extension methods, see the section "Extension Methods" in Chapter 17, "Subroutines and Functions."

In addition to adding extension methods to a generic class, you can also derive an enhanced collection from a generic class. For example, the following code defines an EmployeeCollection class that inherits from the generic Collection(Of Employee). It then adds an overloaded version of the Add method that takes first and last names as parameters.

Imports System.Collections.Generic

Public Class EmployeeList
    Inherits List(Of Employee)

    Public Overloads Sub Add(
     ByVal first_name As String, ByVal last_name As String)
        Dim emp As New Employee(first_name, last_name)
        MyBase.Add(emp)
    End Sub
End Class

                                                  
GENERICS

The following table lists the some of the most useful collection classes defined by the System.Collections.Generic namespace.

COLLECTION

PURPOSE

Comparer

Compares two objects of the specific type and returns −1, 0, or 1 to indicate whether the first is less than, equal to, or greater than the second

Dictionary

A strongly typed dictionary

LinkedList

A strongly typed linked list

LinkedListNode

A strongly typed node in a linked list

List

A strongly typed list

Queue

A strongly typed queue

SortedDictionary

A strongly typed sorted dictionary

SortedList

A strongly typed sorted list

Stack

A strongly typed stack

Example program GenericStringList, which is available for download on the book's web site, demonstrates a generic List(Of String). Example program GenericEmployeeList, which is also available for download, derives a strongly typed EmployeeList class from a generic List(Of Employee).

For more information on generics (including instructions for writing your own generic classes), see Chapter 29.

COLLECTION INITIALIZERS

A new feature in Visual Basic 2010 allows you to easily initialize collection classes that have an Add method. To initialize a collection, follow the variable's instantiation with the keyword From and then a series of comma-separated values inside braces.

For example, the following code snippet initializes an ArrayList, StringCollection, and generic List(Of Person). Notice how the generic List's initializer includes a series of new Person objects that are initialized with the With keyword.

Dim numbers As New ArrayList() From {1, 2, 3}
Dim names As New StringCollection() From {"Alice", "Bob", "Cynthia"}
Dim authors As New List(Of Person) From {
    New Person() With {.FirstName = "Simon", .LastName = "Green"},
    New Person() With {.FirstName = "Christopher", .LastName = "Moore"},
    New Person() With {.FirstName = "Terry", .LastName = "Pratchett"}
}

If a collection's Add method takes more than one parameter, simply include the appropriate values for each item inside their own sets of braces. The following code uses this method to initialize a NameValueCollection and a Dictionary with Integer keys and String values:

Dim phone_numbers As New NameValueCollection() From {
    {"Ashley", "502-253-3748"},
    {"Jane", "505-847-2984"},
    {"Mike", "505-847-3984"},
    {"Shebly", "502-487-4939"}
}
Dim greetings As New Dictionary(Of Integer, String) From {
    {1, "Hi"},
    {2, "Hello"},
    {3, "Holla"}
}

The same technique works for other collections that need two values such as ListDictionary, Hashtable, HybridDiction, StringDictionary, and SortedList.

Unfortunately, you cannot use this method to initialize the Stack and Queue classes. For historical reasons, the methods in those classes that add new items are called Push and Enqueue rather than Add, and this method requires the class to have an Add method.

Fortunately, you can write extension methods to give those classes Add methods. The following code creates Add methods for the Stack and Queue classes:

Module Extensions
    <Extension()>
    Public Sub Add(ByVal the_stack As Stack, ByVal value As Object)
        the_stack.Push(value)
    End Sub

    <Extension()>
    Public Sub Add(ByVal the_queue As Queue, ByVal value As Object)
        the_queue.Enqueue(value)
    End Sub
End Module

After you create these extension methods, you can initialize Stacks and Queues as in the following code:

Dim people_stack As New Stack() From {"Electra", "Storm", "Rogue"}
Dim people_queue As New Queue() From {"Xavier", "Anakin", "Zaphod"}

SUMMARY

This chapter describes five types of objects: arrays, collections, dictionaries, stacks, and queues. It also explains how to convert any of these into strongly typed classes and how to use generic collections.

Arrays store objects sequentially. They allow fast access at any point in the array. The Array class lets you make arrays indexed with nonzero lower bounds, although they provide slower performance than arrays of variables, which require lower bounds of zero. The Array class provides several useful methods for working with Array objects and normal variable arrays, including Sort, Reverse, IndexOf, LastIndexOf, and BinarySearch.

Collections store data in ways that are different from those used by arrays. An ArrayList stores items in a linked list. That works well for short lists, but slows down when the list grows large. A StringCollection holds a collection of strings. StringCollection is an example of a strongly typed collection (it holds only strings). The NameValueCollection class is a specialized collection that can hold more than one string value for a given key value.

Dictionaries associate key values with corresponding data values. You look up the key to find the data much as you might look up a word in the dictionary to find its definition. The ListDictionary class stores its data in a linked list. It is fast for small data sets but slow when it contains too much data. A Hashtable, on the other hand, has substantial overhead, but is extremely fast for large dictionaries. A HybridDictionary acts as a ListDictionary if it doesn't contain too much data, and switches to a Hashtable when it gets too big. The StringDictionary class is basically a Hashtable that is strongly typed to work with strings. The SortedList class is a Hashtable/Array hybrid that lets you access values by key or in sorted order.

Stack classes provide access to items in last-in, first-out (LIFO) order, whereas Queue classes give access to their items in first-in, first-out (FIFO) order.

The generic Dictionary, LinkedList, List, Queue, SortedDictionary, SortedList, and Stack classes enable you to use strongly typed data structures without going to the trouble of building your own strongly typed classes.

Although these classes have very different features for adding, removing, finding, and ordering objects, they share some common traits. For example, those that provide an Add method support collection initialization, a feature new in Visual Basic 2010.

All of these classes, plus other strongly typed collections that you can derive from the CollectionBase, DictionaryBase, and other classes, provide significant flexibility and options, so you can pick the class that best satisfies your needs. Deciding which class is best can be tricky, but making the right choice can mean the difference between a program that processes a large data set in seconds, hours, or not at all. Spend some time reviewing the different characteristics of the class so that you can make the best choice possible.

This chapter explains how you can use the generic collection classes provided by the System.Collections.Generic namespace to build strongly typed collection classes of several useful types. Chapter 29, "Generics," explains how you can build generic classes of your own. Using generics, you can build strongly typed classes that manipulate all sorts of objects in any way you can imagine.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.226.82.253