Chapter 6. Lining Up Your Ducks with Collections

In This Chapter

  • Creating variables that contain multiple items of data: Arrays

  • Going arrays one better with flexible "collections"

  • New features: Array and collection initializers and set-type collections

Simple one-value variables of the sort you may encounter in this book fall a bit short in dealing with lots of items of the same kind: ten ducks instead of just one, for example. C# fills the gap with two kinds of variables that store multiple items, generally called collections. The two species of collection are the array and the more general purpose collection class. Usually, if I mean array, I say so, and if I mean collection class, I just call it that. If I refer to a collection or a list, I usually mean that it can be either one.

An array is a data type that holds a list of items, all of which must be of the same type: all int or all double, for example.

C# gives you quite a collection of collection classes, and they come in various shapes, such as flexible lists (like strings of beads), queues (like the line to buy your Spider-Man XII tickets), stacks (like the semistack of junk on someone's desk), and more. Most collection classes are like arrays in that they can hold just apples or just oranges. But C# also gives you a few collection classes that can hold both apples and oranges at a time — which is useful only rarely. (And you have much better ways to manage the feat than using these elderly collections.)

For now, if you can master the array and the List collection (although this chapter introduces two other kinds of collections), you'll do fine throughout most of this book. But circle back here later if you want to pump up your collection repertoire.

The C# Array

Variables that contain single values are plenty useful. Even class structures that can describe compound objects made up of parts (such as a vehicle with its engine and transmission) are critical. But you also need a construct for holding a bunch of objects, such as Bill Gates' extensive collection of vintage cars or a certain author's vintage sock collection. The built-in class Array is a structure that can contain a series of elements of the same type (all int values and all double values, for example, or all Vehicle objects and Motor objects — you meet these latter sorts of objects in Chapter 7 of this minibook).

The argument for the array

Consider the problem of averaging a set of six floating-point numbers. Each of the six numbers requires its own double storage:

double d0 = 5;
double d1 = 2;
double d2 = 7;
double d3 = 3.5;
double d4 = 6.5;
double d5 = 8;

(Averaging int variables can result in rounding errors, as described in Chapter 2 of this minibook.)

Computing the average of those variables might look like this:

double sum = d0 + d1 + d2 + d3 + d4 + d5;
double average = sum / 6;

Listing each element by name is tedious. Okay, maybe it's not so tedious when you have only 6 numbers to average, but imagine averaging 600 (or even 6 million) floating-point values.

The fixed-value array

Fortunately, you don't need to name each element separately. C# provides the array structure that can store a sequence of values. Using an array, you can put all your doubles into one variable, like this:

double[] doublesArray = {5, 2, 7, 3.5, 6.5, 8, 1, 9, 1, 3};

You can also declare an empty array without initializing it:

double[] doublesArray = new double[6];

This line allocates space for six doubles but doesn't initialize them.

Note

The Array class, on which all C# arrays are based, provides a special syntax that makes it more convenient to use. The paired brackets [] refer to the way you access individual elements in the array:

doublesArray[0] // Corresponds to d0 (that is, 5)
doublesArray[1] // Corresponds to d1 (that is, 2)
. . .

The 0th element of the array corresponds to d0, the 1th element to d1, the 2th element to d2, and so on. Programmers commonly refer to the 0th element as "doublesArray sub-0," to the first element as "doublesArray sub-1," and so on.

Note

The array's element numbers — 0, 1, 2, . . . — are known as the index.

In C#, the array index starts at 0 and not at 1. Therefore, you typically don't refer to the element at index 1 as the first element but, rather, as the "oneth element" or the "element at index 1." The first element is the zeroth element. If you insist on using normal speech, just be aware that the first element is always at index 0 and the second element is at index 1.

The doublesArray variable wouldn't be much of an improvement, were it not for the possibility that the index of the array is a variable. Using a for loop is easier than writing out each element by hand, as this program demonstrates:

Note

// FixedArrayAverage -- Average a fixed array of numbers using a loop.
namespace FixedArrayAverage
{
  using System;
  public class Program
  {
    public static void Main(string[] args)
    {
      double[] doublesArray = {5, 2, 7, 3.5, 6.5, 8, 1, 9, 1, 3};
      // Accumulate the values in the array into the variable sum.
      double sum = 0;
      for (int i = 0; i < 10; i++)
      {
        sum = sum + doublesArray[i];
      }
      // Now calculate the average.
      double average = sum / 10;
      Console.WriteLine(average);
      Console.WriteLine("Press Enter to terminate...");
      Console.Read();
    }
  }
}

The program begins by initializing a variable sum to 0. Then it loops through the values stored in doublesArray, adding each one to sum. By the end of the loop, sum has accumulated the sum of all values in the array. The resulting sum is divided by the number of elements to create the average. The output from executing this program is the expected 4.6. (You can check it with your calculator.)

The variable-length array

The array used in the FixedArrayAverage program example suffers from these two serious problems:

  • The size of the array is fixed at ten elements.

  • Worse, the elements' values are specified directly in the program.

A program that could read in a variable number of values, perhaps determined by the user during execution, would be much more flexible. It would work not only for the ten values specified in FixedArrayAverage but also for any other set of values, regardless of their number.

The format for declaring a variable-size array differs slightly from that of a fixed-size, fixed-value array:

double[] doublesArrayVariable = new double[N];  // Variable, versus ...
double[] doublesArrayFixed = new double[10];    // Fixed

Here, N represents the number of elements to allocate.

The updated program VariableArrayAverage enables the user to specify the number of values to enter. (N has to come from somewhere.) Because the program retains the values entered, not only does it calculate the average, but it also displays the results in a pleasant format, as shown here:

Note

// VariableArrayAverage -- Average an array whose size is
//    determined by the user at runtime, accumulating the values
//    in an array. Allows them to be referenced as often as
//    desired. In this case, the array creates an attractive output.
namespace VariableArrayAverage
{
  using System;
  public class Program
  {
    public static void Main(string[] args)
    {
      // First read in the number of doubles the user intends to enter.
      Console.Write("Enter the number of values to average: ");
      string numElementsInput = Console.ReadLine();
      int numElements = Convert.ToInt32(numElementsInput);
      Console.WriteLine();
      // Now declare an array of that size.
      double[] doublesArray = new double[numElements]; // Here's the 'N'.
      // Accumulate the values into an array.
      for (int i = 0; i < numElements; i++)
      {
        // Prompt the user for another double.
        Console.Write("enter double #" + (i + 1) + ": ");
        string val = Console.ReadLine();
        double value = Convert.ToDouble(val);
        // Add this to the array using bracket notation.
        doublesArray[i] = value;
      }
      // Accumulate 'numElements' values from
      // the array in the variable sum.
      double sum = 0;
      for (int i = 0; i < numElements; i++)
      {
        sum = sum + doublesArray[i];
      }

      // Now calculate the average.
      double average = sum / numElements;
      // Output the results in an attractive format.
      Console.WriteLine();
      Console.Write(average + " is the average of (" + doublesArray[0]);
      for (int i = 1; i < numElements; i++)
      {
        Console.Write(" + " + doublesArray[i]);
      }
      Console.WriteLine(") / " + numElements);
      // Wait for user to acknowledge the results.
      Console.WriteLine("Press Enter to terminate...");
      Console.Read();
    }
  }
}

Look at the following output of a sample run in which you enter five sequential values, 1 through 5, and the program calculates the average to be 3:

Enter the number of values to average:5

enter double #1: 1
enter double #2: 2
enter double #3: 3
enter double #4: 4
enter double #5: 5

3 is the average of (1 + 2 + 3 + 4 + 5) / 5
Press Enter to terminate...

The VariableArrayAverage program begins by prompting the user for the number of values she intends to average. (That's the N we mention a little earlier.) The result is stored in the int variable numElements. In the example, the number entered is 5.

The program continues by allocating an array doublesArray with the specified number of elements. In this case, the program allocates an array with five elements. The program loops the number of times specified by numElements, reading a new value from the user each time. After the last value, the program calculates the average.

Tip

Getting console output just right, as in this example, is a little tricky. Follow each statement in VariableArrayAverage carefully as the program outputs open parentheses, equal signs, plus signs, and each of the numbers in the sequence, and compare it with the output.

The VariableArrayAverage program probably doesn't completely satisfy your thirst for flexibility. You don't want to have to tell the program how many numbers you want to average. What you really want is to enter numbers to average as long as you want — and then tell the program to average what you entered. That's where the C# collections come in. They give you a powerful, flexible alternative to arrays. Getting input directly from the user isn't the only way to fill up your array or another collection, either.

The Length property

The for loop that's used to populate the array in the VariableArrayAverage program begins this way:

// Now declare an array of that size.
double[] doublesArray = new double[numElements];
// Accumulate the values into an array.
for (int i = 0; i < numElements; i++)
{
  . . .
}

The doublesArray is declared to be numElements items long. Thus the clever programmer used a for loop to iterate through numElements items of the array. (Iterate means to loop through the array one element at a time, as with a for loop.)

It would be a shame and a crime to have to schlep around the variable numElements with doublesArray everywhere it goes just so that you know how long it is. Fortunately, that isn't necessary. An array has a property named Length that contains its length. doublesArray.Length has the same value as numElements.

The following for loop is preferable:

// Accumulate the values into an array.
for (int i = 0; i < doublesArray.Length; i++) ...

Initializing an array

The following lines show an array with its initializer and then one that allocates space but doesn't initialize the elements' values:

double[] fixedLengthArray = {5, 2, 7, 3.5, 6.5, 8, 1, 9, 1, 3};
double[] variableLengthArray = new double[10];

You can do it all yourself using the following code:

double[] fixedLengthArray = new double[10] {5, 2, 7, 3.5, 6.5, 8, 1, 9, 1, 3};

Here, you have specifically allocated the memory using new and then followed that declaration with the initial values for the members of the array. I think I can predict which form you prefer. (Hint: Line 1?)

A Loop Made foreach Array

Given an array of strings, the following loop averages their lengths:

public class Student  // Read about classes in Book II.
{
  public string name;
  public double gpa;         // Grade point average
}
public class Program
{
  public static void Main(string[] args)
  {
    //  . . .create the array somehow . . .
    // Now average the students you have.
    double sum = 0.0;
    for (int i = 0; i < students.Length; i++)
    {
      sum += students[i].gpa;
    }
    double avg = sum / students.Length;
    //  . . .do something with the average . . .
  }
}

The for loop iterates through the members of the array. (Yes, you can have arrays of any sort of object, not just of simple types such as double and string. You most likely haven't been formally introduced to classes yet, so bear with me a bit longer. I get into them in the next book.)

students.Length contains the number of elements in the array.

Note

C# provides another loop, named foreach, designed specifically for iterating through collections such as the array. It works this way:

// Now average the students that you have.
double sum = 0.0;
foreach (Student student in students)
{
  sum += student.gpa;  // This extracts the current student's GPA.
}
double avg = sum / students.Length;

The first time through the loop, foreach fetches the first Student object in the array and stores it in the variable student. On each subsequent pass, foreach retrieves the next element. Control passes out of the foreach loop when all elements in the array have been processed.

Notice that no index appears in the foreach statement. The lack of an index greatly reduces the chance of error and is simpler to write than the for statement, although sometimes that index is handy and you prefer a for loop.

Note

The foreach loop is even more powerful than it would seem from the example. This statement works on other collection types in addition to arrays. In addition, foreach handles multidimensional arrays (arrays of arrays, in effect), a topic I don't describe in this book. To find out all about multi-dimensional arrays, look up multidimensional arrays in the C# Help system.

Sorting Arrays of Data

A common programming challenge is the need to sort the elements within an array. Just because an array cannot grow or shrink doesn't mean that the elements within it cannot be moved, removed, or added. For example, the following code snippet swaps the location of two string elements within the array strings:

string temp = strings[i]; // Save the i'th string.
strings[i] = strings[k];  // Replace it with the kth.
strings[k] = temp;        // Replace kth with temp.

In this example, the object reference in the ith location in the strings array is saved so that it isn't lost when the second statement replaces it with another element. Finally, the temp variable is saved back into the kth location. Pictorially, this process looks like Figure 6-1.

Tip

The data collections discussed in the rest of this chapter are more versatile than the array for adding and removing elements.

The term swapping two objects means swapping references to two objects.

Figure 6-1. The term swapping two objects means swapping references to two objects.

The following program demonstrates how to use the ability to manipulate elements within an array as part of a sort. This particular sorting algorithm is the bubble sort. Though it's not so great on large arrays with thousands of elements, it's simple and effective on small arrays:

Note

// BubbleSortArray -- Given a list of planets, sort their
//    names: first, in alphabetical order.
//    Second, by the length of their names, shortest to longest.
//    Third, from longest to shortest.
//    This demonstrates using and sorting arrays, working with
//    them by array index. Two sort algorithms are used:
//    1. The Sort algorithm used by class Array's Sort() method.
//    2. The classic Bubble Sort algorithm.
using System;

namespace BubbleSortArray
{
  class Program
  {
    static void Main(string[] args)
    {
      Console.WriteLine("The 5 planets closest to the sun, in order: ");
      string[] planets =
        new string[] { "Mercury", "Venus", "Earth", "Mars", "Jupiter" };
      foreach (string planet in planets)
      {
        // Use the special char 	 to insert a tab in the printed line.
        Console.WriteLine("	" +  planet);
      }
Console.WriteLine("
Now listed alphabetically: ");
      // Array.Sort() is a method on the Array class.
      // Array.Sort() does its work in-place in the planets array,
      // which leaves you without a copy of the original array. The
      // solution is to copy the old array to a new one and sort it.
      string[] sortedNames = planets;
      Array.Sort(sortedNames);
      // This demonstrates that (a) sortedNames contains the same
      // strings as planets and (b) that they're now sorted.
      foreach (string planet in sortedNames)
      {
        Console.WriteLine("	" + planet);
      }

      Console.WriteLine("
List by name length - shortest first: ");
      // This algorithm is called "Bubble Sort": It's the simplest
      // but worst-performing sort. The Array.Sort() method is much
      // more efficient, but I couldn't use it directly to sort thex
      // planets in order of name length because it sorts strings,
      // not their lengths.
      int outer;  // Index of the outer loop
      int inner;  // Index of the inner loop
      // Loop DOWN from last index to first: planets[4] to planets[0].
      for (outer = planets.Length - 1; outer >= 0; outer--)
      {
        // On each outer loop, loop through all elements BEYOND the
        // current outer element. This loop goes up, from planets[1]
        // to planets[4]. Using the for loop, you can traverse the
        // array in either direction.
        for (inner = 1; inner <= outer; inner++)
        {
          // Compare adjacent elements. If the earlier one is longer
          // than the later one, swap them. This shows how you can
          // swap one array element with another when they're out of order.
          if (planets[inner - 1].Length > planets[inner].Length)
          {
            // Temporarily store one planet.
            string temp = planets[inner - 1];
            // Now overwrite that planet with the other one.
            planets[inner - 1] = planets[inner];
            // Finally, reclaim the planet stored in temp and put
            // it in place of the other.
            planets[inner] = temp;
          }
        }
      }
      foreach (string planet in planets)
      {
        Console.WriteLine("	" + planet);
      }

      Console.WriteLine("
Now listed longest first: ");
      // That is, just loop down through the sorted planets.
      for(int i = planets.Length - 1; i >= 0; i--)
      {
        Console.WriteLine("	" + planets[i]);
      }

      Console.WriteLine("
Press Enter to terminate...");
      Console.Read();
    }
  }
}

The program begins with an array containing the names of the first five planets closest to the sun. (To keep the figures small, I didn't include the outer planets, so I didn't have to decide about poor Pluto, which is, what now? — a planetoid or something?)

The program then invokes the array's own Sort() method. After sorting with the built-in Sort() method on the Array class, the program sorts the lengths of the planets' names using a custom sort just to amaze you.

Tip

The built-in Sort() method for arrays (and other collections) is, without a doubt, more efficient than the custom bubble sort. Don't roll your own unless you have good reason to.

The algorithm for the second sort works by continuously looping through the list of strings until the list is sorted. On each pass through the sortedNames array, the program compares each string to its neighbor. If the two are found to be out of order, the method swaps them and then flags the list as not sorted. Figures 6-2 through 6-5 show the planets list after each pass. In Figure 6-5, note that the next-to-last pass results in a sorted list and that the final pass terminates the sort because nothing changes.

Before starting the bubble sort.

Figure 6-2. Before starting the bubble sort.

After Pass 1 of the bubble sort.

Figure 6-3. After Pass 1 of the bubble sort.

After Prass 2 of the bubble sort.

Figure 6-4. After Prass 2 of the bubble sort.

The final pass terminates the sort because nothing changes.

Figure 6-5. The final pass terminates the sort because nothing changes.

Eventually, longer planet names "bubble" their way to the top of the list; hence the name bubble sort.

Tip

Give single-item variables singular names, as in planet or student. The name of the variable should somehow include the name of the class, as in badStudent or goodStudent or sexyCoedStudent. Give arrays (or other collections) plural names, as in students or phoneNumbers or phoneNumbersInMyPalmPilot. As always, this tip reflects the opinion of the authors and not of this book's publisher nor any of its shareholders — C# doesn't care how you name your variables.

New Feature: Using var for Arrays

Traditionally, you used one of the following forms (which are as old as C# — almost six years old at the time this book was written) to initialize an array:

int[] numbers = new int[3];            // Size but no initializer, or ...
int[] numbers = new int[] { 1, 2, 3 }; // Initializer but no size, or ...
int[] numbers = new int[3] { 1, 2, 3 };// Size and initializer, or ...
int[] numbers = { 1, 2, 3 };           // No 'new' keyword -- extreme short form.

Chapter 2 of this minibook introduces the new var keyword, which tells the C# compiler, "You figure out the variable type from the initializer expression I'm providing."

Happily, var works with arrays, too:

// myArray is an int[] with 6 elements.
var myArray = new [] { 2, 3, 5, 7, 11, 13 };  // Initializer required!

The new syntax has only two changes:

  • var is used instead of the explicit type information for the numbers array on the left side of the assignment.

  • The int keyword is omitted before the brackets on the right side of the assignment. It's the part that the compiler can infer.

Note

In the var version, the initializer is required. The compiler uses it to infer the type of the array elements without the int keyword.

Here are a few more examples:

var names = new [] { "John", "Paul", "George", "Ringo" };        // Strings
var averages = new [] { 3.0, 3.34, 4.0, 2.0, 1.8 };              // Doubles
var prez = new []{new President("FDR"), new President("JFK")};   // Presidents

Note

You can't use the extreme short form for initializing an array when you use var. The following line doesn't compile:

var names = { "John", "Paul", "George", "Ringo" };  // Needs 'new []'

The var way is less concise, but when used in some other situations not involving arrays, it truly shines and in some cases is mandatory. (You can see examples in Chapter 7.)

Note

The UsingVarWithArraysAndCollections sample program on this book's Web site demonstrates var with array initializers. Note that you can't use var as a variable name now, as you could in the past. It's a crummy variable name anyway.

Loosening Up with C# Collections

Often an array is the simplest, most straightforward way to deal with a list of Students or a list of doubles. You also encounter many places in the .NET Framework class library that require the use of arrays.

But arrays have a couple of fairly serious limitations that sometimes get in your way. At such times, you'll appreciate the extensive C# repertoire of more flexible collection classes.

Although arrays have the advantage of simplicity and can have multiple dimensions, they suffer from two important limitations:

  • A program must declare the size of an array when it's created. Unlike Visual Basic, C# doesn't let you change the size of an array after it's defined. For example, you might not know up front how big the array needs to be.

  • Inserting or removing an element in the middle of an array is wildly inefficient. You have to move around all the elements to make room. In a big array, that can be a huge, time-consuming job.

Most collections, on the other hand, make it much easier to add, insert, or remove elements, and you can resize them as needed, right in midstream. In fact, most collections usually take care of resizing automatically.

Tip

If you need a multidimensional data structure, use an array. No collection allows multiple dimensions (although you can create some elaborate data structures, such as collections of arrays or collections of collections).

Arrays and collections have some characteristics in common:

  • Each can contain elements of only one type. You must specify that type in your code, at compile time, and after you declare the type, it can't change.

  • As with arrays, you can access most collections with array-like syntax using square brackets to specify an index: myList[3] = "Joe".

  • Both collections and arrays have methods and properties. Thus, to find the number of elements in the following smallPrimeNumbers array, you call its Length property:

    var smallPrimeNumbers = new [] { 2, 3, 5, 7, 11, 13 };
    int numElements = smallPrimeNumbers.Length;  // Result is 6.
  • With a collection, you call its Count property:

    List<int> smallPrimes = new List<int> { 2, 3, 5, 7, 11, 13 };
    int numElements = smallPrimes.Count; // Collections have a Count
        property.
  • Check out class Array in Help to see what other methods and properties it has (7 public properties and 36 public methods).

Understanding Collection Syntax

In this section, I'll get you up and running with collection syntax and introduce the most important and most frequently used collection classes.

Table 6-1 lists the main collection classes in C#. I find it useful to think of collections as having various "shapes" — the list shape or dictionary shape, for example.

Table 6-1. The Most Common Collection "Shapes"

Class

Description

List<T>

This dynamic array contains objects of type T.

LinkedList<T>

This is a linked list of objects of type T.

Queue<T>

Start at the back end of the line and end up at the front.

Stack<T>

Always add or delete items at the "top" of the list, like a stack of cafeteria trays.

Dictionary<TKey, TValue>

This structure works like a dictionary. Look up a key (a word, for example) and retrieve its corresponding value (for example, definition).

HashSet<T>

This structure resembles a mathematical set, with no duplicate items. It works much like a list but provides mathematical set operations, such as union and intersection.

Figuring out <T>

In the mysterious-looking <T> notation you see in Table 6-1, <T> is a placeholder for a particular data type. To bring this symbolic object to life, instantiate it by inserting a real type, like this:

List<int> intList = new List<int>(); // Instantiating for int

Instantiate is geekspeak for "Create an object (instance) of this type."

For example, you might create different List<T> instantiations for types int, string, and Student, for example. By the way, T isn't a sacred name. You can use anything you like — for instance, <dummy> or <aType>. It's common to use T, U, V, and so on.

Notice how I express the Dictionary<TKey, TValue> collection in Table 6-1. Here, two types are needed: one for the dictionary's keys and one for the values associated with the keys. I cover dictionaries later, in the section "Using Dictionaries."

Tip

If this notation seems a bit forbidding, don't worry. You get used to it.

Going generic

These modern collections are known as generic collections, in the sense that you can fill in a blank template, of sorts, with a type (or types) in order to create a custom collection. If the generic List<T> seems puzzling, check out Chapter 8 in this minibook. That chapter discusses the generic C# facilities in more detail. In particular, the chapter shows you how to roll your own generic collections, classes, methods, and other types.

Using Lists

Suppose you need to store a list of MP3 objects, each of which represents one item in your MP3 music collection. As an array, it might look like this:

MP3[] myMP3s = new MP3[50];         // Start with an empty array.
myPP3s[0] = new MP3("Norah Jones"); // Create an MP3 and add it to the array.
// ... and so on.

With a list collection, it looks like this:

List<MP3> myMP3s = new List<MP3>();   // An empty list
myMP3s.Add(new MP3("Avril Lavigne")); // Call the list's Add() method to add.
// ... and so on.

So what, you say? These examples look similar, and the list doesn't appear to provide any advantage over the array. But what happens when you add the 50th MP3 to the array and then want to add a 51st? You're out of room. Your only course is to declare a new, larger array and then copy all MP3s from the old array into the new one. Also, if you remove an MP3 from the array, your array is left with a gaping hole. What do you put into that empty slot to take the place of the MP3 you ditched? The value null, maybe?

The list collection sails happily on, in the face of those same obstacles. Want to add MP3 number 51? No problem. Want to junk your old Pat Boone MP3s? (Are there any?) No problem. The list takes care of healing itself after you delete old Pat.

Warning

If your list (or array, for that matter) can contain null items, be sure to check for null when you're looping through with for or foreach. You don't want to call the Play() method on a null MP3 item. It results in an error.

Note

The ListCollection example on this book's Web site shows some of the things you can do with List<T>. In the following code listing, I'll intersperse explanations with bits of code.

The following code (excerpted from the example) shows how to instantiate a new, empty list for the string type. In other words, this list can hold only strings:

// List<T>: note angle brackets plus parentheses in
// List<T> declaration; T is a "type parameter",
// List<T> is a "parameterized type."
// Instantiate for string type.
List<string> nameList = new List<string>();
sList.Add("one");
sList.Add(3);                            // Compiler error here!
sList.Add(new Student("du Bois"));       // Compiler error here!

You add items to a List<T> by using its Add() method. The preceding code snippet successfully adds one string to the list, but then it runs into trouble trying to add first an integer and then a Student. The list was instantiated for strings, so the compiler rejects both attempts.

The next code fragment instantiates a completely new list for type int and then adds two int values to the list. Afterward, the foreach loop iterates the int list, printing out the ints:

// Instantiate for int.
List<int> intList = new List<int>();
intList.Add(3);                          // Fine.
intList.Add(4);
Console.WriteLine("Printing intList:");
foreach(int i in intList)  // foreach just works for all collections.
{
  Console.WriteLine("int i = " + i);
}

The following bit of code instantiates a new list to hold Students and adds two students with its Add() method. But then notice the array of Students, which I add to the student list using its AddRange() method. AddRange() lets you add a whole array or (almost) any other collection to the list, all at once:

// Instantiate for Student.
List<Student> studentList = new List<Student>();
Student student1 = new Student("Vigil");
Student student2 = new Student("Finch");
studentList.Add(student1);
studentList.Add(student2);
Student[] students = { new Student("Mox"), new Student("Fox") };
studentList.AddRange(students); // Add whole array to List.
Console.WriteLine("Num students in studentList = " + studentList.Count);

(Don't worry about the "new Student" stuff. I get to that topic in Book II.)

Tip

You can easily convert lists to arrays and vice versa. To put an array into a list, use the list's AddRange() method as just described. To convert a list to an array, call the list's ToArray() method:

Student[] students = studentList.ToArray();  // studentList is a List<Student>.

List<T> also has a number of other methods for adding items, including methods to insert one or more items anywhere in the list and methods to remove items or clear the list. Note that List<T> also has a Count property. (This single nit can trip you up if you're used to the Length property on arrays and strings. For collections, it's Count.)

The next snippet demonstrates several ways to search a list: IndexOf() returns the array-style index of an item within the list, if found, or −1 if not found. The code also demonstrates accessing an item with array-style indexing and via the Contains() method. Other searching methods include BinarySearch(), not shown:

// Search with IndexOf().
Console.WriteLine("Student2 at " + studentList.IndexOf(student2));
string name = studentList[3].Name;  // Access list by index.
if(studentList.Contains(student1))  // student1 is a Student object.
{
  Console.WriteLine(student1.Name + " contained in list");
}

The final code segment demonstrates several more List<T> operations, including sorting, inserting, and removing items:

studentList.Sort(); // Assumes Student implements IComparable interface  (Ch 14).
studentList.Insert(3, new Student("Ross"));
studentList.RemoveAt(3);  // Deletes the third element.
Console.WriteLine("removed " + name);         // Name defined above

That's only a sampling of the List<T> methods. You can look up the full list in Help.

Tip

To look up generic collections you have to look in the Help index for the term List<T>. If you try searching for just List, you'll be lost in a list of lists of lists. If you want to see information about the whole set of collection classes (well, the generic ones), search the index for generic collections.

Using Dictionaries

You've no doubt used Webster's or another dictionary. It's organized as a bunch of words in alphabetical order. Associated with each word is a body of information including pronunciations, definitions, and other information. To use a dictionary, you look up a word and retrieve its information.

In C#, the dictionary "shape" differs from the list shape. Dictionaries are represented by the Dictionary<TKey, TValue> class. TKey represents the data type used for the dictionary's keys (similar to the words in a standard dictionary or the terms you look up). TValue represents the data type used to store the information or data associated with a key (similar to the word's definitions in Webster's).

Note

.NET dictionaries are based on the idea of a hash table. Imagine a group of buckets spread around the floor. When you compute a hash, using a hash function, you get a value that specifies only one of the buckets. That same hash always points to the same bucket. If the hash is computed properly, you should see a good, fairly even distribution of items spread among the buckets. Thus the hash is a key to one of the buckets. Provide the key to retrieve the bucket's contents — its value.

Note

Using dictionaries is no harder in C# than in high school. The following DictionaryExample program (excerpts) shows a few things you can do with dictionaries. To save a little space, we show just parts of the Main() method.

Note

If you find the going a bit rough here, you may want to circle back later.

The first piece of the code just creates a new Dictionary object that has string keys and string values. You aren't limited to strings, though. Either the key or the value, or both, can be any type. Note that the Add() method requires both a key and a value.

Dictionary<string, string> dict = new Dictionary<string, string>();
//  Add(key, value).
dict.Add("C#", "cool");
dict.Add("C++", "like writing Sanskrit poetry in Morse code");
dict.Add("VB", "a simple but wordy language");
dict.Add("Java", "good, but not C#");
dict.Add("Fortran", "ANCNT");  // 6-letters-max variable name for "ancient."
dict.Add("Cobol", "even more wordy, or is it wordier, and verbose than VB");

The ContainsKey() method tells you whether the dictionary contains a particular key. There's a corresponding ContainsValue() method too:

// See if the dictionary contains a particular key.
Console.WriteLine("Contains key C# " + dict.ContainsKey("C#"));     // True
Console.WriteLine("Contains key Ruby " + dict.ContainsKey("Ruby")); // False

You can, of course, iterate the dictionary in a loop just as you can in any collection. But keep in mind that the dictionary is like a list of pairs of items — think of each pair as an object that contains both the key and the value. So to iterate the whole dictionary with foreach, you need to retrieve one of the pairs each time through the loop. The pairs are objects of type KeyValuePair<TKey, TValue>. In the WriteLine() call, I use the pair's Key and Value properties to extract the items. Here's what it looks like:

// Iterate the dictionary's contents with foreach.
// Note that you're iterating pairs of keys and values.
Console.WriteLine("
Contents of the dictionary:");
foreach (KeyValuePair<string, string> pair in dict)
{
// Because the key happens to be a string, we can call string methods on it.
  Console.WriteLine("Key: " + pair.Key.PadRight(8) + "Value: " + pair.Value);
}

In the final segment of the example program, you can see how to iterate just the keys or just the values. The dictionary's Keys property returns another collection: a list-shaped collection of type Dictionary<TKey, TValue>.KeyCollection. Because the keys happen to be strings, you can iterate the keys as strings and call string methods on them. The Values property is similar. The final bit of code uses the dictionary's Count property to see how many key/value pairs it contains.

// List the keys, which are in no particular order.
Console.WriteLine("
Just the keys:");
// Dictionary<TKey, TValue>.KeyCollection is a collection of just the keys,
// in this case strings. So here's how to retrieve the keys:
Dictionary<string, string>.KeyCollection keys = dict.Keys;
foreach(string key in keys)
{
  Console.WriteLine("Key: " + key);
}

// List the values, which are in same order as key collection above.
Console.WriteLine("
Just the values:");
Dictionary<string, string>.ValueCollection values = dict.Values;
foreach (string value in values)
{
  Console.WriteLine("Value: " + value);
}
Console.Write("
Number of items in the dictionary: " + dict.Count);

Of course, that doesn't exhaust the possibilities for working with dictionaries. Look up generic dictionary in the Help index for all the details.

Note

Dictionary pairs are in no particular order, and you can't sort a dictionary. It really is just like a bunch of buckets spread around the floor.

Array and Collection Initializers

In this section, I summarize initialization techniques for both arrays and collections — both old-style and new. You may want to bend the page corner.

Initializing arrays

Note

As a reminder, given the new var syntax covered earlier in this chapter, an array declaration can look like either of these examples:

int[] numbers = { 1, 2, 3 };           // Shorter form -- can't use var.
var numbers = new [] { 1, 2, 3 };      // Full initializer mandatory with var.

Initializing collections

Meanwhile, the traditional way to initialize a collection, such as a List<T> — or a Queue<T> or Stack<T> — back in the C# 2.0 days (a number of years ago), was this:

List<int> numList = new List<int>();        // New empty list.
numbers.Add(1);                             // Add elements one at a time.
numbers.Add(2);
numbers.Add(3);                             // ...tedious!

Or, if you had the numbers in an array or another collection already, it went like this:

List<int> numList = new List<int>(numbers); // Initializing from an array or...
List<int> numList2 = new List<int>(numList);// from another collection or...
numList.AddRange(numbers);                  // using AddRange

Note

When initializing lists, queues, or stacks as shown here, you can pass in any array or list-like collection, including lists, queues, stacks, and the new sets, which I cover in the next section (but not dictionaries — their shape is wrong). The MoreCollections example on the Web site illustrates several cases of initializing one collection from another.

Note

Since C# 3.0, collection initializers resemble the new array initializers and are much easier to use than most of the earlier forms. The new initializers look like this:

List<int> numList = new List<int> { 1, 2, 3 };  // List
int[] intArray = { 1, 2, 3 };                   // Array

The key difference between the new array and collection initializers is that you still must spell out the type for collections — which means giving List<int> after the new keyword (see the boldface in the preceding example).

Note

Of course, you can also use the var keyword with collections:

var list = new List<string> { "Head", "Heart", "Hands", "Health" };

You can also use the new dynamic keyword:

Dynamic list = new List<string> { "Head", "Heart", "Hands", "Health" };

Initializing dictionaries with the new syntax looks like this:

Dictionary<int, string> dict =
  new Dictionary<int, string> { { 1, "Sam" }, { 2, "Joe" } };

Outwardly, this example looks the same as for List<T>, but inside the outer curly braces, you see a second level of curly-brace-enclosed items, one per entry in the dictionary. Because this dictionary dict has integer keys and string values, each inner pair of curly braces contains one of each, separated by a comma. The key/value pairs are separated by commas as well.

Initializing sets (see the next section) is much like initializing lists:

HashSet<int> biggerPrimes = new HashSet<int> { 19, 23, 29, 31, 37, 41 };

Note

The UsingVarWithArraysAndCollections example on this book's Web site demonstrates the var keyword used with arrays and collections.

Using Sets

C# 3.0 added the new collection type HashSet<T>. A set is an unordered collection with no duplicate items.

The set concept comes from mathematics. Think of the set of genders (female and male), the set of days in a week, or the set of variations on the triangle (isosceles, equilateral, scalene, right, obtuse). Unlike math sets, C# sets can't be infinite, though they can be as large as available memory.

You can do things to a set in common with other collections, such as add, delete, and find items. But you can also perform several specifically set-like operations, such as union and intersection. Union joins the members of two sets into one. Intersection finds the overlap between two sets and results in a set containing only the overlapping members. So sets are good for combining and eliminating items.

Note

Like dictionaries, sets are implemented using hash tables. Sets resemble dictionaries with keys but no values, making them list-like in shape. See the earlier section "Using Dictionaries" for details.

Note

To create a HashSet<T>, you can do this:

HashSet<int> smallPrimeNumbers = new HashSet<int>();
smallPrimeNumbers.Add(2);
smallPrimeNumbers.Add(3);

Or, more conveniently, you can use a collection initializer:

HashSet<int> smallPrimeNumbers = new HashSet<int> { 2, 3, 5, 7, 11, 13 };

Or create the set from an existing collection of any list-like kind, including arrays:

List<int> intList = new List<int> { 0, 1, 2, 3, 4, 5, 6, 7 };
HashSet<int> numbers = new HashSet<int>(intList);

If you attempt to add to a hash set an item that the set already contains, as in this example:

smallPrimeNumbers.Add(2);

the compiler doesn't treat the duplication as an error (and doesn't change the hash set, which can't have duplicates). Actually, Add() returns true if the addition occurred and false if it didn't. You don't have to use that fact, but it can be useful if you want to do something when an attempt is made to add a duplicate:

bool successful = smallPrimeNumbers.Add(2);
if(successful)
{
  // 2 was added, now do something useful.
}
// If successful is false, not added because it was already there

Note

The following example — the HashSetExample on the Web site — shows off several HashSet<T> methods but, more important, demonstrates using a HashSet<T> as a tool for working with other collections. You can do strictly mathematical operations with HashSet<T>, but we find its ability to combine collections in various ways quite handy.

The first segment of this code starts with a List<string> and an array. Each contains color names. Though you could combine the two by simply calling the list's AddRange() method:

colors.AddRange(moreColors);

the resulting list contains some duplicates (yellow, orange). Using a HashSet<T> and the UnionWith() method, on the other hand, you can combine two collections and eliminate any duplicates in one shot, as the following example shows.

Note

Here's the beginning of the HashSetExample on this book's Web site:

Console.WriteLine("Combining two collections with no duplicates:");
List<string> colors = new List<string> { "red", "orange", "yellow" };
string[] moreColors = { "orange", "yellow", "green", "blue", "violet" };
// Want to combine but without any duplicates.
// Following is just the first stage ...
HashSet<string> combined = new HashSet<string>(colors);
// ... now for the second stage.
// UnionWith() collects items in both lists that aren't duplicated,
// resulting in a combined collection whose members are all unique.
combined.UnionWith(moreColors);
foreach (string color in combined)
{
  Console.WriteLine(color);
}

The result given here contains "red", "orange", "yellow", "green", "blue", and "violet". The first stage uses the colors list to initialize a new HashSet<T>. The second stage then calls the set's UnionWith() method to add in the moreColors array — but adding only the ones not already in the set. The set ends up containing just the colors in both original lists. Green, blue, and violet come from the second list; red, orange, and yellow come from the first. The moreColors array's orange and yellow would duplicate the ones already in the set, so they're screened out.

But suppose that you want to end up with a List<T> containing those colors, not a HashSet<T>. The next segment shows how to create a new List<T> initialized with the combined set:

Console.WriteLine("
Converting the combined set to a list:");
  // Initialize a new List from the combined set above.
  List<string> spectrum = new List<string>(combined);
  foreach(string color in spectrum)
  {
    Console.WriteLine(color);
  }

Back when these examples were written, the 2008 U.S. presidential campaign was in full swing, with about ten early candidates in each major party. A good many of those candidates were also members of the U.S. Senate. How can you produce a list of just the candidates who are also in the Senate? The HashSet<T> IntersectWith() method gives you the overlapping items between the candidate list and the Senate list — items in both lists, but only those items:

Console.WriteLine("
Finding the overlap in two lists:");
List<string> presidentialCandidates =
  new List<string> { "Clinton", "Edwards", "Giuliani", "McCain", "Obama", "Romney" };
List<string> senators = new List<string> { "Alexander", "Boxer", "Clinton", "McCain", "Obama", "Snowe" };
HashSet<string> senatorsRunning = new HashSet<string>(presidentialCandidates);
// IntersectWith() collects items that appear in both lists, eliminates others.
senatorsRunning.IntersectWith(senators);
foreach (string senator in senatorsRunning)
{
  Console.WriteLine(senator);
}

The result is "Clinton", "McCain", "Obama" because those are the only ones in both lists. The opposite trick is to remove any items that appear in both of two lists so that you end up with just the items in your target list that aren't duplicated in the other list. This calls for the HashSet<T> method ExceptWith():

Console.WriteLine("
Excluding items from a list:");
Queue<int> queue =
  new Queue<int>(new int[] { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 17 });
HashSet<int> unique = new HashSet<int> { 1, 3, 5, 7, 9, 11, 13, 15 };
// ExceptWith() removes items in unique that are also in queue: 1, 3, 5, 7.
unique.ExceptWith(queue);
foreach (int n in unique)
{
  Console.WriteLine(n.ToString());
}

After this code, unique excludes its own items that duplicate items in queue (1, 3, 5, 7, and 9) and also excludes items in queue that aren't in unique (0, 2, 4, 6, 8, and 17). You end up with 11, 13, and 15 in unique.

Meanwhile, the next code segment uses the SymmetricExceptWith() method to create the opposite result from IntersectWith(). Whereas intersection gives you the overlapping items, SymmetricExceptWith() gives you the items in both lists that don't overlap. The uniqueToOne set ends up containing just 5, 3, 1, 12, and 10:

Console.WriteLine("
Finding just the non-overlapping items in two lists:");
  Stack<int> stackOne = new Stack<int>(new int[] { 1, 2, 3, 4, 5, 6, 7, 8 });
  Stack<int> stackTwo = new Stack<int>(new int[] { 2, 4, 6, 7, 8, 10, 12 });
  HashSet<int> nonoverlapping = new HashSet<int>(stackOne);
  // SymmetricExceptWith() collects items that are in one collection but not
  // the other: the items that don't overlap.
  nonoverlapping.SymmetricExceptWith(stackTwo);
  foreach(int n in nonoverlapping)
  {
    Console.WriteLine(n.ToString());
  }
  Console.WriteLine("Press Enter to terminate...");
  Console.Read();
}

My use of stacks here is a bit unorthodox because I add all members at one time rather than push each one, and I remove a bunch at a time rather than pop each one. Those operations — pushing and popping — are the correct ways to interact with a stack.

Notice that all the HashSet<T> methods I demonstrate are void methods — they don't return a value. Thus the results are reflected directly in the hash set on which you call these methods: nonoverlapping in the preceding code example.

Note

We found the behavior of UnionWith() and IntersectWith() a bit awkward at first because I wanted a new resulting set, with the original (input) sets remaining the same when I applied these methods. But in Book II you meet (I'm happy to report) the new LINQ query operators, which add versions of these methods that return a whole new set object. Combining what you see here with what you see there, you get the best of both worlds. More than that I'd better not say now.

When would you use HashSet<T>? Any time you're working with two or more collections and you want to find such items as the overlap — or create a collection that contains two other collections or exclude a group of items from a collection — sets can be useful. Many of the HashSet<T> methods can relate sets and other collection classes. You can do more with sets, of course, so look up the term HashSet<T> in Help and play with HashSetExample.

On Not Using Old-Fashioned Collections

At the dawn of time, before C# 2.0, when Zarathustra spake, all collection classes were implemented as collections of type Object. You couldn't create a collection just for strings or just for ints. Such a collection lets you store any type of data, because all objects in C# are derived from class Object. Thus you can add both ints and strings to the same collection without seeing error messages (because of the inheritance and polymorphism of C#, which I discuss in Book II ).

But a serious drawback occurs in the Object-based arrangement: To extract the int that you know you put into a collection, you must cast out to an int the Object you get:

ArrayList ints = new ArrayList();  // An old-fashioned list of Objects
int myInt = (int)ints[0];          // Extract the first int in the list.

It's as though your ints were hidden inside Easter eggs. If you don't cast, you create errors because, for instance, Object doesn't support the + operation or other methods, properties, and operators that you expect on ints. You can work with these limitations, but this kind of code is error-prone, and it's just plain tedious to do all that casting. (Besides, as I discuss in Book II, working with Easter eggs adds some processing overhead because of the "boxing" phenomenon. Too much boxing slows your program.)

And, if the collection happens to contain objects of more than one type — pomegranates and basketballs, say — the problem becomes tougher. Somehow, you have to detect that the object you fish out is a pomegranate or a basketball so that you can cast it correctly.

With those limitations on the older, nongeneric collections, the newer generic ones are a gale of fresh air. You never have to cast, and you always know what you're getting because you can put only one type into any given collection. But you still see the older collections occasionally, in code that other people write — and sometimes you may even have a legitimate reason to stick apples and oranges in the same collection.

Note

The nongeneric collections are found in the System.Collections and System.Collections.Specialized namespaces. The Specialized collections are interesting, sometimes useful, oddball collections, and mainly nongeneric. The modern, generic ones are found in System.Collections.Generic. (I explain namespaces and generics in Book II, in my discussion of object-oriented programming).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.12.74.18