Chapter 20. Enumerators and Iterators

Enumerators and Enumerable Types

In Chapter 14, you saw that you can use a foreach statement to cycle through the elements of an array. In this chapter, you'll take a closer look at arrays and see why they can be processed by foreach statements. You'll also look at how you can add this capability to your own user-defined classes. Later in the chapter, I'll discuss the use of iterators.

Using the foreach Statement

When you use the foreach statement with an array, the statement presents you witheach element in the array, one by one, allowing you to read its value.

For example, the following code declares an array with four elements, and then uses a foreach loop to print out the values of the items:

int[] arr1 = { 10, 11, 12, 13 };                // Define the array.

   foreach (int item in arr1)                      // Enumerate the elements.
      Console.WriteLine("Item value:  {0}", item);

This code produces the following output:

Item value:  10
Item value:  11
Item value:  12
Item value:  13

Why does this work, apparently magically, with arrays? The reason is that an array can produce, upon request, an object called an enumerator. The enumerator can return the elements of the array, one by one, in order, as they are requested. The enumerator "knows" the order of the items, and keeps track of where it is in the sequence. It then returns the current item when it is requested.

For types that have enumerators, there must be a way of retrieving them. The standard way of retrieving an object's enumerator in .NET is to call the object's GetEnumerator method. Types that implement a GetEnumerator method are called enumerable types, or just enumerables. Arrays are enumerables.

Figure 20-1 illustrates the relationship between enumerables and enumerators.

Overview of enumerators and enumerables

Figure 20-1. Overview of enumerators and enumerables

The foreach construct is designed towork with enumerables. As long as the object it is given to iterate over is an enumerable type, such as an array, it will perform the following actions:

  • Getting the object's enumerator by calling the GetEnumerator method

  • Requesting each item from the enumerator and making it available to your code as the iteration variable, which your code can read, but not change

Overview of enumerators and enumerables

Types of Enumerators

There are three variations on enumerators. They all work essentially the same way, with only slight differences. I will discuss all three types. You can implement enumerators using

  • The IEnumerator/IEnumerable interfaces—called the non-generic interface form

  • The IEnumerator<T>/IEnumerable<T> interfaces—called the generic interface form

  • The form that uses no interfaces

Using the IEnumerator Interface

This section will start by looking at the first in the preceding list: the non-generic interface form. This form of enumerator is a class that implements the IEnumerator interface. It is called non-generic because it does not use C# generics.

The IEnumerator interface contains three function members: Current, MoveNext, and Reset.

  • Current is a property that returns the item at the current position in the sequence.

    • It is a read-only property.

    • It returns a reference of type object, so an object of any type can be returned.

  • MoveNext is a method that advances the enumerator's position to the next item in the collection. It also returns a Boolean value, indicating whether the new position is a valid position or is beyond the end of the sequence.

    • If the new position is valid, the method returns true.

    • If the new position is not valid (i.e., it's at the end), the method returns false.

    • The initial position of the enumerator is before the first item in the sequence. MoveNext must be called before the first access of Current, or the CLR will raise an InvalidOperationException exception.

  • Reset is a method that resets the position to the initial state.

Figure 20-2 illustrates a collection of three items, which is shown on the left of the figure, and its enumerator, which is shown on the right. In the figure, the enumerator is an instance of a class called ArrEnumerator.

The enumerator for a small collection

Figure 20-2. The enumerator for a small collection

The enumerator class is usually declared as a nested class of the class for which it is an enumerator. A nested class is one that is declared inside the declaration of another class. Nested classes are described in detail in Chapter 25.

The way the enumerator keeps track of the current item in the sequence is entirely implementation-dependent. It might be implemented as a reference to an object, an index value, or something else entirely. In the case of an array, it is simply the index of the item.

Figure 20-3 illustrates the states of an enumerator for a collection of three items. The states are labeled 1 through 5.

  • Notice that in state 1, the initial position of the enumerator is −1 (i.e., before the first element of the collection).

  • Each transition between states is caused by a call to MoveNext, which advances the position in the sequence. Each call to MoveNext between states 1 and 4 returns true. In the transition between states 4 and 5, however, the position ends up beyond the last item in the collection, so the method returns false.

  • In the final state, any further calls to MoveNext return false.

The states of an enumerator

Figure 20-3. The states of an enumerator

Given a collection's enumerator, you should be able to simulate a foreach loop by cycling through the items in the collection using the MoveNext and Current members. For example, you know that arrays are enumerable, so the following code does manually what the foreach statement does automatically. The output is the same as if it were in a foreach loop.

static void Main()
   {
      int[] MyArray = { 10, 11, 12, 13 };           // Create an array.

      IEnumerator ie = MyArray.GetEnumerator();     // Get its enumerator.
      while ( ie.MoveNext() )                       // Move to the next item.
      {
         int i = (int) ie.Current;                  // Get the current item.
         Console.WriteLine("{0}", i);               // Write it out.
      }
   }

This code produces the following output:

10
11
12
13

Declaring an IEnumerator Enumerator

To create a non-generic interface enumerator class, you must declare a class that implements the IEnumerator interface. The IEnumerator interface has the following characteristics:

  • It is a member of the System.Collections namespace.

  • It contains the three members Current, MoveNext, and Reset.

The following code shows the outline of a non-generic enumerator class. It does not show how the position is maintained. Notice that Current returns a reference to an object.

Declaring an IEnumerator Enumerator

For example, the following code implements an enumerator class that lists an array of color names:

Declaring an IEnumerator Enumerator

The IEnumerable Interface

The IEnumerable interface has only a single member, method GetEnumerator, which returns an enumerator for the object.

Figure 20-4 shows class MyClass, which has three items to enumerate, and implements the IEnumerable interface by implementing the GetEnumerator method.

The GetEnumerator method returns an enumerator object for the class.

Figure 20-4. The GetEnumerator method returns an enumerator object for the class.

The following code shows the form for the declaration of an enumerable class:

The GetEnumerator method returns an enumerator object for the class.

The following code gives an example of an enumerable class that uses enumerator class ColorEnumerator from the previous example. Remember that ColorEnumerator implements IEnumerator.

The GetEnumerator method returns an enumerator object for the class.

Example Using IEnumerable and IEnumerator

Putting the MyColors and ColorEnumerator examples together, you can add a class called Program with a Main method that creates an instance of MyColors and uses it in a foreach loop.

using System;
   using System.Collections;

   namespace ColorCollectionEnumerator
   {
      class ColorEnumerator: IEnumerator
      {
         string[] Colors;
         int Position = −1;

         public ColorEnumerator(string[] theColors)             // Constructor
         {
            Colors = new string[theColors.Length];
            for (int i = 0; i < theColors.Length; i++)
               Colors[i] = theColors[i];
         }

         public object Current                                  // Current
         {
            get { return Colors[Position]; }
         }

         public bool MoveNext()                                 // MoveNext
         {
            if (Position < Colors.Length − 1)
            {
               Position++;
               return true;
            }
            else
               return false;
         }

         public void Reset()                                    // Reset
         { Position = −1; }
      }
class MyColors: IEnumerable
      {
         string[] Colors = { "Red", "Yellow", "Blue" };

         public IEnumerator GetEnumerator()
         {
            return new ColorEnumerator(Colors);
         }
      }

      class Program
      {
         static void Main()
         {
            MyColors mc = new MyColors();
            foreach (string color in mc)
               Console.WriteLine("{0}", color);
         }
      }

   }

This code produces the following output:

Red
Yellow
Blue

The Non-Interface Enumerator

You've just seen how to use the IEnumerable and IEnumerator interfaces to create useful enumerables and enumerators. But there are several drawbacks to this method.

First, remember that the object returned by Current is of type object. For value types, this means that before they are returned by Current, they must be boxed to turn them into objects. They must then be unboxed again after they have been received from Current. This can exact a substantial performance penalty if it needs to be done on large amounts of data.

Another drawback of the non-generic interface method is that you've lost type safety. The values being enumerated are being handled as objects, and so can be of any type. This eliminates the safety of compile-time type checking.

You can solve these problems by making the following changes to the enumerator/enumerable class declarations.

  • For the enumerator class

    • Do not derive the class from IEnumerator.

    • Implement MoveNext just as before.

    • Implement Current just as before, but have as its return type the type of the items being enumerated.

    • You do not have to implement Reset.

  • For the enumerable class

    • Do not derive the class from IEnumerable.

    • Implement GetEnumerator as before, but have its return type be the type of the enumerator class.

Figure 20-5 shows the differences. The non-generic interface code is on the left, and the non-interface code is on the right. With these changes, the foreach statement will be perfectly happy to process your collection, but without the drawbacks just listed.

Comparing interface-based and non-interface-based enumerators

Figure 20-5. Comparing interface-based and non-interface-based enumerators

One possible problem with the non-interface enumerator implementation is that types from other assemblies might expect enumeration to be implemented using the interface method. If these objects attempt to get an enumeration of your class objects using the interface conventions, they will not be able to find them.

To solve this problem, you can implement both forms in the same classes. That is, you can create implementations for Current, MoveNext, Reset, and GetEnumerator at the class level, and also create explicit interface implementations for them. With both sets of implementations, the type-safe, more efficient implementation will be called by foreach and other constructs that can use the non-interface implementations, while the other constructs will call the explicit interface implementations.

The Generic Enumeration Interfaces

The third form of enumerator uses the generic interfaces IEnumerable<T> and IEnumerator<T>. They are called generic because they use C# generics. Using them is very similar to using the non-generic forms. Essentially, the differences between the two are the following:

  • With the non-generic interface form

    • The GetEnumerator method of interface IEnumerable returns an enumerator class instance that implements IEnumerator.

    • The class implementing IEnumerator implements property Current, which returns a reference of type object, which you must then cast to the actual type of the object.

  • With the generic interface form

    • The GetEnumerator method of interface IEnumerable<T> returns an enumerator class instance that implements IEnumerator<T>.

    • The class implementing IEnumerator<T> implements property Current, which returns an object of the actual type, rather than a reference to the base class object.

An important point to notice is that the non-generic interface implementations are not type-safe. They return references to type object, which must then be cast to the actual types. With the generic interfaces, however, the enumerator is type-safe, returning references to the actual types.

The IEnumerator<T> Interface

The IEnumerator<T> interface uses generics to return an actual derived type, rather than an object of type object.

The IEnumerator<T> interface derives from two other interfaces: the non-generic IEnumerator interface and the IDisposable interface. It must therefore implement their members.

  • You have already seen the non-generic IEnumerator interface and its three members.

  • The IDisposable interface has a single, void, parameterless method called Dispose, which can be used to free unmanaged resources being held by the class. (The Dispose method was described in Chapter 6.)

  • The IEnumerator<T> interface itself has a single method, Current, which returns an item of a derived type—not an item of type object.

  • Since both IEnumerator<T> and IEnumerator have a member named Current, you should explicitly implement the IEnumerator version, and implement the generic version in the class itself, as shown in Figure 20-6.

Figure 20-6 illustrates the implementation of the interface.

Implementing the IEnumerator<T> interface

Figure 20-6. Implementing the IEnumerator<T> interface

The declaration of the class implementing the interface should look something like the pattern in the following code, where T is the type returned by the enumerator:

Implementing the IEnumerator<T> interface

For example, the following code implements the ColorEnumerator example using the generic enumerator interface:

Implementing the IEnumerator<T> interface

The IEnumerable<T> Interface

The generic IEnumerable<T> interface is very similar to the non-generic version, IEnumerable. The generic version derives from IEnumerable, so it must also implement the IEnumerable interface.

  • Like IEnumerable, the generic version alsocontains a single member, a method called GetEnumerator. This version of GetEnumerator, however, returns a class object implementing the generic IEnumerator<T> interface.

  • Since the class must implement two GetEnumerator methods, you should explicitly implement the non-generic version, and implement the generic version in the class itself, as shown in Figure 20-7.

Figure 20-7 illustrates the implementation of the interface.

Implementing the IEnumerable<T> interface

Figure 20-7. Implementing the IEnumerable<T> interface

The following code shows a pattern for implementing the generic interface. T is the type returned by the enumerator.

Implementing the IEnumerable<T> interface

For example, the following code shows the use of the generic enumerable interface:

Implementing the IEnumerable<T> interface

Iterators

Enumerable classes and enumerators are used extensively in the .NET collection classes, so it's important that you know how they work. But now that you know how to create your ownenumerable classes and enumerators, you might be pleased to learn that, starting with C# 2.0, the language got a much simpler way of creating enumerators and enumerables. In fact, the compiler will create them for you. The construct that produces them is called an iterator. You can use the enumerators and enumerables generated by iterators wherever you would use manually coded enumerators or enumerables.

Before I explain the details, let's take a look at two examples. The following method declaration implements an iterator that produces and returns an enumerator.

  • The iterator returns a generic enumerator that returns three items of type string.

  • The yield return statements declare that this is the next item in the enumeration.

Iterators

The following method declaration is another version that produces the same result:

Iterators

I haven't explained the yield return statement yet, but on inspecting these code segments, you might have the feeling that something is different about this code. It doesn't seem quite right. What exactly does the yield return statement do?

For example, in the first version, if the method returns on the first yield return statement, then the last two statements can never be reached. If it doesn't return on the first statement, but continues through to the end of the method, then what happens to the values? And in the second version, if the yield return statement in the body of the loop returns on the first iteration, then the loop will never get to any subsequent iterations.

And besides all that, an enumerator doesn't just return all the elements in one shot—it returns a new value with each access of the Current property. So how does this give you an enumerator? Clearly this code is different than anything shown before.

Iterator Blocks

An iterator block is a code block with one or more yield statements. Any of the following three types of code blocks can be iterator blocks:

  • A method body

  • An accessor body

  • An operator body

Iterator blocks are treated differently than other blocks. Other blocks contain sequences of statements that are treated imperatively. That is, the first statement in the block is executed, followed by the subsequent statements, and eventually control leaves the block.

An iterator block, on the other hand, is not a sequence of imperative commands to be executed at one time. Instead, it describes the behavior of an enumerator class that you want the compiler to build for you. The code in the iterator block describes how to enumerate the elements.

Iterator blocks have two special statements:

  • The yield return statement specifies the next item in the sequence to return.

  • The yield break statement specifies that there are no more items in the sequence.

The compiler takes this description of how to enumerate the items and uses it to build the enumerator class, including all the required method and property implementations. The resulting class is nested inside the class where the iterator is declared. Figure 20-8 shows the code on the left and the resulting objects on the right. Notice how much is built for you automatically by the compiler.

An iterator that produces an enumerator

Figure 20-8. An iterator that produces an enumerator

Using an Iterator to Create an Enumerator

The following code illustrates how to use an iterator to create an enumerable class.

  • MyClass, illustrated in Figure 20-8, uses iterator method BlackAndWhite to produce an enumerator for the class.

  • MyClass also implements method GetEnumerator, which in turn calls BlackAndWhite, and returns the enumerator that BlackAndWhite returns to it.

  • Notice that in Main, you can use an instance of the class directly in the foreach statement since the class is enumerable.

Using an Iterator to Create an Enumerator

This code produces the following output:

black
gray
white

Using an Iterator to Create an Enumerable

The previous example created a class comprising two parts: the iterator that produced the enumerator and the GetEnumerator method that returned that enumerator. In this example, the iterator is used to create an enumerable rather than an enumerator. There are some important differences between this example and the last:

  • In the previous example, iterator method BlackAndWhite returned an IEnumerator<string> and MyClass implemented method GetEnumerator by returning the object returned by BlackAndWhite.

  • In this example, the iterator method BlackAndWhite returns an IEnumerable<string> rather than an IEnumerator<string>. MyClass, therefore, implements its GetEnumerator method by first calling method BlackAndWhite to get the enumerable object, and then calling that object's GetEnumerator method and returning its results.

  • Notice that in the foreach statement in Main, you can either use an instance of the class or call BlackAndWhite directly, since it returns an enumerable. Both ways are shown.

Using an Iterator to Create an Enumerable
Using an Iterator to Create an Enumerable

This code produces the following output:

black  gray  white  black  gray  white

Common Iterator Patterns

The previous two sections showed that you can create an iterator to return either an enumerable or an enumerator. Figure 20-9 summarizes how to use the common iterator patterns.

  • When you implement an iterator that returns an enumerator, you must make the class enumerable by implementing GetEnumerator, so that it returns the enumerator returned by the iterator. This is shown on the left of the figure.

  • In a class, when you implement an iterator that returns an enumerable, you can either make this class itself enumerable or not by either making it implement GetEnumerator or not.

    • If you implement GetEnumerator, make it call the iterator method to get an instance of the automatically generated class that implements IEnumerable. Next, return the enumerator built by GetEnumerator from this IEnumerable object, as shown on the right of the figure.

    • If you don't make the class itself enumerable by not implementing GetEnumerator, you can still use the enumerable returned by the iterator, by calling the iterator method directly, as shown in the second foreach statement on the right.

The common iterator patterns

Figure 20-9. The common iterator patterns

Producing Enumerables and Enumerators

The previous examples used iterators that returned either an IEnumerator<T> or an IEnumerable<T>. You can also create iterators that return the non-generic versions as well. The return types you can specify are the following:

  • IEnumerator<T> (generic—substitute an actual type for T)

  • IEnumerable<T> (generic—substitute an actual type for T)

  • IEnumerator (non-generic)

  • IEnumerable (non-generic)

For the two enumerator types, the compiler generates a nested class that contains the implementation of either the non-generic or the generic enumerator, with the behavior specified by the iterator block.

For the two enumerable types, it does even more. It produces a nested class that is both enumerable and the enumerator. The class, therefore, implements both the enumerator and the GetEnumerator method. Notice that GetEnumerator is implemented as part of the nested class—not as part of the enclosing class.

Figure 20-10 illustrates the generic enumerable produced by the enumerable iterator in the last example.

  • The iterator's code is shown on the left side of the figure, and shows that its return type is IEnumerable<string>.

  • On the right side of the figure, the diagram shows that the nested class implements both IEnumerator<string> and IEnumerable<string>.

The compiler produces a class that is both an enumerable and an enumerator. It also produces the method that returns the class object.

Figure 20-10. The compiler produces a class that is both an enumerable and an enumerator. It also produces the method that returns the class object.

Producing Multiple Enumerables

In the following example, class ColorCollection has two enumerable iterators—one enumerating the items in forward order and the other enumerating them in reverse order. Notice that although it has two methods that return enumerables, the class itself is not enumerable since it doesn't implement GetEnumerator.

using System;
   using System.Collections.Generic;                       // You need this namespace.

   namespace ColorCollectionIterator
   {
      class ColorCollection
      {
         string[] Colors={"Red", "Orange", "Yellow", "Green", "Blue", "Purple"};

         public IEnumerable<string> Forward() {      // Enumerable iterator
            for (int i = 0; i < Colors.Length; i++)
               yield return Colors[i];
         }

         public IEnumerable<string> Reverse() {      // Enumerable iterator
            for (int i = Colors.Length − 1; i >= 0; i--)
               yield return Colors[i];
         }
      }
Producing Multiple Enumerables

New page This code produces the following output:

Red Orange Yellow Green Blue Purple
Purple Blue Green Yellow Orange Red
Purple Blue Green Yellow Orange Red

Producing Multiple Enumerators

The previous example used iterators to produce a class with two enumerables. This example shows two things. First, it uses iterators to produce a class with two enumerators. Second, it shows how iterators can be implemented as properties rather than methods.

The code declares two properties that define two different enumerators. The GetEnumerator method returns one or the other of the two enumerators, depending on the value of the Boolean variable ColorFlag. If ColorFlag is true, the Colors enumerator is returned. Otherwise, the BlackAndWhite enumerator is returned.

class MyClass: IEnumerable<string>
   {
      bool ColorFlag = true;

      public MyClass(bool flag)                    // Constructor
      {
         ColorFlag = flag;
      }

      IEnumerator<string> BlackAndWhite      // Property--enumerator iterator
      {
         get {
            yield return "black";
            yield return "gray";
            yield return "white";
         }
      }

      IEnumerator<string> Colors             // Property--enumerator iterator
      {
         get {
            string[] TheColors = { "blue", "red", "yellow" };
            for (int i = 0; i < TheColors.Length; i++)
               yield return TheColors[i];
         }
      }
public IEnumerator<string> GetEnumerator()  // GetEnumerator
      {
         return ColorFlag
                  ? Colors                     // Return Colors enumerator
                  : BlackAndWhite;             // Return BlackAndWhite enumerator
      }

      System.Collections.IEnumerator
      System.Collections.IEnumerable.GetEnumerator()
      {
         return ColorFlag
                  ? Colors                     // Return Colors enumerator
                  : BlackAndWhite;             // Return BlackAndWhite enumerator
      }
   }

   class Program
   {
      static void Main()
      {
         MyClass mc1 = new MyClass( true );    // Call constructor with true
         foreach (string s in mc1)
            Console.Write("{0}  ", s);
         Console.WriteLine("");

         MyClass mc2 = new MyClass( false );   // Call constructor with false
         foreach (string s in mc2)
            Console.Write("{0}  ", s);
         Console.WriteLine("");
      }

   }

This code produces the following output:

blue  red  yellow
black  gray  white

Behind the Scenes with Iterators

The following are some other important things to know about iterators:

  • Iterators require the System.Collections.Generic namespace, so you should include it with a using directive.

  • In the compiler-generated enumerators, the Reset method is not supported. It is implemented, since it is required by the interface, but the implementation throws a System. NotSupportedException exception if it is called. Notice that the Reset method is shown grayed out in Figure 20-8.

Behind the scenes, the enumerator class generated by the compiler is a state machine with four states:

Before: The initial state before the first call to MoveNext.
Running: The state entered when MoveNext is called. While in this state, the enumerator determines and sets the position for the next item. It exits the state when it encounters a yield return, a yield break, or the end of the iterator body.
Suspended: The state where the state machine is waiting for the next call to MoveNext.
After: The state where there are no more items to enumerate.

If the state machine is in either the before or suspended states, and there is a call to the MoveNext method, it goes into the running state. In the running state, it determines the next item in the collection, and sets the position.

If there are more items, the state machine goes into the suspended state. If there are no more items, it goes into the after state, where it remains. Figure 20-11 shows the state machine.

An iterator state machine

Figure 20-11. An iterator state machine

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
13.58.51.36