© Dmitri Nesteruk 2020
D. NesterukDesign Patterns in .NET Core 3https://doi.org/10.1007/978-1-4842-6180-4_17

17. Iterator

Dmitri Nesteruk1  
(1)
St. Petersburg, c.St-Petersburg, Russia
 

An iterator, put simply, is an object that is used to traverse some structure or other. Typically, the iterator references the currently accessed element and has a method to move forward. A bidirectional iterator also lets you walk backward, and a random-access iterator allows you to access an element at an arbitrary position.

In .NET, that thing which enables iterator typically implements the IEnumerator<T> interface . It has the following members:
  • Current refers to the element at the current position.

  • MoveNext() lets you move on to the next element of the collection, and returns true if we succeeded and false otherwise.

  • Reset() sets the enumerator to the initial position.

The enumerator is also disposable, but we don’t care about that too much. The point is, any time you write
foreach (x in y)
  Console.WriteLine(x);
what you’re really doing is the equivalent of
var enumerator = ((IEnumerable<Foo>)y).GetEnumerator();
while (enumerator.MoveNext())
{
  temp = enumerator.Current;
  Console.WriteLine(temp);
}

In other words, a class that implements IEnumerable<T> is required to have a method called GetEnumerator() that returns an IEnumerator<T>. And you use that enumerator to traverse the object.

Needless to say, it is very rare for you to have to make your own IEnumerator. Typically, you can write code such as…
IEnumerable<int> GetSomeNumbers()
{
  yield return 1;
  yield return 2;
  yield return 3;
}

…and the rest of the operations will be taken care of by the compiler. Alternatively, you can just use an existing collection class (array, List<T>, etc.) which already has all the plumbing you need.

Array-Backed Properties

Not all things are easy to iterate . For example, you cannot iterate all fields in a class unless you’re using reflection. But sometimes you need to. Here, let me show you a scenario.

Suppose you’re making a game with creatures in it. These creatures have various attributes, such as strength, agility, and intelligence. You could implement them as
public class Creature
{
  public int Strength { get; set; }
  public int Agility { get; set; }
  public int Intelligence { get; set; }
}
But now you also want to output some aggregate statistics about the creature. For example, you decide to calculate the sum of all its abilities:
public double SumOfStats => Strength + Agility + Intelligence;
This code is impossible to automatically refactor if you add an additional Wisdom property (ooh, is that too much D&D nerdiness for you?), but let me show you something even worse. If you want the average of all the abilities, you would write:
public double AverageStat => SumOfStats / 3.0;
Whoa there! That 3.0 is a bona fide magic number , completely unsafe if the structure of the code changes. Let me show you yet another example of ugliness. Suppose you decide to calculate the maximum ability value of a creature. You’d need to write something like:
public double MaxStat => Math.Max(
  Math.Max(Strength, Agility), Intelligence);

Well, you get the idea. This code is not robust and will break on any small change, so we’re going to fix it, and the implementation is going to make use of array-backed properties.

The idea of array-backed properties is simple: all the backing fields of related properties exist in one array:
private int [] stats = new int[3];
Each of the properties then projects its getter and setter into the array. To avoid using integral indices, you can introduce private constants:
private const int strength = 0;
public int Strength
{
  get => stats[strength];
  set => stats[strength] = value;
}
// same for other properties
And now, of course, calculating the sum/average/maximum statistics is really easy because the underlying field is an array, and arrays are supported in LINQ:
public double AverageStat => stats.Average();
public double SumOfStats => stats.Sum();
public double MaxStat => stats.Max();
If you want to add an extra property, all you need to do is
  • Extend the array by one element

  • Create a property with getter and setter

And that’s it! The stats will still be calculated correctly. Furthermore, if you want, you can eschew all of those methods we made in favor of
public IEnumerable<int> Stats => stats;

and just let the client perform his own LINQ queries directly, for example, creature.Stats.Average().

Finally, if you want stats to be the enumerable collection, that is, letting people write foreach (var stat in creature), you can simply implement IEnumerable (and perhaps an indexer too):
public class Creature : IEnumerable<int>
{
  // as before
  public IEnumerator<int> GetEnumerator()
    => stats.AsEnumerable().GetEnumerator();
  IEnumerator IEnumerable.GetEnumerator()
    => GetEnumerator();
  public int this[int index]
  {
    get => stats[index];
    set => stats[index] = value;
  }
}

This approach is functional, but there are plenty of downsides. One of those downsides has to do with change notifications. For example, suppose that your UI application binds a UI element to the SumOfStats property . You change Strength, but how would SumOfStats let you know that it did, in fact, change too? If SumOfStats was defined as a basic sum of different properties, we could have treated that summation as an expression tree, parsed it, and extracted the dependencies. But because we’re using LINQ, this is now impossible or, at the very least, very difficult. We could attempt to supply some special metadata to indicate that some properties are array-backed and then read this metadata when determining dependencies, but as you can guess, this has both computational and cognitive costs .

Let’s Make an Iterator

In order to appreciate just how ugly iterators can get if you do decide to make them directly, we are going to implement a classic Comp Sci example: tree traversal . Let’s begin by defining a single node of a binary tree:
public class Node<T>
{
  public T Value;
  public Node<T> Left, Right;
  public Node<T> Parent;
  public Node(T value)
  {
   Value = value;
  }
  public Node(T value, Node<T> left, Node<T> right)
  {
    Value = value;
    Left = left;
    Right = right;
    left.Parent = right.Parent = this;
  }
}
I’ve thrown in an additional constructor that initializes its node with both left and right child nodes. This allows us to define chained constructor trees such as
//   1
//  /
// 2   3
var root = new Node<int>(1,
  new Node<int>(2), new Node<int>(3));
Okay, so now we want to traverse the tree. If you remember your Data Structures and Algorithms course, you’ll know that there are three ways: in-order, preorder, and postorder. Suppose we decide to define an InOrderIterator . Here’s what it would look like:
public class InOrderIterator<T>
{
  public Node<T> Current { get; set; }
  private readonly Node<T> root;
  private bool yieldedStart;
  public InOrderIterator(Node<T> root)
  {
    this.root = Current = root;
    while (Current.Left != null)
      Current = Current.Left;
  }
  public bool MoveNext()
  {
    // todo
  }
}
Not bad so far: just as if we were implementing IEnumerator<T> , we have a property called Current and a MoveNext() method. But here’s the thing: since the iterator is stateful, every invocation of MoveNext() has to take us to the next element in our current traversal scheme. This isn’t as easy as it sounds:
public bool MoveNext()
{
  if (!yieldedStart)
  {
    yieldedStart = true;
    return true;
  }
  if (Current.Right != null)
  {
    Current = Current.Right;
    while (Current.Left != null)
      Current = Current.Left;
    return true;
  }
  else
  {
    var p = Current.Parent;
    while (p != null && Current == p.Right)
    {
      Current = p;
      p = p.Parent;
    }
    Current = p;
    return Current != null;
  }
}
Whoa there! Bet you didn’t expect this! Well this is exactly what you get if you implement your own iterators directly: an unreadable mess. But it works! We can use the iterator directly, C++ style:
var it = new InOrderIterator<int>(root);
while (it.MoveNext())
{
  Write(it.Current.Value);
  Write(',');
}
WriteLine();
// prints 213
Or, if we want, we can construct a dedicated BinaryTree class that exposes this in-order iterator as a default one:
public class BinaryTree<T>
{
  private Node<T> root;
  public BinaryTree(Node<T> root)
  {
    this.root = root;
  }
  public InOrderIterator<T> GetEnumerator()
  {
    return new InOrderIterator<T>(root);
  }
}
Notice we don’t even have to implement IEnumerable (thanks to duck typing1). We can now write
var root = new Node<int>(1,
  new Node<int>(2), new Node<int>(3));
var tree = new BinaryTree<int>(root);
foreach (var node in tree)
  WriteLine(node.Value); // 2 1 3

Improved Iteration

Our implementation of in-order iteration is virtually unreadable and is nothing like what you read in textbooks. Why? Lack of recursion. After all, MoveNext() cannot preserve its state, so every time it gets invoked, it starts from scratch without remembering its context: it only remembers the previous element, which needs to be found before we find the next one in the iteration scheme we’re using.

And this is why yield return exists: you can construct a state machine behind the scenes. This means that if I wanted to create a more natural in-order implementation, I could simply write it as
public IEnumerable<Node<T>> NaturalInOrder
{
  get
  {
    IEnumerable<Node<T>> TraverseInOrder(Node<T> current)
    {
      if (current.Left != null)
      {
        foreach (var left in TraverseInOrder(current.Left))
          yield return left;
      }
      yield return current;
      if (current.Right != null)
      {
        foreach (var right in TraverseInOrder(current.Right))
          yield return right;
      }
    }
    foreach (var node in TraverseInOrder(root))
      yield return node;
  }
}
Notice that all the calls here are recursive. Now what we can do is use this directly, for example:
var root = new Node<int>(1,
  new Node<int>(2), new Node<int>(3));
var tree = new BinaryTree<int>(root);
WriteLine(string.Join(",", tree.NaturalInOrder.Select(x => x.Value)));
// 2,1,3

Woo-hoo! This is way better. The algorithm itself is readable, and once again, we can take the property and just do LINQ on it, no problem.

Iterator Adapter

Quite often you want an object to be iterable in some special way. For example, suppose you want to calculate the sum of all elements in a matrix – LINQ doesn’t provide a Sum() method over a rectangular array, so what you can do is build an adapter such as
public class OneDAdapter<T> : IEnumerable<T>
{
  private readonly T[,] arr;
  private int w, h;
  public OneDAdapter(T[,] arr)
  {
    this.arr = arr;
    w = arr.GetLength(0);
    h = arr.GetLength(1);
  }
  public IEnumerator<T> GetEnumerator()
  {
    for (int y = 0; y < h; ++y)
    for (int x = 0; x < w; ++x)
      yield return arr[x, y];
  }
  IEnumerator IEnumerable.GetEnumerator()
  {
    return GetEnumerator();
  }
}
This adapter could then be used whenever you want to iterate a 2D array in 1D manner . For example, calculation of the sum is now as simple as
var data = new [,] { { 1, 2 }, { 3, 4 } };
var sum = new OneDAdapter<int>(data).Sum();

Of course, we’re still stuck with C#’s inability to derive type arguments in constructors, so perhaps a factory method could be useful here.

Here’s another example – this one supports reverse iteration of a 1D array:
public class ReverseIterable<T> : IEnumerable<T>
{
  private readonly T[] arr;
  public ReverseIterable(T[] arr) => this.arr = arr;
  public IEnumerator<T> GetEnumerator()
  {
    for (int i = arr.Length - 1; i >= 0; --i)
      yield return arr[i];
  }
  IEnumerator IEnumerable.GetEnumerator()
  {
    return GetEnumerator();
  }
}
Again, if you don’t want to specify the type parameter explicitly, you’d have to create another, nongeneric ReverseIterable class and provide a factory method :
public static class ReverseIterable
{
  public static ReverseIterable<T> From<T>(T[] arr)
  {
    return new ReverseIterable<T>(arr);
  }
}

Of course, as we’ve discussed countless times before, this implies that the constructor is made public, and the only way to make it private is to make the factory a nested class of the iterator adapter.

Summary

The Iterator design pattern has been deliberately hidden in C# in favor of the simple IEnumerator/IEnumerable duopoly upon which everything is built. Notice that these interfaces only support forward iteration – there is no MoveBack() in IEnumerator. The existence of yield allows you to very quickly return elements as a collection that can be consumed by someone else while being blissfully unaware of the state machine that gets built behind the scenes.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
54.152.77.92