Building Our Skeleton

First, you have to decide where the implementation should live. To do that, you have to find out where the Enumerator::Lazy method lives. If you head over to the official documentation, you might find a clue shown in the figure.

images/cheap_counterfeiting/rubydoclazyenum.png

So, the Enumerator class is the parent of the Lazy class. This is easy enough to translate to code:

 class​ Lazy < Enumerator
 end

For our exercise, we’ll use another name instead of reopening the existing Ruby class. A quick trip to the thesaurus yields a synonym to Lazy. Introducing, Lax:

 class​ Lax < Enumerator
 end

Notice that we’re inheriting from Enumerator. Let’s look at why.

External vs. Internal Iteration

What does inheriting from Enumerator buy you? In order to answer that question, let’s review the Enumerator class. According to the documentation, Enumerator is:

A class which allows both internal and external iteration.

What is the difference between the two flavors of iteration? The key lies in who controls the iteration: the enumerable or the enumerator.

For internal iteration, it is the Array object (or any Enumerable) that controls the iteration. In fact, that’s how you normally interact with Enumerables.

External iteration, on the other hand, is controlled by some other object wrapped around an Enumerable.

Why would you want external iterators in the first place? Sometimes, you do not want to iterate through all of the elements in one pass. You might want to say, “Give me exactly one now, and when I need the next one, I will ask again.” In other words, external iterators let you control the state of the enumeration. That lets you pause and rewind the enumeration as you see fit.

Internal iterators do not give you that ability. Once you kick-start an enumeration, there’s no turning back.

Creating an Enumerator from an Enumerable

Remember, an Enumerator wraps an Enumerable. You can see this in action in an irb session:

 >>​ e = Enumerator.new([1,2,3])
 warning: Enumerator.new without a block is deprecated; use Object#to_enum
 
 >>​ e.next
 =>​ 1
 
 >>​ e.next
 =>​ 2
 
 >>​ e.next
 =>​ 3
 
 >>​ e.next
 StopIteration: iteration reached an end
  from (irb):7:in `next'

When you wrap an array with an enumerator, you can then call the Enumerator#next method multiple times to retrieve the next value. When there are no more values left, the StopIteration exception is raised.

Notice that in the first snippet that you entered, Ruby complains about either using Object#to_enum or creating the enumerator with a block. Let’s pick the second option and use a block:

 >>​ e = Enumerator.new ​do​ |yielder|
 >>​ [1,2,3].each ​do​ |val|
 >>​ yielder << val
 >>​ ​end
 >>​ ​end
 =>​ ​#<Enumerator: #<Enumerator::Generator:0x007fb9798e0668>:each>

And as usual, we can call Enumerator#next:

 >>​ e.next
 =>​ 1
 
 >>​ e.next
 =>​ 2
 
 >>​ e.next
 =>​ 3
 
 >>​ e.next
 StopIteration: iteration reached an end
  from (irb):16:in `next'
  from (irb):16
  from /Users/benjamintan/.rbenv/versions/​2.2.0/bin/irb:11:in `<main>​​'

Let’s look at the code again, because there’s more than meets the eye. There are a few questions that come up:

 e = Enumerator.new ​do​ |yielder|
  [1,2,3].each ​do​ |val|
  yielder << val
 end
 end
  1. What is this yielder object that is passed into the block?

  2. What does yielder << val do?

  3. Most importantly, how is it possible that simply wrapping an Enumerable enables the ability to retrieve each element of the enumerable one by one?

Let’s tackle the first two questions. The yielder object is passed into the block when an Enumerator object is created with a block. The purpose of the yielder object is to store the instructions for the next yield. Note that this is not the value; it is the instructions. That is what yielder << val specifies.

Here’s an interesting and potentially confusing side note: the << is aliased to the yield method, but it has nothing to do with the yield keyword.

When the Enumerator#next method is called, it emits the next value and returns. This suggests that the yielder object must keep some form of state. How is this achieved? The return value from the previous code listing gives us a clue:

 #<Enumerator: #<Enumerator::Generator:0x007fb9798e0668>:each>

This tells us that an Enumerator object contains another object called Enumerator::Generator.

Generators and Fibers, Oh My!

Let’s take a detour and explore the Enumerator::Generator class. Generators can convert an internal iterator, such as [1,2,3], into an external one. Generators are the secret sauce that allows the one-by-one retrieval of the elements of an enumerable.

Here’s how a generator works:

  1. First, it computes some result.

  2. This result is handed back to the caller.

  3. In addition, it also saves the state of the computation so that the caller can resume that computation to generate the next result.

One way to do this is to use a little-known Ruby construct called a fiber. The Fiber class is perfect for converting an internal iterator to an external one.

You probably won’t use fibers often, but they’re pretty fun to explore. Let’s do a quick run through of the basics.

You create a fiber with Fiber.new, with a block that represents the computation:

 f = Fiber.new ​do
  x = 0
 loop​ ​do
  Fiber.yield x
  x += 1
 end
 end

This block contains an infinite loop. Note the Fiber.yield method in the loop body. That, dear reader, is the secret sauce! Before we get into more details, try running the example in irb:

 >>​ f = Fiber.new ​do
 >>​ x = 0
 >>​ ​loop​ ​do
 >>​ Fiber.yield x
 >>​ x += 1
 >>​ ​end
 >>​ ​end
 =>​ ​#<Fiber:0x007fb979023e58>

When you create a fiber like this, the block isn’t executed immediately. So how then is the block executed? This is done with the Fiber#resume method. Observe:

 >>​ f.resume
 =>​ 0
 
 >>​ f.resume
 =>​ 1
 
 >>​ f.resume
 =>​ 2

Now back to the secret sauce. What you have just created here is an infinite number generator. The reason the loop doesn’t run indefinitely is because of the behavior of the Fiber.yield method, not to be confused with the yield keyword.

When the code executes Fiber.yield x, the result is returned to the caller, and control is given back to the caller. When Fiber#resume is called again, the variable x is incremented. The loop goes for another round, executing Fiber.yield x again, and once again gives control back to the caller.

Keep this ability of being able to start/pause/resume the execution behavior of fibers in mind as we move into the next section, as it’ll help you understand what’s happening as we build our implementation.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.225.72.245