4.3. Looping

The Ruby language offers two categories of looping statements: "built-in loops" and iterators. Ruby has three types of built-in loops: for, while, and until. A few differences aside, these are constructs that you should already be accustomed to from other languages.

Ruby also offers an infinite loop construct, through the loop method.

Despite being employed at times, idiomatic Ruby code tends to favor iterators that can also be customized to reflect the needs of your code.

4.3.1. The for/in Loop

The for statement in Ruby iterates over a collection of elements. To be more exact, it enables you to iterate through enumerable objects such as arrays, hashes, and ranges. This is its basic syntax:

for element in collection
  # Do something with element
  code
end

Notice that the statement requires both the for and in keywords, and as such it acts similarly to the For Each in Visual Basic, and foreach in C#, not their simple For and for versions.

These are a few examples that use for to loop through collections:

# Prints the integers between 0 and 10
for i in 0..10
  puts i
end

# Prints each element of the array
for el in [2, 4, 6, 8, 10]
  puts el
end

# Prints each key-value pair in the hash
h = { :x => 24, :y => 25, :z =>26 }
for key, value in h
  puts "#{key} => #{value}"
end

Similarly to if, unless, and case, for is terminated by end, and it accepts an optional separator token. Only, instead of being then, do is the keyword as shown here:

for c in 'a'..'z' do
  puts c
end

This is useful when a newline or a semicolon (in Ruby 1.8's case) is missing, but usually otherwise omitted.

4.3.2. The while and until Loops

The while loop in Ruby acts just as you'd expect. When the tested condition is not false or nil, the body of the loop gets executed. Here is the basic syntax:

while expression
  code
end

The until statement is the opposite. It continues to loop when the condition evaluates to nil or false and stops when it evaluates to true (remember that anything but false and nil evaluates to true):

until expression
  code
end

The following example would probably never be written by a savvy rubyist in a real program, but it illustrates the difference between the two:

# Prints integers from 0 to 10
i = 0
while (i<= 10)
  puts i
  i += 1
end

# Prints integers from 0 to 10
i = 0
until (i> 10)
  puts i
  i += 1
end

Like the for/in loop, while and until accept an optional do keyword.

There is nothing particular about while and until in Ruby, except that they too can be used as expression modifiers. Here again this increases readability for trivial one-liners, as shown by these two equivalent, hypothetical lines of code:

battery.charge! while !battery.full?
battery.charge! until battery.full?

4.3.3. Blocks and Iterators

Ruby methods can have regular parameters, and can accept a block as well. A block is one or more statements grouped together, which act as nameless (or anonymous) functions. They don't exist on their own, but need to be associated with a method. Iterators are methods that accept an associated code block of this type. They usually iterate over a collection of elements, but the definition is broad enough to include methods that don't do that. For this reason, some people prefer not to refer to methods that don't iterate as iterators, but simply as methods that accept an associated block.

If you've ever worked with the IEnumerable and IEnumerator interfaces (and their generic forms) the iterator pattern shouldn't feel new to you. In the .NET world, LINQ to Objects introduced a few generators to simplify the process further. In Ruby, things are much simpler and concise, though undisputed that iterators and blocks were not invented by Ruby, but they are a distinctive feature of the language.

4.3.3.1. Numeric Iterators

Take a look at one of the simplest iterators, the method times of the Integer class:

5.times { puts "Ruby" }

This iterator accepts a block of code that prints the "Ruby" string literal and executes it a number of times, as specified by the receiver object (5 in this case). In short, this prints the code between the curly brackets five times.

Blocks can be defined between curly brackets or through the do/end notation, for example:

5.times do
  puts "Ruby"
end

The convention is to use curly brackets for one-line blocks and a do/end pair for blocks that contain multiple lines of code.

When the iterator has a regular argument, this should be surrounded by parentheses when adopting the curly brackets style for blocks, given that they have high precedence and would end up calling the block over the argument (a generally meaningless operation), instead of the method.

At each iteration, times passes a value to the associated block. That value is accessible from within the block as an iterator parameter/variable that you define by specifying an arbitrary identifier between pipes (for example, |identifier|):

5.times {|x| print x }

At each iteration the value of x is set as an incremented number (in the case of times), starting from zero. This line of code therefore prints 01234.

The class Integer offers another two common iterators, upto and downto:

1.upto(10) {|n| puts n }     # Returns 1 and prints integers from 1 up to 10
10.downto(1) {|n| puts n }   # Returns 10 and prints integers from 10 down to 1

For the sake of simplicity, this example uses the puts method, but blocks can contain any arbitrarily complex code as shown here:

fact = 1
2.upto(10) {|n| fact *= n }
puts fact                    # Prints 3628800

The Numeric class offers a step iterator as well, which invokes the block with the sequence of numbers that begins with the number that the method is invoked on. It's also incremented by the specified step on each call, until the limit is exceeded:

# Prints a table of squares for numbers
# between 0 and 10, in increments of 0.1
0.step(10, 0.1) do |n|
  puts "#{n}	#{n**2}"
end

4.3.3.2. The each Method

Perhaps the most common iterator that's available for many objects is the each method, as you briefly saw in the previous chapter when you used it as a more popular alternative to the for loop.

The following snippet shows its usage with arrays:

sites = ["reddit.com", "dzone.com", "digg.com"]
sites.each {|site| puts "<a href="http://#{site}">#{site.capitalize}</a>" }

Blocks can perform any action on the data iterated over, but the returning value of each is generally the receiver itself (the array sites in the preceding example).

This outputs:

<a href="http://reddit.com">Reddit.com</a>
<a href="http://dzone.com">Dzone.com</a>
<a href="http://digg.com">Digg.com</a>

It is common for classes that implement the each method to include the Enumerable module as well, which provides a series of useful traversal and search methods. For example, Array also has the method each_with_index, which passes the actual element and its index to the two block parameters:

["a","b","c"].each_with_index {|elem, index| puts "#{elem}: #{index}" }

Again, the names of the block parameters (elem and index in this example) are entirely up to you. It is often useful to use short, meaningful ones though.

This prints:

a: 0
b: 1
c: 2

The each iterator can be used with ranges as well:

('abc'..'xyz').each {|s| puts s }

This prints all the strings between 'abc' and 'xyz' in alphabetical order:

abd
abe
abf
abg
abh
...
...
xyv
xyw
xyx
xyy
xyz

And hashes:

author = { :name => "Kurt Vonnegut", :site => "vonnegut.com", :books =>14 }
author.each {|k, v| puts "#{k} => #{v}" }

This prints each key-value pair. Hashes are not ordered and, as such, the output may appear in a different order:

site => vonnegut.com
books => 14
name => Kurt Vonnegut

The String class has an each method as well (also available as each_line). It accepts an optional argument (the string separator) that defaults to the newline:

"this is
a string
on multiple
lines".each {|line| puts line }

The output of this one-liner is:

this is
a string
on multiple
lines

This is perhaps not exactly what you expected from the each method when it's applied to strings. Perhaps you were expecting to be able to iterate over every single character. Doing so is possible by employing the each_byte iterator:

"just a string".each_byte {|c| print c, " " }

which prints:

106 117 115 116 32 97 32 115 116 114 105 110 103

This is probably still not what you want. So you need to convert the numbers to their ASCII character representation through the chr method:

"just a string".each_byte {|c| print c.chr, " " }

And obtain:

j u s t   a   s t r i n g

Yes, the preceding example adds a final space at the end of the output.

Alternatively, you could have used the printf method passing it the argument "%c."

The each method is particularly useful when dealing with files:

# A very simple quine

File.open("quine.rb") do |f|
  f.each {|line| puts line }
end

If you save the code of that snippet in a file called quine.rb and run it, the program prints its own source code. This is a very straightforward form of quine (a program that prints its own source code).

You might also notice that blocks are often employed when working with files by passing them to the class method File.open. In a similar way, you could write to a file:

File.open("myfile.txt", "w") do |f|
  5.times { f.puts "Let's add a string" }
end

The "w" specifies that the file is accessible for writing and it should be created if it doesn't exist. If the file already exists, it is overwritten. If you'd like to append instead, use the "a" argument.

The each_with_index iterator exists for files as well:

File.open("myfile.txt") do |f|
  f.each_with_index do |line, index|
    puts "#{index}: #{line}"
  end
end

This prints each line contained in the myfile.txt file with its index. For example, if you ran the previous "write on file" snippet from the same folder, you'd obtain:

0: Let's add a string
1: Let's add a string
2: Let's add a string
3: Let's add a string
4: Let's add a string

ri File will tell you a whole lot more about the File class.

When working in Ruby on Rails you will often deal with arrays. So the next section takes a closer look at some other common iterators that are available for instances of Array (and in most cases, Hash).

4.3.3.3. Common Iterators

Array objects have a map method (alias for collect) that creates a new array containing the values returned by the associated block:

[1,2,3,4].map {|n| n**2 }        # [1, 4, 9, 16]
[1,2,3,4].collect {|n| n** 2 }   # [1, 4, 9, 16]

It is the Ruby equivalent of Enumerable.Select in .NET. This is particularly useful when you want to obtain a new array by uniformly altering each element of another array as shown here:

def capitalize_names(list)
  list.map {|name| name.capitalize }
end

names = ["matz", "david", "antonio"]
cap_names = capitalize_names(names)   # ["Matz", "David", "Antonio"]

Notice how map doesn't alter the original names array because, as is common in Ruby, it works on a copy of the receiver. The equivalent methods map! and collect! actually modify the receiver:

names.map! {|name| name.capitalize }
p names                               # Prints ["Matz", "David", "Antonio"]

Using the select iterator, you can create a new array by selecting elements based on the given criteria:

numbers = [*1..10]                    # [1,2,3,4,5,6,7,8,9,10]
evens = numbers.select {|x| x % 2 == 0 }
p evens                               # Prints [2, 4, 6, 8, 10]

If you are an efficiency geek, in Ruby you can use x & 1 == 0 when testing for evenness, too.

The select method can be employed to implement the classical Quicksort algorithm, as shown in Listing 4-2 (quicksort.rb).

Example 4.2. Quicksort Using Array#select
def qsort(array)
  return [] if array.empty?
  pivot, *tail = array
  less = tail.select {|el| el < pivot }
  greater = tail.select {|el| el >= pivot }
  qsort(less) + [pivot] + qsort(greater)
end

a = [2, 7, 9, 1, 3, 5, 2, 10]
p qsort(a)                            # Prints [1, 2, 2, 3, 5, 7, 9, 10]
puts qsort(a) == a.sort               # Prints true

The third line assigns the first element of the array to pivot, and the rest of the array to the variable tail.

If you are not familiar with the Quicksort algorithm, feel free to skip this example.

The opposite of select is reject, which returns only elements for which the block is not true. Somewhat similarly, arrays have the delete_if method that removes elements for which the block evaluates to true (from the receiver array, not a copy):

numbers = [*1..10]
numbers.delete_if {|x| (x&1).zero? }  # Returns [1, 3, 5, 7, 9]
p numbers                             # Prints [1, 3, 5, 7, 9]

The Enumerable module also provides the Array class with the partition method, which returns two arrays: the first array contains the elements of the array for which the block evaluates to true, and the second one, for which the block is false:

numbers = [*1..10]
p numbers.partition {|x| (x&1).zero? } # Prints [[2, 4, 6, 8, 10], [1, 3, 5, 7, 9]]

You can use this method to make Listing Listing 4-2 even more concise as shown in Listing Listing 4-3 (quicksort2.rb).

Example 4.3. Quicksort Using Enumerable#partition
def qsort(array)
  return [] if array.empty?
  pivot, *tail = array
  less, greater = tail.partition {|el| el < pivot }
  qsort(less) + [pivot] + qsort(greater)
end

a = [2, 7, 9, 1, 3, 5, 2, 10]
p qsort(a)                            # Prints [1, 2, 2, 3, 5, 7, 9, 10]
puts qsort(a) == a.sort               # Prints true

tail.partition creates an array containing two arrays: the first with elements less than the pivot, and the second with elements greater or equal to the pivot. The parallel assignment assigns the first element (an array) to the less variable, and the second element (an array as well) to the greater variable.

Another iterator worth mentioning is Enumerable#inject, which is sometimes known as reduce, fold, or aggregate in other languages. As a matter of fact, it's the Ruby equivalent of Enumerable.Aggregate in .NET 3.5.

This is the description of the method taken from the output of ri Enumerable#inject:

-------------------------------------------------------------------------Enumerable#inject
enum.inject(initial) {| memo, obj | block }  => obj
enum.inject          {| memo, obj | block }  => obj
--------------------------------------------------------------------------------------------------
  Combines the elements of _enum_ by applying the block to an
accumulator value (_memo_) and each element in turn. At each step,
  _memo_ is set to the value returned by the block. The first form
lets you supply an initial value for _memo_. The second form uses
the first element of the collection as a the initial value (and
skips that element while iterating).

The description is exact, but may still appear somewhat confusing unless you're well-versed in functional programming languages. A few examples should help illustrate its usage.

Take this line into consideration (it uses a range, but works equally well with arrays):

puts (0..100).inject {|sum, n| sum + n }   # Prints 5050

The great mathematician Gauss didn't need inject to calculate this. When he was a schoolboy he came up with a formula that easily added up arithmetical series.

The sum parameter is first set to the first element of the receiver (0 in this case). At each iteration n is set to the current element, and the returning value of the block is stored in sum. This means that the first iteration sum is set to 0+1, then 1+2, then 3+3, then 6+4, and so on, until the last element (100) has been added.

You could very easily rewrite the factorial (seen before), through the inject method (the factorial of 0 and 1 is 1, so you can pass 1 as an argument for inject, and this will work as the initial value for the calculation):

(2..n).inject(1) {|fact, x| fact * x }

Running ri Hash will tell you which iterators are available for Hash objects. collect, select, reject, delete_if, and partition, to name but a few, are all available. They usually accept two block parameters/variables instead of one. Of these, one is for the key and the other for the value. The following is an example of Hash#select and Hash#delete_if usage:

hash = { "a" => 1, "b" => 2, "c" => 3, "d" => 4 }
p hash.select {|k,v| k >  "b" }   # Prints [["c", 3], ["d", 4]]
p hash.select {|k,v| v < 3 }      # Prints [["a", 1], ["b", 2]]
p hash.delete_if {|k,v| v < 3 }   # Prints {"c"=>3, "d"=>4}

Array and Hash both include the Enumerable module, or in Ruby speak, they "mix in" its methods. Many instance methods are made available by Enumerable, but the following is a list (almost complete) of common ones:

  • Plain iterators: each_with_index, each_cons, and each_slice.

  • Methods that return true or false: include? (and member?), any?, and all?.

  • Filter methods: detect (and find), select (and find_all), reject, and grep.

  • Methods that transform a collection by either directly modifying it or by returning an altered copy of the receiver: map (and collect), partition, sort, sort_by, zip, and to_a (and entries).

  • Aggregators: inject and sum.

  • Summarizers: max and min.

Use the ri tool to look up those that haven't been illustrated in this and the previous chapter (for example, ri Enumerable#find).

Before moving on to the creation of your own iterator methods, think about how you'd implement all these little snippets of code in C#, Visual Basic, or on any other language you're accustomed to. Chances are that you'll find Ruby far more direct, concise, and easy to use.

4.3.3.4. Defining Your Own Iterators

A common characteristic of all the iterator methods seen so far is that they invoke the associated block of code for each element of a given sequence. As mentioned before, there can also be methods that expect and invoke the execution of a block of code, but don't actually loop (and they too are sometimes broadly and improperly called iterators).

At the heart of both, there is the yield statement, which enables the invoked method to temporarily pass the control to the block for execution. For example, the following method executes the associated block three times:

def three_times
  yield
  yield
  yield
end

three_times { puts "hello" }

It is important to understand that yield temporarily passes the control to the block of code, but when the last line of code in the block gets executed, the control is passed back to the method. The following modified version of the preceding example illustrates this:

def three_times
  puts "In the method"
  yield
  yield
  yield
  puts "In the method again"
end

three_times { puts "In the block" }

which prints:

In the method
In the block
In the block
In the block
In the method again

Remember the previous snippets in which most iterators allowed you to use one argument in the block (or two in the case of hashes)? You can pass argument values to the associated block by following your yield statements with a list of values (or expressions, to be more exact):

def three_times
  yield 1, 2
  yield 3, 4
  yield 5, 6
end

three_times {|a, b| puts a + b }

And this prints to the standard output:

3
7
11

This example illustrates the point, but it's rather silly, given that your real code will most likely need to perform something much more useful than that. So assume for a moment that you'd like to have an iterator for ranges that pass only even argument values to the associated block. You can easily implement it as follows:

def each_even(range)
  range.each do |n|
    yield n if (n&1).zero?
  end
end

each_even(1..10) {|x| print x, " " }   # Prints 2 4 6 8 10

It is worth noting that the class Range has an instance method called step, which could be used instead.

You can verify if a block was passed to the method through the block_given? method (globally accessible, because it's defined in Kernel) as shown here:

def n_times(n)
  if block_given?
    n.times { yield }
  else
    puts "I'm blockless"
  end
end

n_times(5) { puts "oh hi" }            # Prints 5 times oh hi
n_times(5)                             # Prints "I'm blockless"

The n.times passes an argument whose value goes from zero to n-1 during the execution, as seen before, but in this specific case, it was ignored. Changing the line n.times { yield} to n.times {|val| yield val } makes that value available to your custom-defined version (for example, n_times(5) {|x| puts x }).

yield also has a rough equivalent that can be utilized by specifying a block argument explicitly then prefixing it with an ampersand character (&), before invoking it through the call method. The previous range example can therefore be written as follows:

def each_even(range, &block;)
  range.each do |n|
    block.call n if (n&1).zero?
  end
end

each_even(1..10) {|x| print x, " " }   # Prints 2 4 6 8 10

It is usually fine to opt for yield instead, but the "&block and block.call" approach has an advantage in situations where you need more control, given that you have an actual object (a Proc one) to use as a point of reference, rather than just relying on a keyword (yield) to handle control from the method.

4.3.3.5. Proc.new and lambda

As mentioned before, blocks are essentially subroutines that are associated with a method and as such, cannot exist on their own:

my_block = { puts "don't do this" }   # syntax error

Thankfully, Ruby offers a way to explicitly convert a "standalone block" into an actual Proc object through the Proc.new and Kernel#lambda methods.

When a block is passed to a method, this gets instantiated as a Proc object as well.

Take a look at the following example:

add = lambda {|x, y| x + y }           # #<Proc:[email protected]:1>
sum = Proc.new {|x, y| x + y }         # #<Proc:[email protected]:2>
puts add.call(3, 5)                    # Prints 8
puts sum.call(3, 5)                    # Prints 8

Both methods are used to create anonymous methods (or functions, if you prefer) that can be invoked (with parameters in this case) and reused in your programs. When in the previous section you gave a name to the block, specifying it as an argument prefixed by an ampersand character, and then you called it with block.call, you were working with a Proc object.

You can use Proc objects as arguments for iterators that expect a block, by prefixing them with an ampersand:

addition = Proc.new {|sum, x| sum + x }
puts [1,2,3,4,5].inject(&addition)    # Prints 15

Blocks and procs act as closures because they can access variables that have been defined outside of their scope (or to clarify this further, outside of the curly brackets or the do/end pair). This means that they're able to access and modify objects that were defined in the context that invoked them (their binding).

There is actually a method called binding that returns a Binding object. This describes the variables and methods' context when called.

Behind the scenes, Ruby associates a binding with any block or proc that it creates. This implies that you can have the following:

sum = 0
1.upto(100) {|n| sum += n }
puts sum                              # Prints 5050

Notice how the sum variable can be accessed and modified from within the block. Variables defined inside the block are local to the block and not accessible outside of it:

1.upto(100) {|n| var = n }
puts var                              # Raises a NameError

Please bear in mind that block scoping will be substantially revisited in the next version of Ruby (Ruby 1.9).

In the following example, you can see how the method n_power returns a Proc object, created through lambda. When this proc gets invoked, the value (3) of the argument n originally passed to the n_power method is retained by the proc and is used along with the val parameter (whose value is 5), passed to the call method, in order to execute the actual calculation in the body of the proc:

def n_power(n)
  lambda {|val| val ** n }
end

cubed = n_power(3)
puts cubed.call(5)                    # Prints 125

The following shows how a Proc "remembers" and can modify its binding, even when invoked several times:

def make_counter(n = 0)
  lambda { n += 1 }
end

c1 = make_counter
c1.call                               # 1
c1.call                               # 2
c1.call                               # 3

Note how the counter gets incremented because lambda created a closure that's able to keep the state of the argument n (assigned to 0 by default when executing c1 = make_counter) and increment its value at each call. If you were to create a second closure, its local variables would be independent from c1, which has a different binding:

c2 = make_counter
c2.call                               # 1
c2.call                               # 2

Both Proc.new and Kernel#lambda return a Proc instance and can usually be used almost interchangeably. It's important to be aware of two fundamental differences between these two methods though.

The first difference concerns the returning behavior. Using return within the block of a proc created with lambda returns control back to the calling method. Using return with a proc created with Proc.new tries to return from the calling method. The following example shows these different behaviors:

def process_lambda
  puts "In the method"
  p = lambda { return "In the block" }
  puts p.call

puts "Back in the method"
end

def process_procnew
  puts "In the method"
  p = Proc.new { return "In the block" }
  puts p.call
  puts "Back in the method"
end

Executing process_lambda produces:

In the method
In the block
Back in the method

which is what you would expect in most cases. Executing process_procnew prints the following:

In the method

This surprising result is due to the fact that return within the Proc.new block returns "In the block" as the returning value for the calling method, which exists de facto from the earlier method.

Not only that, but if the proc were to be called outside of a method, it would raise a LocalJumpError error:

p = Proc.new { return "In a block" }
p.call                                 # Raises unexpected return (LocalJumpError)

The second difference, perhaps with far fewer implications, is that Proc.new tends to be more liberal in terms of argument passing, whereas lambda acts like a regular method that expects an exact number of arguments (unless the splat operator is used to pack a variable number of arguments into an array). You can see the different behavior in the following example:

p1 = Proc.new {|x, y| x + y }
p2 = lambda {|x, y| x + y }

# 2 arguments as expected
puts p1.call(1,2)     # 3
puts p2.call(1,2)     # 3

# A third unexpected argument
puts p1.call(1,2,3)   # 3
puts p2.call(1,2,3)   # Raises wrong number of arguments (3 for 2) (ArgumentError)

At this point you may wonder why so many pages have been devoted to covering concepts like blocks, iterators, procs, and closures. It's because these are so fundamental to Ruby (and Rails) programming, that if the chapter were to abruptly end here, understanding these concepts would still place you ahead of many Rails beginners.

It was important to spell out a few more advanced details. That said, this section is admittedly quite heavy in terms of details and you should be able to get by even if you don't remember all of them, as long as you get the general idea. You can breathe a sigh of relief as the chapter progresses toward other topics.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.172.146