Blocks

The use of blocks is fundamental to the use of iterators. In the previous section, we focused on iterators as a kind of looping construct. Blocks were implicit to our discussion but were not the subject of it. Now we turn our attention to the block themselves. The subsections that follow explain:

  • The syntax for associating a block with a method invocation

  • The “return value” of a block

  • The scope of variables in blocks

  • The difference between block parameters and method parameters

Block Syntax

Blocks may not stand alone; they are only legal following a method invocation. You can, however, place a block after any method invocation; if the method is not an iterator and never invokes the block with yield, the block will be silently ignored. Blocks are delimited with curly braces or with do and end keywords. The opening curly brace or the do keyword must be on the same line as the method invocation, or else Ruby interprets the line terminator as a statement terminator and invokes the method without the block:

# Print the numbers 1 to 10
1.upto(10) {|x| puts x }   # Invocation and block on one line with braces
1.upto(10) do |x|          # Block delimited with do/end
  puts x
end
1.upto(10)                 # No block specified
 {|x| puts x }             # Syntax error: block not after an invocation

One common convention is to use curly braces when a block fits on a single line, and to use do and end when the block extends over multiple lines.This is not completely a matter of convention, however; the Ruby parser binds { tightly to the token that precedes it. If you omit the parentheses around method arguments and use curly brace delimiters for a block, then the block will be associated with the last method argument rather than the method itself, which is probably not what you want. To avoid this case, put parentheses around the arguments or delimit the block with do and end:

1.upto(3) {|x| puts x }    # Parens and curly braces work
1.upto 3 do |x| puts x end # No parens, block delimited with do/end
1.upto 3 {|x| puts x }     # Syntax Error: trying to pass a block to 3!

Blocks can be parameterized, just as methods can. Block parameters are separated with commas and delimited with a pair of vertical bar (|) characters, but they are otherwise much like method parameters (see Block Parameters in Ruby 1.9 for details):

# The Hash.each iterator passes two arguments to its block
hash.each do |key, value|   # For each (key,value) pair in the hash
  puts "#{key}: #{value}"   # Print the key and the value
end                         # End of the block

It is a common convention to write the block parameters on the same line as the method invocation and the opening brace or do keyword, but this is not required by the syntax.

The Value of a Block

In the iterator examples shown so far in this chapter, the iterator method has yielded values to its associated block but has ignored the value returned by the block. This is not always the case, however. Consider the Array.sort method. If you associate a block with an invocation of this method, it will yield pairs of elements to the block, and it is the block’s job to sort them. The block’s return value (–1, 0, or 1) indicates the ordering of the two arguments. The “return value” of the block is available to the iterator method as the value of the yield statement.

The “return value” of a block is simply the value of the last expression evaluated in the block. So, to sort an array of words from longest to shortest, we could write:

# The block takes two words and "returns" their relative order
words.sort! {|x,y| y.length <=> x.length}

We’ve been placing the phrase “return value” in quotes for a very important reason: you should not normally use the return keyword to return from a block. A return inside a block causes the containing method (not the iterator method that yields to the block, but the method that the block is part of) to return. There are, of course, times when this is exactly what you want to do. But don’t use return if you just want to return from a block to the method that called yield. If you need to force a block to return to the invoking method before it reaches the last expression, or if you want to return more than one value, you can use next instead of return. (return, next, and the related statement break are explained in detail in Altering Control Flow.) Here is an example that uses next to return from the block:

array.collect do |x|
  next 0 if x == nil  # Return prematurely if x is nil
  next x, x*x         # Return two values
end

Note that it is not particularly common to use next in this way, and the code above is easily rewritten without it:

array.collect do |x|
  if x == nil
    0
  else
    [x, x*x]
  end
end

Blocks and Variable Scope

Blocks define a new variable scope: variables created within a block exist only within that block and are undefined outside of the block. Be cautious, however; the local variables in a method are available to any blocks within that method. So if a block assigns a value to a variable that is already defined outside of the block, this does not create a new block-local variable but instead assigns a new value to the already-existing variable. Sometimes, this is exactly the behavior we want:

total = 0   
data.each {|x| total += x }  # Sum the elements of the data array
puts total                   # Print out that sum

Sometimes, however, we do not want to alter variables in the enclosing scope, but we do so inadvertently. This problem is a particular concern for block parameters in Ruby 1.8. In Ruby 1.8, if a block parameter shares the name of an existing variable, then invocations of the block simply assign a value to that existing variable rather than creating a new block-local variable. The following code, for example, is problematic because it uses the same identifier i as the block parameter for two nested blocks:

1.upto(10) do |i|         # 10 rows
  1.upto(10) do |i|       # Each has 10 columns
    print "#{i} "         # Print column number
  end
  print " ==> Row #{i}
" # Try to print row number, but get column number
end

Ruby 1.9 is different: block parameters are always local to their block, and invocations of the block never assign values to existing variables. If Ruby 1.9 is invoked with the -w flag, it will warn you if a block parameter has the same name as an existing variable. This helps you avoid writing code that runs differently in 1.8 and 1.9.

Ruby 1.9 is different in another important way, too. Block syntax has been extended to allow you to declare block-local variables that are guaranteed to be local, even if a variable by the same name already exists in the enclosing scope. To do this, follow the list of block parameters with a semicolon and a comma-separated list of block local variables. Here is an example:

x = y = 0            # local variables
1.upto(4) do |x;y|   # x and y are local to block
                     # x and y "shadow" the outer variables
  y = x + 1          # Use y as a scratch variable
  puts y*y           # Prints 4, 9, 16, 25
end
[x,y]                # => [0,0]: block does not alter these

In this code, x is a block parameter: it gets a value when the block is invoked with yield. y is a block-local variable. It does not receive any value from a yield invocation, but it has the value nil until the block actually assigns some other value to it. The point of declaring these block-local variables is to guarantee that you will not inadvertently clobber the value of some existing variable. (This might happen if a block is cut-and-pasted from one method to another, for example.) If you invoke Ruby 1.9 with the -w option, it will warn you if a block-local variable shadows an existing variable.

Blocks can have more than one parameter and more than one local variable, of course. Here is a block with two parameters and three local variables:

hash.each {|key,value; i,j,k| ... }

Passing Arguments to a Block

We’ve said previously that the parameters to a block are much like the parameters to a method. They are not strictly the same, however. The argument values that follow a yield keyword are assigned to block parameters following rules that are closer to the rules for variable assignment than to the rules for method invocation. Thus, when an iterator executes yield k,v to invoke a block declared with parameters |key, value|, it is equivalent to this assignment statement:

key,value = k,v

The Hash.each_pair iterator yields a key/value pair like this:[*]

{:one=>1}.each_pair {|key,value| ... } # key=:one, value=1

In Ruby 1.8, it is even more clear that block invocation uses variable assignment. Recall that in Ruby 1.8 parameters are only local to the block if they are not already in use as local variables of the containing method. If they are already local variables, then they are simply assigned to. In fact, Ruby 1.8 allows any kind of variable to be used as a block parameter, including global variables and instance variables:

{:one=>1}.each_pair {|$key, @value| ... } # No longer works in Ruby 1.9

This iterator sets the global variable $key to :one and sets the instance variable @value to 1. As already noted, Ruby 1.9 makes block parameters local to the block. This also means that block parameters can no longer be global or instance variables.

The Hash.each iterator yields key/value pairs as two elements of a single array. It is very common, however, to see code like this:

hash.each {|k,v| ... }  # key and value assigned to params k and v

This also works by parallel assignment. The yielded value, a two-element array, is assigned to the variables k and v:

k,v = [key, value]

By the rules of parallel assignment (see Parallel Assignment), a single array on the right is expanded to and its elements assigned to the multiple variables on the left.

Block invocation does not work exactly like parallel assignment. Imagine an iterator that passes two values to its block. By the rules of parallel assignment, we might expect to be able to declare a block with a single parameter and have the two values automatically filled into an array for us. But it does not work that way:

def two; yield 1,2; end  # An iterator that yields two values
two {|x| p x }     # Ruby 1.8: warns and prints [1,2],
two {|x| p x }     # Ruby 1.9: prints 1, no warning
two {|*x| p x }    # Either version: prints [1,2]; no warning
two {|x,| p x }    # Either version: prints 1; no warning

In Ruby 1.8, multiple arguments are packed into an array when there is a single block parameter, but this is deprecated and generates a warning message. In Ruby 1.9, the first value yielded is assigned to the block parameter and the second value is silently discarded. If we want multiple yielded values to be packed into an array and assigned to a single block parameter, we must explicitly indicate this by prefixing the parameter with an *, exactly as we’d do in a method declaration. (See Chapter 6 for a thorough discussion of method parameters and method declaration.) Also note that we can explicitly discard the second yielded value by declaring a block parameter list that ends with a comma, as if to say: “There is another parameter, but it is unused and I can’t be bothered to pick a name for it.”

Although block invocation does not behave like parallel assignment in this case, it does not behave like method invocation, either. If we declare a method with one argument and then pass two arguments to it, Ruby doesn’t just print a warning, it raises an error.

The yield statement allows bare hashes as the last argument value, just as method invocations (see Hashes for Named Arguments) do. That is, if the last argument to yield is a hash literal, you may omit the curly braces. Because it is not common for iterators to yield hashes, we have to contrive an example to demonstrate this:

def hashiter; yield :a=>1, :b=>2; end  # Note no curly braces
hashiter {|hash| puts hash[:a] }       # Prints 1

Block Parameters in Ruby 1.9

In Ruby 1.8, only the last block parameter may have an * prefix. Ruby 1.9 lifts this restriction and allows any one block parameter, regardless of its position in the list, to have an * prefix:

def five; yield 1,2,3,4,5; end     # Yield 5 values
five do |head, *body, tail|        # Extra values go into body array
  print head, body, tail           # Prints "1[2,3,4]5"
end

In Ruby 1.9 block parameters can have default values just like method parameters can. Suppose, for example, that you want to iterate the values of an object o but you don’t know if o is an array or a hash. You could use a block like this:

o.each {|key=nil,value| puts value}

If the each iterator yields a single value, it is assigned to the second block parameter. If each yields a pair of values, they are assigned to both parameters.

In Ruby 1.9, the final block parameter may be prefixed with & to indicate that it is to receive any block associated with the invocation of the block. Recall, however, that a yield invocation may not have a block associated with it. We’ll learn in Chapter 6 that a block can be converted into a Proc, and blocks can be associated with Proc invocations. The following code example should make sense once you have read Chapter 6:

# This Proc expects a block 
printer = lambda {|&b| puts b.call } # Print value returned by b
printer.call { "hi" }                # Pass a block to the block!


[*] The Ruby 1.8 each_pair yields two separate values to the block. In Ruby 1.9, the each_pair iterator is a synonym for each and passes a single array argument, as will be explained shortly. The code shown here, however, works correctly in both versions.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.141.37.10