The use of blocks is fundamental to the use of iterators. In the previous section, we focused on iterators as a kind of looping construct. Blocks were implicit to our discussion but were not the subject of it. Now we turn our attention to the block themselves. The subsections that follow explain:
The syntax for associating a block with a method invocation
The “return value” of a block
The scope of variables in blocks
The difference between block parameters and method parameters
Blocks may not stand alone; they are only legal following a method
invocation. You can, however, place a block after any method
invocation; if the method is not an iterator and never invokes the
block with yield
, the block will be
silently ignored. Blocks are delimited with curly braces or with
do
and end
keywords. The opening curly brace
or the do
keyword must
be on the same line as the method invocation, or else Ruby interprets
the line terminator as a statement terminator and invokes the method
without the block:
# Print the numbers 1 to 10 1.upto(10) {|x| puts x } # Invocation and block on one line with braces 1.upto(10) do |x| # Block delimited with do/end puts x end 1.upto(10) # No block specified {|x| puts x } # Syntax error: block not after an invocation
One common convention is to use curly braces when a block fits
on a single line, and to use do
and
end
when the block extends over
multiple lines.This is not completely a matter of convention, however;
the Ruby parser binds {
tightly to
the token that precedes it. If you omit the parentheses around method
arguments and use curly brace delimiters for a block, then the block
will be associated with the last method argument rather than the
method itself, which is probably not what you want. To avoid this
case, put parentheses around the
arguments or delimit the block with do
and end
:
1.upto(3) {|x| puts x } # Parens and curly braces work 1.upto 3 do |x| puts x end # No parens, block delimited with do/end 1.upto 3 {|x| puts x } # Syntax Error: trying to pass a block to 3!
Blocks can be parameterized, just as methods can. Block
parameters are separated with commas and delimited with a pair of
vertical bar (|
) characters, but
they are otherwise much like method parameters (see Block Parameters in Ruby 1.9 for details):
# The Hash.each iterator passes two arguments to its block hash.each do |key, value| # For each (key,value) pair in the hash puts "#{key}: #{value}" # Print the key and the value end # End of the block
It is a common convention to write the block parameters on the
same line as the method invocation and the opening brace or do
keyword, but this is not required by the
syntax.
In the iterator examples shown so far in this chapter, the iterator method has
yielded values to its associated block but has ignored the value
returned by the block. This is not always the case, however. Consider
the Array.sort
method. If you
associate a block with an invocation of this method, it will yield
pairs of elements to the block, and it is the block’s job to sort
them. The block’s return value (–1
,
0
, or 1
) indicates the ordering of the two
arguments. The “return value” of the block is available to the
iterator method as the value of the yield
statement.
The “return value” of a block is simply the value of the last expression evaluated in the block. So, to sort an array of words from longest to shortest, we could write:
# The block takes two words and "returns" their relative order words.sort! {|x,y| y.length <=> x.length}
We’ve been placing the phrase “return value” in quotes for a
very important reason: you should not normally use the return
keyword to return from a block. A return
inside a block causes the containing
method (not the iterator method that yields to the block, but the
method that the block is part of) to return. There are, of course,
times when this is exactly what you want to do. But don’t use return
if you just want to return from a
block to the method that called yield
. If you need to force a block to
return to the invoking method before it reaches the last expression,
or if you want to return more than one value, you can use next
instead of return
. (return
, next
, and the related statement break
are explained in detail in Altering Control Flow.) Here is an example that uses
next
to return from the
block:
array.collect do |x| next 0 if x == nil # Return prematurely if x is nil next x, x*x # Return two values end
Note that it is not particularly common to use next
in this way, and the code above is
easily rewritten without it:
array.collect do |x| if x == nil 0 else [x, x*x] end end
Blocks define a new variable scope: variables created within a block exist only within that block and are undefined outside of the block. Be cautious, however; the local variables in a method are available to any blocks within that method. So if a block assigns a value to a variable that is already defined outside of the block, this does not create a new block-local variable but instead assigns a new value to the already-existing variable. Sometimes, this is exactly the behavior we want:
total = 0 data.each {|x| total += x } # Sum the elements of the data array puts total # Print out that sum
Sometimes, however, we do not want to alter variables in the
enclosing scope, but we do so inadvertently. This problem is a
particular concern for block parameters in Ruby 1.8. In Ruby 1.8, if a block parameter shares the
name of an existing variable, then invocations of the block simply
assign a value to that existing variable rather than creating a new
block-local variable. The following code, for example, is problematic
because it uses the same identifier i
as the block parameter for two nested
blocks:
1.upto(10) do |i| # 10 rows 1.upto(10) do |i| # Each has 10 columns print "#{i} " # Print column number end print " ==> Row #{i} " # Try to print row number, but get column number end
Ruby 1.9 is different: block parameters are always local to their block, and
invocations of the block never assign values to existing variables. If
Ruby 1.9 is invoked with the -w
flag, it will warn you if a block
parameter has the same name as an existing variable. This helps you
avoid writing code that runs differently in 1.8 and 1.9.
Ruby 1.9 is different in another important way, too. Block syntax has been extended to allow you to declare block-local variables that are guaranteed to be local, even if a variable by the same name already exists in the enclosing scope. To do this, follow the list of block parameters with a semicolon and a comma-separated list of block local variables. Here is an example:
x = y = 0 # local variables 1.upto(4) do |x;y| # x and y are local to block # x and y "shadow" the outer variables y = x + 1 # Use y as a scratch variable puts y*y # Prints 4, 9, 16, 25 end [x,y] # => [0,0]: block does not alter these
In this code, x
is a block
parameter: it gets a value when the block is invoked with yield.
y
is a block-local variable. It
does not receive any value from a yield
invocation, but it has the value
nil
until the block actually
assigns some other value to it. The point of declaring these
block-local variables is to guarantee that you will not inadvertently
clobber the value of some existing variable. (This might happen if a
block is cut-and-pasted from one method to another, for example.) If
you invoke Ruby 1.9 with the -w
option, it will warn you if a block-local variable shadows an existing
variable.
Blocks can have more than one parameter and more than one local variable, of course. Here is a block with two parameters and three local variables:
hash.each {|key,value; i,j,k| ... }
We’ve said previously that the parameters to a block are much
like the parameters to a method. They are not strictly the same,
however. The argument values that follow a yield
keyword are assigned to block
parameters following rules that are closer to the rules for variable
assignment than to the rules for method invocation. Thus, when an
iterator executes yield k,v
to
invoke a block declared with parameters |key,
value|
, it is equivalent to this assignment
statement:
key,value = k,v
The Hash.each_pair
iterator
yields a key/value pair like this:[*]
{:one=>1}.each_pair {|key,value| ... } # key=:one, value=1
In Ruby 1.8, it is even more clear that block invocation uses variable assignment. Recall that in Ruby 1.8 parameters are only local to the block if they are not already in use as local variables of the containing method. If they are already local variables, then they are simply assigned to. In fact, Ruby 1.8 allows any kind of variable to be used as a block parameter, including global variables and instance variables:
{:one=>1}.each_pair {|$key, @value| ... } # No longer works in Ruby 1.9
This iterator sets the global variable $key
to :one
and sets the instance variable @value
to 1. As already noted, Ruby 1.9
makes block parameters local to the block. This also means that block
parameters can no longer be global or instance variables.
The Hash.each
iterator yields
key/value pairs as two elements of a single array. It is very common,
however, to see code like this:
hash.each {|k,v| ... } # key and value assigned to params k and v
This also works by parallel assignment. The yielded value, a two-element
array, is assigned to the
variables k
and v
:
k,v = [key, value]
By the rules of parallel assignment (see Parallel Assignment), a single array on the right is expanded to and its elements assigned to the multiple variables on the left.
Block invocation does not work exactly like parallel assignment. Imagine an iterator that passes two values to its block. By the rules of parallel assignment, we might expect to be able to declare a block with a single parameter and have the two values automatically filled into an array for us. But it does not work that way:
def two; yield 1,2; end # An iterator that yields two values two {|x| p x } # Ruby 1.8: warns and prints [1,2], two {|x| p x } # Ruby 1.9: prints 1, no warning two {|*x| p x } # Either version: prints [1,2]; no warning two {|x,| p x } # Either version: prints 1; no warning
In Ruby 1.8, multiple arguments are packed into an array when
there is a single block parameter, but this is deprecated and
generates a warning message. In Ruby 1.9, the first value yielded is
assigned to the block parameter and the second value is silently
discarded. If we want multiple yielded values to be packed into an
array and assigned to a single block parameter, we must explicitly
indicate this by prefixing the parameter with an *
, exactly as we’d do in a method
declaration. (See Chapter 6 for a thorough discussion
of method parameters and method declaration.) Also note that we can
explicitly discard the second
yielded value by declaring a block parameter list that ends with a
comma, as if to say: “There is another parameter, but it is unused and
I can’t be bothered to pick a name for it.”
Although block invocation does not behave like parallel assignment in this case, it does not behave like method invocation, either. If we declare a method with one argument and then pass two arguments to it, Ruby doesn’t just print a warning, it raises an error.
The yield
statement allows
bare hashes as the last argument value, just as method invocations
(see Hashes for Named Arguments) do. That is, if the last argument to
yield
is a hash literal, you may
omit the curly braces. Because it is not common for iterators to yield
hashes, we have to contrive an example to demonstrate this:
def hashiter; yield :a=>1, :b=>2; end # Note no curly braces hashiter {|hash| puts hash[:a] } # Prints 1
In Ruby 1.8, only the last block parameter may have an *
prefix. Ruby 1.9 lifts this restriction
and allows any one block parameter, regardless of its position in the
list, to have an *
prefix:
def five; yield 1,2,3,4,5; end # Yield 5 values five do |head, *body, tail| # Extra values go into body array print head, body, tail # Prints "1[2,3,4]5" end
In Ruby 1.9 block parameters can have default values just like
method parameters can. Suppose, for example, that you want to iterate
the values of an object o
but you don’t know if
o
is an array or a hash. You could use a block like
this:
o.each {|key=nil,value| puts value}
If the each
iterator yields a single value,
it is assigned to the second block parameter. If
each
yields a pair of values, they are assigned to
both parameters.
In Ruby 1.9, the final block parameter may be prefixed with
&
to indicate that it is to
receive any block associated with the invocation of the block. Recall,
however, that a yield
invocation
may not have a block associated with it. We’ll learn in Chapter 6 that a block can be converted into a Proc
, and blocks can be
associated with Proc
invocations.
The following code example should make sense once you have read Chapter 6:
# This Proc expects a block printer = lambda {|&b| puts b.call } # Print value returned by b printer.call { "hi" } # Pass a block to the block!
[*] The Ruby 1.8 each_pair
yields two separate values to the block. In Ruby 1.9, the each_pair
iterator is a synonym for
each
and passes a single array
argument, as will be explained shortly. The code shown here,
however, works correctly in both versions.
3.144.228.78