Functional Programming

Ruby is not a functional programming language in the way that languages like Lisp and Haskell are, but Ruby’s blocks, procs, and lambdas lend themselves nicely to a functional programming style. Any time you use a block with an Enumerable iterator like map or inject, you’re programming in a functional style. Here are examples using the map and inject iterators:

# Compute the average and standard deviation of an array of numbers
mean = a.inject {|x,y| x+y } / a.size
sumOfSquares = a.map{|x| (x-mean)**2 }.inject{|x,y| x+y }
standardDeviation = Math.sqrt(sumOfSquares/(a.size-1))

If the functional programming style is attractive to you, it is easy to add features to Ruby’s built-in classes to facilitate functional programming. The rest of this chapter explores some possibilities for working with functions. The code in this section is dense and is presented as a mind-expanding exploration, not as a prescription for good programming style. In particular, redefining operators as heavily as the code in the next section does is likely to result in programs that are difficult for others to read and maintain!

This is advanced material and the code that follows assumes familiarity with Chapter 7. You may, therefore, want to skip the rest of this chapter the first time through the book.

Applying a Function to an Enumerable

mapand inject are two of the most important iterators defined by Enumerable. Each expects a block. If we are to write programs in a function-centric way, we might like methods on our functions that allow us to apply those functions to a specified Enumerable object:

# This module defines methods and operators for functional programming.
module Functional

  # Apply this function to each element of the specified Enumerable,
  # returning an array of results. This is the reverse of Enumerable.map.
  # Use | as an operator alias. Read "|" as "over" or "applied over".
  # 
  # Example:
  #   a = [[1,2],[3,4]]
  #   sum = lambda {|x,y| x+y}
  #   sums = sum|a   # => [3,7]
  def apply(enum)
    enum.map &self
  end
  alias | apply

  # Use this function to "reduce" an enumerable to a single quantity.
  # This is the inverse of Enumerable.inject.
  # Use <= as an operator alias.
  # Mnemonic: <= looks like a needle for injections
  # Example:
  #   data = [1,2,3,4]
  #   sum = lambda {|x,y| x+y}
  #   total = sum<=data   # => 10
  def reduce(enum)
    enum.inject &self
  end
  alias <= reduce
end

# Add these functional programming methods to Proc and Method classes.
class Proc; include Functional; end
class Method; include Functional; end

Notice that we define methods in a module named Functional, and then we include this module into both the Proc and Method classes. In this way, apply and reduce work for both proc and method objects. Most of the methods that follow also define methods in this Functional module, so that they work for both Proc and Method.

With apply and reduce defined as above, we could refactor our statistical computations as follows:

sum = lambda {|x,y| x+y }        # A function to add two numbers
mean = (sum<=a)/a.size           # Or sum.reduce(a) or a.inject(&sum)
deviation = lambda {|x| x-mean } # Function to compute difference from mean
square = lambda {|x| x*x }       # Function to square a number
standardDeviation = Math.sqrt((sum<=square|(deviation|a))/(a.size-1))

Notice that the last line is succinct but that all the nonstandard operators make it hard to read. Also notice that the | operator is left-associative, even when we define it ourselves. The syntax, therefore, for applying multiple functions to an Enumerable requires parentheses. That is, we must write square|(deviation|a) instead of square|deviation|a.

Composing Functions

If we have two functions f and g, we sometimes want to define a new function h which is f(g()), or f composed with g. We can write a method that performs function composition automatically, as follows:

module Functional
  # Return a new lambda that computes self[f[args]].
  # Use * as an operator alias for compose.
  # Examples, using the * alias for this method.
  # 
  # f = lambda {|x| x*x }
  # g = lambda {|x| x+1 }
  # (f*g)[2]   # => 9
  # (g*f)[2]   # => 5
  # 
  # def polar(x,y)
  #   [Math.hypot(y,x), Math.atan2(y,x)]
  # end
  # def cartesian(magnitude, angle)
  #   [magnitude*Math.cos(angle), magnitude*Math.sin(angle)]
  # end
  # p,c = method :polar, method :cartesian
  # (c*p)[3,4]  # => [3,4]
  # 
  def compose(f)
    if self.respond_to?(:arity) && self.arity == 1
      lambda {|*args| self[f[*args]] }
    else
      lambda {|*args| self[*f[*args]] }
    end
  end

  # * is the natural operator for function composition.
  alias * compose
end

The example code in the comment demonstrates the use of compose with Method objects as well as lambdas. We can use this new * function composition operator to slightly simplify our computation of standard deviation. Using the same definitions of the lambdas sum, square, and deviation, the computation becomes:

standardDeviation = Math.sqrt((sum<=square*deviation|a)/(a.size-1))

The difference is that we compose square and deviation into a single function before applying it to the array a.

Partially Applying Functions

In functional programming, partial application is the process of taking a function and a partial set of argument values and producing a new function that is equivalent to the original function with the specified arguments fixed. This is similar to, but not quite the same as currying with the Proc.curry method. For example:

product = lambda {|x, y| x*y }       # A function of two arguments
double = lambda {|x| product(2,x) }  # Apply one argument

Partial application can be simplified with appropriate methods (and operators) in our Functional module:

module Functional
  #
  # Return a lambda equivalent to this one with one or more initial 
  # arguments applied. When only a single argument
  # is being specified, the >> alias may be simpler to use.
  # Example:
  #   product = lambda {|x,y| x*y}
  #   doubler = product >> 2
  #
  def apply_head(*first)
    lambda {|*rest| self[*first.concat(rest)]}
  end

  #
  # Return a lambda equivalent to this one with one or more final arguments
  # applied. When only a single argument is being specified,
  # the << alias may be simpler.
  # Example:
  #  difference = lambda {|x,y| x-y }
  #  decrement = difference << 1
  #
  def apply_tail(*last)
    lambda {|*rest| self[*rest.concat(last)]}
  end

  # Here are operator alternatives for these methods. The angle brackets
  # point to the side on which the argument is shifted in.
  alias >> apply_head    # g = f >> 2 -- set first arg to 2
  alias << apply_tail    # g = f << 2 -- set last arg to 2
end

Using these methods and operators, we can define our double function simply as product>>2. We can use partial application to make our standard deviation computation somewhat more abstract, by building our deviation function from a more general-purpose difference function:

difference = lambda {|x,y| x-y }  # Compute difference of two numbers
deviation = difference<<mean      # Apply second argument

Memoizing Functions

Memoization is a functional programming term for caching the results of a function invocation. If a function always returns the same value when passed the same arguments, if there is reason to believe that the same arguments will be used repeatedly, and if the computation it performs is somewhat expensive, then memoization may be a useful optimization. We can automate memoization for Proc and Method objects with the following method:

module Functional
  #
  # Return a new lambda that caches the results of this function and 
  # only calls the function when new arguments are supplied.
  #
  def memoize
    cache = {}  # An empty cache. The lambda captures this in its closure.
    lambda {|*args|
      # notice that the hash key is the entire array of arguments!
      unless cache.has_key?(args)  # If no cached result for these args
        cache[args] = self[*args]  # Compute and cache the result
      end
      cache[args]                  # Return result from cache
    }
  end
  # A (probably unnecessary) unary + operator for memoization
  # Mnemonic: the + operator means "improved"
  alias +@ memoize        # cached_f = +f
end

Here’s how we might use the memoize method or the unary + operator:

# A memoized recursive factorial function
factorial = lambda {|x| return 1 if x==0; x*factorial[x-1]; }.memoize
# Or, using the unary operator syntax
factorial = +lambda {|x| return 1 if x==0; x*factorial[x-1]; }

Note that the factorial function here is a recursive function. It calls the memoized version of itself, which produces optimal caching. It would not work as well if you defined a recursive nonmemoized version of the function and then defined a distinct memoized version of that:

factorial = lambda {|x| return 1 if x==0; x*factorial[x-1]; }
cached_factorial = +factorial # Recursive calls aren't cached!

Symbols, Methods, and Procs

There is a close relationship between the Symbol, Method, and Proc classes. We’ve already seen the method method, which takes a Symbol argument and returns a Method object.

Ruby 1.9 adds a useful to_proc method to the Symbol class. This method allows a symbol to be prefixed with & and passed as a block to an iterator. The symbol is assumed to name a method. When the Proc created with this to_proc method is invoked, it calls the named method of its first argument, passing any remaining arguments to that named method. Here’s how you might use it:

# Increment an array of integers with the Fixnum.succ method
[1,2,3].map(&:succ)  # => [2,3,4]

Without Symbol.to_proc, we’d have to be slightly more verbose:

[1,2,3].map {|n| n.succ }

Symbol.to_proc was originally devised as an extension for Ruby 1.8, and it is typically implemented like this:

class Symbol
  def to_proc
    lambda {|receiver, *args| receiver.send(self, *args)}
  end
end

This implementation uses the send method (see Invoking Methods) to invoke a method named by a symbol. We could also do it like this:

class Symbol
  def to_proc
    lambda {|receiver, *args| receiver.method(self)[*args]}
  end
end

In addition to to_proc, we can define some related and possibly useful utilities. Let’s start with the Module class:

class Module
  # Access instance methods with array notation. Returns UnboundMethod,
  alias [] instance_method
end

Here, we’re simply defining a shorthand for the instance_method method of the Module class. Recall that that method returns an UnboundMethod object, that cannot be invoked until bound to a particular instance of its class. Here’s an example using this new notation (notice the appeal of indexing a class with the names of its methods!):

String[:reverse].bind("hello").call   # => "olleh"

Binding an unbound method can also be made simpler with a bit of the same syntactic sugar:

class UnboundMethod
  # Allow [] as an alternative to bind.  
  alias [] bind
end

With this alias in place, and using the existing [] alias for calling a method, this code becomes:

String[:reverse]["hello"][]   # => "olleh"

The first pair of brackets indexes the method, the second pair binds it, and the third pair calls it.

Next, if we’re going to use the [] operator for looking up the instance methods of a class, how about using []= for defining instance methods:

class Module
  # Define a instance method with name sym and body f.
  # Example: String[:backwards] = lambda { reverse }
  def []=(sym, f)
    self.instance_eval { define_method(sym, f) }
  end
end

The definition of this []= operator may be confusing—this is advanced Ruby. define_method is a private method of Module. We use instance_eval (a public method of Object) to run a block (including the invocation of a private method) as if it were inside the module on which the method is being defined. We’ll see instance_eval and define_method again in Chapter 8.

Let’s use this new []= operator to define a new Enumerable.average method:

Enumerable[:average] = lambda do
  sum, n = 0.0, 0
  self.each {|x| sum += x; n += 1 }
  if n == 0
    nil
  else
    sum/n
  end
end

We’ve used the [] and []= operators here to get and set instance methods of a class or module. We can do something similar for the singleton methods of an object (which include the class methods of a class or module). Any object can have a singleton method, but it doesn’t make sense to define an [] operator on the Object class, as so many subclasses define that operator. For singleton methods, therefore, we could take the opposite approach and define operators on the Symbol class:

#
# Add [] and []= operators to the Symbol class for accessing and setting
# singleton methods of objects. Read : as "method" and [] as "of".
# So :m[o] reads "method m of o".
#
class Symbol
  # Return the Method of obj named by this symbol. This may be a singleton
  # method of obj (such as a class method) or an instance method defined
  # by obj.class or inherited from a superclass.
  # Examples:
  #   creator = :new[Object]  # Class method Object.new
  #   doubler = :*[2]         # * method of Fixnum 2
  #
  def [](obj)
    obj.method(self)
  end
  
  # Define a singleton method on object o, using Proc or Method f as its body.
  # This symbol is used as the name of the method.
  # Examples:
  #
  #  :singleton[o] = lambda { puts "this is a singleton method of o" }
  #  :class_method[String] = lambda { puts "this is a class method" }
  # 
  # Note that you can't create instance methods this way. See Module.[]=
  #
  def []=(o,f)
    # We can't use self in the block below, as it is evaluated in the 
    # context of a different object. So we have to assign self to a variable.
    sym = self
    # This is the object we define singleton methods on.
    eigenclass = (class << o; self end)
    # define_method is private, so we have to use instance_eval to execute it.
    eigenclass.instance_eval { define_method(sym, f) }
  end
end

With this Symbol.[] method defined, along with the Functional module described previously, we can write clever (and unreadable) code like this:

dashes = :*['-']       # Method * of '-'
puts dashes[10]        # Prints "----------"

y = (:+[1]*:*[2])[x]   # Another way to write y = 2*x + 1

The definition of []= for Symbol is like that of []= for Module, in that it uses instance_eval to invoke the define_method method. The difference is that singleton methods are not defined within a class, as instance methods are, but in the eigenclass of the object. We’ll encounter the eigenclass again in Chapter 7.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.138.123.106