Chapter 16. Ruby and functional programming

This chapter covers

  • A description of the functional style of programming
  • Pure functions
  • Method chaining and Kernel#itself
  • Higher-order functions
  • Method#curry, Proc#curry, and partial function application
  • Recursion

By now you know that Ruby is a powerful and expressive language that offers solutions to myriad software-programming challenges. Ruby encourages an object-oriented programming approach to language design, and thus far we’ve mostly restricted our study of programming principles to this discipline. But Ruby has long supported many elements of a functional programming style. In fact, in recent years Ruby has added even more language features to support writing code in a functional style.

The word “style” is important here. Ruby doesn’t meet the definition of a purely functional language because it doesn’t guarantee referential transparency or immutability (both defined in this chapter). Nevertheless, tenets of functional programming are available to the Rubyist. Software engineers can evaluate the pros and cons of each approach when choosing between object-oriented and functional programming styles. In a large enough program, mixing the two styles of programming is possible. As an engineer, you’ll find yourself frequently evaluating the trade-offs of one style over another.

You’ve already seen examples of methods that encapsulate functional behavior. Enumerable#map, which we’ll revisit, is one such example. In this chapter, we’ll dive into what makes code functional, where advantages can be reaped from a functional style of programming, and how Ruby allows you to harness this power. Table 16.1 lists some properties of functional and object-oriented programming styles.

Table 16.1. Properties of functional and object-oriented programming styles

Property

Object-oriented

Functional

Promotes code reuse
Models behavior with objects  
Separates data and behavior  
Promotes encapsulation
Terse code  
Generic routines  
Modularity  
Flexibility with polymorphism  
Note

We use the terms “method” and “function” interchangeably in this chapter. Ruby makes no distinction between the two. Generally speaking, functional programmers speak in terms of functions. Terms such as “pure functions,” defined later in the chapter, lend themselves to the function nomenclature.

16.1. Understanding pure functions

Pure functions lie at the core of functional programming and have their roots in mathematical principles. Pure functions exhibit specific behavior such as referential transparency. A function that, when given the same arguments, always returns the same result with no side effects is said to possess referential transparency—much more on this property later in the chapter. Pure functions lay the foundation for functional programs.

Because functional programming is mathematics based, learning the underlying mathematical principles behind a functional technique can be very helpful in the early going. We won’t dive into mathematics too deeply in this chapter, but a simple example may be helpful.

Let’s say we set a variable called num to the integer value 100:

>> num = 100

Now we want to increment the value. We know that Ruby allows us to do this by redefining num:

>> num = num + 1
=> 101

num now equals 101. But this isn’t a functional programming approach. Why? Think about the mathematical implications of the statement, specifically using the transitive property:

num = num + 1
1 = num - num
1 = 0

Whoops! Our way of incrementing variables is valid in the Ruby world but not in the math world. In functional programming, we wouldn’t change the value of num. We’d simply create a new variable:

>> new_num = num + 1
=> 101

Using two variables is equally valid Ruby syntax via the transitive property:

new_num = num + 1
1 = new_num - num
1 = 101 - 100
1 = 1

Changing the value of a variable once it has been set is (usually) considered a side effect. We’ll learn more about side effects next.

16.1.1. Methods with side effects

A side effect is virtually anything a method does other than returning a value. Modifying an instance variable, raising an exception, and outputting text to the console are all examples of side effects. Side effects can be intentional or unintentional. At times they’re useful, and at other times they should be avoided.

Let’s say we want to write a method that implements the behavior of Array#sum: sum returns the result of adding each item within a given array. We could implement it the following way:

def sum_of_parts(arr)
  sum = 0
  arr.size.times { sum = sum + arr.shift }
  sum
end

In this example, we first initialize a sum variable to 0. Then we loop over each element in the array, using Array#shift to remove each element and add it to sum.

Type the following into your console, and you’ll see that it works as expected the first time:

>> my_array = [1,3,5]
>> sum_of_parts my_array
=> 9

But the second time we run sum_of_parts, we get a different value:

>> sum_of_parts my_array
=> 0

What happened? By implementing Array#shift, we removed every item from my_array until it was empty:

>> my_array
=> []

Array#shift has the side effect of mutating the array that’s passed in. But Enumerable #reduce (which is an alias for #inject) also allows us to sum elements in an array. We can reduce sum_of_parts to a one-line method with no side effects thusly:

def sum_of_parts(arr)
  arr.reduce(:+)
end

By not modifying arr, this version of sum_of_parts can be called any number of times and return the same value.

Side effects abound in everyday software development, and some are less obvious than you may think. The following behaviors are inherently side effects:

  • Outputting data to the terminal
  • Changing a record in a database
  • Updating a value on a web page

Methods with no side effects are called pure functions. The evaluation of a pure function results in a value determined solely by its arguments and has no observable side effects. The first time we implemented sum_of_parts, the result was determined by the input (the array we gave it). However, a side effect kept it from being a pure function. Our new sum_of_parts method is pure, because the lack of side effects keeps it returning the same result, method call after method call.

As you’ll see, many of Ruby’s built-in methods are pure.

16.1.2. Pure functions and referential transparency in Ruby

A pure function is one whose result is solely dependent on its parameters. It may not surprise you to learn that pure functions abound in some of Ruby’s most basic math operators:

>> 3 + 4
=> 7

Remember from chapter 1 that +, -, *, /, and others are simply methods with character syntax instead of letters. The preceding math operation can be rewritten as follows:

>> 3.+(4)

+ is a pure function. It operates on a receiver (3), takes an argument (4), and returns the result. Every time + is called with identical receivers and arguments, it will return identical results. Importantly, it never modifies the receiver or the argument in producing a result.

Pure functions are said to be referentially transparent. Referential transparency is achieved if the expression can be replaced by the value of the expression without changing the program’s behavior. In all cases, 3 + 4 can be replaced by 7 without altering the state of a Ruby program. The + method makes for referentially transparent functions.

Let’s look at a less obvious example: Enumerable#map. map operates on a receiver and takes a block as an argument. Here’s an implementation of map on an array:

>> [3,5,7,9].map { |i| i * 5 }
=> [15, 25, 35, 45]

This map function takes an array as a receiver, iterates over the array, and yields each item in turn to the block. The block takes the value yielded to it, multiplies it by 5, and returns the resulting value. map then stores each resulting value in a new array. map never modifies the receiver; it always creates a new array. In this way, it maintains its status as a pure function.

In fact, map is one of the most common built-in methods in functional languages. reduce, more commonly found in Ruby by its alias inject, is another. As you’ll see next, Ruby also contains many built-in methods with side effects.

16.1.3. Side effects in Ruby’s built-in methods

As discussed elsewhere in the book, bang methods often contain side effects. Whereas upcase returns an all-caps version of the string upon which it is called, upcase! both returns the all-caps value and mutates the string:

>> str = "joe"
>> str.upcase
=> "JOE"
>> str
=> "joe"
>> str.upcase!
=> "JOE"
>> str
=> "JOE"

str is permanently changed here. upcase! returns the desired result and has the side effect of changing the value of str. chomp!, gsub!, reverse!, and slice! are all examples of built-in Ruby methods with side effects. These are (for the most part) methods that operate on String. But Array and Hash objects have their own share of methods with side effects, not all of which end in an exclamation mark.

You’ll quickly intuit that compact!, select!, and sort!, among others, will permanently alter the state of an array or hash. Meanwhile, << is sometimes overlooked because it is syntactic sugar, is used commonly, and lacks an exclamation mark. But use the array-append operator on your array, and you change its state:

>> arr = [1, 2, 3]
>> arr << 4
>> arr
=> [1, 2, 3, 4]

delete, also lacking an exclamation mark in its name, does the opposite for hashes:

>> hash = { a: "foo", b: "bar" }
>> hash.delete(:a)
>> hash
=> { :b => "bar" }

Remember that modifying state is not the only kind of side effect. A side effect can also be an exception, and many Ruby built-in methods raise an exception if particular conditions aren’t met. In this way, your code may have unintentional side effects, which you’ll want to avoid no matter what style of code you choose to write:

>> arr = [1, 2, 3]
>> arr.drop(-1)
=> ArgumentError (attempt to drop negative size)

Passing a negative value to Array#drop will raise an ArgumentError. In the proper context, this exception can be useful. As long as the ArgumentError is handled with a proper try, catch, or ensure clause, arr.drop(-1) can execute without unexpected interruption. But an unhandled exception is a side effect and should be avoided regardless of whether you write in an object-oriented or functional style.

Table 16.2 lists some other commonly used Ruby built-in objects whose methods have documented exception cases.

Table 16.2. Methods with documented exception cases

Class

Method

String []=
  encode
  unicode_normalize
  unicode_normalized?
Array fetch
  drop
  take
  transpose
Hash fetch
  fetch_values
  rehash
Integer coerce
  sqrt
  +, -, /, *, and most other operator methods

At some point, changing state becomes important. Ruby programs are often intentionally built with side effects, because this is how we manage an object’s state. Functional programming offers consistency through stateless behavior, pure functions, and referential transparency. But to create programs that do anything of value, you’ll eventually need to alter some state—persist records to a database, write output to the console, or update the view of a web page. Both object-oriented and functional styles are important to learn because understanding how and why to use both will lead to better-designed programs.

Before we create some of our own functions, let’s look a little closer at simple state management.

16.1.4. Modifying an object’s state

An object’s state is the value of its attributes at a given point in time. Most object-oriented programming is built on the idea of setting, modifying, and retrieving the state of objects. In traditional OOP, we construct objects with state and behavior in mind.

Consider the following Grade object. It maintains several variables as well as a method that calculates a letter grade based on the average of numerical test scores.

class Grade
  attr_reader :letter

  def calculate_grade(scores)
    case scores.sum / scores.size
    when 90..100
      @letter_grade = "A"
    when 80...90
      @letter_grade = "B"
    when 70...80
      @letter_grade = "C"
    when 60...70
      @letter_grade = "D"
    else
      @letter_grade = "F"
    end
  end
end

Next, we’ll write a ReportCard object that uses the Grade object to issue a report card:

class ReportCard

  def initialize(name, physics_grade, chemistry_grade, biology_grade)
    @name = name
    @physics_grade = physics_grade
    @chemistry_grade = chemistry_grade
    @biology_grade = biology_grade
  end

  def issue
    puts "Report Card for #{@name}"
    puts
    puts "Physics: #{@physics_grade.letter}"
    puts "Chemistry: #{@chemistry_grade.letter}"
    puts "Biology: #{@biology_grade.letter}"
  end
end

Finally, let’s put these objects to work and build a report card:

>> physics_grade = Grade.new
>> physics_grade.calculate_grade([78,92,90])
>> chemistry_grade = Grade.new
>> chemistry_grade.calculate_grade([90,80,88])
>> biology_grade = Grade.new
>> biology_grade.calculate_grade([99,90,98])

>> rc = ReportCard.new(physics_grade, chemistry_grade, biology_grade)
>> rc.issue
Report Card for Joe

Physics: B
Chemistry: B
Biology: A

The Grade object maintains state—the letter grade for a set of numerical scores. ReportCard depends on this state both to maintain its own state (@physics_grade, @chemistry_grade, and @biology_grade) and to issue its output (#issue).

As we construct systems of growing complexity, modifying the state of our objects increases the risk of errors. In this case we might mitigate some of the risk by removing some state from our objects. Let’s implement Grade and ReportCard without any instance variables:

class Grade
  def self.calculate_grade(scores)
    case scores.sum / scores.size
    when 90..100
      "A"
    when 80...90
      "B"
    when 70...80
      "C"
    when 60...70
      "D"
    else
      "F"
    end
  end
end

In this example, we do away with the letter instance variables. The result is an object with just one singleton method, calculate_grade.

We can treat ReportCard the same way, removing state and reducing the object to a single class method, issue:

Class ReportCard
  def self.issue(name, physics_grade, chemistry_grade, biology_grade)
    puts "Report Card for #{name}"
    puts
    puts "Physics: #{physics_grade}"
    puts "Chemistry: #{chemistry_grade}"
    puts "Biology: #{biology_grade}"
  end
end

Working with these new objects is more succinct simply because we needn’t initialize them:

>> physics_grade = Grade.calculate([78, 92, 90])
>> chemistry_grade = Grade.calculate([90, 80, 88])
>> biology_grade = Grade.calculate([99,90,98])
>> ReportCard.issue("Joe", physics_grade, chemistry_grade, biology_grade)

The output is the same as in our original implementation. All we’ve really done in this example is remove instance variables and custom initialize methods. But the effect of this simple change is worth considering. If we can eliminate the management of state, we’re a step closer to working with pure functions. Rather than creating singleton methods as we’ve done here, we can examine the underlying expressions and whether those can be recreated as functions. The rest of this chapter will consider techniques for doing just that.

16.2. Immutability

Immutable objects don’t change once they’ve been created. In this way they differ from most of the objects you’ve seen in the book.

When we make an object’s attributes immutable, we’re assured that no other program or function will change its value at a later time:

class Record
  attr_accessor :artist, :title, :year, :rating

  def initialize(artist, title, year, rating)
    @artist = artist
    @title = title
    @year = year
    @rating = rating
  end
end

The preceding Record class behaves like any other object until we call freeze on it. From that point on, the attributes are immutable:

>> the_unseen = Record.new("Quasimoto", "The Unseen", 2000, 3.5)
>> the_unseen.rating = 4.5
=> 4.5
>> the_unseen.freeze
>> the_unseen.rating.frozen?
=> true
>> the_unseen.artist = "Madlib"
=> FrozenError: (can't modify frozen Record)

Let’s now look at freeze and frozen a bit more closely.

16.2.1. Object#freeze and Object#frozen?

In functional languages, if you try to change an immutable object, you’ll get an exception of some kind. In most languages, constants provide an example of immutability. As we know, this isn’t so in Ruby:

>> CONSTANT = "can't change me!"
=> "can't change me!"
>> CONSTANT.gsub!(/can't/, 'can')
=> "can change me!"

In fact, nearly everything in Ruby can be mutated. Ruby’s Object class provides a freeze method to make objects (nearly) immutable:

>> CONSTANT.freeze
=> "can change me"
>> CONSTANT.gsub!(/can/, 'can't')
=> FrozenError (can't modify frozen String)

You can check whether an object is frozen by using the frozen? method:

>> CONSTANT.frozen?
=> true

Because freeze is defined on Object, nearly any object or attribute on an object can be frozen. But be careful! As you saw in previous sections of the book, freezing an object doesn’t guarantee immutability for all of its attributes.

Frozen objects generally make for safer code because we can be confident that, once frozen, they won’t be modified by other methods or objects later in the execution cycle. It’s for this reason that immutable objects are a popular choice when multiple threads are involved.

16.2.2. Frozen string literals

You can make your Ruby strings default to frozen (immutable) rather than calling freeze on each one. This is achieved in one of the following ways:

  • Run your programs on the command line with a specific instruction. This will make every string in your program frozen by default:
    ruby --enable-frozen-string-literal my_program.rb
  • Add the following line to the top of any individual file:
    # frozen_string_literal: true

This will make all strings in that file frozen, but not strings in other files throughout your program.

You can make frozen strings default in an irb session by setting the RUBYOPT environment variable at the same time as you start the session:

RUBYOPT=--enable-frozen-string-literal irb

Why the two approaches to strings, one frozen and one not? Versions of Ruby 3.0 and later will all default to frozen strings. The options listed above are a way for Rubyists to prepare themselves and their code for this major change to the language.

When strings are frozen, they occupy one and only one place in memory. A look at a string’s object_id tells the tale. Try running the following examples using RUBYOPT= --enable-frozen-string-literal irb (and compare to the results you get with plain irb):

>> str = "a frozen string"
>> new_str = "a frozen string"
>> new_str.object_id == str.object_id
=> true
>> str << ", brr!"
=> FrozenError (can't modify frozen String)

With frozen string literals turned on, two strings that look identical will occupy the very same place in memory and thus have the same object_id. (Note: strings surrounded by single quotes display the same behavior.)

What if you really need a string to change? One way is to dup the string and make the change, thereby creating an entirely new and unfrozen string:

>> str = "a frozen string"
>> new_str = str.dup
>> new_str.object_id == str.object_id
=> false
>> new_str << ", brr!"
=> "a frozen string, brr!"

Another alternative is String.new, which creates unfrozen strings by default:

>> str = String.new("an unfrozen string")
>> str.frozen?
=> false
>> str << ", heating up!"
=> "an unfrozen string, heating up!"

A third option is to use the unary plus operator:

>> str = "frozen!"
>> str.frozen?
=> true
>> unfrozen_str = +str
>> unfrozen_str.frozen?
=> false

String immutability is still under heavy discussion and development, so expect things to shift and change somewhat as we approach the release of Ruby 3.0.

Let’s now turn our attention to higher-order functions and find out what else functional programming has to offer Rubyists.

True immutability not included

Frozen though they may seem, Ruby objects can always be modified. When a Ruby object is frozen, a flag at the C level of Ruby is set that determines that object’s behavior. When the flag is set, we get frozen behavior. But the flag can be unset, thereby “unfreezing” or “thawing” the object. (No standard method exists to unfreeze an object; it requires altering the object’s C-level implementation and may change depending on your Ruby interpreter. Ruby does not make it easy, but it is possible.)

Because nothing can be made truly immutable, Ruby isn’t considered a pure functional language. Immutability is a requirement for such languages. This shouldn’t deter you from using functional programming paradigms in the language, as long as you’re aware of the limitations.

16.3. Higher-order functions

A method that takes a function as an argument or returns a function as a result is called a higher-order function. You’ve already seen that some methods can take a block as an argument. In Ruby, a method can also return a proc. In a functional programming style, procs are functions that are either used as arguments or returned as values. Both types of functions are considered higher-order functions.

Ruby has many higher-order functions built into the language. You’re getting a heavy dose of map and inject/reduce in this chapter, all of which are higher-order functions because they accept a block as an argument:

>> [1,3,5].map { |x| x * 5 }
=> [5, 15, 25]

Here map takes { |x| x * 5 } as a function and transforms the array, importantly returning a brand-new array as a result. Examples of methods that take functions as arguments abound:

>> Array.new(4) { |i| "#{i + 1}A" }
=> ["1A", "2A", "3A", "4A"]

>> { a: 80, b: 90, c: 100 }.select { |k, v| v > 90 }
=> {:c=>100}

>> "I love Ruby!".each_byte { |b| print b, ' ' }
=> 73 32 108 111 118 101 32 82 117 98 121 33

Later, we’ll use techniques to create methods that return functions. Let’s look next at method chaining, a feature of Ruby frequently enhanced by higher-order functions.

16.3.1. Method chaining

You’ve seen examples of method chaining throughout the book. Here’s a simple example:

>> "joe".upcase.reverse
=> "EOJ"

"joe" is the receiver for upcase. The result of "joe".upcase, "JOE", is then the receiver for reverse. "JOE" is never printed to the screen because upcase and reverse are chained together.

Earlier we preached caution about chaining too many methods together. This is particularly true when the methods being chained together lack referential transparency. On the other hand, the preceding example works, because all three of these conditions are met:

  • Both methods are defined on the String object
  • Both methods return an instance of String
  • Neither method contains any side effects

We can find the same easy-to-execute method chaining in Ruby’s Integer objects:

>> 10 / 5 + 2
=> 4

Chaining methods is often made easier by applying the principles just listed, in particular ensuring that there are no side effects. Let’s look at two methods built into Ruby’s Kernel that aid in method chaining.

16.3.2. Kernel#itself and Kernel#yield_self

Method chaining is a popular practice in Ruby, regardless of whether you’re writing functional or object-oriented code. itself and yield_self are recent additions to the Ruby programming language, and both are used to promote easier chaining of methods.

Kernel#itself was added to Ruby to aid in the construction of more-fluid method chains. itself seems insignificant on the surface. It simply returns the object on which it’s called:

>> "Ruby".itself
=> "Ruby"
>> [1, 1, 3, 4, 5, 5, 5, 6, 7].itself
=> [1, 1, 3, 4, 5, 5, 5, 6, 7]

When we dig a little deeper, though, we can see a way for itself to add some syntactic sugar to our operations. A common example is with the group_by method, which operates on an enumerable and returns a hash. Suppose we want to group identical strings in an array. We can do so with a clunky block:

>> %w(joe, joe, david, matz, david, matz, joe)group_by { |name | name }
=> {"joe"=>["joe", "joe", "joe"], "david"=>["david", "david"], "matz"=>["matz", "matz"]}

Or we can substitute with itself and produce an identical result:

>> %w(joe, joe, david, matz, david, matz, joe).group_by(&:itself)
=> {"joe"=>["joe", "joe", "joe"], "david"=>["david", "david"], "matz"=>["matz", "matz"]}

Replacing the block with itself is not earth shattering, and that’s the point. By simply returning the object, it becomes a useful tool for clarity and conciseness as we chain more methods together. itself is also useful as a default argument:

def filter_arr(arr, method=:itself)
  arr.public_send(method)
end
>> a = [1,1,2,2,3,5,6]
>> filter_arr(a, :uniq)
=> [1, 2, 3, 5, 6]
>> filter_arr(a)
=> [1, 1, 2, 2, 3, 5, 6]

Kernel#yield_self functions much like Object#tap. Both take a block and yield the receiver to that block. The difference is that yield_self returns the result of the block rather than the receiver itself:

>> "Ruby".yield_self { |str| str + " Roundtable" }
=> "Ruby Roundtable"

yield_self can be thought of as a cousin of tap that discourages side effects, just as map performs functions similar to each but without side effects. Because tap only returns the receiver, we’d need to add a puts statement to see the results of combining strings:

>> "Ruby".tap { |str| puts str + " Roundtable" }
Ruby Roundtable
=> "Ruby"

Because yield_self returns the result of its block, successive calls can be chained together. This has an elegance to it when used with lambdas in a functional programming style:

>> add_newline = -> (str) { str + "
" }
>> welcome = -> (str) { "Welcome, " + str.upcase + "!" }
>> "joe".yield_self(&welcome).yield_self(&add_newline) + "We're glad you're
     here!"
=> "Welcome, JOE!
We're glad you';re here!"

When yield_self is called without a block, it returns an enumerator:

>> 3.yield_self.class
=> Enumerator

This gives us more flexibility to call yield_self in other situations:

>> (1..10).yield_self { |r| r.member?(rand(15)) } # returns true or false
>> (rand(10) + 1).yield_self { |x| x.odd? ? x + 1 : x } # returns an even
     number between 2 and 10.
Note

In an upcoming release of Ruby, yield_self will be aliased to then. yield_self isn’t particularly descriptive or succinct, and the Ruby core team has decided that the method then achieves both.

16.3.3. Functions that return functions

Ruby’s map method is a higher-order function because it takes a block as an argument. This block gives transformation instructions to the receiver, an array. The result of calling map is a brand-new array containing whatever transformation was applied in the given block. This array is a value returned by calling map.

A popular technique in functional programming is creating higher-order functions that return other functions. In Ruby, that means writing methods that return a proc containing transformation instructions rather than a value. You saw an example of this in chapter 14:

def multiply_by(m)
  Proc.new {|x| puts x * m }
end
mult = multiply_by(10)
mult.call(12)

The result of the multiply_by method is not a straightforward value but a proc that performs yet another transformation. The method returns a function, which makes the method itself a higher-order function. The function that’s returned can be called later (for example, when more information has been gathered) with an argument that can be evaluated to return a value.

With a technique called currying, we can even evaluate parts of a function rather than the entire function at once.

16.3.4. Currying and partial function application

In chapter 14 you saw that Proc objects—excluding lambdas—are less strict about what arguments are passed into them. Whereas a method defines a strict arity, or argument count, most procs allow us to pass in any number of arguments. Ruby’s curry method deals directly with arity, specifically by passing in fewer arguments to either methods or procs.

Partial function application is a functional programming technique whereby a function is passed any number of arguments less than its arity. The function is evaluated with the given arguments, and a new function is returned that takes the rest of the arguments. Let’s look at a simple example of a proc that adds two arguments:

>> add = -> (a, b) { a + b }

When we use partial function application for add, passing in the value 1 for a, the following expression is returned:

>> add = -> (1, b) { 1 + b }

Currying is subtly different. Instead of returning one function, currying returns a series of functions that each take one argument. Let’s look again at our add proc:

>> add = -> (a, b) { a + b }

A curried version of add chains two functions together, one for each argument:

>> add = -> (a) { -> (b) { a + b } }

Both of these statements are valid Ruby syntax. The curry method obviates the need to write the second version. A partially applied function, by contrast, applies whatever arguments are passed to it and returns only one function with which we can evaluate the result.

In Ruby, curry accomplishes both partial application and currying. We won’t see functions chained together or partially applied the way you see them above. Just take note of what’s happening under the hood.

One use case for currying and partial function application is to create generic functions that can be reused. Let’s say we want to create a series of functions that take an array and return the elements that are multiples of an integer:

def find_multiples_of_3(arr)
  arr.select { |el| el % 3 == 0 }
end

>> find_multiples_of_3([-3,3,4,5,6,8,9,10,12])
=> [-3, 3, 6, 9, 12]

def find_multiples_of_5(arr)
  arr.select { |el| el % 5 == 0 }
end

>> find_multiples_of_5([-3,3,4,5,6,8,9,10,12])
=> [5, 10]

As we add more methods similar to the two here, things can quickly get cumbersome. The methods are very similar, except for one argument. By abstracting that argument, we can create a generic, or multiuse function:

>> find_multiples = -> (x, arr) { arr.select { |el| el % x == 0 } }

This generic function takes the form of a lambda. The curry method allows us to create a function that returns another function:

>> find_multiples_of = find_multiples.curry
>> find_multiples_of_3 = find_multiples_of.(3)
>> find_multiples_of_5 = find_multiples_of.(5)

find_multiples_of is the curried form of find_multiples. Now we can pass in one argument and return a function that takes the remaining argument, an array. From here, finding multiples in an array is performed similarly to the method-based approach:

>> find_multiples_of_3.([-3,3,4,5,6,8,9,10,12])
=> [-3, 3, 6, 9, 12]
>> find_multiples_of_5.([-3,3,4,5,6,8,9,10,12])
=> [5, 10]

This approach makes use of higher-order functions: it creates a function that returns another function. It’s shorter than the method-based approach and doesn’t duplicate the effort of determining multiples.

Several ways exist to call curried or partially applied functions. Let’s turn again to our simple addition example, this time adding another argument:

>> add = -> (a, b, c) { a + b + c }
>> fun = add.curry
=> #<Proc:0x000055f71c9718f0 (lambda)>

Calling curry with no arguments returns a lambda. We can evaluate this lambda all at once,

>> fun.(1,2,3)
=> 6

or chain several calls together,

>> fun.(1).(2).(3)

or pass in just one argument, which returns a function that takes two arguments:

>> fun2 = fun.(1)
=> #<Proc:0x000055f71ca50d20 (lambda)>
>> fun2.(2,3)
=> 6

Finally, we can pass in two arguments with predictable results:

>> fun3 = fun.(1,2)
=> #<Proc:0x000055f71ca50d20 (lambda)>
>> fun3.(3)
=> 6

In all these examples, the function waits until it has the necessary arguments, returning functions along the way. When all arguments have been supplied, the function is evaluated and returns a value.

curry also takes an optional arity argument. When called with this argument, the curried object will only be evaluated when the given number of arguments has been supplied:

>> sum_all = -> (*nums) { nums.reduce(:+) }
>> sum_all.curry.(1,2,3)
=> 6
>> sum_at_least_four = sum_all.curry(4)
>> sum1 = sum_at_least_four.(3,4)
=> #<Proc:0x000055d2f90867a0 (lambda)>
>> sum2 = sum1.(5)
=> #<Proc:0x000055d2f90b4948 (lambda)>
>> sum3 = sum2.(7)
=> 19

Passing in 4 to the curry method means that the function will wait until at least four arguments have been passed in before evaluating itself. Until then, the function is happy to return a partially applied function.

When it comes to calling curried or partially applied functions, we have several types of notation at our disposal:

>> fun.(1,2,3)
=> 6

When using parentheses, we must separate them with a dot operator. The dot operator is actually shorthand for call:

>> fun.call(1,2,3)
=> 6

Square brackets can replace arguments in parentheses. When using square brackets, the dot operator isn’t necessary:

>> fun[1,2,3]
=> 6
>> fun[1,2][3]
=> 6

Methods can also be curried:

>> def add(a, b, c) ; a + b + c ; end
=> :add
>> fun = method(:add).curry
=> #<Proc:0x000055f71cb45758 (lambda)>
>> fun.(1,2,3)
=> 6

As you can see, the curry method turns our add method into a lambda. Once that transition is complete, the lambda behaves exactly as all of our previous examples.

Next we’ll look at recursion and see what tools Ruby uses to support this technique.

16.4. Recursion

Simply put, recursion is when a function calls itself one or more times before returning the desired result. Because it provides a mechanism for iterating that produces no side effects, it’s an oft-used tool of functional programmers.

One classic example used to demonstrate recursion is to evaluate the Fibonacci sequence. In this example, we want to find the integer in the Fibonacci sequence at a given point x. Here’s an iterative approach:

def fibonacci(x)
  y = 0
  z = 1

  x.times do
    temp = y
    y = z
    z =  temp + y
  end

  y
end

>> fibonacci(7)
=> 13

A recursive solution will call itself successively, decrementing the arguments it passes, until it arrives at the correct answer. In this case, we can write a much more terse function:

def fibonacci(x)
  x <= 1 ? x : fibonacci(x - 1) + fibonacci(x - 2)
end

>> fibonacci(7)
=> 13

A recursive function isn’t complete without a terminal clause. A terminal clause lets the function know that evaluation is complete, and it should return a value rather than continuing to call itself. In the preceding fibonacci example, x <= 1 is the terminal clause. As soon as this statement returns true, we stop recursing and return x. If we didn’t have this clause, the method would call itself forever (or until we stopped it ourselves).

Let’s look at another common example, finding the sum of the squares for the first x positive integers (not to be confused with the statistical formula, sum of squares):

def sum_squares(x)
  if x == 0
    0
  else
    x**2 + sum_squares(x - 1)
  end
end

>> sum_squares(3)
=> 14

In this case, x == 0 is the terminal clause. Our function continues to decrement x and recursively call sum_squares until the terminal-clause condition is met. At that point, the function returns its value.

Recursion is a difficult programming concept to master, and a full treatment of the topic falls outside the scope of this book. It’s important to mention here the properties of Ruby that make it adequate for recursive functions: lazy evaluation and tail-call optimization.

16.4.1. Lazy evaluation

In chapter 10 we covered the lazy method and how it impacts enumerators. Functional languages make use of lazy evaluation in order to work with very large, or even infinite, sets of data. These sets of data are often expressed as sequences, and working with sequences forms a cornerstone of functional programming.

Let’s turn again to finding multiples. We can find multiples of a given number against infinite positive numbers:

def find_multiples(num, mult)
  (1..Float::INFINITY).lazy.select { |x| x % mult == 0}.first(num)
end

>> find_multiples(3, 50)
=> [50, 100, 150]

Using our knowledge of partial function application, we can use the curry method to give us a generic function for reuse:

>> first_3_multiples = self.method(:find_multiples).curry.(3)
=> #<Proc:0x000055c3b7d76088 (lambda)>
>> first_5_multiples = self.method(:find_multiples).curry.(5)
=> #<Proc:0x000055c3b8243368 (lambda)>
>> first_3_multiples.(256)
=> [256, 512, 768]

As recursive functions call themselves indefinitely or until a terminal clause is reached, lazy evaluation comes in handy. Let’s use lazy evaluation to create a sequence of squares for each positive integer:

>> squares = (1..Float::INFINITY).lazy.map { |x| x * x }
=> #<Enumerator::Lazy: #<Enumerator::Lazy: 1..Infinity>:map>

We’ve stored an infinite sequence into a lazy enumerator called squares. We can now use Enumerator methods to filter our sequence:

>> squares.first(10)
=> [1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
>> squares.first(10)[3]
=> 16

By creating an infinite sequence, summing consecutive squares becomes a trivial matter:

def sum_squares(y)
  squares = (1..Float::INFINITY).lazy.map { |x| x * x }
  squares.first(y).sum
end

Generic functions can be used as building blocks to meet more-complex requirements. Suppose we want to create a function that takes two integers that we’ll call a “powers factorial” and a “set size.” We take the first consecutive numbers raised to the given powers factorial for the given set size and then recurse on the powers factorial. Finally, we’ll add the numbers in each set together. For example, with a powers factorial of 4 and a set size of 3, we’ll add the first 3 integers to the 4th power to the first 3 integers cubed, and so on, until we reach a power of 1. The result is 154 and is broken down as follows:

  • [1, 16, 81] + [1, 8, 27] + [1, 4, 9] + [1, 2, 3] = [1, 16, 81, 1, 8, 27, 1, 4, 9, 1, 2, 3] = 154

We start by turning our squares enumerator into a generic function:

powers = -> (power) { (1..Float::INFINITY).lazy.map { |x| x**power } }

We need a terminal clause. Any integer to the power of 1 is the integer itself. So when power == 1, we’ll stop recursing and return an array with consecutive positive integers. Finally, we’ll use flatten and sum to add the elements of the array. Here’s the recursive method in its entirety:

def sum_powers_factorial(pfact, size)
  powers = -> (power) { (1..Float::INFINITY).lazy.map { |x| x**power } }        1
if pfact == 1
    Array.new(size) { |x| x + 1 }                                               2
  else
    [powers.call(pfact).first(size),
      sum_powers_factorial(pfact - 1, size)].
      flatten.sum                                                               3
  end
end

>> sum_factorial_powers(4, 3)
=> 154

We begin by defining powers 1 so that we can use it later in the function. We then define the terminal clause 2. When pfact == 1, we return an array of length size with consecutive positive integers beginning with 1. Next, we use our powers function to return an array of size consecutive integers to the power of pfact 3. We combine this into an array with a recursive call to sum_powers_factorial until the terminal clause evaluates to true. At that point, we flatten the array and sum the elements to give us our final value.

By creating generic functions and combining them into higher-order functions, we have much more succinct code. In this case, we solved a complex mathematical problem with just a few lines of code—an object-oriented solution would involve potentially many more lines of code. Which is more expressive? Which would more aptly fit your needs? That’s a question only you, the Rubyist, can answer.

Let’s look next at an optimization technique that can positively impact your recursive functions.

16.4.2. Tail-call optimization

Recursive functions can take a long time to evaluate. In our previous Fibonacci example, calculating the 7th or 8th integer in the sequence is quick. Depending on your hardware, you may start to see a lag in computation around the 30th integer. Try calculating the 100th integer, and you’ll be waiting awhile! With even larger numbers, the computation may take up all available resources and bring your machine to a halt. Ruby will helpfully raise a SystemStackError before this happens, but you’re still left with a function that doesn’t, well, function.

When evaluating a recursive function, the Ruby interpreter typically needs to add an additional stack frame to its call stack each time it evaluates the function. A System-StackError indicates that the call stack—the number of subroutines that the Ruby interpreter will evaluate—has grown too large for the machine to handle. These types of errors—common in many programming languages—are often called “stack overflow” errors.

Tail-call optimization unburdens the Ruby interpreter by allowing tail-recursive functions to be evaluated without adding successive stack frames to the call stack. When the call stack remains constant, we avoid stack overflow errors and can recurse many thousands of times through a function without exhausting system resources.

For tail-call optimization to work, the function itself must be tail recursive. This means that the last instruction in a function is a call to the function itself. Our previous recursive fibonacci example wasn’t tail recursive because the last instruction was an operation, fibonacci(x - 1) + fibonacci(x - 2), and not a simple call to itself. To make it tail recursive, we’ll split it into two functions.

Write the following code in a file called fib_tail_recursive.rb.

def fibonacci_helper(x, y, num)
  num < 1 ? x : fibonacci_helper(y, x + y, num - 1)
end

def fibonacci(x)
  fibonacci_helper(0, 1, x)
end

fibonacci_helper is tail recursive because its final instruction is to call itself. With a tail-recursive function in place, Ruby can resolve fibonacci much faster and for much larger numbers. You can now safely evaluate fibonacci(100) or even fibonacci (10000) in no time!

Tail-call optimization isn’t turned on by default in Ruby. Without it enabled, you’ll eventually encounter a SystemStackError if your call requires hundreds of thousands or millions of recursions to evaluate. You can enable tail-call optimization by setting compile options on the RubyVM::InstructionSequence class. To try this, create a new file called fib_implementer.rb. Set your compile options, require the fib_tail_recursive file, and call the fibonacci function.

RubyVM::InstructionSequence.compile_option =  {
  tailcall_optimization: true,
  trace_instruction: false
}

require_relative 'fib_tail_recursive'

puts fibonacci(1000000)

With tail-call optimization turned on, fibonacci can evaluate much larger numbers and will not encounter a SystemStackError.

Summary

In this chapter, you’ve seen

  • The differences between functional and object-oriented programming styles
  • Pure functions and referential transparency
  • freeze and frozen? methods in the context of immutable code
  • How Ruby will handle strings in version 3.0
  • Currying and partial function application techniques with curry
  • The lazy method and tail-call optimization

Ruby isn’t a purely functional language, but it provides tools that make this style of programming possible. It gives you a new set of tools and a different perspective on the code you write. Plus, it’s fun!

Immutability and referential integrity are tenets of functional programming with broad applicability in other languages. Recursion and tail-call optimization, while not specific to a functional programming style, tend to be used more frequently in such paradigms. This chapter provided the principles that can help you learn and grow as a Rubyist and as a programmer.

And that’s that! Enjoy your status as a well-grounded Rubyist and the many structures you’ll build on top of the foundation you’ve acquired through this book.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.218.163.28