Pattern 17Memoization

Intent

To cache the results of a pure function call to avoid performing the same computation more than once

Overview

Since pure functions always return the same value for given arguments, it’s possible to replace a pure function call with cached results.

We can do this manually by writing a function that keeps track of its previous arguments. When it’s called, it first checks its cache to see if it has already been called with the passed-in arguments. If it has, it returns the cached value. Otherwise, it performs the computation.

Some languages provide first-class support for Memoization using higher-order functions. Clojure, for instance, has a function called memoize that takes a function and returns a new one that will cache results. Scala doesn’t have a built-in memoization function, so we’ll use a simple manual implementation.

Sample Code: Simple Caching

One use for Memoization is as a simple cache for expensive or time-consuming functions, especially when the function is called multiple times with the same argument. In this example, we’ll simulate the time-consuming operation by having it sleep the thread.

In Scala

Let’s get started with a look at our simulated expensive function call. As an example, we’re using a lookup by ID from some (presumably slow) datastore. To fake it out here, we sleep the thread for a second before returning a value from a static map. We also print the ID we’re looking up to the console to demonstrate when the function is being executed:

ScalaExamples/src/main/scala/com/mblinn/mbfpp/functional/memoization/Examples.scala
 
def​ expensiveLookup(id: ​Int​) = {
 
Thread.sleep(1000)
 
println(s​"Doing expensive lookup for $id"​)
 
Map(42 -> ​"foo"​, 12 -> ​"bar"​, 1 -> ​"baz"​).get(id)
 
}

Just as we’d expect, the lengthy function is executed each time we call it, as we can see from the console output:

 
scala>​ expensiveLookup(42)
 
Doing expensive lookup for 42
 
res0: Option[String] = Some(foo)
 
 
scala>​ expensiveLookup(42)
 
Doing expensive lookup for 42
 
res1: Option[String] = Some(foo)

Now let’s take a look at a simple memoized version of expensiveLookup. To create it we’ll use memoizeExpensiveLookup, which initializes a cache and returns a new function that wraps calls to memoizeExpensiveFunction.

The new function first checks its cache to see if it has results from a previous function call. If it does, it returns the cached results. Otherwise it calls the expensive lookup and caches the results before returning them.

Finally, we call memoizeExpensiveFunction and store a reference to the function it returns into a new var. The full solution is in the following code:

ScalaExamples/src/main/scala/com/mblinn/mbfpp/functional/memoization/Examples.scala
 
def​ memoizeExpensiveLookup() = {
 
var​ cache = Map[​Int​, ​Option​[​String​]]()
 
(id: ​Int​) =>
 
cache.get(id) ​match​ {
 
case​ Some(result: ​Option​[​String​]) => result
 
case​ None => {
 
val​ result = expensiveLookup(id)
 
cache += id -> result
 
result
 
}
 
}
 
}
 
val​ memoizedExpensiveLookup = memoizeExpensiveLookup

As we can see from the following REPL output, the expensive function is only called the first time for a given argument. After that, it returns the cached copy:

 
scala>​ memoizedExpensiveLookup(42)
 
Doing expensive lookup for 42
 
res2: Option[String] = Some(foo)
 
 
scala>​ memoizedExpensiveLookup(42)
 
res3: Option[String] = Some(foo)

One quirk with this example is in the last line:

 
val memoizedExpensiveLookup = memoizeExpensiveLookup

Here, we’re having memoizeExpensiveLookup return a new function, and we’re storing a reference to it. This allows us to wrap the cache up in a closure so that only the function has a reference to it. If we needed another cache, we could create it like so:

 
scala>​ val memoizedExpensiveLookup2 = memoizeExpensiveLookup
 
memoizedExpensiveLookup2: Int => Option[String] = <function1>
 
 
scala>​ memoizedExpensiveLookup2(42)
 
Doing expensive lookup for 42
 
res4: Option[String] = Some(foo)

Our Scala solution is a bit clumsy since we’ve done it manually for a single, specific case, but it serves as a good model for how memoization works behind the scenes. Let’s take a look at how we can use Clojure’s memoize function to solve the same problem.

In Clojure

In Clojure, we’ll start with a similar simulated expensive function. However, we won’t manually memoize it. Instead, we’ll use Clojure’s memoize function to automatically return a memoized version of the function, as this code shows:

ClojureExamples/src/mbfpp/functional/memoization/examples.clj
 
(​defn​ expensive-lookup [id]
 
(Thread/sleep 1000)
 
(​println​ (​str​ ​"Lookup for "​ id))
 
({42 ​"foo"​ 12 ​"bar"​ 1 ​"baz"​} id))
 
 
(​def​ memoized-expensive-lookup
 
(​memoize​ expensive-lookup))

As we can see from the following REPL output, it behaves similarly to the Scala version and only performs the expensive operation once:

 
=> (memoized-expensive-lookup 42)
 
Lookup for 42
 
"foo"
 
=> (memoized-expensive-lookup 42)
 
"foo"

Behind the scenes, the memoize function creates a new function that’s much like the manual example we saw in Scala that uses a map as a cache.

Discussion

One use of Memoization we didn’t cover here is in solving dynamic programming problems, which is one of its original uses. Dynamic programming problems are problems that can be broken down into simpler subproblems recursively. A classic, easy-to-understand example is computing a Fibonacci number.

The formula for calculating the nth Fibonacci number adds together the previous two numbers in the sequence. A simple Clojure function to calculate a Fibonacci number using this definition follows:

ClojureExamples/src/mbfpp/functional/memoization/examples.clj
 
(​def​ slow-fib
 
(​fn​ [n]
 
(​cond
 
(​<​​=​ n 0) 0
 
(​<​ n 2) 1
 
:else (​+​ (slow-fib (​-​ n 1)) (slow-fib (​-​ n 2))))))

The nice thing about this function is that it mirrors the mathematical definition. However, it needs to recursively compute its subparts repeatedly, so its performance is terrible for even moderately large numbers. If we memoize the function, as we do in the following code, then the subparts are cached and the function can perform reasonably well:

ClojureExamples/src/mbfpp/functional/memoization/examples.clj
 
(​def​ mem-fib
 
(​memoize
 
(​fn​ [n]
 
(​cond
 
(​<​​=​ n 0) 0
 
(​<​ n 2) 1
 
:else (​+​ (mem-fib (​-​ n 1)) (mem-fib (​-​ n 2)))))))

Running the two functions shows the drastic difference in performance:

 
=> (time (slow-fib 40))
 
"Elapsed time: 6689.204 msecs"
 
102334155
 
=> (time (mem-fib 40))
 
"Elapsed time: 0.402 msecs"
 
102334155

Dynamic programming problems are rich and fascinating; however, they only pop up in a limited number of domains. I’ve generally seen memoization used as a simple, convenient cache for expensive or long-lived operations rather than as a dynamic programming tool.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.139.83.96