Combining function calls with reducers

Clojure 1.5 introduced the clojure.core.reducers library. This library provides a lot of interesting and exciting features. It allows you to compose multiple calls to map and other sequence-processing, high-order functions. This also makes it possible to abstract map and other functions for different types of collections while maintaining the collection type.

Looking at the following chart, initial operations on individual data items such as map and filter operate on items of the original dataset. Then, the outputs of the operations on the items are combined using a reduce function. Finally, the outputs of the reduction step are progressively combined until the final result is produced. This might involve a reduce-type operation (such as addition), or an accumulation (such as the into function).

Combining function calls with reducers

In this recipe, we'll take a look at how we can use reducers to minimize the number of sequences that Clojure creates and immediately throws away.

Getting ready

The primary dependency that we'll need for this is the reducers library, but we'll also use the clojure.string library:

(require '[clojure.string :as str]
         '[clojure.core.reducers :as r])

How to do it…

To illustrate this feature of reducers, we'll take a sequence of words and run them through a series of transformations. We'll take a look at how Clojure handles these both with and without the reducers library.

The data that we'll work on will be a sequence of strings that contain words and numbers. We'll convert all of the letters to lowercase and all of the numbers to integers. Based on this specification, the first step of the processing pipeline will be str/lower-case. The second step will be the ->int function:

(defn ->int [x]
  (try
    (Long/parseLong x)
    (catch Exception e
      x)))

The data that we'll work on will be this list:

(def data
  (str/split (str "This is a small list. It contains 42 "
                  "items. Or less.")
             #"s+"))

If you run this using clojure.core/map, you will get the results that you had expected:

user=> (map ->int
            (map str/lower-case
                 data))
("this" "is" "a" "small" "list." "it" "contains" 42 "items." "or" "less.")

The problem with this approach isn't the results; it's what Clojure is doing between the two calls to map. In this case, the first map creates an entirely new lazy sequence. The second map walks over it again before throwing it and its contents away. Repeatedly allocating lists and immediately throwing them away is wasteful. It takes more time, and can potentially consume more memory, than you have available. In this case, this isn't really a problem, but for longer pipelines of the map calls (potentially processing long sequences) this can be a performance problem.

This is a problem that reducers address. Let's change our calls to map into calls to clojure.reducers/map and see what happens:

user=> (r/map ->int
              (r/map str/lower-case
                     data))
#<reducers$folder$reify__1529 clojure.core.reducers$folder$reify__1529@37577fd6>

What happened here?

Actually, this is exactly what the reducers library is supposed to do. Instead of actually processing the input, the stacked calls to r/map compose the two functions into one. When results are needed, the reducers library processes the input sequence through the combined functions. It, thereby, accomplishes the processing without creating an intermediate, throwaway sequence.

So, how do we get the output? We simply tell Clojure to feed it into a vector or other data structure:

user=> (into []
             (r/map ->int
                    (r/map str/lower-case
                           data)))
["this" "is" "a" "small" "list." "it" "contains" 42 "items." "or" "less."]

There's more...

For more information on reducers, see Rich Hickey's blog posts at http://clojure.com/blog/2012/05/08/reducers-a-library-and-model-for-collection-processing.html and http://clojure.com/blog/2012/05/15/anatomy-of-reducer.html. Also, his presentation on reducers for EuroClojure 2012 (http://vimeo.com/45561411) has a lot of good information.

See also

We'll take a look at another feature of reducers in the next recipe, Parallelizing with reducers.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.221.163.13