Hiding Performance Optimizations

Now that we’ve proven a particular bit of code is really too slow, what can we do to make it faster? Shantanu Kumar’s Clojure High Performance Programming [Kum13] is full of techniques to improve the performance of Clojure programs and we touch upon two tricks in this chapter. Often these techniques mean embracing the wonderful Java interop of Clojure, using primitives, arrays, and type hinting to match Java performance. However, a codebase packed with features like these can make Clojure feel like Java, which can be shocking for a functional programming devotee. Let’s look at how we can speed things up without winding up with ugly, unreadable code.

Imagine you’ve been working on a feature for one of your back-office applications that computes a few statistics on the latest sales numbers. You’ve tracked the biggest problem to a tight loop that adds a vector of numbers together, using a function called sum that’s written in a pretty typical Clojure style. Here’s how you might speed it up:

performance/sum.clj
 
(​defn​ sum [xs]
 
(​reduce​ ​+​ xs))
 
(​def​ collection (​range​ 100))
 
(bench (sum collection))
 
; Execution time mean : 2.078925 µs
 
; Execution time std-deviation : 378.988150 ns
 
 
(​defn​ array-sum [^​ints​ xs]
 
(​loop​ [index 0
 
acc 0]
 
(​if​ (​<​ index (​alength​ xs))
 
(​recur​ (​unchecked-inc​ index) (​+​ acc (​aget​ xs index)))
 
acc)))
 
(​def​ array (​into-array​ Integer/TYPE (​range​ 100)))
 
(bench (array-sum array))
 
; Execution time mean : 161.939607 ns
 
; Execution time std-deviation : 5.566530 ns

In array-sum, we’ve got a Java array, a type hint, and a low-level loop/recur instead of the much more elegant reduce version, but in exchange for that noise, we get about a 15x speedup. What do you think—is the speed worth all this code complexity? If you answered “It depends on the context,” well done!

We won’t try to cover every low-level speedup technique; instead you’ll see how you can get the speed you want and the clarity you want. In this case, you can use a macro from clojure.core, areduce, to clean things up a bit:

performance/sum_with_areduce.clj
 
(​defn​ array-sum [^​ints​ xs]
 
(​areduce​ xs index ret 0 (​+​ ret (​aget​ xs index))))
 
 
(bench (array-sum array))
 
; Execution time mean : 170.214852 ns
 
; Execution time std-deviation : 18.504698 ns

The clojure.core/areduce macro is implemented in the core language similarly to the previous version of array-sum, so it’s no surprise that its performance is just as good.

performance/areduce_implementation.clj
 
(​defmacro​ ​areduce
 
"Reduces an expression across an array a, using an index named idx,
 
and return value named ret, initialized to init, setting ret to the
 
evaluation of expr at each step, returning ret."
 
{:added ​"1.0"​}
 
[a idx ret init expr]
 
`(​let​ [a# ~a]
 
(​loop​ [~idx 0 ~ret ~init]
 
(​if​ (​<​ ~idx (​alength​ a#))
 
(​recur​ (​unchecked-inc​ ~idx) ~expr)
 
~ret))))

There’s a library from Prismatic called hiphip[19] that goes a step further and eliminates the need for type hints.

After adding the Leiningen coordinates ([prismatic/hiphip "0.1.0"]) to your project.clj’s :dependencies vector, you can try this out in your REPL as well (make sure to restart it after adding the dependencies):

performance/sum_with_hiphip.clj
 
(​require​ 'hiphip.int)
 
(bench (hiphip.int/asum array))
 
; Execution time mean : 144.535507 ns
 
; Execution time std-deviation : 1.587751 ns

In hiphip’s case, the big win in using a macro is that it can share nearly all of the implementation code across multiple data types (doubles, floats, ints, and longs). It even gets a little fancier, too. Because asum is a macro, it’s also able to provide variants that allow some elegant bindings similar to the ones in for and doseq (see the hiphip.array docstring for details):

performance/sum_with_hiphip_fanciness.clj
 
(​defn​ array-sum-of-squares [^​ints​ xs]
 
(​areduce​ xs index ret 0 (​+​ ret (​let​ [x (​aget​ xs index)] (​*​ x x)))))
 
 
(bench (array-sum-of-squares array))
 
; Execution time mean : 1.419661 µs
 
; Execution time std-deviation : 256.799353 ns
 
 
(bench (hiphip.int/asum [n array] (​*​ n n)))
 
; Execution time mean : 1.591465 µs
 
; Execution time std-deviation : 232.503393 ns

The hiphip.int/asum version has pretty much identical performance to array-sum-of-squares. Meanwhile, it’s a heck of a lot nicer to read and understand, right?

We don’t have the space to go into all the details of hiphip’s asum implementation, but here’s the asum definition that’s shared across the various data types:

performance/hiphip_asum.clj
 
;; from hiphip/type_impl.clj
 
(​defmacro​ asum
 
([array]
 
`(asum [a# ~array] a#))
 
([bindings form]
 
`(​areduce​ ~bindings sum# ~(impl/value-cast +type+ 0) (​+​ sum# ~form))))

The areduce here is hiphip’s version that provides fancy bindings, not the areduce from clojure.core. impl/value-cast is what gives asum its ability to act on different data types based on the value of +type+ at compile time.

hiphip/type_impl.clj, where this macro is defined, is loaded directly from namespaces like hiphip.int via (load "type_impl"), instead of relying on the usual Clojure :require mechanism. Furthermore, hiphip/type_impl.clj doesn’t actually define a namespace, so when it’s loaded in hiphip.int, for example, it defines the asum macro in that namespace. This explains why +type+ can vary for the different types instead of just being a single unchanging value. This loading mechanism is a bit of a hack to avoid a macro-defining macro, which often means multiple levels of syntax-quoting. I can’t fault anyone for wanting to avoid macro-defining macros—they’re hard to write and even harder to understand later.

In Clojure’s built-in areduce and hiphip’s asum, we’ve seen macros enable great performance, comparable to loop/recur, while keeping the code concise and easy to understand. Next we’ll look at a different approach to performance optimization that can allow us to sidestep problems entirely rather than improving them incrementally.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.139.97.53