Moving Execution to Compile Time

If you were playing golf, where the lowest number of strokes wins, would you rather get a hole-in-one or skip the hole entirely? Assuming that’s a legal thing to do (it’s not) and that you don’t enjoy playing the game in the first place (your mileage may vary), zero is clearly a better score than one.

When it comes to maintenance, the best code is no code at all.[20] The fastest code is code that doesn’t need to execute at runtime. In our case, the lowest number of instructions wins! Remember that we can consider macros as not even existing at runtime; this is a clear opportunity for us to speed things up.

The caveat is that in order to move expensive computations to compile time, we need access to all of the computation’s inputs at compile time. So computation that requires user input, even indirectly, would be a poor candidate for this type of optimization. The same goes for any computation where input values may vary: they wouldn’t yet have their input values when they’re asked to compute their values during macroexpansion.

For instance, a function that talks to a web service can’t really be macro-ized, as the inputs aren’t available until runtime, when the function is called:

performance/non_macroizable.clj
 
(​defn​ calculate-estimate [project-id]
 
(​let​ [{:keys [optimistic realistic pessimistic]}
 
(fetch-estimates-from-web-service project-id)
 
weighted-mean (​/​ (​+​ optimistic (​*​ realistic 4) pessimistic) 6)
 
std-dev (​/​ (​-​ pessimistic optimistic) 6)]
 
(​double​ (​+​ weighted-mean (​*​ 2 std-dev)))))

But if we could reduce this function’s responsibilities to have it handle only the calculation and not the web service call (leaving that job to some new function), that would give us both a better design and an idea about how we might benefit from macro-izing this call:

performance/macroizable_1.clj
 
(​defn​ calculate-estimate [{:keys [optimistic realistic pessimistic]}]
 
(​let​ [weighted-mean (​/​ (​+​ optimistic (​*​ realistic 4) pessimistic) 6)
 
std-dev (​/​ (​-​ pessimistic optimistic) 6)]
 
(​double​ (​+​ weighted-mean (​*​ 2 std-dev)))))
 
 
user=> (calculate-estimate {:optimistic 3 :realistic 5 :pessimistic 8})
 
;=> 6.833333333333333
 
 
user=> (bench (calculate-estimate {:optimistic 3 :realistic 5 :pessimistic 8}))
 
; Execution time mean : 1.974506 µs
 
; Execution time std-deviation : 22.817749 ns
 
 
(​defmacro​ calculate-estimate [estimates]
 
`(​let​ [estimates# ~estimates
 
optimistic# (:optimistic estimates#)
 
realistic# (:realistic estimates#)
 
pessimistic# (:pessimistic estimates#)
 
weighted-mean# (​/​ (​+​ optimistic# (​*​ realistic# 4) pessimistic#) 6)
 
std-dev# (​/​ (​-​ pessimistic# optimistic#) 6)]
 
(​double​ (​+​ weighted-mean# (​*​ 2 std-dev#)))))
 
 
user=> (calculate-estimate {:optimistic 3 :realistic 5 :pessimistic 8})
 
;=> 6.833333333333333
 
 
user=> (bench (calculate-estimate {:optimistic 3 :realistic 5 :pessimistic 8}))
 
; Execution time mean : 2.208451 µs
 
; Execution time std-deviation : 700.062170 ns

So on my machine, these are pretty comparable. This makes sense, because the only savings here is in avoiding a function call, and the JIT may even be inlining that.

If we were able to constrain the problem a bit and always call the macro with a literal map (this means it’s available at compile time, not a value we get from a web service, database, or user input), we could do quite a bit better. Can you see how?

performance/macroizable_2.clj
 
(​defmacro​ calculate-estimate [{:keys [optimistic realistic pessimistic]}]
 
(​let​ [weighted-mean (​/​ (​+​ optimistic (​*​ realistic 4) pessimistic) 6)
 
std-dev (​/​ (​-​ pessimistic optimistic) 6)]
 
(​double​ (​+​ weighted-mean (​*​ 2 std-dev)))))
 
 
user=> (calculate-estimate {:optimistic 3 :realistic 5 :pessimistic 8})
 
;=> 6.833333333333333
 
 
user=> (bench (calculate-estimate {:optimistic 3 :realistic 5 :pessimistic 8}))
 
; Execution time mean : 4.814157 ns
 
; Execution time std-deviation : 0.761096 ns

We’ve just moved all the work we were doing to macroexpansion time instead of runtime. Notice that we aren’t quoting anything—this macro expands into a number. And it pays off: this version is on the order of nanoseconds. Units matter here. This isn’t a few times slower, it’s orders of magnitude faster, and it’s because the macro has to expand only once!

Our example is a little unrealistic, since estimates come from a web service and normally won’t be known at compile time. But there are certainly cases where we have all the information we need at compile time—the key criterion is having the input directly in the source code, like when you extract a function for clarity or to avoid duplication.

For example, the Hiccup[21] HTML templating library does as much work as it can in the hiccup.core/html macro at compile time. It converts its input data structures to HTML strings where possible, and when it’s unable to do so, it defers the conversion of only those parts until runtime. Since completely eliminating runtime code costs is such a big win, it’s good to keep track of what data we know and can take advantage of at compile time in performance-critical scenarios.

Using the JVM in ClojureScript Macros

The idea of shifting evaluation to compile time from runtime is particularly interesting when we apply it to ClojureScript, where macros are written and expanded in (JVM) Clojure but runtime evaluation happens on the JavaScript virtual machine. By moving expensive operations to compile time on the JVM, we can significantly reduce the amount of work we have to do at runtime.

On a project at work last year, our goal was to read in a data structure from a file to use in building up some HTML. There were a couple of barriers to achieving this goal:

  • Reading from the filesystem is kind of an expensive operation, and our files were never going to change while the program was running.

  • Our ClojureScript runs in the browser (as opposed to Node.js or something like that), so it doesn’t even have access to the filesystem to be able to read the files.

Our initial solution to this problem was to use the ClojureScript macro system, which worked since the JVM-land macros have access to the filesystem. Because this felt a bit too clever and magical, we ended up moving away from that solution toward one that put the data structures into normal ClojureScript functions instead of their own files. But it’s interesting that we were able to move a very expensive operation (and in fact one that seemed impossible to do at runtime!) forward to compile time to solve our problem. So, long story short, if you’re working in ClojureScript and you need to do things on the JVM that your JavaScript VM can’t do, macros are a way forward.

The Need for Speed

In this chapter you’ve seen a few ways that macros can help you write fast systems while keeping your code concise, including hiding unsightly performance optimization hacks and shifting execution to compile time. You’ve seen how tools like Criterium can tell you what’s slow and whether your subsequent changes have improved performance. Speeding up software can be really fun, but if it doesn’t need to be fast, you don’t need to waste your time working on it.

Now that you’ve seen how to speed up your code’s runtime performance, we’ll look at how you can speed up a more important bottleneck: your understanding of the code.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.16.137.38