Code-Walking Macros

Let’s say you’re a former Ruby developer, and you really miss the feature of Ruby that allows any method, class, or module definition to act as an implicit begin (Ruby’s version of try). In other words, you’d like to be able to write this code in Clojure:

language_features/implicit_try_1.clj
 
(​defn​ delete-file [path]
 
(clojure.java.io/delete-file path)
 
(​catch​ java.io.IOException e false))

This code doesn’t work in Clojure, of course—you’ll get a compiler exception if you try to type that in at your REPL. You’d need to wrap a try expression around everything following the argument list to get it to work as you want. Take a moment and think about what it would take to have this code work. One option would be to file an enhancement request for the language to allow this, but major language changes like this are unlikely to be accepted without lots of thought, and certainly not anytime soon. But if we wrap our code in a macro call, we can definitely write the macro that allows this code to work.

As we have only a single example, we might first try a naïve approach:

language_features/implicit_try_1a.clj
 
(​defmacro​ with-implicit-try [& defns]
 
`(​do
 
~@(​map
 
(​fn​ [defn-expression]
 
(​let​ [initial-args (​take​ 3 defn-expression)
 
body (​drop​ 3 defn-expression)]
 
`(~@initial-args (​try​ ~@body))))
 
defns)))
 
 
(with-implicit-try
 
(​defn​ delete-file [path]
 
(clojure.java.io/delete-file path)
 
(​catch​ java.io.IOException e false)))

And in fact this version will work great for this kind of input defn. It’s fragile, though: if we do something as simple as adding a docstring, suddenly our macro is broken because the number of initial-args needs to change. We could use clojure.tools.macro/name-with-attributes to solve the docstring issue, but as we saw in the last section, there are plenty of other variations in the input that defn allows. Besides, we’d really like to be able to have our with-implicit-try work for other things besides defn as well: do, fn, let, loop, when, and letfn all feel like places it’d make sense to have our implicit try in place.

In order to cover all these cases correctly, we need to look not only at each top-level expression, but also at all of the internal expressions. And what about macros like when that end up expanding to other expressions on our hit list? In order to handle those kinds of things, we’d need a macro that actually walks through the code given to it, macroexpanding at each step and generating the code we want to end up with.

There are a wide range of code walking tools out there. For the very simplest of tasks, we could use the built-in clojure.walk, but its macroexpansion facilities aren’t aware of bindings, a shortcoming that Riddley[42] overcomes:

language_features/clojure_walk.clj
 
(​require​ '[clojure.walk :as cw])
 
(cw/macroexpand-all '(​let​ [​when​ :now] (​when​ {:now ​"Go!"​})))
 
;=> (let* [when :now] (if {:now "Go!"} (do)))
 
 
;; lein try org.clojars.trptcolin/riddley "0.1.7.1"
 
 
(​require​ '[riddley.walk :as rw])
 
(rw/macroexpand-all '(​let​ [​when​ :now] (​when​ {:now ​"Go!"​})))
 
;=> (let* [when :now] (when {:now "Go!"}))

Riddley also provides access to &env and takes care of expanding any :inline function definitions. Besides macroexpand-all, and getting back to our purposes, Riddley exposes a handy function, riddley.walk/walk-exprs, that allows us to replace input expressions with expressions of our choosing.

language_features/riddley_basics.clj
 
(​require​ '[riddley.walk :as walk])
 
 
(​defn​ malkovichify [expression]
 
(walk/walk-exprs
 
symbol?​ ​;; predicate: should we run the handler on this?
 
(​constantly​ 'malkovich) ​;; handler: does any desired replacements
 
expression))
 
 
(malkovichify '(​println​ a b))
 
;=> (malkovich malkovich malkovich)

This is exactly the sort of thing we need for our implicit try. Handling all the cases correctly in a code-walking macro can often be similar to thinking recursively, because of macroexpansion. What are the base cases? Well, the base cases of macroexpansion are precisely the special forms, so they’re a good place to start:

language_features/special_forms.clj
 
(source ​special-symbol?​)
 
; (defn special-symbol?
 
; "Returns true if s names a special form"
 
; {:added "1.0"
 
; :static true}
 
; [s]
 
; (contains? (. clojure.lang.Compiler specials) s))
 
 
user=> (​sort​ (​keys​ clojure.lang.Compiler/specials))
 
;=> (& . case* catch def deftype* do finally fn* if let* letfn* loop*
 
; monitor-enter monitor-exit new quote recur reify* set! throw try var
 
; clojure.core/import*)

Continuing along our recursive-programming train of mind, if we correctly handle the base cases (special forms like let*, fn*, etc.), and we correctly handle macroexpansions, that means that by induction we’ll be able to correctly handle code that contains macros.

A first cut at with-implicit-try takes only around 60 lines of code, but it did take a lot of trial and error, with plenty of test cases:

language_features/trymplicit_1.clj
 
(​ns​ trptcolin.trymplicit
 
(:require [riddley.walk :as walk]))
 
 
(​declare​ add-try)
 
 
(​defn​ should-transform? [x]
 
(​and​ (​seq?​ x)
 
(#{'fn* '​do​ 'loop* 'let* 'letfn* 'reify*} (​first​ x))))
 
 
(​defn-​ wrap-fn-body [[bindings & body]]
 
(​list​ bindings (​cons​ '​try​ body)))
 
 
(​defn-​ wrap-bindings [bindings]
 
(​->>​ bindings
 
(partition-all 2)
 
(​mapcat
 
(​fn​ [[k v]]
 
(​let​ [[k v] [k (add-try v)]]
 
[k v])))
 
vec​))
 
 
(​defn-​ wrap-fn-decl [clauses]
 
(​let​ [[name? args? fn-bodies]
 
(​if​ (​symbol?​ (​first​ clauses))
 
(​if​ (​vector?​ (​second​ clauses))
 
[(​first​ clauses) (​second​ clauses) (​drop​ 2 clauses)]
 
[(​first​ clauses) nil (​doall​ (​map​ wrap-fn-body (​rest​ clauses)))])
 
[nil nil (​doall​ (​map​ wrap-fn-body clauses))])]
 
(cond->> fn-bodies
 
(​and​ name? args?) (#(​list​ (​cons​ '​try​ %)))
 
(​not​ (​and​ name? args?)) (​map​ add-try)
 
args? (​cons​ args?)
 
name? (​cons​ name?))))
 
 
(​defn-​ wrap-let-like [expression]
 
(​let​ [[verb bindings & body] expression]
 
`(~verb ~(wrap-bindings bindings) (​try​ ~@(add-try body)))))
 
 
(​defn​ transform [x]
 
(​condp​ ​=​ (​first​ x)
 
'​do​ (​let​ [[_ & body] x]
 
(​cons​ '​try​ (add-try body)))
 
 
'loop* (wrap-let-like x)
 
 
'let* (wrap-let-like x)
 
 
'letfn* (wrap-let-like x)
 
 
'fn* (​let​ [[verb & fn-decl] x]
 
`(fn* ~@(wrap-fn-decl fn-decl)))
 
 
'reify* (​let​ [[verb options & fn-decls] x]
 
`(~verb ~options ~@(​map​ wrap-fn-decl fn-decls)))
 
x))
 
 
(​defn​ add-try [expression]
 
(walk/walk-exprs should-transform? transform expression))
 
 
(​defmacro​ with-implicit-try [& body]
 
(​cons​ '​try​ (​map​ add-try body)))

There’s clearly some complexity here, particularly around pulling out the various styles of fn* expressions. It would have been nice if there was only one base-level form, but because of backward compatibility, we’ll probably be stuck with this sort of thing for a while.

Much more significantly, however, there’s one critical flaw in this approach to wrapping try inside each of these special forms: it’s not possible to recur across a try. This means that any special form that can act as a recur target can’t be changed in such a cavalier way. So we need a way to avoid our wrapping of fn*, loop*, and reify*, at least where they’re being used for recur. Would it be better to simply always skip those expressions? Perhaps, but it’s not nearly as interesting, and besides, wrapping a function definition was really our main use case anyway!

To make things concrete, how can we avoid inserting a try for fn* when it has a recur? Since we’re code-walking anyway, one interesting strategy is to extend the existing code walker with a case for recur, and have fn* keep track of whether a recur was found while its contents were being walked:

language_features/trymplicit_finding_recur.clj
 
(​def​ recur-found (​atom​ false))
 
 
(​defn​ should-transform? [x]
 
(​and​ (​seq?​ x)
 
;; NOTE: we've added 'recur - easy to forget
 
(#{'​recur​ 'fn* '​do​ 'loop* 'let* 'letfn* 'reify*} (​first​ x))))
 
 
(​defn​ transform [x]
 
(​condp​ ​=​ (​first​ x)
 
'​recur​ (​let​ [[verb & args] x]
 
(​reset!​ recur-found true)
 
x)
 
;; ...
 
'fn* (​let​ [[verb & fn-decl] x
 
_ (​reset!​ recur-found false)
 
result `(fn* ~@(​doall​ (wrap-fn-decl fn-decl)))]
 
(​if​ @recur-found
 
x
 
result))
 
;; ...
 
))

There is a major problem here: we’re looking for any recur expression inside the function, not just the ones that could be affected. For instance, the recur in this useless expression, (fn* [] (loop* [] (recur))), affects only the loop* expression, not the outer fn*. It’s as though we need to push a new context onto a stack for every potential recur target we walk and pop it off when we’re done walking it. Good news—Clojure’s dynamic bindings work perfectly for that, and are the key to solving this dilemma in a reasonable way.

There are lots of other details here: for instance, we’ve added doall in a number of places to realize any lazy seqs, since we’re relying on side effects while code-walking. And there’s also still room for improvement, both in terms of code quality and functionality. How would we handle each of the remaining special forms, for instance?

language_features/trymplicit/src/trptcolin/trymplicit.clj
 
(​ns​ trptcolin.trymplicit
 
(:require [riddley.walk :as walk]))
 
 
(​def​ ^:dynamic *recur-search-tracker*
 
(​atom​ false))
 
 
(​declare​ add-try)
 
 
(​defn​ should-transform? [x]
 
(​and​ (​seq?​ x)
 
(#{'fn* '​do​ 'loop* 'let* 'letfn* 'reify* '​recur​} (​first​ x))))
 
 
(​defn-​ wrap-fn-body [wrapper-fn [bindings & body]]
 
(​if​ (​nil?​ wrapper-fn)
 
(​cons​ bindings body)
 
(​list​ bindings (wrapper-fn body))))
 
 
(​defn-​ wrap-bindings [bindings]
 
(​->>​ bindings
 
(partition-all 2)
 
(​mapcat
 
(​fn​ [[k v]]
 
(​let​ [[k v] [k (add-try v)]]
 
[k v])))
 
vec​))
 
 
(​defn-​ wrap-fn-decl [wrapper-fn clauses]
 
(​let​ [[name? args? fn-bodies]
 
(​cond​ (​symbol?​ (​first​ clauses))
 
(​if​ (​vector?​ (​second​ clauses))
 
[(​first​ clauses) (​second​ clauses) (​drop​ 2 clauses)]
 
[(​first​ clauses) nil
 
(​doall​ (​map​ (​partial​ wrap-fn-body wrapper-fn)
 
(​rest​ clauses)))])
 
(​vector?​ (​first​ clauses))
 
[nil (​first​ clauses) (​rest​ clauses)]
 
:else
 
[nil nil (​doall​ (​map​ (​partial​ wrap-fn-body wrapper-fn)
 
clauses))])]
 
(cond->> fn-bodies
 
(​and​ name? args?) (#(​if​ (​nil?​ wrapper-fn)
 
(​list​ `(​do​ ~@(​doall​ (​map​ add-try %))))
 
(​list​ (wrapper-fn (​doall​ (​map​ add-try %))))))
 
(​not​ (​and​ name? args?)) (#(​let​ [not-both-result (​map​ add-try %)]
 
not-both-result))
 
args? (​cons​ args?)
 
name? (​cons​ name?))))
 
 
(​defn-​ wrap-let-like [expression]
 
(​let​ [[verb bindings & body] expression
 
result `(~verb ~(wrap-bindings bindings) (​try​ ~@(​doall​ (add-try body))))]
 
(​if​ @*recur-search-tracker*
 
`(~verb ~(wrap-bindings bindings) ~@(add-try body))
 
result)))
 
 
(​defn​ transform [x]
 
(​condp​ ​=​ (​first​ x)
 
 
'​recur​ (​let​ [[verb & args] x]
 
(​reset!​ *recur-search-tracker* true)
 
x)
 
 
'​do​ (​let​ [[_ & body] x
 
result (​cons​ '​try​ (add-try body))]
 
(​if​ @*recur-search-tracker*
 
(​cons​ '​do​ (add-try body))
 
(​cons​ '​try​ (add-try body))))
 
 
'loop* (​binding​ [*recur-search-tracker* (​atom​ false)]
 
(wrap-let-like x))
 
 
'let* (wrap-let-like x)
 
 
'letfn* (wrap-let-like x)
 
 
'fn* (​binding​ [*recur-search-tracker* (​atom​ false)]
 
(​let​ [[verb & fn-decl] x
 
result `(fn* ~@(​doall​ (wrap-fn-decl #(​cons​ '​try​ %) fn-decl)))]
 
(​if​ @*recur-search-tracker*
 
`(fn* ~@(​doall​ (wrap-fn-decl nil fn-decl)))
 
result)))
 
 
'reify* (​let​ [[verb options & fn-decls] x
 
wrap-reify-fn (​fn​ [expression]
 
(​binding​ [*recur-search-tracker* (​atom​ false)]
 
(​let​ [result (​doall​ (wrap-fn-decl #(​cons​ '​try​ %)
 
expression))]
 
(​if​ @*recur-search-tracker*
 
(wrap-fn-decl nil expression)
 
result))))]
 
`(~verb ~options ~@(​doall​ (​map​ wrap-reify-fn fn-decls))))
 
 
x))
 
 
(​defn​ add-try [expression]
 
(walk/walk-exprs should-transform? transform expression))
 
 
(​defmacro​ with-implicit-try [& body]
 
(​cons​ '​try​ (​map​ #(​binding​ [*recur-search-tracker* (​atom​ false)] (add-try %))
 
body)))

I’m trying to make the case here that code-walking macros are not easy: they take a lot of effort to do well. There aren’t very many examples of robust code-walking macros out there, but those that do exist are quite interesting:

  • Proteus[43] creates local mutable variables... OK, I realize that reads like a joke in a Clojure book, but it’s probably fine, right? Proteus, like our trymplicit, uses Riddley to do its code-walking.

  • Clojure-TCO[44] rewrites Clojure expressions to provide tail-call optimization, a feature for which the JVM lacks full support but that we can get via macros. Clojure-TCO does its own custom code-walking and, like delimc, doesn’t handle all of Clojure.

  • core.async[45] generates a state machine from imperative-looking code that allows the library to schedule the code’s execution asynchronously, whether on the JVM using threads or on the JavaScript VM, and using Go-style channels (Communicating Sequential Processes). It uses tools.analyzer.jvm[46] for the Clojure implementation’s code-walking, and the built-in ClojureScript analyzer for the ClojureScript implementation.

Every nontrivial code-walking macro I’ve written or studied has some caveats where you need to know a bit about how the underlying machinery works. But this is also true of every interesting language feature I’ve studied. Abstractions, at some point, break down. These libraries may all seem to do magical-looking things, but as a user of a code-walking macro it’s great for you to be aware of where the abstractions can leak. Typically these kinds of macros support a subset of Clojure, and it’s more important than ever for library authors to document that subset, along with any changes in language semantics.

As Douglas Hoyte points out in Let Over Lambda [Hoy08], writing a robust code walker is tough. If we can avoid it, we’re usually better off not trying to do it ourselves, which is what makes libraries like riddley and tools.analyzer.jvm so useful. Even writing a code-walking macro like the one we just wrote, that simply uses a code walker, is itself a serious undertaking. It’s quite easy to continually find yourself 90% of the way done, with 90% still left to go. But if you want to do the kinds of deep transformations that core.async, Clojure-TCO, Proteus, and our own Trymplicit library are able to do, it’s worth rolling up your sleeves and doing the hard work.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.149.234.188