Implementing Pattern Matching

If you’ve ever programmed in a functional style outside of Clojure, perhaps using Haskell, Erlang, or Scala, you’ve almost certainly seen pattern matching. You can think of pattern matching as similar to Clojure’s destructuring feature. It allows you to pull apart input expressions and bind values to names wherever you want. But pattern matching actually goes a bit further than destructuring as it lets you provide several potential patterns to match, along with clauses that will be executed if there’s a match.

So if you’re a pattern matching fan coming to Clojure, what do you do? If you immediately thought, “I could write a macro!,” you’re more optimistic about your macro skills than I was when I learned about pattern matching—great! And you’re absolutely right, too! Whether or not you’re a little hazy on what pattern matching looks like, let’s decide what we might like to see in our pattern matcher. We’ll start with something simple just to make sure that we can control the flow of execution, similar to what we did in the last chapter:

language_features/pattern_matching_1.clj
 
;; desired API (using vectors so we can match multiple things later on):
 
(match [x]
 
[0] :zero
 
[1] :one
 
:else :foo)
 
 
;; desired macroexpansion:
 
(​cond​ (​=​ [x] [0]) :zero
 
(​=​ [x] [1]) :one
 
:else :foo)
 
 
;; expected outcomes
 
(describe ​"pattern matching"
 
(it ​"matches an int"
 
;; will only compile once we've written `match`
 
(​let​ [match-simple-int-input (​fn​ [n]
 
(match [n]
 
[0] :zero
 
[1] :one
 
[2] :two
 
:else :other))]
 
(should​=​ :zero (match-simple-int-input 0))
 
(should​=​ :one (match-simple-int-input 1))
 
(should​=​ :two (match-simple-int-input 2))
 
(should​=​ :other (match-simple-int-input 3)))))

The expected outcomes here are encoded using a unit testing library called Speclj,[33] whose Leiningen coordinates are [speclj 3.0.2].

How would you write match as a macro? One way is to build on top of cond:

language_features/pattern_matching_1a.clj
 
(​defmacro​ match [input & more]
 
(​let​ [clauses (​partition​ 2 more)]
 
`(​cond
 
~@(​mapcat​ (​fn​ [[match-expression result]]
 
(​if​ (​=​ :else match-expression)
 
[:else result]
 
[`(​=​ ~input ~match-expression)
 
result]))
 
clauses))))

This version, small as it is, is enough to pass our first set of tests. Take some time to make sure you see how it expands into the cond expression, because things are going to get trickier. We’ve said before that where possible, macros should stay small and delegate most of their work to helper functions, so let’s take our own advice and extract this function we passed to mapcat:

language_features/pattern_matching_1b.clj
 
(​defn​ match-clause [input [match-expression result]]
 
(​if​ (​=​ :else match-expression)
 
[:else result]
 
[`(​=​ ~input ~match-expression)
 
result]))
 
 
(​defmacro​ match [input & more]
 
(​let​ [clauses (​partition​ 2 more)]
 
`(​cond​ ~@(​mapcat​ (​partial​ match-clause input)
 
clauses))))

Ah, much clearer. What’s next? Well, one of the main features of pattern matching is that it allows you to bind values to names that we can use in the result expressions, so let’s make sure we can do that. Here’s what we expect to be able to do:

language_features/pattern_matching_2.clj
 
(it ​"matches and binds"
 
(​let​ [match-and-bind (​fn​ [[a b]]
 
(match [a b]
 
[0 y] {:axis ​"Y"​ :y y}
 
[x 0] {:axis ​"X"​ :x x}
 
[x y] {:x x :y y}))]
 
(should​=​ {:axis ​"Y"​ :y 5} (match-and-bind [0 5]))
 
(should​=​ {:axis ​"Y"​ :y 3} (match-and-bind [0 3]))
 
(should​=​ {:axis ​"X"​ :x 1} (match-and-bind [1 0]))
 
(should​=​ {:axis ​"X"​ :x 2} (match-and-bind [2 0]))
 
(should​=​ {:x 1 :y 2} (match-and-bind [1 2]))
 
(should​=​ {:x 2 :y 1} (match-and-bind [2 1]))))

Notice that in some cases, like the [0 y] clause, we want to check that the first element, a, is equal to 0, and if so, bind y to whatever b is. This isn’t such a straightforward problem, because we have to use some combination of checking and binding: there’s no straightforward function or macro in Clojure for us to use as a target for our macroexpansion.

Nevertheless, it’s not an insurmountable task: I spent an hour or two working on a solution to pass this test case, and here’s what came out of it:

language_features/pattern_matching_2a.clj
 
;; runtime helper
 
(​defn​ match-item? [matchable-item input]
 
(​if​ (​symbol?​ matchable-item)
 
true
 
(​=​ input matchable-item)))
 
 
;; macroexpansion helpers
 
(​defn​ create-test-expression [input match-expression]
 
`(​and​ (​=​ (​count​ ~input) ~(​count​ match-expression))
 
(​every?​ ​identity
 
(​map​ match-item? '~match-expression ~input))))
 
 
(​defn​ create-bindings-map [input match-expression]
 
(​let​ [pairs (​map​ ​vector​ match-expression input)]
 
(​into​ {} (​filter​ (​comp​ ​symbol?​ ​first​) pairs))))
 
 
(​defn​ create-result-with-bindings [input match-expression result]
 
(​let​ [bindings-map (create-bindings-map input match-expression)]
 
`(​let​ [~@(​mapcat​ ​identity​ bindings-map)]
 
~result)))
 
 
(​defn​ match-clause [input [match-expression result]]
 
(​if​ (​=​ :else match-expression)
 
[:else result]
 
[(create-test-expression input match-expression)
 
(create-result-with-bindings input match-expression result)]))
 
 
(​defmacro​ match [input & more]
 
(​let​ [clauses (​partition​ 2 more)]
 
`(​cond​ ~@(​mapcat​ (​partial​ match-clause input)
 
clauses))))

Now, we’ll go into some of the details, but it’s important to note here that this solution didn’t spring from my fingers fully formed. When I write macros, I tend to hit compile errors from time to time, at which point I take a deep breath, realize the compile errors are trying their best to help me, and move forward. Usually it’s helpful to check out what code is being generated, either directly from macroexpand-1 or by calling the helper functions with the intermediate values I expect to be passing around. We’re doing some pretty complex metaprogramming here—the fact that there’s not much code doesn’t mean it doesn’t take any time to get it all loaded up into your head.

The match macro stayed the same for this iteration, and the match-clause helper function kept its basic structure, but the matching logic needed to get much smarter. Since now the test expression needs to accommodate either a symbol binding or some literal value, create-test-expression makes sure there are the same number of elements in the input and match expressions. Did you notice that we’re being a little sloppy with how we dump the input expression into the macroexpansion? Anytime you unquote the same value multiple times, be sure there’s no potential for multiple evaluation like we saw in Macros Can Be Tricky to Get Right—or at least be sure you’re aware of that potential and willing to accept it. For the purpose of this example, I am going to accept it.

There’s a bit of an annoying issue to deal with, though. We’re macroexpanding into let expressions, but we’re not taking advantage of destructuring to allow pattern matching in nested structures. Let’s capture that in a test:

language_features/pattern_matching_3.clj
 
(it ​"handles vector destructuring"
 
(​let​ [match-and-bind (​fn​ [[a b]]
 
(match [a b]
 
[0 [y & more]] {:axis ​"Y"​ :y y :more more}
 
[[x & more] 0] {:axis ​"X"​ :x x :more more}
 
[x y] {:x x :y y}))]
 
(should​=​ {:axis ​"Y"​ :y 5 :more [6 7]} (match-and-bind [0 [5 6 7]]))
 
(should​=​ {:axis ​"X"​ :x 1 :more [2 3]} (match-and-bind [[1 2 3] 0]))
 
(should​=​ {:x 1 :y 2} (match-and-bind [1 2]))))

This issue just needs a few lines tweaked:

language_features/pattern_matching_3a.clj
 
(​defn​ match-item? [matchable-item input]
 
(​cond​ (​symbol?​ matchable-item) true
 
(​vector?​ matchable-item)
 
(​and​ (​sequential?​ input)
 
(​every?​ ​identity​ (​map​ match-item? matchable-item input)))
 
:else (​=​ input matchable-item)))
 
 
(​defn​ create-bindings-map [input match-expression]
 
(​let​ [pairs (​map​ ​vector​ match-expression (​concat​ input (​repeat​ nil)))]
 
(​into​ {} (​filter​ (​fn​ [[k v]]
 
(​not​ (​or​ (​keyword?​ k)
 
(​number?​ k)
 
(​nil?​ k))))
 
pairs))))

There’s some duplication that could be cleaned up, but this gives us a much better pattern matcher. We can use it to implement a concise and fast-enough merge function, to be used as part of merge sort, where we have two sorted halves of a collection from recursive calls, and need to merge them together:

language_features/pattern_matching_4.clj
 
(​defn​ ​merge​ [xs ys]
 
(​loop​ [acc [] xs xs ys ys]
 
(match [(​seq​ xs) (​seq​ ys)]
 
[nil b] (​concat​ acc b)
 
[a nil] (​concat​ acc a)
 
[[x & x-rest] [y & y-rest]]
 
(​if​ (​<​ x y)
 
(​recur​ (​conj​ acc x) x-rest ys)
 
(​recur​ (​conj​ acc y) xs y-rest)))))
 
 
 
(it ​"implements merge (from merge-sort)"
 
(should​=​ [1 2 3] (​merge​ [1 2 3] []))
 
(should​=​ [1 2 3] (​merge​ [1 2 3] nil))
 
(should​=​ [1 2 3 4] (​merge​ [1 2 3] [4]))
 
(should​=​ [1 2 3] (​merge​ [] [1 2 3]))
 
(should​=​ [1 2 3] (​merge​ nil [1 2 3]))
 
(should​=​ [1 2 3 4 5 6 7] (​merge​ [1 3 4 7] [2 5 6])))

We’ll give this pattern matcher the tongue-in-cheek name snore.match—as a community, can do a lot better than this. Consider core.match,[34] a pattern matching library written by David Nolen, which is also built using macros, but with much more thought and care put into it. It’s built for speed, based on an algorithm by Luc Maranget to eliminate branches of the search tree more quickly (but keeping the sequential cond-like ordering of the match clauses). In addition to providing many more types of pattern matches than the simple ones we’ve done here, it even allows extending the pattern matching language itself! Take a look at the core.match wiki[35] for details.

Keep in mind that pattern matching is considered to be a crucial differentiator for a number of programming languages, and a mark of pride. And despite the fact that Clojure itself doesn’t ship with a pattern matcher, we’re able to create that feature ourselves, as a library, so that others can use it without having to change the core language.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.225.255.187