Chapter 15. More macros and DSLs

 

This chapter covers

  • A review of Clojure macros
  • Anaphoric macros
  • Shifting computation to compile time
  • Macro generating macros
  • Domain-specific languages in Clojure

 

This final chapter is about what many consider the most powerful feature of Clojure. John McCarthy, the inventor of the Lisp programming language, once said that Lisp is a local maximum in the space of programming languages. Clojure macros make Clojure a programmable programming language because it can do arbitrary code transformations of Clojure code, using Clojure itself. No programming language outside the Lisp family can do this in such a simple way. To reiterate the obvious, this is possible because code is data.

You’ve seen a lot of macros through the course of this book, including in chapter 7, which served as an introduction to the topic. In this section, you’re going to see a lot more but with two new points of focus. The first will be advanced uses of macros, and the second will be the conscious design of a simple domain-specific language.

15.1. Macros

This section is about the things you can do with macros. You’ve already used macros quite a bit, so you should be familiar with the basics. As a refresher, we’ll write a little macro in order to remind you what macros make possible. You’ve used Clojure’s let macro several times so far. Although let itself is a macro, it’s implemented in terms of the let* special form, which sets up lexical scope for the symbols named in the binding form. We’ll now implement a subset of the functionality of let via a macro that generates function calls. This is what we’d like to do:

(my-let [x 10
         y x
         z (+ x y)]
  (* x y z))

This should return 2000, because x is 10, y is also 10, and z is 20. Here’s the implementation:

(defmacro single-arg-fn [binding-form & body]
  `((fn [~(first binding-form)] ~@body) ~(second binding-form)))

(defmacro my-let [lettings & body]
  (if (empty? lettings)
    `(do ~@body)
    `(single-arg-fn ~(take 2 lettings)
       (my-let ~(drop 2 lettings) ~@body))))

Although this is a limited implementation, you still get all the advantages that arise from using functions underneath the covers. For instance, you can do the following:

user> (my-let [[a b] [2 5]
               {:keys [x y]} {:x (* a b) :y 20}
               z (+ x y)]
        (println "a,b,x,y,z:" a b x y z)
        (* x y z))
a,b,x,y,z: 2 5 10 20 30
6000

We’re not doing any error checking, but hopefully this example has reminded you how macros work, as well as shown you how to seemingly add features to the Clojure language itself. Use macroexpand-1 and macroexpand to get a hint as to how the my-let does its thing. We’re now ready to look beyond the basics.

Broadly speaking, in this section, we’re going to explore three new concepts. The first is that of anaphora, an approach of writing macros that utilize intentional variable capture to their advantage. You’ll see why they’re called anaphoric and what they might be used for.

The second concept we’ll explore is the idea of moving some of the computation from a program’s runtime into its compile time. Some computation that would otherwise be done when the program is already running will now be done while the code is being compiled. You’ll see not only where this might be useful but also an example.

Finally, we’ll look at writing macros that generate other macros. This can be tricky, and we’ll look at a simple example of such a macro. Understanding macro-generating macros is a sign of being on the path to macro zen.

Without further ado, our first stop is Clojure anaphora.

15.1.1. Anaphoric macros

In the chapter on the basics of macros, we talked about the issue of variable capture. You saw that Clojure solves this issue in an elegant manner through two processes: the first is that names inside a macro template get namespace qualified to the namespace that the macro is defined in, and the second is by providing a convenient auto-gensym reader macro.

Macros that do their work based on intentional variable capture are called anaphoric macros. The chapter on web services introduced the use of anaphoric macros in the section on Compojure. In this section, we’ll do more variable capture but in a slightly more complex manner. To get things started, we’ll visit a commonly cited example that illustrates this concept. We’ll then build on it to write a useful utility macro.

15.1.2. The anaphoric if

Writing the anaphoric version of the if construct is the “Hello, world!” of anaphora. The anaphoric if is probably one of the simplest of its ilk, but it illustrates the point well, while also being a useful utility macro.

Consider the following example, where we first do a computation, check if it is truthy, and then proceed to use it in another computation. Imagine that we had the following function:

(defn some-computation [x]
  (if (even? x) false (inc x)))

It’s a placeholder to illustrate the point we’re about to make. Now consider a use case as follows:

(if (some-computation 11)
 (* 2 (some-computation 11)))

Naturally, you wouldn’t stand for such duplication, and you’d use the let form to remove it:

(let [computation (some-computation 11)]
  (if computation
    (* 2 computation)))

You also know that you don’t need to stop here, because you can use the handy if-let macro:

(if-let [computation (some-computation 11)]
  (* 2 computation))

Although this is clear enough, it would be nice if you could write something like the following, for it to read more clearly:

(anaphoric-if (some-computation 11)
  (* 2 it))

Here, it is a symbol that represents the value of the condition clause. Most anaphoric macros use pronouns such as it to refer to some value that was computed. The word anaphor means a word or phrase that refers to an earlier word or phrase.

Implementing Anaphoric-if

Now that you’ve seen what you’d like to express in the code, let’s set about implementing it. You could imagine writing it as follows:

(defmacro anaphoric-if [test-form then-form]
  `(if-let [~'it ~test-form]
     ~then-form))

Here’s the macro expansion of the example from earlier:

user> (macroexpand-1 '(anaphoric-if (some-computation 11)
        (* 2 it)))
(clojure.core/if-let [it (some-computation 11)] (* 2 it))

That expansion looks exactly like what you need because it creates a local name it and binds the value of the test-form to it. It then evaluates the then-form inside the let block created by the if-let form, which ensures that it happens only if the value of it is truthy. Here it is in action:

user> (anaphoric-if (some-computation 12)
           (* 2 it))
nil

user> (anaphoric-if (some-computation 11)
           (* 2 it))
24

Notice how we had to force Clojure to not namespace qualify the name it. We did this by unquoting a quoted symbol (that’s what the strange notation ~'it is). This forces the variable capture. We’ll use this technique (and the unquote splice version of it) again in the following sections.

There you have it, a simple macro that adds some convenience. It’s important to remember that when using anaphora, you’re using variable capture. So although it may be OK that the symbol it is captured in this case, it may not be in other cases. You have to be watchful for situations where intentional variable capture can cause subtle bugs.

Now that we have an anaphoric version of if, we’re ready to move on to a more complex example. Before we do, let’s write a macro that generalizes our anaphoric if a little.

Generalizing the anaphoric if

Let’s recall our implementation of the anaphoric if macro:

(defmacro anaphoric-if [test-form then-form]
  `(if-let [~'it ~test-form]
     ~then-form))

Note that we built this on the if-let macro, which in turn is built on the if special form. If you were to remove the hard dependency on the if special form and instead specify it at call time, you could have a more general version of this code on your hands. Let’s take a look:

(defmacro with-it [operator test-form & exprs]
  `(let [~'it ~test-form]
     (~operator ~'it ~@exprs)))

So, we take the idea from anaphoric-if and create a new version of it where we need to pass in the thing we’re trying to accomplish. For instance, the example from before would now read like this:

user> (with-it if (some-computation 12)
         (* 2 it))
nil

user> (with-it if (some-computation 11)
         (* 2 it))
24

Why would you want to do this? Because now you can have an anaphoric version of more than the if form. For example, you could do the following:

user> (with-it and (some-computation 11) (> it 10) (* 2 it))
24

Or you could do this:

user> (with-it when (some-computation 11)
        (println "Got it:" it)
        (* 2 it))
Got it: 12
24

Try these out at the REPL, and also try versions that use if-not, or, when-not, and so on. You could even go back and define macros like anaphoric-if in terms of with-it, for instance:

(defmacro anaphoric-if [test-form then-form]
  `(with-it if ~test-form ~then-form))

You could define all such variants (using if, and, or, and so on) in one swoop. This wraps up our introduction to anaphoric macros. As we mentioned at the start of this section, these examples are quite simple. The next one will be slightly more involved.

15.1.3. The thread-it macro

A couple of the most useful macros in Clojure’s core namespace are the threading macros. This refers to the thread-first and the thread-last macros, which we covered in chapter 2. As a refresher, we’ll write a function to calculate the surface area of a cylinder with a radius r and height h. The formula is

2 * PI * r (r + h)

Using the thread-first macro, you can write this as

(defn surface-area-cylinder [r h]
  (-> r
      (+ h)
      (* 2 Math/PI r)))

You saw a similar example when we first encountered this macro. Instead of writing something like a let form with intermediate results of a larger computation, the result of the first form is fed into the next form as the first argument, the result of that’s then fed into the next form as its first argument in turn, and so on. It’s a significant improvement in code readability.

The thread-last macro is the same, but instead of placing consecutive results in the first argument position of the following form, it places them in the position of the last argument. It’s useful in code that’s similar to the following hypothetical example:

(defn some-calculation [a-collection]
  (->> (seq a-collection)
       (filter some-pred?)
       (map a-transform)
       (reduce another-function)))

Now, although both the thread-first and thread-last macros are extremely useful, they do have a possible shortcoming: they both fix the position of where each step of the computation is placed into the next form. The thread-first places it as the first argument of the next call, whereas the thread-last macro places it in the position of the last argument.

Occasionally, this can be limiting. Consider our previous snippet of code. Imagine if you wanted to use a function written by someone else called compute-averages-from that accepts two arguments: a sequence of data and a predicate in that order. As it stands, you couldn’t plug that function into the threaded code shown previously, because the order of arguments was reversed. You’d have to adapt the function, perhaps as follows:

(defn another-calculation [a-collection]
  (->> (seq a-collection)
       (filter some-pred?)
       (map a-transform)
       (#(compute-averages-from % another-pred?))))

You’ve seen the use of anonymous functions to create adapter functions such as this before, but it isn’t pretty. It spoils the overall elegance by adding some noise to the code. What if, instead of being limited to threading forms as the first and last arguments of subsequent forms, you could choose where to thread them?

Implementing thread-it

As you can guess from the fact that we’re in the middle of a section on anaphoric macros, we’re going to choose a symbol, which will be the placeholder for where we’d like our new threading macro to thread forms into. We’ll use the it symbol and call the macro thread-it. With our new macro, we’d be able to do something like this:

(defn yet-another-calculation [a-collection]
  (thread-it (seq a-collection)
             (filter some-pred? it)
             (map a-transform it)
             (compute-averages-from it another-pred?)))

Before we jump into the implementation, let’s add another change to the way Clojure’s built-in threading macros work, in that they expect at least one argument. We’d like to be able to call our thread-it macro without any arguments. This may be useful when you’re using it inside another macro. Although the following doesn’t work

user> (->> )
Wrong number of args (2) passed to: core$--GT
  [Thrown class java.lang.IllegalArgumentException]

we’d like our macro to do this:

user> (thread-it)
nil

Now we’re ready to look at the implementation. Consider the following:

(defmacro thread-it [& [first-expr & rest-expr]]
  (if (empty? rest-expr)
    first-expr
    `(let [~'it ~first-expr]
       (thread-it ~@rest-expr))))

As you can see, the macro accepts any number of arguments. The list of arguments is destructured into the first (named first-expr) and the rest (named rest-expr). The first task is to check to see if rest-expr is empty (which happens when either no arguments were passed in or a single argument was passed in). If this is so, the macro will return first-expr, which will be nil if there were no arguments passed into thread-it or the single argument if only one was passed in.

If there are arguments remaining inside rest-expr, the macro expands to another call to itself, with the symbol it bound to the value of first-expr, nestled inside a let block. This recursive macro definition expands until it has consumed all the forms it was passed in. Here’s an example of it in action:

user> (thread-it (* 10 20) (inc it) (- it 8) (* 10 it) (/ it 5))
386

Also, the way we’ve implemented it, the following behavior is expected:

user> (thread-it it)
; Evaluation aborted.
Unable to resolve symbol: it in this context
  [Thrown class java.lang.Exception]

This happens because we don’t start by binding anything to it. You could change this behavior by initially binding it to a default value of some kind. That’s all there is to the implementation. It can be a useful macro in situations where the functions (or macros) in a threading form take arguments in an irregular order. Further, as a refinement, or perhaps as another version of this macro, you could replace the let with an if-let. This will short-circuit the computation if any step results in a logically false value. The implementation of that is straightforward and is left as an exercise to the reader.

This leads us to the end of the discussion on anaphora. It’s a useful technique at times, even though it breaks hygiene because it involves variable capture. As we mentioned, you have to be careful while using it, but when you do, it can result in code that’s more readable than it would be otherwise.

Our next stop is to examine another use case of macros. We’re going to make the Clojure compiler work harder by doing some work that would otherwise have to be done by our program at runtime.

15.1.4. Shifting computation to compile time

So far in this book, you’ve seen several uses of macros and have written several macros yourself. In this section, you’re going to see another use of macros, and it has to do with performance. In order to illustrate the concept, we’ll examine a simple code cipher called ROT13. It stands for “rotate by 13 places” and is a simple cipher that can be broken quite easily. But its purpose is to hide text in a way that isn’t immediately obvious, not to communicate spy secrets. It’s commonly used as the online equivalent of text printed upside down (for example, in magazines and newspapers), to give out puzzle solutions, answers to riddles, and the like.

About the ROT13 cipher

Table 15.1 shows what each letter of the alphabet corresponds to.

Table 15.1. The alphabet rotated by 13 places

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

a b c d e f g h i j k l m n o p q r s t u v w x y z
n o p q r s t u v w x y z a b c d e f g h i j k l m

The first row is the index for each letter of the alphabet, starting at 1. The second row is the alphabet itself. The last row is the alphabet shifted by 13. Each letter on this last row corresponds to the letter that will be used in place of the letter above it in a message encrypted using this cipher system. For example, the word abracadabra becomes noenpnqnoen.

Decrypting a rotation cipher is usually done by rotating each letter back the same number of times. ROT13 has the additional property of being a reciprocal cipher. A message encrypted using a reciprocal cipher can be decrypted by running it through the cipher system itself. The encryption process also works to decrypt encrypted messages. In this section, we’ll implement a generalized rotation cipher by allowing the rotation length to be passed in as a parameter.

Generalized rotation ciphers

Let’s begin our implementation with the letters of the alphabet. Recall that Clojure has a convenient reader macro to represent literal characters:

(def ALPHABETS [a  c d e f g h i j k l m 
 o p q 
 s 	
     u v w x y z])

Let’s also define a few convenience values based on the alphabet shown:

(def NUM-ALPHABETS (count ALPHABETS))

(def INDICES (range 1 (inc NUM-ALPHABETS)))

(def lookup (zipmap INDICES ALPHABETS))

Now, let’s talk about our approach. Because we want to implement a generic rotation mechanism, we’ll need to know at which numbered slot a letter falls when it’s rotated a specific number of times. We’d like to take a slot number such as 14, rotate it by a configurable number, and see where it ends up. For example, in the case of ROT13, the letter in slot number 10 (which is the letter j) ends up in slot 23. We’ll write a function called shift, which will compute this new slot number. We can’t add the shift-by number to the slot number, because we’ll have to take care of overflow. Here’s the implementation of shift:

(defn shift [shift-by index]
  (let [shifted (+ (mod shift-by NUM-ALPHABETS) index)]
    (cond
      (<= shifted 0) (+ shifted NUM-ALPHABETS)
      (> shifted NUM-ALPHABETS) (- shifted NUM-ALPHABETS)
      :default shifted)))

There are a couple of points to note here. The first is that we calculated shifted by adding (mod shift-by NUM-ALPHABETS) to the given index (and not shift-by) so that we can handle the cases where shift-by is more than NUM-ALPHABETS. Because we handle overflow by wrapping to the beginning, this approach works, for example:

user> (shift 10 13)
23

user> (shift 20 13)
7

Now that you have this function, you can use it to create a simple cryptographic tableau, a table of rows and columns with which you can decrypt or encrypt information. In our case, for ROT13, the tableau would be the second and third rows from table 15.1. Here’s a function that computes this:

(defn shifted-tableau [shift-by]
  (->> (map #(shift shift-by %) INDICES)
       (map lookup)
       (zipmap ALPHABETS)))

This creates a map where the keys are alphabets that need to be encrypted, and values are the cipher versions of the same. Here’s an example:

user> (shifted-tableau 13)
{a 
,  o, c p, d q, e 
, f s, g 	, h u, i v, j w, k x,
     l y, m z, 
 a, o , p c, q d, 
 e, s f, 	 g, u h, v
     i, w j, x k, y l, z m}

Because our cipher is quite simple, a simple map such as this suffices. Now that you have our tableau, encrypting messages is as simple as looking up each letter. Here’s the encrypt function:

(defn encrypt [shift-by plaintext]
  (let [shifted (shifted-tableau shift-by)]
    (apply str (map shifted plaintext))))

Try it at the REPL:

user> (encrypt 13 "abracadabra")
"noenpnqnoen"

That works as expected. Recall that ROT13 is a reciprocal cipher. Let’s see if it works:

user> (encrypt 13 "noenpnqnoen")
"abracadabra"

It does! If you rotate by anything other than 13, you’ll need a real decrypt function. All you need to do to decrypt a message is to reverse the process. Let’s express that as follows:

(defn decrypt [shift-by encrypted]
  (encrypt (- shift-by) encrypted))

decrypt works by rotating an encrypted message the other way by the same rotation. Let’s see it work at the REPL:

user> (decrypt 13 "noenpnqnoen")
"abracadabra"

Great, so we have all the bare necessities in place. In order to implement a particular cipher, such as ROT13, you can define a pair of functions as follows:

(def encrypt-with-rot13 (partial encrypt 13))

(def decrypt-with-rot13 (partial decrypt 13))

Now try it at the REPL:

user> (decrypt-with-rot13 (encrypt-with-rot13 "abracadabra"))
"abracadabra"

So there you have it; we’ve implemented the simple cipher system. The complete code is shown in the following listing.

Listing 15.1. A general rotation cipher system to implement things like ROT13
(ns chapter-macros.shifting)

(def ALPHABETS [a  c d e f g h i j k l m 
 o p q 
 s 	
     u v w x y z])
(def NUM-ALPHABETS (count ALPHABETS))

(def INDICES (range 1 (inc NUM-ALPHABETS)))

(def lookup (zipmap INDICES ALPHABETS))

(defn shift [shift-by index]
  (let [shifted (+ (mod shift-by NUM-ALPHABETS) index)]
    (cond
      (<= shifted 0) (+ shifted NUM-ALPHABETS)
      (> shifted NUM-ALPHABETS) (- shifted NUM-ALPHABETS)
      :default shifted)))

(defn shifted-tableau [shift-by]
  (->> (map #(shift shift-by %) INDICES)
       (map lookup)
       (zipmap ALPHABETS )))

(defn encrypt [shift-by plaintext]
  (let [shifted (shifted-tableau shift-by)]
    (apply str (map shifted plaintext))))

(defn decrypt [shift-by encrypted]
  (encrypt (- shift-by) encrypted))

(def encrypt-with-rot13 (partial encrypt 13))

(def decrypt-with-rot13 (partial decrypt 13))

The issue with this implementation is that you compute the tableau each time you encrypt or decrypt a message. This is easily fixed by memoizing the shifted-tableau function. This will take care of this problem, but in the next section, we’ll go one step further.

Making the compiler work harder

So far, we’ve implemented functions to encrypt and decrypt messages for any rotation cipher. Our basic approach has been to create a map that can help us code (or decode) each letter in a message to its cipher version. As discussed at the end of the previous section, we can speed up our implementation by memoizing the tableau calculation.

Even with memoize, the computation still happens at least once (the first time the function is called). Imagine, instead, if you created an inline literal map containing the appropriate tableau data. You could then look it up in the map each time, without having to compute it. Such a definition of encrypt-with-rot13 might look like this:

(defn encrypt-with-rot13 [plaintext]
  (apply str (map {a 
  o c p} plaintext)))

In an implementation, the tableau would be complete for all the letters of the alphabet, not only for a, , and c. In any case, if you did have such a literal map in the code itself, it would obviate the need to compute it at runtime. Luckily, we’re coding in Clojure, and you can bend it to your will. Consider the following:

(defmacro def-rot-encrypter [name shift-by]
  (let [tableau (shifted-tableau shift-by)]
    `(defn ~name [~'message]
       (apply str (map ~tableau ~'message)))))

This macro first computes the tableau for shifted-by as needed and then defines a function by the specified name. The function body includes the computed table, in the right place, as we illustrated in the code sample a moment ago. Look at its expansion:

user> (macroexpand-1 '(def-rot-encrypter encrypt13 13))
(clojure.core/defn encrypt13 [message] (clojure.core/apply clojure.core/str
     (clojure.core/map {a 
,  o, c p, d q, e 
, f s, g 	, h
     u, i v, j w, k x, l y, m z, 
 a, o , p c, q d, 

     e, s f, 	 g, u h, v i, w j, x k, y l, z m} message)))

This looks almost exactly like our desired function, with an inline literal tableau map. Figure 15.1 shows the flow of the code.

Figure 15.1. As usual, the Clojure reader first converts the text of our programs into data structures. During this process, macros are expanded, including our defrot-encrypter macro, which generates a tableau. This tableau is a Clojure map and is included in the final form of the source code as an inline lookup table.

Let’s check to see if it works:

user> (def-rot-encrypter encrypt13 13)
#'user/encrypt13

user> (encrypt13 "abracadabra")
"noenpnqnoen"

And there you have it. Our new encrypt13 function at runtime doesn’t do any tableau computation at all. If you were to, for instance, ship this code off to someone as a Java library, they wouldn’t even know that shifted-tableau was ever called.

As a final item, we’ll create a convenience way to define a pair of functions, which can be used to encrypt or decrypt functions in a rotation cipher:

(defmacro define-rot-encryption [shift-by]
  `(do
     (def-rot-encrypter ~(symbol (str "encrypt"
     shift-by)) ~shift-by)
     (def-rot-encrypter ~(symbol (str "decrypt"
     shift-by)) ~(- shift-by))))

And finally, here it is in action:

user> (define-rot-encryption 15)
#'user/decrypt15

Here, it prints the decrypt function var, because it was the last thing the macro expansion did. Let’s use our new pair of functions:

user> (encrypt15 "abracadabra")
"pqgprpspqgp"

user> (decrypt15 "pqgprpspqgp")
"abracadabra"

Shifting computation to the compile cycle can be a useful trick when parts of the computation needed are known in advance. Clojure macros make it easy to run arbitrary code during the expansion phase and to give the programmer the power of the full Clojure language itself. In this example, for instance, we wrote the shifted-tableau function with no prior intention of using it in this manner. Moving computation into macros this way can be quite handy at times, despite how simple it is to do.

15.1.5. Macro-generating macros

Now that you understand what it is to move computation to the compile phase of program execution, you’re ready for a new adventure. We’ll expand your mind a little as we try to write code that writes code that writes code—we’re going to write a macro that writes a macro.

We’ll take an example that’s most often used to illustrate this, and it’s probably the simplest example of such a macro. But it will serve well to illustrate this topic. The macro will create a synonym for an existing function or macro. Imagine you have two vars as follows:

user> (declare x y)
#'user/y

And if you use our new macro make-synonym

user> (make-synonym b binding)
#'user/b

then the following should work:

user> (b [x 10 y 20] (println "X,Y:" x y))
X,Y: 10 20

We’ll implement the make-synonym macro in this section.

An example template

When writing a macro, it’s usually easier to start with an example of the desired expansion. In this case, we can use the previous example:

(b [x 10 y 20] (println "X,Y:" x y))

And in order for it to do so, b should be replaced with binding, resulting in the expansion:

(binding [x 10 y 20] (println "X,Y:" x y))

You could easily solve this if you wrote a custom macro defining b in terms of binding, as follows:

(defmacro b [& stuff]
  `(binding ~@stuff))

This replaces the symbol b with the symbol binding, keeping everything else the same. We aren’t interested in the vars being bound, or the body itself, which is why we lump everything into stuff.

Now that we have a version of b that works as expected, we need to generalize it into make-synonym. The previous code is an example of what our make-synonym macro ought to produce.

Implementing make-synonym

You know make-synonym is a macro and that it accepts two parameters. The first parameter is a new symbol that will be the synonym of the existing macro or function, whereas the second parameter is the name of the existing macro or function. We can begin implementing our new macro by starting with an empty definition:

(defmacro synonym [new-name old-name])

The next question is, what should go in the body? We can start by putting in the sample expansion from the previous section. Here’s what it looks like:

(defmacro make-synonym [new-name old-name]
  (defmacro b [& stuff]
    `(binding ~@stuff)))

Obviously, this won’t work as desired, because no matter what’s passed in as arguments to this version of make-synonym, it will always create a macro named b (that expands to binding).

What we want, instead, is for make-synonym to produce the inner form containing the call to defmacro, instead of calling it. We know we can do this using the back quote. In this case, we’ll have two back quotes. While we’re at it, instead of the hard-coded symbols b and binding, we’ll use the names passed in as parameters. Consider the following increment of our make-synonym macro:

(defmacro make-synonym [new-name old-name]
  `(defmacro ~new-name [& stuff]
     `(~old-name ~@stuff)))

This is a little confusing, because we have two back quotes in play here, one nested inside the other. The easiest way to understand what’s happening is to look at an expansion. We’ll try it at the REPL:

user> (macroexpand-1 '(make-synonym b binding))
(clojure.core/defmacro b [& user/stuff]
  (clojure.core/seq (clojure.core/concat (clojure.core/list user/old-name)
                                                      user/stuff)))

In order to understand this expansion, let’s first look at what happens to a back quote when it’s expanded:

user> (defmacro back-quote-test []
        `(something))
#'user/back-quote-test

user> (macroexpand '(back-quote-test))
(user/something)

This isn’t surprising, because the Clojure namespace qualifies any names unless explicitly asked not to. Now, let’s add a back quote:

user> (defmacro back-quote-test []
        ``(something))
#'user/back-quote-test

We’ve added another back quote to the one already present. What we’re saying is instead of expanding the back-quoted form and using its return value as the expansion of the back-quote-test macro, we want the back-quoting mechanism itself. Here it is at the REPL:

user> (macroexpand '(back-quote-test))
(clojure.core/seq (clojure.core/concat (clojure.core/list
                                          (quote user/something))))

Because we’re using the symbol something as is, Clojure is namespace qualifying, as you’d expect. Now that you know what the back-quote mechanism itself is, we can return to the expansion of make-synonym:

user> (macroexpand-1 '(make-synonym b binding))
(clojure.core/defmacro b [& user/stuff]
  (clojure.core/seq (clojure.core/concat (clojure.core/list user/old-name)
                                                      user/stuff)))

Here, the symbol b gets substituted as part of the expansion of the outer back-quote expansion. Because we don’t explicitly quote the symbol stuff, it gets namespace qualified (we’ll need to fix that soon). To understand what’s happening to old-name inside the nested back quote, let’s look at the following:

user> (defmacro back-quote-test []
        ``(~something))
#'user/back-quote-test

user> (macroexpand '(back-quote-test))
(clojure.core/seq (clojure.core/concat (clojure.core/list user/something)))

If you compare this to the previous version of back-quote-test and the expansion it generated, you’ll notice that user/something is no longer wrapped in a quote form. This is again as expected, because we’re unquoting it using the ~ reader macro. This explains why the nested back-quote form of the make-synonym macro expands with user/old-name as it does. Again, we’ll need to fix this problem because we don’t want the symbol old-name but the argument passed in.

Finally, in order to see what’s going on with the unquote splicing and the stuff symbol, let’s look at the following simpler example:

user> (defmacro back-quote-test []
        ``(~@something))
#'user/back-quote-test

user> (macroexpand '(back-quote-test))
(clojure.core/seq (clojure.core/concat user/something))

If you now compare this version of the expansion with the previous one, you’ll note that user/something is no longer wrapped in a call to list. This is in line with our expected behavior of unquote-slice in that it doesn’t add an extra set of parentheses.

At this point, we’ve walked through the complete expansion of our make-synonym macro. The only problem is that it still doesn’t do what we intended it to do. The two problems we identified were that both stuff and old-name weren’t being expanded correctly. Let’s fix stuff first. Consider the following change to make-synonym:

(defmacro make-synonym [new-name old-name]
  `(defmacro ~new-name [& ~'stuff]
     `(~old-name ~@~'stuff)))

Here’s the expansion:

user> (macroexpand-1 '(make-synonym b binding))
(clojure.core/defmacro b [& stuff]
  (clojure.core/seq (clojure.core/concat
                          (clojure.core/list user/old-name) stuff)))

Finally, we’ll fix the issue with user/old-name:

(defmacro make-synonym [new-name old-name]
  `(defmacro ~new-name [& ~'stuff]
     `(~'~old-name ~@~'stuff)))

And here’s the expansion:

user> (macroexpand-1 '(make-synonym b binding))
(clojure.core/defmacro b [& stuff]
  (clojure.core/seq (clojure.core/concat
    (clojure.core/list (quote binding)) stuff)))

To check to see if this is what we expect, let’s compare it with our original template:

(defmacro b [& stuff]
  `(binding ~@stuff))

This is indeed what we set out to do, and you can test it as follows:

user> (declare x y)
#'user/y

user> (make-synonym b binding)
#'user/b

user> (b [x 10 y 20] (println "X,Y:" x y))
X,Y: 10 20
nil

Phew, we’re finished. That was a lot of calisthenics for three lines of code. We’ll wrap up this section with why we even bothered with this somewhat esoteric code.

Why macro-generating macros

There are at least two reasons why it’s useful to know how to write macros that generate macros. The first is the same reason you’d write any other kind of macro: to create abstractions that remove the duplication that arises from patterns in the code. This is important when these duplications are structural and are difficult to eliminate without some form of code generation. Clojure macros are an excellent tool to do this job, because they give the programmer the full power of Clojure to do it. The fact that code generation is a language-level feature does pull its weight.

Having said this, although writing macros is a common thing to do in a Clojure program, it isn’t often the case that a macro generates another macro. You’ll probably do it only a handful of times in your career. Combined with the other usages you’ve seen, such as moving computation to compile time and intentional symbol capture—the few times when you do need macros to abstract patterns out of macros them-selves—writing macros to generate macros can lead to a solution that would be difficult without the technique.

The second reason, and the more commonly useful one, for knowing this concept is to drive home the process of macro expansion, quoting, and unquoting. If you can understand and write macros that generate macros, then you’ll have no trouble writing simpler ones.

With these topics about macro writing out of the way, we’re ready to move on to a couple of examples. In the next section, we’ll look at using macros to create domain-specific languages (DSLs).

15.2. Domain-specific languages

We’re now going to look at explicitly doing something we’ve been doing implicitly so far. In several chapters, we’ve written macros that appear to add features to the Clojure language itself. An example is def-worker, which allowed us to create functions that can run on multiple worker machines in a cluster. We also created a simple object system with most of the semantics of regular object-oriented languages. We created def-modus-operandi, which allowed multimethods to be used in a manner similar to Clojure protocols. We won’t list all the other examples here, because it should be clear that macros have helped us in presenting our abstractions as a convenient feature of the language.

In this section, we’re going to further explore the idea of wrapping our abstractions in a layer of language. Taking this idea to its logical end brings us to the concept of metalinguistic abstraction—the approach of creating a domain-specific language that’s then used to solve the problem at hand. It allows us to solve not only the problem we started out with but a whole class of problems in that domain. It leaves us with a system that’s highly flexible and maintainable, while staying small and easier to understand and debug. Let’s begin by examining the design philosophy that leads to such systems.

15.2.1. DSL-driven design

When given the requirements of a software program, the first step usually involves thinking about what approach to take. This might end with a big design session that produces a detailed breakdown of the various components and pieces that will compose the final solution. This often goes hand in hand with the traditional top-down decomposition technique of taking something large and complex and breaking it into pieces that are smaller, independent, and easier to understand.

By itself, this approach has been known to not work particularly well in most cases. This is because the requirements for most systems are never specified perfectly, which causes the system to be redesigned in ways big and small. Many times, the requirements explicitly change over time as the reality of the business itself changes. This is why most agile teams prefer an evolutionary design, one that arises from incrementally building the system to satisfy more and more of the requirements over time.

When such an approach is desirable (and few systems can do without it these days), it makes sense to think not only in a top-down manner but also in a bottom-up way. Decomposing a problem in a bottom-up manner is different from the top-down version. With the bottom-up approach, you create small abstractions on top of the core programming language to handle tiny elements of the problem domain. These domain-specific primitives are created without explicit thought to exactly how they’ll eventually be used to solve the original problem. Indeed, at this stage, the idea is to create primitives that model all the low-level details of the problem domain.

The other area of focus is combinability. The various domain primitives should be combinable into more-complex entities as desired. This can be done using either the combinability features of the programming language itself (for instance, Clojure’s functions) or by creating new domain-specific constructs on top of existing ones. Macros can help with such extensions, because they can manipulate code forms with ease.

Functional programming aids in the pursuit of such a design. In addition to recursive and conditional constructs, being able to treat functions as first-class objects allows higher levels of complexity and abstraction to be managed in a more natural manner. Being able to create lexical closures adds another powerful piece to our toolset. When higher-order functions, closures, and macros are used together, the domain primitives can be combined to solve more than the original problem specified in the requirements document. It can solve a whole class of problems in that domain, because what gets created at the end of such a bottom-up process is a rich set of primitives, operators, and forms for combination that closely models the business domain itself.

The final layers of such a system consist of two pieces. The topmost is literally the respecification of the requirements in an executable, domain-specific language. This is metalinguistic abstraction, manifested in the fact that the final piece of the system that seems to solve the problem is written not in a general-purpose programming language but in a language that has been grown organically from a lower-level programming language. It’s often understandable by nonprogrammers and indeed is sometimes suitable for them to use directly. The next piece is a sort of runtime adapter, which either executes the domain-specific code by interpreting it or by compiling it down to the language’s own primitives. An example may be a set of macros that translate the syntactically friendly code into other forms, and code that sets up the right evaluation context for it. Figure 15.2 shows a block diagram of the various layers described.

Figure 15.2. The typical layers in a DSL-driven system are shown here. Such systems benefit from a bottom-up design where the lowest levels are the primitive concepts of the domain modeled on top of the basic Clojure language. Higher layers are compositions of these primitives into more complex domain concepts. Finally, a runtime layer sits on top of these, which can execute code specified in a domain-specific language. This final layer often represents the core solution of the problem that the software was meant to solve.

It’s useful to point out that a domain-specific language isn’t about using macros, even though they’re often a big part of the final linguistics. Macros help with fluency of the language, especially as used by the end users but also at lower levels to help create the abstractions themselves. In this way, they’re no different from other available features of the language such as higher-order functions and conditionals. The point to remember is that the core of the DSL approach is the resulting bottom-up design and the set of easily combinable domain primitives.

In the next section, we’ll explore the creation of a simple domain-specific language.

15.2.2. User classification

Most websites today personalize the experience for individual users. Many go beyond simple preferences and use the users’ own usage statistics to improve their experience. Amazon, for example, does a great job of this by showing users things they might like to buy based on their own purchase history and browsing patterns. Other web services use similarly collected usage statistics to show more relevant ads to users as they browse. In this section, we’ll explore this business domain.

The goal here is to use data about the user to do something special for them. It could be showing ads or making the site more specific to their tastes. The first step in any such task is to know what kind of user it is. Usually, the system can recognize several classes of users and is able to personalize the experience for each class in some way. The user has to be classified into the known segments before anything can be done. The business folks would like to be able to change the specification of the various segments as they’re discovered, so the system shouldn’t hardcode this aspect. Further, they’d like to make such changes quickly, potentially without requiring development effort and without requiring a restart of the system after making such changes. In an ideal world, they’d even like to specify the segment descriptions in a nice little GUI application.

This example is well suited to our earlier discussion, but aspects of this apply to most nontrivial systems being built today. For this example in particular, we’ll build a domain-specific language to specify the rules that classify users into various segments. To get started, we’ll describe the lay of the land, which in our case will be a small part of the overall system design, as well as a few functions available to find information about our users.

The data element

We’ll model a few primitive domain-specific data elements. We’ll focus our example on things that can be gleaned from the data that users’ browsers send to the server along with every request. There’s nothing to stop you from extending this approach to things that are looked up from elsewhere, such as a database of the users’ past behavior, or indeed anything else, such as stock quotes or the weather in Hawaii. We’ll model the session data as a simple Clojure map containing the data elements we care about, and we’ll store it in Redis. We’ll not focus on how we create the session map, because this example isn’t about parsing strings or loading data from various data stores.

Here’s an example of a user session:

{:consumer-id "abc"
 :url-referrer "http://www.google.com/search?q=clojure+programmers"
 :search-terms ["clojure" "programmers"]
 :ip-address "192.168.0.10"
 :tz-offset 420
 :user-agent :safari}

Again, sessions can contain a lot more than what comes in via the web request. You can imagine loads of precomputed information being stored in such a session to enable more useful targeting as well as a caching technique so that things don’t have to be loaded or computed more than once in a user’s session.

User session persistence

We’ll need a key to store such sessions in Redis, and for this example :consumer-id will serve us well. We’ll add a level of indirection so the code will read better as well as let us change this decision later if we desire:

(def redis-key-for :consumer-id)

Let’s now define a way to save sessions into Redis and also to load them back out. Here’s a pair of functions that do that:

(defn save-session [session]
  (redis/set (redis-key-for session) (pr-str session)))

(defn find-session [consumer-id]
  (read-string (redis/get consumer-id)))

Now that we have the essential capability of storing and loading sessions, we have a design decision to make. If we consider the user session to be the central concept in our behavioral targeting domain, then we can write it such that the DSL always executes in context of a session. We could define a var called *session* that we’ll then bind to the specific one during a computation:

(declare *session*)

And we could define a convenience macro that sets up the binding:

(defmacro in-session [consumer-id & body]
  `(binding [*session* (find-session ~consumer-id)]
     (do ~@body)))

The following listing shows the complete session namespace that we’ve defined so far.

Listing 15.2. Basic functions to handle session persistence in Redis
(ns chapter-macros.session
  (:require redis))

(def redis-key-for :consumer-id)

(declare *session*)

(defn save-session [session]
  (redis/set (redis-key-for session) (pr-str session)))

(defn find-session [consumer-id]
  (read-string (redis/get consumer-id)))

(defmacro in-session [consumer-id & body]
  `(binding [*session* (find-session ~consumer-id)]
     (do ~@body)))

Now that we’ve dealt with persisting user sessions, we’ll focus on the segmentation itself.

Segmenting users

We’re going to now talk about the process of describing user segmentation. In our application, we’d like to satisfy two qualitative requirements of this segmentation process. The first is that these rules shouldn’t be hardcoded into our application and that it should be possible to dynamically update the rules. The second is that these rules should be expressed in a format that’s somewhat analyst friendly. It should be in a domain-specific language that’s somewhat simpler for nonprogrammers to express ideas in. Here’s an example of something we might allow:

(defsegment googling-clojurians
     (and
       (> (count $search-terms) 0)
       (matches? $url-referrer "google")))

Here’s another example of the desired language:

(defsegment loyal-safari
     (and
       (empty? $url-referrer)
       (= :safari $user-agent)))

Notice the symbols prefixed with $. These are meant to have special significance in our DSL, because they’re the elements that will be looked up and substituted from the user’s session. Our job now is to implement def-segment so that the previous definition is compiled into something meaningful.

 

Syntax of Clojure DSLs

In many programming languages, especially dynamic ones such as Ruby and Python, domain-specific languages have become all the rage. There are two kinds of DSLs: internal and external. Internal DSLs are hosted on top of a language such as Ruby and use the underlying language to execute the DSL code. External DSLs are limited forms of regular programming languages in the sense that they have a lexer and parser that convert DSL code that conforms to a grammar into executable code. Internal DSLs are often simpler and serve most requirements that a DSL might need to satisfy.

Such DSLs are often focused on providing English-like readability, and a lot of text-parsing code is dedicated to converting the easy-to-read text into constructs of the underlying language. Clojure, on the other hand, has its magical reader. It can read an entire character stream and convert it into a form that can be executed. The programmer doesn’t have to do anything to support the lexical analysis, tokenizing, and parsing. Clojure even provides a macro system to further enhance the capabilities of textual expression.

This is the reason why many Clojure DSLs look much like Clojure. Clojure DSLs are often based on s-expressions because using the reader to do the heavy lifting of creating a little language is the most straightforward thing to do. The book DSLs in Action by Debasish Ghosh (Manning Publications) is a great resource if you’re interested in DSLs in a variety of languages.

 

We can start with a macro skeleton that looks like this:

(defmacro defsegment [segment-name & body])

Let’s begin by handling the $ prefixes. We’ll transform the body expressions such that all symbols prefixed by the $ will be transformed into a session lookup of an attribute with the same name. Something like $user-agent will become (:user-agent *sessions*). To perform this transformation, we’ll need to recursively walk the body expression to find all the symbols that need this substitution and then rebuild a new expression with the substitutions made. Luckily, we don’t have to write this code because it exists in the clojure.walk namespace. The postwalk function fits the bill:

user> (doc postwalk)
-------------------------
clojure.walk/postwalk
([f form])
  Performs a depth-first, post-order traversal of form.  Calls f on
  each sub-form, uses f's return value in place of the original.
  Recognizes all Clojure data structures except sorted-map-by.
  Consumes seqs as with doall.
nil

This is what we need, so we can transform our DSL code using the following function:

(defn transform-lookups [dollar-attribute]
  (let [prefixed-string (str dollar-attribute)]
    (if-not (.startsWith prefixed-string "$")
      dollar-attribute
      (session-lookup prefixed-string))))

We’ll need a couple of support functions, namely, session-lookup and drop-first-char, which can be implemented as follows:

(defn drop-first-char [name]
  (apply str (rest name)))

(defn session-lookup [dollar-name]
  (->> (drop-first-char dollar-name)
       (keyword)
       (list '*session*)))

Let’s test that the code we wrote does what’s expected:

user> (transform-lookups '$user-agent)
(*session* :user-agent)

This is a simple test, but note that the resulting form can be used to look up attributes of a user session if the *session* special var is bound appropriately.

Now, let’s use postwalk to test our replacement logic on a slightly more complex form:

user> (postwalk transform-lookups '(> (count $search-terms) 0))
(> (count (*session* :search-terms)) 0)

That works as expected. We now have a tool to transform the DSL body expressed using our $-prefixed symbols into usable Clojure code. As an aside, we also have a place where we can make more complex replacements if we need to.

We can now use this in our definition of defsegment as follows:

(defmacro defsegment [segment-name & body]
  (let [transformed (postwalk transform-lookups body)])

We’ve now transformed the body as specified by the user of our DSL, and we now need to convert it into something we can execute later. Let’s look at what we’re working with:

user> (postwalk transform-lookups '(and
                                     (> (count $search-terms) 0)
                                     (= :safari $user-agent)))
(and
  (> (count (*session* :search-terms)) 0)
  (= :safari (*session* :user-agent)))

The simplest way to execute this later is to convert it into a function. You can then call the function whenever you need to run this rule. We used a similar approach when we defined our remote worker framework, where we stored computations as anonymous functions that were executed on remote servers. If we’re going to do this, we’ll need a place to put the functions. We’ll create a new namespace to keep all code related to this storing of functions for later use. It’s shown in the following listing.

Listing 15.3. The dsl-store namespace for storing the rules as anonymous functions
(ns chapter-macros.dsl-store)

(def RULES (ref {}))

(defn register-segment [segment-name segment-fn]
  (dosync
   (alter RULES assoc-in [:segments segment-name] segment-fn)))

(defn segment-named [segment-name]
  (get-in @RULES [:segments segment-name]))

(defn all-segments []
  (:segments @RULES))

Now that you know you can put functions where you can find them again later, we’re ready to improve our definition of defsegment:

(defmacro defsegment [segment-name & body]
  (let [transformed (postwalk transform-lookups body)]
    `(let [segment-fn#  (fn [] ~@transformed)]
       (register-segment ~(keyword segment-name) segment-fn#))))

We now have all the pieces together for our DSL to compile. The next listing shows the complete segment namespace.

Listing 15.4. The segmentation DSL defined using a simple macro
(ns chapter-macros.segment
  (:use chapter-macros.dsl-store
        clojure.walk))

(defn drop-first-char [name]
  (apply str (rest name)))

(defn session-lookup [dollar-name]
  (->> (drop-first-char dollar-name)
       (keyword)
       (list '*session*)))

(defn transform-lookups [dollar-attribute]
  (let [prefixed-string (str dollar-attribute)]
    (if-not (.startsWith prefixed-string "$")
      dollar-attribute
      (session-lookup prefixed-string))))

(defmacro defsegment [segment-name & body]
  (let [transformed (postwalk transform-lookups body)]
    `(let [segment-fn#  (fn [] ~@transformed)]
       (register-segment ~(keyword segment-name) segment-fn#))))

Here it is in action, at the REPL:

user> (defsegment loyal-safari
        (and
          (empty? $url-referrer)
          (= :safari $user-agent)))
{:segments
  {:loyal-safari
    #<user$eval3457$segment_fn__3232__auto____3458
     user$eval3457$segment_fn__3232__auto____3458@5054c2b8>}}

Our definition of googling-clojurians still won’t work, because it will complain about an unknown matches? function. We’re going to solve this and add more functionality in the next couple of sections.

The power of the DSL

So far, we’ve put together the plumbing of the DSL. You can define some DSL code and expect it to compile and some functions to be created and stored as a result. At least three things influence how powerful our DSL can be.

The first, obviously, is the data inside a user’s session. Entities such as $url-referrer and $search-terms are examples of this. These data elements are obtained either directly from the web session of the user, from historical data about the user, or from any other source that has been used to load information into the user’s session.

The second factor is the number of primitives that can be used to manipulate the data elements. Examples of such primitives are empty? and count. We’ve leveraged Clojure’s own functions here, but there’s nothing to stop you from adding more. The function matches? that we’ll add shortly is an example of such an addition.

The final factor is combinability, which is to say how the data elements and the language primitives can be combined to create more complex forms. Here again you can use all of Clojure’s built-in facilities. For example, in our previous examples, we used and and >.

In the next section, we’ll focus on creating new primitives, and then we’ll write code to execute the DSL.

Adding primitives to the execution engine

As you can imagine, matches? is a function. For the purposes of our example here, it can be as simple as this:

(defn matches? [^String superset ^String subset]
  (and
   (not (empty? superset))
   (> (.indexOf superset subset) 0)))

You can add more functions such as this one, and they can be as complex as needed. The user of the DSL doesn’t need to know how they’re implemented, because they’ll be described as the primitives of the domain-specific language.

Now, let’s go ahead and define the remainder of the execution engine. The first piece is a function to load up with the DSL program. Typically, this will be some text either written by a user or generated by another program such as a graphical rules editor. Given that ultimately the DSL is Clojure code, you can use load-string to load it. Consider the following code:

(ns chapter-macros.engine
  (:use chapter-macros.segment
        chapter-macros.session
        chapter-macros.dsl-store))

(defn load-code [code-string]
  (binding [*ns* (:ns (meta load-code))]
    (load-string code-string)))

Note that the load-code function first switches the namespace to its own, because all supporting functions are available in it. This way, load-code can be called from anywhere, and all supporting functions can be found. It then calls load-string.

Our next step is to execute a segment function and to see if it returns true or false. A true value means that the user belongs to that segment. The following function checks this:

(defn segment-satisfied? [[segment-name segment-fn]]
  (if (segment-fn)
    segment-name))

You now have all the pieces to take a bunch of segment definitions and classify a user into one or more of them (or none of them). Consider the classify function:

(defn classify []
  (->> (all-segments)
       (map segment-satisfied?)
       (remove nil?)))

The complete source of our engine namespace is shown in the following listing.

Listing 15.5. The simple DSL execution engine to classify users into segments
(ns chapter-macros.engine
  (:use chapter-macros.segment
        chapter-macros.session
        chapter-macros.dsl-store))

(defn load-code [code-string]
  (binding [*ns* (:ns (meta load-code))]
    (load-string code-string)))

(defn matches? [^String superset ^String subset]
  (and
   (not (empty? superset))
   (> (.indexOf superset subset) 0)))

(defn segment-satisfied? [[segment-name segment-fn]]
  (if (segment-fn)
    segment-name))

(defn classify []
  (->> (all-segments)
        (map segment-satisfied?)
        (remove nil?)))

Let’s test it at the REPL. We’ll begin by creating a string that contains our definitions of the two segments in our new DSL:

user> (def dsl-code (str
  '(defsegment googling-clojurians
      (and
       (> (count $search-terms) 0)
       (matches? $url-referrer "google")))

  '(defsegment loyal-safari
      (and
       (empty? $url-referrer)
       (= :safari $user-agent)))))
#'user/dsl-code

Next, we’ll bring in our little DSL engine:

user> (use 'chapter-macros.engine)
nil

It’s now easy to load up the segment definitions:

user> (load-code dsl-code)
{:segments {:loyal-safari #<engine$eval3399$segment_fn__2833_
TRUNCATED OUTPUT

In order to test classification, we’re going to need a user session and Redis running. We can set up a session for testing purposes by defining one at the REPL as follows:

user> (def abc-session {
    :consumer-id "abc"
    :url-referrer "http://www.google.com/search?q=clojure+programmers"
    :search-terms ["clojure" "programmers"]
    :ip-address "192.168.0.10"
    :tz-offset 480
    :user-agent :safari})
#'user/abc-session

And let’s put it into Redis:

user> (require 'redis) (use 'chapter-macros.session)
nil

user> (redis/with-server {:host "localhost"}
        (save-session abc-session))
"OK"

Everything is set up now, and we can test segmentation:

user> (redis/with-server {:host "localhost"}
        (in-session "abc"
          (println "The current user is in:" (classify))))
The current user is in: (:googling-clojurians)
nil

It works as expected. Note that the classify function returns a lazy sequence that’s realized by the call to println. If you were to omit that, you’d need a doall to see it at the REPL; otherwise, it will complain about the *session* var not being bound.

With this, we have the basics working end to end. Expanding the DSL is as easy as adding new data elements and new primitives such as the matches? function. We can also expand the $attribute syntax by doing more in the postwalk transformation. Before addressing updating rules, we’ll add a way to name the abstractions we’re defining and allow for segments to be reused.

Increasing combinability

Imagine that you’d like to narrow the scope of the googling-clojurians crowd. You’d like to know which of these folks are also using the Chrome browser. You could create a segment as follows:

(defsegment googling-clojurians-chrome
     (and
      (> (count $search-terms) 0)
      (matches? $url-referrer "google")
      (= :chrome $user-agent)))

This will work fine, but it has the obvious problem that two out of the three conditions are duplicated in the googling-clojurians segment. In a normal programming language, creating a named entity and replacing the duplicate code in both places with that entity can remove such duplication. For example, you could create a Clojure function and call it from both places.

If you do that, you’ll expose the lower-level details of the implementation of our DSL to the eventual users of the DSL. It would be ideal if you could hide that detail while letting them use named entities. Consider this revised implementation of def-segment:

(defmacro defsegment [segment-name & body]
  (let [transformed (postwalk transform-lookups body)]
    `(let [segment-fn#  (fn [] ~@transformed)]
       (register-segment ~(keyword segment-name) segment-fn#)
       (def ~segment-name segment-fn#))))

The change we made does what we talked about doing by hand. The definition of a segment now also creates a var by the same name. It can be used as follows:

(defsegment googling-clojurians-chrome
     (and
      (googling-clojurians)
      (= :chrome $user-agent)))

This is equivalent in functionality to the previous definition of this segment, with the duplication removed. This is an example of increasing the combinability of domain-specific entities, where segment definitions are built on top of the lower-level session-lookup primitives, combined with built-in logical operators. Note that because our DSL code is all executed within a single namespace, we have a single namespace going. This could cause problems with name conflicts, and this may need to be addressed, depending on the requirements.

Another example of a language-level construct is in-session, which given a customer id sets up the execution context for classification. It abstracts away the details of where the session is stored and how to access and load it.

Although this is a small example, we’ve explored several of the concepts we talked about in the opening discussion. The last step will be to look at how the DSL can be updated dynamically.

Dynamic updates

With the DSL, we’ve exposed a linguistic layer to the code that follows. We also said we would like to add dynamic updates to the rules. You’ve already seen that, but we didn’t focus on it. Consider again a definition such as this:

(defsegment googling-clojurians
    (and
     (> (count $search-terms) 0)
     (matches? $url-referrer "yahoo")))

You know that evaluating this code will change the definition of the segment known as googling-clojurians (not to mention that it’s named incorrectly, because Yahoo search is being used). But the following code has the same effect:

(load-code (str '(defsegment googling-clojurians
     (and
      (> (count $search-terms) 0)
      (matches? $url-referrer "yahoo")))))

The point to note, if not already obvious, is that load-code accepts a string. This DSL code snippet can be created anywhere, even from outside our execution engine. It could be created, say, from a text editor and loaded in via a web service.

Let’s take another example by imagining you had a set of remote worker processes that implemented our rule engine to classify users into segments. You can imagine classify being implemented using def-worker. When sent a request, it will access a commonly available Redis server, find the specified user session, and classify the user into segments. This is no different from what you’ve seen earlier, except for the fact that this code would run on multiple remote servers.

Now, let’s imagine load-code also being implemented as a def-worker. In this scenario, not only could you remotely load DSL code, but you could also use run-worker-everywhere to broadcast DSL updates across all remote workers. You’d get the ability to update our segmentation cluster in real time, with no code to deploy. This change requires very little code, thanks to our remote workers framework from the previous chapters, and the implementation is left as an exercise to you.

We’ll end this section with one last point. We haven’t addressed error checking the DSL code so far, and in a production system you’d definitely need to do that. We’ve also built quite a minimal domain-specific language, and you could certainly make it arbitrarily powerful. Being able to use the full Clojure language inside it is a powerful feature that can be used by power users if so desired. As the capability of the DSL itself is expanded to do more than segmentation, the ability to update running code in such a simple way as described previously could prove to be useful.

15.3. Summary

We’re at the end of this chapter and of this book. We left macros for the end because they’re special. When most people start out with the Lisp family of programming languages, they first ask about the odd syntax. The answer to that question is the macro system. In that sense, we’ve come full circle. Macros are special because they make Clojure a programmable programming language. They allow the programmer to mold the core language into one that suits the problem at hand. In this way, Clojure blurs the line between the designers of the language itself and the programmer.

This chapter started with a few advanced uses of Clojure macros. Anaphoric macros aren’t used a lot, and they certainly come with their gotchas, but when applied carefully, they can result in truly elegant solutions. Similarly, moving computation into the compile phase of your program seems like something that isn’t done often. Certainly, the example we looked at gives only a glimpse into what’s possible. It’s an important technique, though, that can be effective when needed. Finally, macros that define other macros threaten to send us down the rabbit hole. Understanding such use of the macro system is the only way to true Lisp mastery.

Lisp encourages a certain style of programming. Everyone seems to be talking about domain-specific languages these days, but in Clojure, it’s the normal way to build programs, and there’s nothing advanced about it at all. It’s my hope that our behavioral targeting DSL example didn’t seem particularly complicated or new. We’ve written similar code throughout this book, be it our mocking framework to help us write tests, our object system utility, our library for distributed stream processing, or our faux protocols library we called modus operandi. Some people express concern about the misuse of macros, but I believe that the real concern should be an incomplete understanding of the Lisp way.

What you’ve seen in this book is only the tip of the iceberg. Lisp, and thus Clojure, makes it possible to build systems that can withstand today’s demanding requirements. It isn’t far-fetched to think that the revival of Lisp will prompt systems that can someday do what we mean. In order to do that, we’ll need more than a few language features or a macro system. We’ll need more than DSLs.

We’ll need a system that can adapt itself to new and changing requirements. Programmers will need to recognize that evaluators are themselves programs, and they can be built like everything else, allowing new kinds of evaluation rules and paradigms. We’ll need programs that watch themselves as they run and modify themselves to improve their output. All this might seem like fantasy, but it’s possible. In the words of Alan Kay, the computer revolution hasn’t even started yet. And paraphrasing him some more, the way to build systems that can do all this is to play it grand. We have to build our systems grander than we think they can be. A language like Clojure gives us the tools to make this happen.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.189.228