More Complex Macros

Let’s suppose you need a custom my-length command. This is a classic list-eating function that will count the length of a list. We’ll write it in the proper “tail call optimized” style (discussed in Chapter 14), where the recursive function call is in the tail position. Here’s the code:

(defun my-length (lst)
   (labels ((f (lst acc)
              (if lst
                (f (cdr lst) (1+ acc))
                 acc)))
      (f lst 0)))

As you can see, this function has tons of repetitive stuff, once again giving us that dreaded feeling of déjà vu. There are two repetitive patterns in this function:

  • As in other list-eater functions, we have the annoying check to see if the list is empty and the associated use of cdr .

  • We did all this verbose work to create a local function f .

Let’s write some macros that make this function (and other functions with the same repetition) more pithy.

A Macro for Splitting Lists

First, let’s create a split macro. It will let us write cleaner list-eater functions, such as our my-length function.

List-eaters always check if the list is empty. If it isn’t, they take apart the list using car and/or cdr, and then perform operations on the head and/or tail of the list. The split macro does this for us. Here’s what it looks like when we use the finished split macro:

 > (split '(2 3)
     (format t "This can be split into ˜a and ˜a." head tail)
      (format t "This cannot be split."))
  This can be split into 2 and (3).
 > (split '()
      (format t "This can be split into ˜a and ˜a." head tail)
     (format t "This cannot be split."))
  This cannot be split.

The first argument of the split macro is a list you want to split into a head and a tail . If this is possible, the next expression in the split macro will be called . As a bonus, our split macro automatically creates two variables for us, named head and tail. This way, we don’t always need to call car and cdr inside list-eating functions. If the list is empty , we call the expression at the end .

Let’s look at the code for the split macro. Note that this initial version of the macro contains some bugs we’ll discuss shortly:

;Warning! Contains Bugs!
 (defmacro split (val yes no)
   `(if ,val
      (let ((head (car ,val))
            (tail (cdr ,val)))
        ,yes)
        ,no))

Our split macro requires three (and only three) expressions as arguments . This means when we use this macro, we’ll always need exactly three items.

The code that needs to be generated by split is pretty straightforward. First, we have an if that checks if the list is empty . If it is, we break apart the list and stick it into our two local variables, head and tail . Then we put in the code that handles the “yes, we can split the list” case . If we can’t split the list, we call the no case . Note that in the no case, we don’t have access to the head/tail variables, since they aren’t created if the list can’t be split.

With this new split macro, we can clean up our my-length macro a bit:

(defun my-length (lst)
    (labels ((f (lst acc)
               (split lst
                (f tail (1+ acc))
                 acc)))
      (f lst 0)))

Notice how we now make use of the tail variable created by split, simplifying our code . Macros that automatically generate variables like this are called anaphoric macros.

However, we are not yet finished with our split macro. Although it basically works, it contains some subtle bugs that we need to address.

Avoiding Repeated Execution in Macros

One common bug that can happen in a macro is incorrect repeated execution of code. In fact, our current version of the split macro contains this flaw. Here is an example that clearly shows the problem:

> (split (progn (princ "Lisp rocks!")
                  '(2 3))
      (format t "This can be split into ˜a and ˜a." head tail)
      (format t "This cannot be split."))
  Lisp rocks!Lisp rocks!Lisp rocks!This can be split into 2 and (3).

In this use of split, the statement “Lisp rocks!” was printed three times, even though it appears only once in the original code. How is this possible?

Remember that the arguments passed into a macro consist of raw source code. This means the val argument passed into split contains the raw code of the progn statement , including the raw code for the princ statement within it. Since we reference val three times inside the split macro, it causes the princ statement to be executed three times.

We can verify this by running this example through macroexpand:

> (macroexpand '(split (progn (princ "Lisp rocks!")
                                '(2 3))
                       (format t "This can be split into ˜a and ˜a." head tail)
                       (format t "This cannot be split.")))
 (IF (PROGN (PRINC "Lisp rocks!") '(2 3))
   (LET
   ((HEAD (CAR (PROGN (PRINC "Lisp rocks!") '(2 3))))
    (TAIL (CDR (PROGN (PRINC "Lisp rocks!") '(2 3)))))
    (FORMAT T "This can be split into ˜a and ˜a." HEAD TAIL))
   (FORMAT T "This cannot be split.")) ;
  T

As you can see, the princ statement appears three times . This causes unexpected behavior and is inefficient, since we’re repeatedly running the same code unnecessarily.

If you give this problem some thought, the solution isn’t too hard to figure out. We simply need to create a local variable inside the split macro, like this:

;Warning! Still contains a bug!
(defmacro split (val yes no)
  `(let1 x ,val
     (if x
       (let ((head (car x))
             (tail (cdr x)))
         ,yes)
         ,no)))

Note that we made use of let1 in this new version of split. As this shows, it is perfectly okay to use macros inside other macros.

Now if we rerun our previous example, we can see that split behaves correctly, princing the statement only once:

> (split (progn (princ "Lisp rocks!")
                '(2 3))
    (format t "This can be split into ˜a and ˜a." head tail)
    (format t "This cannot be split."))
Lisp rocks!This can be split into 2 and (3).

Unfortunately, this new version of the split macro introduces yet another bug. Let’s tackle this new bug next.

Avoiding Variable Capture

To see the bug in our newest version of split, try running the following:

> (let1 × 100
    (split '(2 3)
      (+ x head)
      nil))
*** - +: (2 3) is not a number

Can you tell what happened? We just created a variable x inside the new version of our split macro! Here’s what the call to split looks like if we macroexpand it:

> (macroexpand '(split '(2 3)
                    (+ x head)
                    nil))
 (LET ((X '(2 3)))
     (IF X (LET ((HEAD (CAR X)) (TAIL (CDR X))) (+ X HEAD)) NIL)) ;
  T

Notice how the expanded version of split contains a definition of x . This blocks the competing definition in our troublesome example . In this scenario, the split macro accidentally captured the variable x and overwrote it in an unexpected way. How can we avoid this problem?

One simple solution would be to not create a variable x in the macro, but to instead use a variable with some insane long name like xqweopfjsadlkjgh. Then we could feel pretty confident the variable used inside the macro will never clash with a variable inside the code that uses it. If fact, there is a Common Lisp function called gensym whose job it is to generate crazy variable names exactly for this purpose:

> (gensym)
#:G8695

The gensym function will create a unique variable name for you that is guaranteed never to clash with any other variable name in your code. You may notice that it also has a special prefix (#:) that differentiates it from other names. Common Lisp handles these gensym-based names as a special case and will stop you from using the name of a gensym variable directly.

Now let’s use the gensym function inside our split macro to protect the macro from causing variable capture:

;This function is finally safe to use
  (defmacro split (val yes no)
   (let1 g (gensym)
   `(let1 ,g ,val
       (if ,g
         (let ((head (car ,g))
               (tail (cdr ,g)))
           ,yes)
           ,no))))

In the first line of our revised macro, we define a variable g that contains the gensym name . It’s very important to notice that there is not a backquote at the front of this line. This means that this line of code is run at macro expand time, not runtime, and it is perfectly fine to define the variable g at this point. The let1 on the next line, however, has a backquote in front of it . This line will be run at runtime, so we don’t want to use a hardcoded variable in this spot. In this new version, we instead use the unique gensym name stored in g.

Now every time the split macro is used, a unique name is generated to hold the internal value. We can test this by running some examples through macroexpand:

> (macroexpand '(split '(2 3)
                  (+ x head)
                  nil))
(LET ((#:G8627 '(2 3))) (IF #:G8627 (LET ((HEAD (CAR #:G8627))
 (TAIL (CDR #:G8627))) (+ X HEAD)) NIL)) ;
T
> (macroexpand '(split '(2 3)
                  (+ x head)
                  nil))
(LET ((#:G8628 '(2 3))) (IF #:G8628 (LET ((HEAD (CAR #:G8628))
 (TAIL (CDR #:G8628))) (+ X HEAD)) NIL)) ;
T

Notice how a differently named local variable was created in both instances . This guarantees that the variable name will not only be unique within your code, but will also be unique if the split macro is ever used multiple times in a nested fashion. We have now created a fully debugged version of our split macro.

Just because it is now bug-free does not mean that it is free of variable capture. Note that the macro still defines the variables head and tail. If you used this function in other code in which head or tail had an alternate meaning, your code would fail! However, in the case of head and tail, the capture is on purpose. In this situation, the variable capture is a feature, not a bug—it is an anaphoric macro. As we’ve discussed, this means that it makes named variables or functions available that we can use in the body of the macro.

A Recursion Macro

Let’s take another look at our improved my-length macro:

(defun my-length (lst)
  (labels ((f (lst acc)
             (split lst
               (f tail (1+ acc))
               acc)))
    (f lst 0)))

As we discussed, there is an additional repetitive pattern in this code: The creation of a local function f. Let’s write another macro that gets rid of this additional visual noise: recurse. Here’s an example of the recurse macro in use:

 > (recurse (n 9)
     (fresh-line)
     (if (zerop n)
       (princ "lift-off!")
       (progn (princ n)
              (self (1- n)))))
  9
  8
  7
  6
  5
  4
  3
  2
  1
  lift-off!

The first parameter into the recurse macro is a list of variables and their starting values . In this case, we’re declaring only one variable (n) and setting its starting value to 9. The rest of the lines in the macro make up the body of the recursive function.

The first thing we do in the body is start a fresh line . Then we check if n has reached zero yet . If it has, we print “lift-off!” . Otherwise, we print the current number and call the function again, recursively. Like our split macro, the recurse macro is anaphoric. In the case of recurse, it makes a function named self available, which we call when we’re ready to perform a recursion . We also subtract one from n at this point to lower the countdown number.

Now that we’ve seen how recurse should work, let’s write this recurse macro. In order to process the list of arguments and starting values, it’s useful for us to have a function that can group items into a list of pairs. Here is a function, pairs, that accomplishes this:

> (defun pairs (lst)
     (labels ((f (lst acc)
                 (split lst
                   (if tail
                       (f (cdr tail) (cons (cons head (car tail)) acc))
                       (reverse acc))
                   (reverse acc))))
        (f lst nil)))
  PAIRS
  > (pairs '(a b c d e f))
  ((A . B) (C . D) (E . F))

The pairs function is a tail-call-optimized list-eater, which, ironically, has its own local function f . (Shortly, we won’t need to declare such a function anymore.) It uses split to break an item off the list . However, since it needs to process two items (a pair) from the list at once, we need to run an additional check to see if the tail is empty . If there are no items in the list (or only one item left ), we return our accumulated values. Otherwise, we recursively process the rest of the list, with a new pair of items placed into the accumulator .

Now we’re finally ready to write the recurse macro:

(defmacro recurse (vars &body body)
   (let1 p (pairs vars)
     `(labels ((self ,(mapcar #'car p)
                  ,@body))
        (self ,@(mapcar #'cdr p)))))

As you can see, it simply transforms the recursion into a traditional local function. First, it uses our new pairs function to take apart the variable names and starting values, and puts the result into p . Then it defines a local function simply named self. The variable names for self are the odd-numbered items from p . Since we want self to be accessible, anaphorically, from inside the macro, we use a plain name instead of a gensym name for this function. At the bottom of the macro, we then simply call self, passing in all the starting values .

Now that we’ve created the recurse macro, let’s once again clean up our my-length function using this new language construct:

(defun my-length (lst)
  (recurse (lst lst
            acc 0)
           (split lst
             (f tail (1+ acc))
             acc)))

As you can see, there is very little repetition or visual noise in this version of our my-length function.

Now you can appreciate how helpful macros can be when trying to write clean, succinct code. However, a liberal use of macros will also require you to bear some costs that you need to be aware of. We’ll look at the potential downsides to macros next.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.91.47