Handling Data in a Generic Way

Common Lisp has many different datatypes available for writing elegant and efficient programs. But without some care, having so many datatypes can lead to ugly and repetitive code.

For example, suppose we want to add several groups of numbers, which are stored as both lists and arrays. Since lists and arrays behave differently, will we need to write two different addition functions—one for lists and the other for arrays? It would be great if we could write a single chunk of code to handle both cases without caring about how the numbers are stored.

Common Lisp has all the features we need to write such generic code, including generic library functions, type predicates, defmethod, and generic accessors. We can use these features to write code that works with many types of data—including built-in as well as custom types that we might create with defstruct—without superfluous repetition in our code.

Working with Sequences

The easiest way to write code that works with any type of argument is to hand the type-checking work to someone else. The Common Lisp libraries are packed with functions that can generically handle data of varying types in their arguments, the most commonly used of which are the sequence functions. The sequence functions work generically across the three main ways of sequencing objects in Lisp: lists, arrays, and strings.

You’ve already seen one of these sequence functions without even realizing it: the length function. You can use the length function to check for the length of all three sequence types:

> (length '(a b c))
3
> (length "blub")
4
> (length (make-array 5))
5

Without the generic length function, you would need to use three separate functions to determine the length of strings, arrays, and lists.

Note

Common Lisp has a specific function for checking the length of lists, called list-length. Because generic functions tend to require extra type-checking to determine the correct behavior, they can be slower to execute. The list-length function is useful for performance-sensitive code, but most Lispers prefer using the generic length function in regular code.

Sequence Functions for Searching

Some sequence functions let you search sequences:

  • find-if finds the first value that satisfies a predicate.

  • count finds out how often a certain object appears in sequence.

  • position tells you where an item is located.

  • some and every tell you if some or every value in a sequence obeys a specific predicate.

Here are some examples:

 > (find-if #'numberp '(a b 5 d))
  5
 > (count #s "mississippi")
  4
 > (position #4 "2kewl4skewl")
  5
 > (some #'numberp '(a b 5 d))
  T
 > (every #'numberp '(a b 5 d))
  NIL

In these examples, we use find-if to find the first number in a sequence, which is the number 5 . We use count to find out how many times the character s appears in "mississippi" . We use position to find at what position the character 4 appears. In this case, it is in the fifth position, starting the count from zero . We use some to see if any items in a sequence are numbers. Indeed, there is a number . Finally, we use every to see if every item in the list is a number, which is not the case .

Sequence Functions for Iterating Across a Sequence

One particularly useful generic sequence function is reduce. The reduce function allows you to iterate through a sequence and distill it down into a single result. Here, we use reduce to add together items in a list:

> (reduce #'+ '(3 4 6 5 2))
20

The sum of those numbers turns out to be 20. Here is a diagram that shows exactly what is happening in this example:

image with no caption

On the right side, shown in gray, is our list. On the left side, you can see the pairs of numbers that are fed into the plus (+) function and the intermediate results that are calculated. This shows that the plus function always receives a single intermediate result as well as the next number in the list as its arguments. The only exception to this is in the very first call to the plus function. Since no intermediate result exists when we start, the first time we call the plus function, we promote the number 3, which is at the start of the list, into our intermediate result column. Therefore, the first time the plus function is called, it actually receives two items straight off the top of the list.

Let’s look at a slightly more complicated example, this time using our own reduction function. We’re going to find the largest even number in the list:

 > (reduce (lambda (best item)
             (if (and (evenp item) (> item best))
                 item
               best))
            '(7 4 6 5 2)
           :initial-value 0)
  6

Our reduction function, which we pass to reduce to distill down our answer from the list, has two arguments . The first argument is the best value we’ve found so far—in other words, the largest even number we’ve found so far. The second argument is the next number from the list.

Our reduce function needs to return as a result the new best number. Therefore, if the latest number is better than the previous best , we return it . Otherwise, we return the previous best .

Remember that the first number in the list we’re reducing will be used as a starting value. If this is a problem, we can instead pass an explicit initial value to the reduce function by passing in a keyword parameter named :initial-value .

Specifying an initial value for the reduce function is often necessary, or a bug can sneak into your code. In our example, it could allow an odd number at the front of the list to erroneously be deemed the best large even number. Let’s see what happens if we leave out the initial value.

> (reduce (lambda (best item)
            (if (and (evenp item) (> item best))
                item
              best))
          '(7 4 6 5 2))
7

Yes, things go horribly, horribly wrong, as a result of not specifying an initial reduce value.

Another great benefit of the reduce function is that it is generic, as is true for all these sequence functions. This means that it can reduce lists, arrays, or strings in exactly the same way, and you can use reduce to write functions that are oblivious to the difference between these different sequence types.

Earlier, I mentioned that it would be convenient to be able to write a single function that could sum together numbers in lists or arrays equally well. Now we can write such a function:

> (defun sum (lst)
     (reduce #'+ lst))
SUM
> (sum '(1 2 3))
6
> (sum (make-array 5 :initial-contents '(1 2 3 4 5)))
15
> (sum "blablabla")
Error: The value # is not of type NUMBER.
image with no caption

sum is blissfully unaware of the difference between arrays and lists; it works on both. However, since addition doesn’t make any sense for characters, the sum function returns an error when used on a string.

Another function that is useful for iterating across a sequence is the map function. This function is identical in behavior to mapcar. However, unlike mapcar, the map function works on all sequence types, not just lists. You specify the type of sequence to return from the mapping by passing an extra argument to the map function.

Here is an example of map:

 > (map 'list
         (lambda (x)
            (if (eq x #s)
                 #S
                 x))
         "this is a string")
 (#	 #h #i #S #  #i #S #  #a #  #S #	 #
 #i #
 #g)

In this example, we’re turning every s character in a string to its uppercase version. The mapping function we pass into map simply checks if the current character is an s and returns the uppercase S if it is .

The result of this calculation is a list of characters . This is because we told the map function that we wanted a list as a result . Had we asked for a string instead, a string would have been our result.

Two More Important Sequence Functions

The subseq function lets you pull a subsequence out of a larger sequence by specifying starting and ending points:

> (subseq "america" 2 6)
"eric"

As you can see, the word america contains the name eric, starting from the second character and ending at the sixth character.

The sort function lets you pass it an arbitrary function to use for the sorting. In this case, we’re just using the less-than (<) function:

> (sort '(5 8 2 4 9 3 6) #'<)
(2 3 4 5 6 8 9)

There are many more sequence functions than we’ve discussed so far, but the examples in this chapter will get you off to a good start.

Note

For a comprehensive list of sequence functions, and indeed all Common Lisp functions, visit the Common Lisp Hyperspec at http://www.snipurl.com/rz3h0—an exhaustive, but daunting, description of all Common Lisp has to offer.

Creating Your Own Generic Functions with Type Predicates

Common Lisp, like virtually all other Lisps, is a dynamically typed language. This means that parameters or variables in your code can hold any type of data—symbols, strings, numbers, functions, or whatever else you want to place in them. In fact, the same parameter or variable can even hold different types of data at different times in a running program.

Therefore, it makes sense to have a bunch of functions that tell you whether a variable has a certain type of data in it. For instance, you can check whether you have a number with numberp:

> (numberp 5)
T

The type predicates you will probably use most frequently are arrayp, characterp, consp, functionp, hash-table-p, listp, stringp, and symbolp.

You can use type predicates to write functions that handle different types of data generically. Suppose we wanted to write a function that lets us add both numbers or lists. Here’s one way we could write such a function:

> (defun add (a b)
     (cond ((and (numberp a) (numberp b)) (+ a b))
           ((and (listp a) (listp b)) (append a b))))
ADD
> (add 3 4)
7
> (add '(a b) '(c d))
(A B C D)

In this add function, we use predicates to see if the arguments passed in are numbers or lists, and then we act appropriately. If we aren’t given two numbers or two lists, it simply returns nil.

Although you can write functions supporting multiple types of data using type predicates, most Lispers wouldn’t write an add function this way, for the following reasons:

A single, monolithic function for all types:

This is fine for just two types, but if we wanted to handle a dozen or more types, our function would quickly turn into a giant monstrosity.

Modifications required to accommodate new cases:

We would need to change the add function whenever we want to support a new type, increasing the chance that we would break existing code. Ideally, we would like to handle each new situation by itself without touching already working code.

Hard to understand:

It is hard to see exactly what the main cond statement is doing and if the types are all being routed to the right place.

Performance:

The resulting function might be slow. For instance, a Lisp interpreter/compiler might be able to create faster code for appending two lists if it knew for sure that both items were lists when the appending happens. However, in our first attempt at the add function, the type of the two arguments is never really completely obvious. Our compiler would need a bit of smarts to be able to tell from the condition (and (listp a) (listp b)) that both variables are guaranteed to be lists. Life would be easier for the compiler if we explicitly stated the types of arguments for each type situation.

Because it is so useful to be able to have a single function that does different things when given certain datatypes, the Common Lisp command defmethod lets us define multiple versions of a function that each supports different types. When that function is called, Lisp checks the argument types at the time of the call and chooses the correct version of the function automatically. The proper term for having a compiler/interpreter choose among different versions of a function based on argument types is type dispatching.

Here’s how we would write our add function using defmethod:

> (defmethod add ((a number) (b number))
     (+ a b))
ADD
> (defmethod add ((a list) (b list))
     (append a b))
ADD
> (add 3 4)
7
> (add '(a b) '(c d))
(A B C D)

As you can see, this version of the add function handles every type of situation with a separate function, and new cases can be added without modifying existing code. Overall, the code is much easier to understand. Also, the compiler can see the type of the parameters and may be able to write faster code using this knowledge.

The defmethod function is like defun, except that it allows us to write multiple functions with the same name. When using defmethod, we can explicitly state the type of each parameter in the function’s argument list so that Lisp can use these type declarations to figure out the correct version of add for each situation.

If you’re familiar with the world of OOP, the word method probably has a special meaning to you. Since this new command is called defmethod, does it have anything to do with OOP? In short, yes. This command can be used not only with Common Lisp’s built-in types, but also with structures you’ve created with defstruct. The combination of defstruct and defmethod basically constitutes a simple object system.

Now we’ll use this object system to write a game!

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.116.36.194