The functional programming style can lead to slow code, at least in the hands of a novice programmer. We used the functional style to develop the core of Dice of Doom. Hence, this first version of our game is excruciatingly inefficient. We had to limit our game to a 2-by-2 board to make it playable. But now we can increase our board size to 3-by-3, as we optimize our game engine.
Let’s increase the parameters controlling the board size to make this happen. You may not want to play a game at this new size until you’ve implemented all the optimizations throughout the rest of this chapter, unless you are an extremely patient person and don’t mind having the computer take minutes building the initial game tree and deciding on moves.
(defparameter *board-size* 3) (defparameter *board-hexnum* (* *board-size* *board-size*))
There, we’ve upgraded the board size to 3 by 3.
The rest of this chapter covers some important techniques for optimizing functional code. These techniques apply to all programs written in the functional style, which includes Dice of Doom. In later chapters, we’ll add other optimizations. Eventually, we’ll be able to play against an AI player on much more spacious boards, while still having elegant code written in the functional style.
Before we start optimizing Dice of Doom, there is an important Lisp programming concept we need to discuss: closures. Closures are extra bits of data from the outside world that are captured whenever a lambda function is created. To understand the hows and whys of capturing variables in a closure, consider the following example:
>(defparameter *foo* (lambda ()
5))
*FOO* >(funcall *FOO*)
5
In this example, we’re creating a new, unnamed function , and then setting *foo*
equal to this function. Next, we call this function using the funcall
command . As you would expect, the value returned from this function is 5
. All the lambda function does is return this number.
Next, consider this more interesting example:
>(defparameter *foo* (let ((x 5))
(lambda ()
x)))
*FOO*
This version of foo
is exactly the same as the previous version of *foo*
, except that we first declare a local variable x
, which is set to 5
. Then, in the body of the lambda
, we return x
. So, what do you think will happen if we call this new version of *foo*
?
The reason this is a tough question is that x
is declared as a “local” variable. However, x
(apparently) no longer exists once we call *foo*
, since we’re already long past the point where we’re evaluating the body of the let
expression.
Let’s try it out and see what happens:
> (funcall *foo*)
5
Holy cow! Somehow the lambda expression we created remembered what x
was at the time it was created. The variable x
, which we previously thought of as a local variable, has somehow managed to live on past the block in which it was created!
When we first covered let
expressions in Chapter 2, you learned that advanced Lispers prefer to call variables created with a let
expression lexical variables. Now you can see why: A variable created in this way does not need to be local, if it is captured in a closure, by using the variable in a lambda expression.
To understand how closures work, remember that Lisp uses garbage collection. In fact, it was the first language to have this feature. Garbage collection means that you never have to “free” variables (as you do in C programming). The Lisp compiler/interpreter is smart enough to know when variables are no longer in use and destroys them automatically.
Garbage collection will happen at some arbitrary future time after you’ve exited a let
expression. Periodically, Lisp will search its memory for items that are no longer referenced anywhere and can therefore be safely destroyed. If Lisp notices that a variable defined in a let
is no longer used by anything, it will destroy that variable.
However, if you create a lambda expression within the let
expression (as we did in the previously), it’s possible for those variables to live on, being referenced from within the lambda expression. In that case, the garbage collector will leave those variables alone. Basically, you’ve created variables that are permanent—at least as long as the lambda expression doesn’t fall out of use and get garbage collected.
You can do a lot of cool things using closures. They’re often used for caching small pieces of information between uses of a function. For instance, here a function that remembers what line number is currently being printed:
>(let ((line-number 0))
(defun my-print (x)
(print line-number)
(print x)
(incf line-number)
nil))
MY-PRINT >(my-print "this")
0 "this" nil >(my-print "is")
1 "is" nil >(my-print "a")
2 "a" nil >(my-print "test")
3 "test" nil
In order to keep track of the line number, we first create a lexical variable named line-number
. Next, we declare our my-print
function using defun
, in the body of the let
. This command will create a lambda function behind the scenes, therefore letting us also generate a closure.
Within the body of the my-print
function, we can then print the line-number
, and even mutate it using incf
. (incf
just adds one to a variable.) Because the line-number
variable is captured in the closure, it can “live on” between calls to my-print
, allowing us to count line numbers.
The first optimization we’re going to perform is called memoization. This technique makes use of closures. Memoization works only for functions written in the functional style. As you know, the behavior of a function in the functional style depends only on the arguments passed into it. Also, the only action of a function in the functional style is to calculate a value to return to the caller.
This suggests an obvious optimization: What if we remember the arguments and result of each call of this function? Then, if the function ever gets called again with the same arguments, we won’t need to recalculate the result. Instead, we can simply return the precalculated result.
Several functions in Dice of Doom can benefit from memoization.
Let’s start with the neighbors
function, which lets us know which hexagons on the board can be attacked from a given location:
> (neighbors 0)
(3 1 4)
What neighbors
is telling us is that if we want to attack other hexagons on the board from hexagon 0, we can reach only hexagon 3, 1, or 4 (based on our new 3-by-3 board size).
As you may remember, the neighbors
function needed to do all kinds of ugly checking for the edges of the board, since hexagons along the edges are limited in the hexagons they can attack. However, since the shape of the board never changes mid-game, these numbers never change for a given board position. This makes neighbors
a perfect candidate for memoization! Here is the code that accomplishes this:
(let ((old-neighbors (symbol-function 'neighbors)) (previous (make-hash-table))) (defun neighbors (pos) (or (gethash pos previous) (setf (gethash pos previous) (funcall old-neighbors pos)))))
Let’s dissect this code to make sense of what’s happening. First, we save the old version of the neighbors
function in a local variable named old-neighbors
. The symbol-function
command simply retrieves the function bound to a symbol. Using symbol-function
here allows us to retain access to the old value of neighbors
, even if we define a new function with the same name, as we’ll do shortly.
Next, we define a local variable previous
, which will hold all previous arguments and results the function has ever seen. This can be represented as a hash table, where the arguments are the hash key and the results are the values.
Now we define a new neighbors
function that will override the old definition of neighbors
. This new definition will add memoization to the old version of the function. Then we look up the argument pos
in the hash table and return it, if available . Otherwise, we call the old definition of the function (that’s why we needed to create the old-neighbors
lexical variable) and add this new argument/result pair to the hash table . Since setf
returns the value being set, this command will also cause this newly calculated result to be returned to the caller of neighbors
.
Be careful not to declare the memoized version of the neighbors
function more than once, without also redeclaring the original version of the function. Otherwise, the neighbors
function will be wrapped in multiple unsightly layers of memoization, since there are no checks if the memoization has already been done.
The biggest payoff by far for memoization in our program will be in the game-tree
function. This makes sense, if you think about how a board game works. Very often, you can get the same board positions in a board game by performing the same moves in a slightly different order. In our naive version of the game-tree
function, every different move sequence leads to a completely different branch in the game tree that we need to build in a totally repetitive and inefficient way.
In the memoized version of the game-tree
code, the function can say to itself, “Hey, I’ve seen that board position before!” and can then share branches of the game tree. Here is a memoized version of game-tree
that does this:
(let ((old-game-tree (symbol-function 'game-tree)) (previous (make-hash-table :test #'equalp))) (defun game-tree (&rest rest) (or (gethash rest previous) (setf (gethash rest previous) (apply old-game-tree rest)))))
As you can see, this memoization is virtually identical to the one we used for the neighbors
function. The only difference is that we’re setting the hash table to use equalp
instead of eql
(the default) for the test on the key .
This is because the key (that is, the arguments to game-tree
) contains the game board, in the form of an array. If we change the test function to be equalp
, then Lisp will check every hexagon on the board and make sure it matches before using a previous calculation.
Another function that will benefit greatly from memoization is the rate-position
function. Here it is, memoized:
(let ((old-rate-position (symbol-function 'rate-position)) (previous (make-hash-table))) (defun rate-position (tree player) (let ((tab (gethash player previous))) (unless tab (setf tab (setf (gethash player previous) (make-hash-table)))) (or (gethash tree tab) (setf (gethash tree tab) (funcall old-rate-position tree player))))))
We need to do something a bit special for the memoization on this function to work correctly, because of the tree
argument passed into rate-position. The game tree is potentially huge, so we need to make sure we never compare a game tree object with equal
(or a similar comparison function that is slow with large lists). Instead, we want to compare it with eql
. Because of this, we handle the memoization of each of the two parameters to rate-position
(tree
and player
) separately. We accomplish this by having nested hash tables.
First, we create an outer hash table with the default eql
test . Then, we define a tab
variable that looks up one of our variables (player
) in the outer hash table , to retrieve an inner hash table. If tab
is not found in the outer hash table , we’ll create a new, empty inner hash table, storing it in the outer hash table with the same key . The rest of the function is similar to our previous examples, except that we’re now using our inner hash table, with the tree
argument as a key .
This memoization will bring us a step closer to having larger, and more fun, boards for Dice of Doom.
You use memoization for optimizing the performance of code written in the functional style. However, memoization code is not, in itself, written in the functional style. It cannot be, since it requires you to maintain and update a table of previous calls to the target function.
The next technique we’re going to use to optimize our functional program is called tail call optimization. To understand this concept, let’s study a simple function that calculates the length of a list:
>(defun my-length (lst)
(if lst
(1+ (my-length (cdr lst)))
0))
MY-LENGTH >(my-length '(fie foh fum))
3
The my-length
function should be pretty easy for you to understand at this point. First, it checks if the list is empty . If not, it recursively calls itself against the tail of the list and adds one to the total, using the 1+
function . If the list is empty, the function just returns 0
.
It turns out that this function is actually quite inefficient. We can easily see this by trying to use it against a really big list:
>(defparameter *biglist* (loop for i below 100000 collect 'x))
*BIGLIST* >(my-length *biglist*)
*** - Program stack overflow. RESET
Calling this function in CLISP actually causes the program to crash! (Other Common Lisp compilers/interpreters may do better, depending on whether the compiler writers use any special tricks to anticipate this common pitfall in Lisp code.)
This happens because of the 1+
function. It tells Lisp, “First, figure out the length of the shorter list, then call 1+
on the result.”
The problem is that each time we call my-length
recursively, Lisp must remember that we need to add one to the result later on, once the length of the tail of the list has been figured out. Since the list is 100,000 items long, it must remember this 99,999 times before it can perform a single addition! The CLISP interpreter places a reminder for all of these additions on the program stack, which eventually overflows, crashing the program.
So how do we avoid this problem? We do it by rewriting our my-length
function like so:
>(defun my-length (lst)
(labels ((f (lst acc)
(if lst
(f (cdr lst) (1+ acc))
acc)))
(f lst 0)))
MY-LENGTH >(my-length '(fie foh fum))
3
Here, we define a local function f
that will act as our list-eater. This function takes an extra parameter, often called an accumulator
, here shortened to acc
. This acc
argument keeps a running count of how many items in the list we have previously encountered. When we initially call the function f
, we set acc
to 0
.
By making this accumulator available, it means that when f
calls itself recursively , it now longer needs to add one to the result. Instead, it just adds one to the accumulator. Once we reach the end of the list (lst
is nil
), then acc
will equal the total number of items in the list, so we can just return it .
What is important here is that the very last thing the function f
does, in the case where more items are on the list, is call itself recursively . (The additional line in the if
statement doesn’t count, since that part won’t be called if the expression evaluates to true.) When a function in Lisp calls itself (or another function) as its very last action, we call this action a tail call. A smart Lisp compiler, when seeing a tail call, can then say to itself, “Hey, since I don’t need to do anything more after calling f
again, I can just go straight to f
, without needing to put the current program context on the stack.”
This is actually similar to performing a GOTO
in BASIC or a longjmp
in C++. In all of these cases, we just “forget” where we came from, which is very fast and doesn’t thrash the stack. However, in the case of a tail call in Lisp, it is also perfectly safe. Anyone who has used GOTO
or longjmp
knows they’re anything but safe!
Notice that there are two different definitions for lst
that exist in the preceding example code. One is an argument to the my-length
function, and the other is an argument to the function f
. The values of these two lst arguments will deviate as the program runs and f
is called recursively. However, within the function f
, the version in its own argument list will take precedence. This process of hiding one variable with another through precedence is called variable shadowing.
I used variable shadowing in the my-length
function so it would be impossible for me to accidentally use the “wrong list” when writing the code inside of function f
. Other programmers dislike this technique, since having similarly named variables with different values can lead to confusion. You’ll need to decide which of these arguments is most convincing to you and whether you’ll use variable shadowing in your own code.
Unfortunately, you can’t be 100 percent sure in Common Lisp that a compiler/interpreter will perform tail call optimizations. It is not required by the ANSI Common Lisp standard. (The situation is actually different in the Scheme dialect, since Scheme has a strict requirement for tail call optimization.)
However, most Common Lisp compilers support this feature, although CLISP requires some extra cajoling to make tail call optimization work for some functions, including our example function. The reason for this is that tail calls can actually lead to performance problems themselves, in some esoteric cases. Also, when we debug a program, it’s nice to be able to look at the full call stack; tail call optimizations will prevent this, since, by their nature, they will minimize the information available on the stack.
Here’s the extra step we need to take to get CLISP to tail call optimize the my-length
function:
(compile 'my-length)
Calling this function will tell CLISP to run the my-length
function through its full compiler, which includes a tail code optimization step. Now we can run my-length
against our jumbo-sized list!
> (my-length *biglist*)
100000
One function in our game that could definitely benefit from tail call optimization is the add-new-dice
function. Here’s the fully optimized version:
(defun add-new-dice (board player spare-dice) (labels ((f (lst n acc) (cond ((zerop n) (append (reverse acc) lst)) ((null lst) (reverse acc)) (t (let ((cur-player (caar lst)) (cur-dice (cadar lst))) (if (and (eq cur-player player) (< cur-dice *max-dice*)) (f (cdr lst) (1- n) (cons (list cur-player (1+ cur-dice)) acc)) (f (cdr lst) n (cons (car lst) acc)))))))) (board-array (f (coerce board 'list) spare-dice ()))))
As before, we’re performing the list-eating in a function called f
, which also has an accumulator. However, this time the acc
variable will contain a list of newly updated hexagons with extra dice. We can now call f
in tail call positions in two places , where we cons
new hexagons to the acc
variable.
Once we’ve processed the whole list of hexagons on the board, we can just return acc
. However, since we’ve consed stuff to acc
as we went along the list, acc
will actually be reversed. Therefore, we need to perform an extra call to reverse
at the very end .
We have now explored some basic techniques for optimizing computer programs written in the functional style.
Now let’s enjoy the fruits of our labor. The following is a full game against the AI player on a 3-by-3 board. As you can see, on an evenly matched starting board, the computer is now practically unbeatable.
>(play-vs-computer (game-tree (gen-board) 0 0 t))
current player = a b-1 a-2 a-3 a-1 b-1 b-2 b-2 a-2 b-3 choose your move: 1. 1 -> 4 2. 1 -> 0 3. 2 -> 5 4. 7 -> 43
current player = a b-1 a-2 a-1 a-1 b-1 a-2 b-2 a-2 b-3 choose your move: 1. end turn 2. 1 -> 4 3. 1 -> 0 4. 5 -> 4 5. 7 -> 41
current player = b b-1 a-3 a-1 a-1 b-1 a-2 b-2 a-2 b-3 current player = b b-1 a-3 a-1 b-1 b-1 a-2 b-1 a-2 b-3 current player = a b-1 a-3 a-1 b-1 b-1 a-2 b-1 a-2 b-3 choose your move: 1. 1 -> 4 2. 1 -> 0 3. 5 -> 4 4. 7 -> 4 5. 7 -> 3 6. 7 -> 61
current player = a b-1 a-1 a-1 b-1 a-2 a-2 b-1 a-2 b-3 choose your move: 1. end turn 2. 4 -> 0 3. 4 -> 3 4. 7 -> 3 5. 7 -> 61
current player = b b-1 a-1 a-1 b-1 a-2 a-2 b-1 a-2 b-3 current player = b b-1 a-1 a-1 b-1 a-2 b-2 b-1 a-2 b-1 current player = a b-2 a-1 a-1 b-1 a-2 b-2 b-1 a-2 b-1 choose your move: 1. 4 -> 3 2. 4 -> 8 3. 7 -> 3 4. 7 -> 6 5. 7 -> 82
current player = a b-2 a-1 a-1 b-1 a-1 b-2 b-1 a-2 a-1 choose your move: 1. end turn 2. 7 -> 3 3. 7 -> 61
current player = b b-2 a-1 a-1 b-1 a-1 b-2 b-1 a-2 a-1 current player = b b-1 b-1 a-1 b-1 a-1 b-2 b-1 a-2 a-1 current player = a b-1 b-1 a-1 b-1 a-1 b-2 b-1 a-2 a-1 choose your move: 1. 7 -> 3 2. 7 -> 61
current player = a b-1 b-1 a-1 a-1 a-1 b-2 b-1 a-1 a-1 choose your move: 1. end turn1
current player = b b-1 b-1 a-1 a-1 a-1 b-2 b-1 a-1 a-1 current player = b b-1 b-1 b-1 a-1 a-1 b-1 b-1 a-1 a-1 current player = a b-1 b-1 b-1 a-1 a-1 b-1 b-1 a-1 a-1 The winner is b
3.143.17.27