Pattern 20Customized Control Flow

Intent

To create focused, custom-control flow abstractions

Overview

Using the right control flow abstraction for the job can help us write clearer code. For instance, Ruby includes an unless operator, which can be used to do something unless a conditional is true. Good Ruby code uses this operator over if and the not operator, since it’s clearer to read.

No language has every useful control flow abstraction built in, though. Functional programming languages give us a way to create our own using higher-order functions. For instance, to create a control flow structure that executes a piece of code n times and prints out the average time for the runs, we can write a function that takes another function and invokes it n times.

However, just using higher-order functions leaves us with a verbose syntax for our custom control flow. We can do better. In Clojure we can use the macro system, and in Scala we’ve got a bag of tricks that include blocks and by name parameters.

Sample Code: Choose One of Three

Let’s start off with a look at a basic custom control structure, choose, which chooses between three different options. We’ll explore two different implementations: the first will use higher-order functions and the second will explore how we can improve on our first solution by providing some syntactic sugar.

In Scala

Our choose function takes an integer between 1 and 3 and three functions. It then executes the corresponding function, as the following code shows:

ScalaExamples/src/main/scala/com/mblinn/mbfpp/functional/ccf/Choose.scala
 
def​ choose[E](num: ​Int​, first: () => E, second: () => E, third: () => E) =
 
if​ (num == 1) first()
 
else​ ​if​ (num == 2) second()
 
else​ ​if​ (num == 3) third()

This is a straightforward use of higher-order functions. Let’s take a look at how we’d use it:

 
scala>​ simplerChoose(2,
 
| () => println("hello, world"),
 
| () => println("goodbye, cruel world"),
 
| () => println("meh, indifferent world"))
 
goodbye, cruel world

It works as we’d expect; however, the need to wrap our actions up in functions is cumbersome. A better syntax would be if we could just pass naked expressions into the choose, as we do in the following imaginary REPL session:

 
scala>​ simplerChoose(2,
 
| println("hello, world"),
 
| println("goodbye, cruel world"),
 
| println("meh, indifferent world"))
 
goodbye, cruel world

Let’s see how to make this syntax real, starting with a very simple case. In the following REPL output, we define a test function with a single argument, expression. The body of the function just attempts to execute the expression. We then call test with the single argument println("hello, world").

 
scala>​ def test[E](expression: E) = expression
 
test: (expression: Unit)Unit
 
 
scala>​ test(println("hello, world"))
 
hello, world

It appears that this works and our expression is evaluated, since "hello, world" is printed to the console. But what happens if we try to execute our expression twice? Let’s find out in the following REPL snippet:

 
scala>​ def testTwice[E](expression: E) = {
 
| expression
 
| expression
 
| }
 
testTwice: (expression: Unit)Unit
 
 
scala>​ testTwice(println("hello, world"))
 
hello, world

The string "hello, world" is only printed to the console once! This is because Scala will, by default, evaluate an expression at the time it’s passed into a function and then pass in the value of the evaluated expression. This is known as pass by value, and it’s usually what we want and expect. For instance, in the following example, it prevents the expression from being evaluated twice:

 
scala>​ def printTwice[E](expression: E) = {
 
| println(expression)
 
| println(expression)
 
| }
 
printTwice: [E](expression: E)Unit
 
 
scala>​ printTwice(5 * 5)
 
25
 
25

However, this is the opposite of what we need when writing custom control structures. Scala gives us an alternative calling semantic called pass by name. Using pass by name means that we pass a name for the expression into the function rather than the evaluated value of the expression. We can then refer to that name inside the function body to have it be evaluated on demand.

To make a function argument pass by name rather than by value, we can use => after the parameter name and before the type annotation. The following REPL snippet rewrites our test function to use pass-by-name calling:

 
scala>​ def testByName[E](expression: => E) {
 
| expression
 
| expression
 
| }
 
testByName: [E](expression: => E)Unit
 
 
scala>​ testByName(println("hello, world"))
 
hello, world
 
hello, world

Now that we understand the difference between pass by value and pass by name, we can write a simplerChoose function that takes naked expressions. The following code snippet does so:

ScalaExamples/src/main/scala/com/mblinn/mbfpp/functional/ccf/Choose.scala
 
def​ simplerChoose[E](num: ​Int​, first: => E, second: => E, third: => E) =
 
if​ (num == 1) first
 
else​ ​if​ (num == 2) second
 
else​ ​if​ (num == 3) third

Now we can use our naked expression syntax, as in the following REPL output:

 
scala>​ simplerChoose(2,
 
| println("hello, world"),
 
| println("goodbye, cruel world"),
 
| println("meh, indifferent world"))
 
goodbye, cruel world

Clojure’s approach to custom control flow is quite different and involves its powerful macro system. Let’s take a look!

In Clojure

Let’s start off our Clojure sample with a look at a simple version of choose that relies on higher-order functions. We take three functions and an integer indicating which one to run, as the following code shows:

ClojureExamples/src/mbfpp/functional/ccf/ccf_examples.clj
 
(​defn​ choose [​num​ ​first​ ​second​ third]
 
(​cond
 
(​=​ 1 ​num​) (​first​)
 
(​=​ 2 ​num​) (​second​)
 
(​=​ 3 ​num​) (third)))

To use it, we pass in our integer and function arguments:

 
=> (choose 2
 
(fn [] (println "hello, world"))
 
(fn [] (println "goodbye, cruel world"))
 
(fn [] (println "meh, indifferent world")))
 
goodbye, cruel world
 
nil

However, we’d like to avoid the need to wrap our actions up into functions and instead write code that looks like the following REPL session:

 
=> (choose 2
 
(println "hello, world")
 
(println "goodbye, cruel world")
 
(println "meh, indifferent world"))
 
goodbye, cruel world
 
nil

To see how we can get there, we’ll need to take a short detour into one of Clojure’s most powerful features, its macro system. Along the way, we’ll answer the age-old question of why Lisp has such a different syntax.

Clojure Macros

Macros are a form of metaprogramming: they are pieces of code that transform other pieces of code. This concept has surprising depth in Clojure and other Lisps.

To see why, let’s do a thought experiment. The builder we introduced in Pattern 4, Replacing Builder for Immutable Object, is verbose to write. One way to cut down on verbosity is to create a skeletal Java class with nothing but attributes in it, and then write code to generate the builder based on those attributes.

A block diagram of this approach is described in the following figure:

images/GenerateBuilders.png

Figure 15. Metaprogramming in Java. Code generating a Builder class

Here, the builder creator is a piece of code that’s responsible for taking in a skeletal Java class with nothing but attributes and producing a builder based on it. This is much like the support that IDEs have to generate getters and setters.

To do so, the builder creator needs some understanding of the input Java code. For such a simple task, the builder creator can just treat the file as text and read the input file line by line, figuring out which lines correspond to variable declarations as it goes.

However, what if we needed to manipulate our input Java code in a more complex way? Say we wanted to modify certain methods to log out the time they were invoked. This would be difficult to do: how do we know when a method starts and ends if we’re just going through the file line by line?

The difficulty is that our simple code generator is treating the Java file as plain text. Complicated language applications like compilers will go through a series of passes to generate an abstract syntax tree or AST. The following diagram is a simplified representation of this process.

images/SimpleParser.png

Figure 16. Simplified Compiler. Stages in a simplified compiler

The AST represents code at a more abstract level in terms of things like methods and classes, rather than as simple text data. For instance, a Java compiler written in Java might have Method and VariableDefinition classes as parts of its AST, among other things.

This makes the AST representation of code the most convenient representation to manipulate programmatically. However, in most programming languages, the AST is hidden away inside the compiler and requires intimate knowledge of the compiler to manipulate.

Lisps, including Clojure, are different. The syntax of Clojure is defined in terms of core Clojure data structures, like lists and vectors. For instance, let’s take a close look at a humble function definition:

 
(​defn​ say-hello [​name​] (​println​ (​str​ ​"Hello, "​ ​name​)))

This is just a list with four elements in it. The first is the symbol defn, the second is the symbol say-hello, the third is a vector, and the fourth is another list. When Clojure evaluates a list, it assumes that the first element is something that can be called, like a function, a macro, or a compiler built-in, and it assumes that the rest of the list is made of arguments.

Other than that, it’s just a list like any other! We can see this by using a single quote, which turns off evaluation on the form it’s applied to. In the following snippet we take the first element from two lists—the first list is a list of four integers, the second is the function definition we just introduced:

 
=> (first '(1 2 3 4))
 
1
 
=> (first '(defn say-hello [name] (println (str "Hello, " name))))
 
defn

Since Clojure code is just Clojure data, it’s very easy to write code to manipulate it. Clojure’s macro system provides a convenient hook to do this manipulation at compile time, as the figure shows.

images/LispReader.png

Figure 17. Read, Evaluate, Compile. Going from Clojure text to bytecode

Let’s dig into this process in a bit more depth. In Clojure, the process of going from a sequence of characters to data structures is called reading, as described in the first box in the diagram. Instead of being some magic hidden away inside of the compiler, it’s a facility that’s available to the programmer.

Some forms are available that will read from various sources, such as files or strings. Here, we use the string version of read to read a vector from a string and take its first element:

 
=> (first (read-string "[1 2 3]"))
 
1

The read form has a partner, eval, as shown in the second step of the diagram. This takes a data structure and evaluates it according to a simple set of evaluation rules we discuss in TinyWeb in Clojure. In the following snippet, we use eval to evaluate a def after we’ve read it in from a string:

 
=> (eval (read-string "(def foo 1)"))
 
#'user/foo
 
=> foo
 
1

You may have seen eval in languages like Ruby or Javascript; however, there’s a crucial difference between that eval and the one Clojure has. In Clojure and other Lisps, eval operates on data structures that have been read in, rather than on strings.

This means it’s possible to do much more sophisticated manipulations, since we don’t have to build up our code using raw string manipulation. The macro expansion step as described in the diagram provides a convenient hook for us to do exactly this.

A macro is just a function with a few key differences. The arguments to a macro are not evaluated, just like the call-by-name arguments we used in Scala in In Scala. A macro is run before compile time, and it returns the code to be compiled. This gives us a formal, built-in way of doing the sort of manipulations we introduced in our Java builder-generator thought experiment.

We define a macro with the defmacro built in. In addition, a few other Clojure features help us build macros by controlling evaluation. These are the backtick, , also known as the syntax quote, and the tilde, ~, also known as unquote.

Together, these features let us build up code templates for use in macros. Syntax quote turns evaluation off inside the form it’s applied to, and it expands any symbol name out to be fully qualified by its namespace. Unquote, as the name suggests, lets us turn evaluation back on inside a syntax quote.

In the following snippet, we use syntax quote and unquote together. The symbol foo and form (+ 1 1) don’t get evaluated, but since we apply unquote to number-one, it does.

 
=> (def number-one 1)
 
#'mbfpp.functional.ccf.ccf-examples/number-one
 
=> `(foo (+ 1 1) ~number-one)
 
(mbfpp.functional.ccf.cff-examples/foo (clojure.core/+ 1 1) 1)

The output looks a bit noisy because syntax quote has namespace-qualified foo and +.

Now that we’ve introduced macros, let’s see how we can use them to simplify choose. We do so by writing a macro, simplerChoose. The simplerChoose macro takes in a number and three forms, and it returns a cond expression that evaluates the appropriate form. The code for simplerChoose is in the following snippet:

ClojureExamples/src/mbfpp/functional/ccf/ccf_examples.clj
 
(​defmacro​ simpler-choose [​num​ ​first​ ​second​ third]
 
`(​cond
 
(​=​ 1 ~​num​) ~​first
 
(​=​ 2 ~​num​) ~​second
 
(​=​ 3 ~​num​) ~third))

Before running it, we can use macroexpand-1 to see what code the macro generates, as we do in the following REPL session:

 
=> (macroexpand-1
 
'(simpler-choose 1 (println "foo") (println "bar") (println "baz")))
 
(clojure.core/cond
 
(clojure.core/= 1 1) (println "foo")
 
(clojure.core/= 2 1) (println "bar")
 
(clojure.core/= 3 1) (println "baz"))

As we can see, the macro expands out to a cond statement, as we’d expect. Now if we run it, it works as we’d expect, without the need to wrap our actions up in functions!

 
=> (simpler-choose 2
 
(println "hello, world")
 
(println "goodbye, cruel world")
 
(println "meh, indifferent world"))
 
goodbye, cruel world
 
nil

Clojure’s macro system is one of its most powerful features, and it explains why Clojure has the syntax it does. In order for the magic to work, Clojure code has to be written in terms of simple Clojure data structures, a property known as homoiconicity.

Sample Code: Average Timing

Let’s take a look at a more involved instance of Customized Control Flow. Here we’ll create a custom control abstraction that executes an expression a given number of times and returns the average time of the executions. This is handy for quick and dirty performance testing.

In Scala

In Scala, our solution is two functions. The first, timeRun, takes an expression, runs it, and returns the time it took. The second, avgTime, takes an expression and a number of times to evaluate it and then returns the average time it took. It uses timeRun as a helper function.

The code for our Scala solution follows:

ScalaExamples/src/main/scala/com/mblinn/mbfpp/functional/ccf/Choose.scala
 
def​ timeRun[E](toTime: => E) = {
 
val​ start = System.currentTimeMillis
 
toTime
 
System.currentTimeMillis - start
 
}
 
def​ avgTime[E](times: ​Int​, toTime: => E) = {
 
val​ allTimes = ​for​ (_ <- Range(0, times)) ​yield​ timeRun(toTime)
 
allTimes.sum / times
 
}

As advertised, this gives us a way to get the average runtime for a statement:

 
scala>​ avgTime(5, Thread.sleep(1000))
 
res0: Long = 1001

Let’s break this down a bit more using the REPL. The meat of avgTime is the following expression:

 
val​ allTimes = ​for​ (_ <- Range(0, times)) ​yield​ timeRun(toTime)

If we substitute some expressions in by hand, we can see this generates a sequence of run times. The underscore in the for binding indicates that we don’t actually care about what the values of the Range expression are bound to, since we’re just using it to run our statement a set number of times:

 
scala>​ val allTimes = for (_ <- Range(0, 5)) yield timeRun(Thread.sleep(1000))
 
allTimes: scala.collection.immutable.IndexedSeq[Long] =
 
Vector(1000, 1001, 1000, 1001, 1001)

From there, we use sum to calculate the sum of all runtimes, and we divide by the number of runs to get the average:

 
scala>​ allTimes.sum / 5
 
res2: Long = 1000

One other interesting element of this solution is how we pass a by-name parameter through two different functions. The toTime parameter is passed into avgTime, and from there into timeRun. It’s not evaluated until it’s used in timeRun.

The ability to chain together calls using by-name parameters is important because it lets us break up the code for more complicated instances of Customized Control Flow.

In Clojure

In Clojure, our solution consists of a macro, avg-time, and a function, time-run. The avg-time macro generates code that uses time-run to time runs of the passed-in statement and then calculate its average.

The code for our Clojure solution follows:

ClojureExamples/src/mbfpp/functional/ccf/ccf_examples.clj
 
(​defn​ time-run [to-time]
 
(​let​ [start (System/currentTimeMillis)]
 
(to-time)
 
(​-​ (System/currentTimeMillis) start)))
 
 
(​defmacro​ avg-time [times to-time]
 
`(​let​ [total-time#
 
(​apply​ ​+​ (​for​ [_# (​range​ ~times)] (time-run (​fn​ [] ~to-time))))]
 
(​float​ (​/​ total-time# ~times))))

Here, we use it to calculate the average time for a test statement:

 
=> (avg-time 5 (Thread/sleep 1000))
 
1000.8

Let’s dig into how time-run works in a bit more detail, starting with a Clojure feature we introduce in this sample: automatic generated symbols, or gensyms. To avoid accidental variable capture in macros, whenever we need a symbol in our generated code, we need to generate a unique symbol.

The way we do this in Clojure is to append a symbol name with a hash sign when we use one inside of a syntax quote. As the snippet below shows, Clojure will expand the gensym out to a fairly unique symbol:

 
=> `(foo# foo#)
 
(foo__2230__auto__ foo__2230__auto__)

We use gensyms for total-time and _. The second one might seem a little strange, since we’re just using underscore to indicate that we don’t care about the values in range, just as we did in Scala.

If we don’t make it a generated symbol, Clojure will qualify the symbol in the current namespace, but making it a gensym causes Clojure to generate a unique symbol for it. We demonstrate this below:

 
=> `(_)
 
(mbfpp.functional.ccf.ccf-examples/_)
 
=> `(-#)
 
(-__1308__auto__)

Now let’s examine the heart of avg-time. The following syntax-quoted let statement serves as a template for the code that the macro will generate. As we can see, the meat of the solution is a for statement that wraps the expression in to-time in a function and runs it through time-run the requested number of times:

 
`(​let​ [total-time#
 
(​apply​ ​+​ (​for​ [_# (​range​ ~times)]
 
(time-run (​fn​ [] ~to-time))))] (​float​ (​/​ total-time# ~times))))

To test this out, we can use macroexpand-1 to look at the code it generates, as we do in the following REPL session:

 
=> (​macroexpand-1​ '(avg-time 5 (Thread/sleep 100)))
 
(clojure.core/let
 
[total-time__1489__auto__
 
(clojure.core/apply
 
clojure.core/+
 
(clojure.core/for
 
[___1490__auto__ (clojure.core/range 5)]
 
(mbfpp.functional.ccf.cff-examples/time-run
 
(clojure.core/fn [] (Thread/sleep 1000)))))]
 
(clojure.core/float (clojure.core// total-time__1489__auto__ 5)))
 
nil

Since all the symbols are either gensyms or are fully qualified by their namespace, this can be a bit hard to read! If I’m having trouble understanding how a macro works, I like to manually convert the output from macroexpand-1 into the code that I’d write by hand. To do this, you generally just need to remove the namespaces from fully qualified symbols and the generated part of gensyms. I’ve done so in the following code:

 
(​let
 
[total-time
 
(​apply​ ​+​ (​for​ [n (​range​ 5)] (time-run (​fn​ [] (Thread/sleep 100)))))]
 
(​float​ (​/​ total-time 5)))

As you can see, this cleaned-up output is much simpler to understand. I’ve also found that the process of going through the generated code by hand will help any bugs in the macro to surface.

Discussion

Both Scala and Clojure let us create customized control flow abstractions, but the way they go about doing so is very different. In Scala, they’re runtime abstractions. We’re just writing functions and passing statements into them. The trick is that we can control when those statements are evaluated using by-name parameters.

In Clojure we use the macro system, which takes advantage of Clojure’s homoiconic nature. Macros are a compile-time concern rather than a runtime one. As we saw, they allow us to fairly easily write code that writes code by using syntax quote as a template for the code we want to produce.

Clojure’s approach is more general, but that’s only possible because of Clojure’s homoiconicity. In order to approximate Clojure-style macros in a nonhomoiconic language like Scala, the language would have to provide hooks into the compiler that let a programmer manipulate ASTs and other compiler artifacts.

This is a difficult task, but Scala does have experimental support for this sort of compile time macro in Scala 1.10. Using this style of macro is more difficult than using a Clojure-style macro, since it requires some knowledge of compiler internals.

Since Scala macros are experimental, and since Scala provides other ways to implement Customized Control Flow, we won’t cover them here.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.26.221