Where’s My for Loop?

Clojure has no for loop and no direct mutable variables. Clojure provides indirect mutable references, but these must be explicitly called out in your code. See Chapter 6, State and Concurrency for details. So how do you write all that code you’re accustomed to writing with for loops?

Rather than create a hypothetical example, we decided to grab a piece of open source Java code (sort of) randomly, find a method with some for loops and variables, and port it to Clojure. We opened the Apache Commons project, which is very widely used. We selected the StringUtils class in Commons Lang, assuming that such a class would require little domain knowledge to understand. We then browsed for a method that had multiple for loops and local variables and found indexOfAny:

 // From Apache Commons Lang, http://commons.apache.org/lang/
 public​ ​static​ ​int​ indexOfAny(String str, ​char​[] searchChars) {
 if​ (isEmpty(str) || ArrayUtils.isEmpty(searchChars)) {
 return​ -1;
  }
 for​ (​int​ i = 0; i < str.length(); i++) {
 char​ ch = str.charAt(i);
 for​ (​int​ j = 0; j < searchChars.length; j++) {
 if​ (searchChars[j] == ch) {
 return​ i;
  }
  }
  }
 return​ -1;
 }

indexOfAny walks str and reports the index of the first char that matches any char in searchChars, returning -1 if no match is found.

Here are some example results from the documentation for indexOfAny:

 StringUtils.indexOfAny(​null​, *) = -1
 StringUtils.indexOfAny(​""​, *) = -1
 StringUtils.indexOfAny(*, ​null​) = -1
 StringUtils.indexOfAny(*, []) = -1
 StringUtils.indexOfAny(​"zzabyycdxx"​,[​'z'​,​'a'​]) = 0
 StringUtils.indexOfAny(​"zzabyycdxx"​,[​'b'​,​'y'​]) = 3
 StringUtils.indexOfAny(​"aba"​, [​'z'​]) = -1

Two ifs, two fors, three possible points of return, and three mutable local variables are in indexOfAny, and the method is 14 lines long, as counted by David A. Wheeler’s SLOCCount.[18]

Now let’s build a Clojure index-of-any, step by step. If we just wanted to find the matches, we could use a Clojure filter. But we want to find the index of a match. So we create indexed, a function that takes a collection and returns an indexed collection:

 (​defn​ indexed [coll] (map-indexed vector coll))

indexed returns a sequence of pairs of the form [idx elt]. Try indexing a string:

 (indexed ​"abcde"​)
 -> ([0 ​a​] [1 ​​] [2 ​c​] [3 ​d​] [4 ​e​])

Next, we want to find the indices of all the characters in the string that match the search set.

Create an index-filter function that is similar to Clojure’s filter but that returns the indices instead of the matches themselves:

 (​defn​ index-filter [pred coll]
  (when pred
  (​for​ [[idx elt] (indexed coll) :when (pred elt)] idx)))

Clojure’s for is not a loop but a sequence comprehension (see Transforming Sequences). The index/element pairs of (indexed coll) are bound to the names idx and elt. The comprehension yields the value of idx for each matching pair, for only those pairs where (pred elt) is true.

Clojure sets are functions that test membership in the set. So you can pass a set of characters and a string to index-filter and get back the indices of all characters in the string that belong to the set. Try it with a few different strings and character sets:

 (index-filter #{​a​ ​​} ​"abcdbbb"​)
 -> (0 1 4 5 6)
 
 (index-filter #{​a​ ​​} ​"xyz"​)
 -> ()

At this point, we’ve accomplished more than the stated objective. index-filter returns the indices of all the matches, and we need only the first index. So, index-of-any simply takes the first result from index-filter:

 (​defn​ index-of-any [pred coll]
  (first (index-filter pred coll)))

Test that index-of-any works correctly with a few different inputs:

 (index-of-any #{​z​ ​a​} ​"zzabyycdxx"​)
 -> 0
 (index-of-any #{​​ ​y​} ​"zzabyycdxx"​)
 -> 3

As the following table shows, the Clojure version is simpler than the imperative version by every metric.

Metric

LOC

Branches

Exits/Method

Variables

Imperative version

14

4

3

3

Functional version

6

1

1

0

What accounts for the difference?

  • The imperative indexOfAny must deal with several special cases: null or empty strings, a null or empty set of search characters, and the absence of a match. These special cases add branches and exits to the method. With a functional approach, most of these kinds of special cases just work without any explicit code.

  • The imperative indexOfAny introduces local variables to traverse collections (both the string and the character set). By using higher-order functions such as map and sequence comprehensions such as for, the functional index-of-any avoids all need for variables.

Unnecessary complexity tends to snowball. For example, the special case branches in the imperative indexOfAny use the magic number -1 to indicate a nonmatch. Should the magic number be a symbolic constant? Whatever you think the right answer is, the question itself disappears in the functional version. While shorter and simpler, the functional index-of-any is also vastly more general:

  • indexOfAny searches a string, while index-of-any can search any sequence.

  • indexOfAny matches against a set of characters, while index-of-any can match against any predicate.

  • indexOfAny returns the first match, while index-filter returns all the matches and can be further composed with other filters.

As an example of how much more general the functional index-of-any is, you could use code like we just wrote to find the third occurrence of “heads” in a series of coin flips:

 (nth
 (index-filter #{:h} [:t :t :h :t :h :t :t :t :h :h])
 2)
 -> 8

So, writing index-of-any in a functional style, without loops or variables, is simpler, less error prone, and more general than the imperative indexOfAny. On larger units of code, these advantages become even more telling.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.129.194.106