Lazy and Infinite Sequences

Most Clojure sequences are lazy; in other words, elements are not calculated until they’re needed. Using lazy sequences has many benefits:

  • You can postpone expensive computations that may not in fact be needed.
  • You can work with huge data sets that don’t fit into memory.
  • You can delay I/O until it’s absolutely needed.

Consider the code and following expression that produces (mostly) prime numbers using wheel factorization:[19]

 (ns examples.primes)
 ;; Taken from clojure.contrib.lazy-seqs
 ; primes cannot be written efficiently as a function, because
 ; it needs to look back on the whole sequence. contrast with
 ; fibs and powers-of-2 which only need a fixed buffer of 1 or 2
 ; previous values.
 (​def​ primes
  (concat
  [2 3 5 7]
  (lazy-seq
  (​let​ [primes-from
  (​fn​ primes-from [n [f & r]]
  (​if​ (some #(zero? (rem n %))
  (take-while #(<= (* % %) n) primes))
  (recur (+ n f) r)
  (lazy-seq (cons n (primes-from (+ n f) r)))))
  wheel (cycle [2 4 2 4 6 2 6 4 2 4 6 6 2 6 4 2
  6 4 6 8 4 2 4 2 4 8 6 4 6 2 4 6
  2 6 6 4 2 4 6 2 6 4 2 4 2 10 2 10])]
  (primes-from 11 wheel)))))
 (require '[examples.primes :refer :all])
 (​def​ ordinals-and-primes (map vector (iterate inc 1) primes))
 -> #​'user/ordinals-and-primes

ordinals-and-primes includes pairs like [5, 11] (11 is the fifth prime number). Both ordinals and primes are infinite, but ordinals-and-primes fits into memory just fine, because it’s lazy. Just take what you need from it:

 (take 5 (drop 1000 ordinals-and-primes))
 -> ([1001 7927] [1002 7933] [1003 7937] [1004 7949] [1005 7951])

When should you prefer lazy sequences? Most of the time. Most sequence functions return lazy sequences, so you “pay” only for what you use. More important, lazy sequences do not require any special effort on your part. In the previous example, iterate, primes, and map return lazy sequences, so ordinals-and-primes gets laziness “for free.”

Lazy sequences are critical to functional programming in Clojure. How to Be Lazy explores creating and using lazy sequences in much greater detail. Additionally, Eager Transformations talks about those cases when you should prefer non-lazy approaches.

When you’re viewing a large sequence from the REPL, you may want to use take to prevent the REPL from evaluating the entire sequence. In other contexts, you may have the opposite problem. You’ve created a lazy sequence, and you want to force the sequence to evaluate fully. The problem usually arises when the code generating the sequence has side effects. Consider the following sequence, which embeds side effects via println:

 (​def​ x (​for​ [i (range 1 3)] (do (println i) i)))
 -> #​'user/x

Newcomers to Clojure are surprised that the previous code prints nothing. Since the definition of x doesn’t actually use the elements, Clojure does not evaluate the comprehension to get them. You can force evaluation with doall:

 (doall coll)

doall forces Clojure to walk the elements of a sequence and returns the elements as a result:

 (doall x)
 | 1
 | 2
 -> (1 2)

You can also use dorun:

 (dorun coll)

dorun walks the elements of a sequence without keeping past elements in memory. As a result, dorun can walk collections too large to fit in memory.

 (​def​ x (​for​ [i (range 1 3)] (do (println i) i)))
 -> #​'user/x
 
 (dorun x)
 | 1
 | 2
 -> nil

The nil return value is a telltale reminder that dorun does not hold a reference to the entire sequence. The dorun and doall functions help you deal with side effects, while most of the rest of Clojure discourages side effects, so you’ll usually not need these functions.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.162.51