© Alejandro Serrano Mena 2019
Alejandro Serrano MenaPractical Haskellhttps://doi.org/10.1007/978-1-4842-4480-7_5

5. Laziness and Infinite Structures

Alejandro Serrano Mena1 
(1)
Utrecht, The Netherlands
 

In previous chapters I introduced several of the pillars of Haskell programming: the pure functional paradigm and the strongly typed nature of the language, which nevertheless allows powerful type constructs such as parametric polymorphism and type classes. This chapter will be devoted to understanding the unique evaluation model of Haskell, based on laziness , and the consequences of that choice.

In short, lazy evaluation means that only the necessary parts of an expression are computed, and this is done at the last possible moment. For example, if you have an expression such as head [2+3, 5*7], the multiplication is never performed at runtime because that value is irrelevant for the result of the expression. head uses only the first element in the list. As you may know, this way of performing evaluation is quite different from other programming languages. That opens the door to some interesting new idioms, such as working with infinite and cycling structures without caring about their special nature.

However, as a developer, you also need to be conscious of the trade-offs this model has, especially in the area of memory usage and performance. You will see the most typical problems that arise because of the laziness in Haskell code and learn about Haskell’s strictness to overcome them. Strictness annotations are available in pattern matching and in data declarations. The time and memory profiler that comes bundled with GHC will be an incredible tool to spot these problems; in this chapter you will look at its basic usage.

An Infinite Number of Time Machines

First you are going to see how Haskell can cope with infinite and cyclic structures without imposing any burden on the developer. In this section, you will look at the declaration and usage of these kinds of values in an intuitive way. The next section will discuss how Haskell is able to represent this information so that the code works.

As I have mentioned, some kinds of time machines allow travel to only certain years in history. But the store is known for always having the time machines to travel to any particular point in time. The problem here is that the world may never end, so the set of all time machines is infinite! Let’s declare a small data type for holding time machines, with the manufacturer and the year to which travel is permitted.
data TimeMachine = TM { manufacturer :: String, year :: Integer }
                 deriving (Eq, Show)
You can write a function to return all the time machines from a year n on. Check that there’s no guard or base case that stops the production of more time machines.
timeMachinesFrom :: String -> Integer -> [TimeMachine]
timeMachinesFrom mf y = TM mf y : timeMachinesFrom mf (y+1)
And from there, here are all the time machines made by Timely Inc. from year 100 on:
timelyIncMachines :: [TimeMachine]
timelyIncMachines = timeMachinesFrom "Timely Inc." 100
If you now load the file in the GHC interpreter, you can get the first elements of the list, using the built-in take function. The system doesn’t enter into any kind of infinite loop while evaluating the Timely Inc. machines.
*Chapter5.Infinite> take 3 timelyIncMachines
[TM {manufacturer = "Timely Inc.", year = 100}
,TM {manufacturer = "Timely Inc.", year = 101}
,TM {manufacturer = "Timely Inc.", year = 102}]
You can also try to find the first of those machines that travels after 2018.
*Chapter5.Infinite> import Data.List
*Chapter5.Infinite Data.List> find ((TM { year = y }) -> y > 2018) timelyIncMachines
Just (TM {manufacturer = "Timely Inc.", year = 2019})
But if you try to compute the length of the list or to find an element that does not exist (in this case, a time machine that travels to year 10), the interpreter will enter into an infinite computation and will never return. To halt the execution in the console, you should press Ctrl+C.
*Chapter5.Infinite> length timelyIncMachines
-- Never stops
*Chapter5.Infinite> find ((TM { year = y }) -> y == 10) timelyIncMachines
-- Never stops

Somehow, Haskell knows how to treat an infinite list, given that you observe only a finite part of it during runtime. On the other hand, the evaluation of an expression such as length timelyIncMachines involves traversing the entire list, so it doesn’t end.

Infinite lists are useful in some other situations. For example, in a previous chapter you wrote a function that, given an input list, returned a new list of tuples where each element was decorated by its position in the list. For that matter, you used the zip function.
*Chapter5.Infinite> (list -> zip [1 .. length list] list) "abcd"
[(1,'a'),(2,'b'),(3,'c'),(4,'d')]
But to do so, you had to traverse the list twice, once to get its length and once again to zip both. A better way to do it is to remember that zip stops when one of the lists ends. Then, you can use an infinite list of numbers as the first argument to zip. Let’s write a function that holds the list of all numbers from 1 on.
allNumbers :: [Integer]
allNumbers = allNumbersFrom 1
allNumbersFrom :: Integer -> [Integer]
allNumbersFrom n = n : allNumbersFrom (n+1)
Now you can write the same function easily.
*Chapter5.Infinite> zip allNumbers "abcd"
[(1,'a'),(2,'b'),(3,'c'),(4,'d')]
Or even better, you can use Haskell infinite list ranges. The notation [1 .. ] describes a list starting from 1 until the end of the integer elements (which in this case does not exist, because integer numbers are infinite).
*Chapter5.Infinite> zip [1 .. ] "abcd"
[(1,'a'),(2,'b'),(3,'c'),(4,'d')]

Note

The notation [e ..] does not necessarily imply that an infinite list is created but rather that the list will hold all the elements larger than e. For example, [False ..] is equivalent to writing [False, True] because the ordering in Bool is False < True and there are no more elements in that type.

There are even more tricks with infinite lists. Let’s look now at an interesting way to define Fibonacci numbers. If you remember from previous chapters, the nth Fibonacci number is defined as the sum of the Fibonacci numbers of steps n-1 and n-2. Let’s look it from a different perspective. Say you already have the list of all the Fibonacci numbers. The position n in this list holds that Fibonacci number. If you take the tail of that list, the list is moved one step forward; position n holds the Fibonacci number n+1. And finally comes the magic: if you sum the elements one by one, you get the Fibonacci numbers but moved two positions. This is depicted in Figure 5-1.
../images/316945_2_En_5_Chapter/316945_2_En_5_Fig1_HTML.jpg
Figure 5-1

Properties of the list of Fibonacci numbers, graphically

You can use this remark to define the list of all Fibonacci numbers. The first two elements will be 0 and 1 (this is fixed by the definition). Then, you obtain the rest of the list by adding elements one at a time, with the list moved one element forward. This element-by-element addition is what you get using zipWith (+).
fibonacci :: [Integer]
fibonacci = 0 : 1 : zipWith (+) fibonacci (tail fibonacci)
Obtaining the nth Fibonacci number is now equivalent to obtaining the element in position n-1 (like in C or Java, lists are indexed starting with 0). The (!!) function in Data.List is exactly the function you need.
*Chapter5.Infinite> import Data.List
*Chapter5.Infinite Data.List> fibonacci !! 20
6765
In addition to using list ranges and constructing functions by hand, an approach that returns infinite lists, the Prelude module includes some built-in functions to generate the needed results. As a special offer, Timely Inc. supplies an infinite number of time machines to travel to the year 2021. One way to define this could be as follows:
infinite2020Machines :: [TimeMachine]
infinite2020Machines = TM "Timely Inc." 2020 : infinite2020Machines
But another way to do so would be using the repeat combinator, which just creates infinite copies of the same value in a list.
*Chapter5.Infinite> take 3 $ repeat $ TM "Timely Inc." 2020
[TM {manufacturer = "Timely Inc.", year = 2020}
,TM {manufacturer = "Timely Inc.", year = 2020}
,TM {manufacturer = "Timely Inc.", year = 2020}]
In addition to one value, you can also repeat a set of values in order. In a special offer, you may have a set of time machines with a 20 percent discount but with the particular property so that you have to sell one for 2005, then one for 1994, then one for 908, and then again from 2005. You declare this infinite list with cycle.
specialOffer :: [TimeMachine]
specialOffer = cycle [TM m 2005, TM m 1994, TM m 908]
               where m = "Timely Inc."
You can see how values are repeated by looking at the first four values.
*Chapter5.Infinite> take 4 specialOffer
[TM {manufacturer = "Timely Inc.", year = 2005}
,TM {manufacturer = "Timely Inc.", year = 1994}
,TM {manufacturer = "Timely Inc.", year = 908}
,TM {manufacturer = "Timely Inc.", year = 2005}]
Values don’t need to always be equal. The iterate function generates values by applying a function to a value to get a second, then applying the same function to this second value to get the third, and so on. You can see the infinite list that will be generated as follows:
iterate f x = [ x, f x, f (f x), f (f (f x)), ... ]
This gives you another way to implement Fibonacci. In particular, the fibonacci2 list will hold pairs of values; for example, in position n, you can find (n Fibonacci number, n+1 Fibonacci number). From one of these tuples, you can build the next element by shifting one position to the left and adding the two numbers to get the n+2 Fibonacci number. In code, this translates to the following:
fibonacci2 :: [Integer]
fibonacci2 = map fst $ iterate ((n,n1) -> (n1,n+n1)) (0,1)
You can use this new function like you used the previous one.
*Chapter5.Infinite> fibonacci2 !! 20
6765

In Exercise 5-1 you will see how infinite lists can even give you a glimpse into history.

Exercise 5-1: The Sieve of Eratosthenes

Eratosthenes was a Greek mathematician from the third century bc. One of his most known inventions is the prime sieve. This sieve gives an algorithm for getting the list of all the primes. It works in the following way:
  • Start with a list of all the numbers from 2 on.

  • Take the first number, in this case 2, and drop from the list of numbers all its multiples, that is, all numbers n such that the remainder of n and 2 is 0.

  • Now take the next number (in this case it will be 3) in the filtered list and repeat the operation: filter out all the multiples of that number.

  • Repeat the previous step with the first number left in the previous one.

Implement the sieve of Eratosthenes using the techniques outlined in this section. The solution should take the form of a declaration primes :: [Integer], which contains all the prime numbers.

Lazy Evaluation Model

At this point, you should be convinced that Haskell can indeed work with infinite values (or, at least, with infinite lists). However, it may seem a bit like black magic. Of course, this is not the case: the ability to work in this way is the result of the strategy that Haskell follows for evaluating expressions, which departs greatly from other programming languages. In this section, I will introduce this lazy strategy and point out some of the most common problems with it.

Understanding Evaluation in Haskell

Most of the programming languages follow a strict evaluation model . In other words, whenever a compound expression is found, it’s immediately transformed into a simpler version (maybe including calls to functions of methods) before the larger expression is evaluated. Most importantly, arguments to a function are evaluated before the control flow enters the body of the function. Here’s an example of the steps that would be followed in this model to evaluate a simple expression:
head [3+2, 7*5] => head [5, 35]  -- we evaluate the arguments to head
                => 5             -- and then we execute the function itself
Under this kind of evaluation, an expression like head timelyIncMachines would cause an infinite loop because there’s no point at which to stop going further and further in the list. In the following code, I reproduce the first steps of this infinite loop, which will continue as shown with three dots. Take the time to understand this example until you are completely sure about why this example loops.
head timelyIncMachines
     => head (timeMachinesFrom "Timely Inc." 100)
     => head (TM "Timely Inc." 100 : timeMachinesFrom "Timely Inc." 101)
     => head (TM "Timely Inc." 100 : TM "Timely Inc." 101 :
              timeMachinesFrom "Timely Inc." 102)
     => head (TM "Timely Inc." 100 : TM "Timely Inc." 101 :
              TM "Timely Inc." 102 : ...)

In contrast, Haskell tries to evaluate expressions as late as possible. In this example, it won’t initially evaluate the expressions that make the elements in the list. When it finds a call to head, it obtains the first element, which will still be an unevaluated expression, 3+2. Since you want to print the result of this expression on the screen, it will continue only by computing the addition, until it arrives at the same final value, that is, 5. This kind of evaluation is known as nonstrict or lazy.

I have been intuitively using the idea of “evaluating as late as possible.” But that approach is not directly applicable to the example of the infinite list because for getting the head you have to enter the body of timeMachinesFrom, which would then give rise to a loop. The extra bit of information you need to know is that, by default, Haskell evaluates an expression only until a constructor is found. The rest of the value will be left unevaluated. Its spot will be occupied by a placeholder indicating how that specific field can be computed. This placeholder is called a thunk.

Applying what I’ve just described to the evaluation of the head of the infinite list, timeMachinesFrom will just produce a (:) constructor with a thunk for the element and another thunk for the rest of the list. When you apply head to it, you get back the first of the thunks. If you want to show the value on the screen, the thunk has to be unwrapped, and the recipe to create the value that is held in the thunk must be followed. Figure 5-2 shows these steps graphically. The first three transitions are the actual evaluation of thunks. In the fourth state, you arrive at a point where you can evaluate the case expression because you can choose a pattern based on the already evaluated value. The final steps are already plain evaluation of thunks. The next thunk to be evaluated in each step is shown with a bolder frame.
../images/316945_2_En_5_Chapter/316945_2_En_5_Fig2_HTML.png
Figure 5-2

Evaluation of head timelyIncMachines

One important feature of lazy evaluation is that once a thunk has been evaluated, the result is saved, and the thunk is not evaluated again if the result of that expression is needed elsewhere. This is a great feature because it means you pay only once for the cost of evaluating each expression in your application. Furthermore, the pure nature of Haskell also helps in sharing thunks that refer to the same expressions, which means it can reuse the evaluation in some part of the program in other places. For example, Figure 5-3 shows how the memory layout changes when executing (head allNumbers, head (tail allNumbers), tail allNumbers). Since allNumbers is a list, the Haskell runtime environment keeps a reference to the same expression from all the appearances of that value. This is shown in Figure 5-3 as different arrows pointing to the same expression. Exercise 5-2 allows you to try this sharing.
../images/316945_2_En_5_Chapter/316945_2_En_5_Fig3_HTML.jpg
Figure 5-3

(head allNumbers, head (tail allNumbers), tail allNumbers)

Exercise 5-2: Evaluating Fibonacci

Write down the evaluation steps of the expression fibonacci !! 3, where fibonacci is the infinite list of the Fibonacci numbers, as defined previously in this chapter.

This would be impossible in a language that allows printing while computing a value. Let’s assume than during its evaluation allNumbers outputs "Natural numbers rule!". If you share the same value for allNumbers, the string would be printed only once. But in many languages, including C and Java, what you would expect is to show it three times, one per reference to allNumbers. You have seen that side effects make it impossible to apply these sharing optimizations, which are key to good performance in Haskell programs.

It should be noted that only expressions will be shared. This should not be confused with memorizing a function, that is, caching the results for arguments that have already been provided. Here’s an example:
(allNumbersFrom 1, allNumbersFrom 2)

Even though allNumbersFrom 1 will call allNumbersFrom 2, the evaluation of allNumbersFrom 2 in allNumbersFrom 1 and in the previous expression will not be shared.

One final issue that remains to be explained is how cyclic structures are represented. Haskell maintains a cycle in memory when declarations are the same. For example, for the case of repeat e, Figure 5-4 shows the evaluation.
../images/316945_2_En_5_Chapter/316945_2_En_5_Fig4_HTML.jpg
Figure 5-4

Evaluation of repeat e

Evaluation Strategies

In this section you saw two examples of evaluation strategies: ways in which the computation proceeds and the order in which parts of the expressions are evaluated. In addition to those two, more strategies have been developed.

What I have called strict evaluation is also known as call by value. Sometimes, especially in object-oriented languages, this is changed to call by reference, where you don’t receive values as arguments but boxes holding those values.

Lazy evaluation is sometimes referred to as call by need, which is a special case of the more general strategy of call by name, in which function arguments are not evaluated before the body of the function but are substituted directly. The difference is that, in general, call by name may evaluate the same expression more than once, whereas call by need uses thunks to do it only once.

Problems with Laziness

Laziness is often a blessing, but sometimes it can also be a curse. As usual in computer science, there’s a trade-off in lazy evaluation. In this case, delaying the evaluation until needed may result in less computation and also allow some programming idioms unavailable in other languages. On the other hand, it may create many thunks, causing the memory to become quite full so that the operating system starts to paginate, which makes the program slower. Let’s look at this problem with the help of your old friends, the folds.

Let’s build a picture showing how foldr (+) 0 [1,2,3] is evaluated, showing explicitly the thunks. Each thunk will hold the recipe to convert it into a proper value inside it. This is depicted in Figure 5-5.
../images/316945_2_En_5_Chapter/316945_2_En_5_Fig5_HTML.jpg
Figure 5-5

Evaluation of foldr (+) 0 [1,2,3]

Until the interpreter reaches the final step of foldr, it cannot proceed with the additions. This means that for each element in the list, a new thunk is created. Now you understand why, when requested to interpret the following line of code, the computer starts to sweat and later halts with an error.
*Chapter5.Problems> foldr (+) 0 [1 .. 1000000000]
Depending on your system, you may get one of these two errors:
<interactive>: out of memory (requested 1048576 bytes)
** Exception: stack overflow

Note

At first sight, the culprit could also be the big length of the list. However, if you perform some other computation over it that doesn’t create thunks in between, such as length [1 .. 1000000000], you can see that the system responds correctly (the actual speed will depend on the capacity of your computer to hold big integers).

The shape of the evaluation using foldr is something like (1 + (2 + (3 + (... + <thunk>)))), so it cannot continue because at each point during evaluation it knows about only one argument to (+). So, if you use parentheses in another way, making the evaluation look like ((((1 + 2) + 3) + ...) + <thunk>), the problem may be gone. You already know how to do it: using foldl.
*Chapter5.Problems> foldl (+) 0 [1 .. 1000000000]
<interactive>: out of memory (requested 1048576 bytes)

But here you face a similar situation: (+) has at each step all of its arguments, but since you do not request the result until the end of the list, many thunks have to be created.

The solution is to force evaluation. Basically, you need to tell Haskell to evaluate the (n+m) thunks before proceeding with the rest of the computation, overriding the default lazy behavior. The function seq in the Prelude module allows you to do so. In the most general form, a force expression is written as a `seq` b. Haskell ensures that the expression a is evaluated before b. Usually, a is part of the expression b. Let’s write an addition operation using the force operator that doesn’t suffer from memory problems.
sumForce :: [Integer] -> Integer
sumForce xs = sumForce' xs 0
   where sumForce' []     z = z
         sumForce' (y:ys) z = let s = z + y in s `seq` sumForce' ys s

When executing sumForce [1 .. 1000000000], the interpreter may take a lot of time, but no memory problem will arise, and eventually an answer will be given. The idiom x `seq` f x is so common that there is a special operator, $! (strict application), to perform this task. So, you can rewrite the bold expression in the previous piece of code as sumForce' ys $! (z+y).

Note

Once again, you see a familiar foldlike pattern in the previous code. Prelude includes a foldl' function that forces the accumulated value before passing it into the next step. To avoid a memory leak you could have written the previous example as foldl' (+) 0 [1 .. 1000000000].

Once again, I stress that Haskell evaluates something only until a constructor is found. The fields are left as thunks until some further computation needs the information enclosed by them. This is true also for seq. If you want to be sure that some part of a larger value is evaluated before continuing, you should explicitly get that value and force it (you will see by the end of the chapter that if you don’t want any thunk inside a value, you can use deep strict evaluation with deepseq ). This is enough for this case because the first constructor that will be encountered will be the integer value coming from the addition.

Now that you know about forcing evaluation, you should resist the temptation to use it everywhere you think a memory problem could be found. Forcing expressions destroys the lazy nature of the language and may also lead to cases where a previously terminating expression no longer is. Think of the case of taking the head of an infinite list. If you make Haskell force the entire list, it will never reach the end, thus entering into an infinite computation chain. If you suspect a memory leak, you should first use profiling to find the correct spot and then think carefully whether using seq will not hurt the applicability of your functions to the kind of arguments that are expected.

Pattern Matching and Laziness

As you can see, this interplay of delays using thunks and forcing their evaluation is important. For that reason, you should have a clear idea of when computation takes place. In addition to explicit seq or ($!), another place where the compiler or interpreter needs to evaluate thunks is on pattern matching; it needs to evaluate up to the point that it knows which of the corresponding branches has to be taken.

There’s a GHC extension to patterns, called BangPatterns, which allows you to force the evaluation of some parts of the pattern. Concretely, you can write ! before any part of the pattern, and then when matching is tried against that pattern, the expression in that point will be evaluated up to a constructor, and then the match will be tried. For example, you can write a function that adds all the years from a list of time machines, using the following syntax both to force the addition of each step and to ensure that the year in each time machine is also evaluated (so the addition does not have a thunk like the second argument):
{-# LANGUAGE BangPatterns #-}
sumYears :: [TimeMachine] -> Integer
sumYears xs = sumYears' xs 0
  where sumYears' []            z = z
        sumYears' (TM _ !y :ys) z = let !s = z + y in sumYears' ys s

Interesting enough, and because of this evaluation forcing in pattern matching, Haskell also includes a way to delay evaluation in matching phases. The way to do it is to use an irrefutable pattern. Matching upon it never fails, but it’s destructured only when some of its constituent parts are needed. One use case for irrefutable patterns involves a function that always returns a value given the same input. For example, you are finding an element in a list, and you have made sure that the element you are searching for already exists, so find will always return the same result. In that case, you can delay the computation of find a bit and just evaluate it when you need the constituent value.

For a more explicit example, suppose you have this function:
lengthyOperation = if lengthyPredicate then Just something else Nothing
Say you know that the lengthyPredicate will be true in some situation. If you write a regular matching as follows, then you will force the lengthyOperation to be evaluated just to choose the branch:
case lengthyOperation of
  Just something -> ...
  Nothing        ->
But since you know that the first one will be the selected one, you can delay the computation a bit more using an irrefutable pattern, like so:
case lengthyOperation of
  ~(Just something) -> ...
Remember that a pattern such as that never fails. So, if you come to a situation where lengthyOperation returns Nothing and you use something inside the body of the match, you will get an error.
Prelude> case Nothing of ~(Just e) -> "hello, " ++ e ++ "!"
"hello, *** Exception: Irrefutable pattern failed for pattern (Just e)

Note

Irrefutable patterns are rarely used, but in some cases they are the key to code that performs well. You shouldn’t worry too much about understanding all the cases where they may be applicable, but knowing of their existence may become handy, especially if reading the source of some built-in function.

Strict Functions

It won’t be long until you read in the documentation of some package that a function is strict on one or several of its arguments. At a high level, this means the argument will have to be evaluated if it is still in thunk form, so you should take care of providing in that place an expression that won’t lead to nontermination.

Formally, in Haskell there is a canonical value called undefined that represents all those computations that don’t end. Because undefined never returns, it can be typed as you want, so you can have undefined :: a. By the way, this typing makes undefined a perfect placeholder in the place of code you haven’t yet written, such as when you want to check that your current code passes type checking.

A function f is then called strict on its argument if f undefined = undefined; that is, if given a nonterminating argument, the function itself does not terminate. One example of a strict function is head. But a function defined as g x = 1 isn’t, because, given any argument, it returns 1.

Intuitively, the notion of being strict on something means that it doesn’t inspect that something. The way a function is strict may be subtler than in the previous examples. For example, head undefined is undefined, but head (1 : undefined) isn’t.

Profiling with GHC

The GHC compiler can be used to generate statistics about runs of your program to get more insight on where computation effort is spent. In this section you will focus on two kinds of profiling: time profiling , which gets information about the amount of time spent in each of the functions in the system, and memory profiling, which allows you to look at the values holding the larger amount of memory. The profiling output in GHC assigns time or memory to the so-called cost centers, and the information and summaries are always related to them. By default, cost centers are assigned to functions, but more can be added using annotations.

To use profiling, you can no longer build the code as a library. You need to build a full executable that could be run from the command line. So, let’s first look briefly at the modifications you need to do in the Cabal file to include an executable.

Each Haskell project may include several executables, identified by a name, and with a reference to the file that includes the entry point of the application as a function main of type IO (). For now, it’s only important to know that IO allows you to perform side effects such as printing onto the screen; you will take a closer look at this type in Chapter 9. Each of these executables is defined as a new block in the Cabal file. The name is specified after the executable keyword that heads the block, and the file containing the entry function is declared inside the main-is property. It’s important to note that the main-is property needs a reference to the file itself (including the extension), not a module name like other properties. The auxiliary modules are defined inside the other-modules property, and dependencies are specified in the build-depends property, as in library blocks.

For example, the following declaration includes an executable profiling-example whose main function is defined in the Main.hs file and that uses the Chapter5.Annotations module. The only dependency is the base package.
executable profiling-example
  build-depends:   base >= 4
  hs-source-dirs:  src
  main-is:         Main.hs
  other-modules:   Chapter5.Annotations
To check that it works, create the Main.hs file. It’s mandatory for the module name that defines an entry point to be called Main and to contain the main :: IO () function, so the file should be named accordingly. In this example, the executable just prints "Hello!" using the putStrLn function.
module Main where
main :: IO ()
main = putStrLn "Hello!"
Then, call cabal new-build or stack build in the project folder. Now you can run it in the console. Bot tools provide a shortcut for running the executable in the package being built.
$ cabal new-run profiling-example  # for Cabal
$ stack exec profiling-example     # for Stack
Hello!
To enable profiling, you must tell the compiler that’s what you want to do. In Cabal you do so by running cabal new-build --enable-profiling. This creates a new cabal.project.local file with a single line.
profiling: True
Setting up Stack is quite similar. You need to include the following two lines in the stack.yaml file which specifies the resolver of the project.
build:
  library-profiling: true
  executable-profiling: true
Then you need to run cabal new-build or stack build to apply the new configuration options. Once your executable is compiled with runtime options support, you specify the runtime options for your program in the command line between +RTS and -RTS. Time and allocation profiling is specified with -p. Thus, for calling the previous executable while gathering time information, you need to run the following:
$ cabal new-run profiling-example -- +RTS -p -RTS  # for Cabal
$ stack exec profiling-example -- +RTS -p -RTS     # for Stack
Hello!
At first sight, nothing has changed in the execution. But if you look carefully at the project folder, a new file called profiling-example.prof has been created. Since the running time of the program is near zero, no interesting profiling output will be generated. Thus, let’s better profile a program that computes the factorial of 100,000 and outputs it to the screen. You should change Main.hs to read as follows:
module Main where
main :: IO ()
main = putStrLn $ show result
result :: Integer
result = foldr (*) 1 [1 .. 100000]
Now run cabal new-build and the executable in profiling mode. The contents of profiling-example.prof should be similar to the following:
        total time  =        5.68 secs   (5680 ticks @ 1000 us, 1 processor)
        total alloc = 9,981,322,816 bytes  (excludes profiling overheads)
COST CENTRE MODULE  %time %alloc
result      Main     94.1   99.7
main        Main      5.9    0.3
                                                 individual    inherited
COST CENTRE MODULE                 no.  entries  %time %alloc  %time %alloc
MAIN        MAIN                   42        0    0.0    0.0  100.0  100.0
 main       Main                   85        0    1.2    0.0    1.2    0.0
 CAF        Main                   83        0    0.0    0.0   98.8  100.0
  result    Main                   86        1   94.1   99.7   94.1   99.7
  main      Main                   84        1    4.7    0.3    4.7    0.3
 CAF        GHC.IO.Encoding        79        0    0.0    0.0    0.0    0.0
 CAF        GHC.Conc.Signal        77        0    0.0    0.0    0.0    0.0
 CAF        GHC.IO.Handle.FD       71        0    0.0    0.0    0.0    0.0
 CAF        GHC.IO.Encoding.Iconv  66        0    0.0    0.0    0.0    0.0
 CAF        GHC.Show               63        0    0.0    0.0    0.0    0.0

The information is divided into three parts. The first part is a summary of the total time and memory used by the program. The second part shows the cost centers (in this case, functions) that contribute the most to that cost. In this case, it’s obvious that result takes most of the time and memory, and main uses a smaller part of it. The last part is a more detailed view that shows the information about time and allocation in a tree. In this case, a lot of internal functions dealing with file handles and encoding are shown, but in a larger application fewer of these will be shown because they will be buried in a larger call stack.

However, the most interesting part is heap profiling, which allows for the production of a graph stating the consumption of memory throughout the program. The most important ways to run heap profiling are breaking it down by cost centers, which is specified using the -h runtime option, and breaking it down by type, whose option is -hy. Other ways to group consumption, such as by module, by closure, and so on, are available but not used often. If you run the previous program with any of those options, a new file is produced, called profiling-example.hp. This file is raw information; to convert it to a graph, you have to use the hp2ps tool that comes bundled with the Haskell Platform. Running hp2ps -c profiling-example.hp will produce a (color) PostScript file that can then be viewed.

Figure 5-6 shows the output of running profiling-example -- +RTS -h -RTS at the command line and then processing the resulting heap profile.
../images/316945_2_En_5_Chapter/316945_2_En_5_Fig6_HTML.jpg
Figure 5-6

Graphical output of heap profiling by cost centers (run with -h)

You can see here that there is a big increase in memory usage in the first tenths of a second, shown in orange. This reflects the large creation of thunks I spoke about in the “Problems with Laziness” section. Then, the memory decreases as the thunks are evaluated. A second spike, shown at the extreme right edge of the graph, highlights the increase in memory when the resulting number has to be converted to a string.

The breakdown by types can illustrate the way memory is used in the system. Now run the application with the -hy runtime option and produce the graph. The result looks like Figure 5-7.
../images/316945_2_En_5_Chapter/316945_2_En_5_Fig7_HTML.jpg
Figure 5-7

Graphical output of heap profiling by types (run with -hy)

As you can see, at the beginning most of the memory belongs to elements of type Integer, which corresponds to those thunks I talked about. As you go further in the execution, the ARR_WORDS type uses more memory. This encompasses the memory used by basic types such as evaluated integers (you see that it grows as the number gets larger) and strings.

Let’s profile the other versions to see how the profiling output confirms my initial thoughts on the problem of memory exhaustion. Replace the code using foldl' instead of foldr.
result :: Integer
result = foldl' (*) 1 [1 .. 100000]
The graph in Figure 5-8, obtained by running this new version through heap profiling by types, shows that now intermediate thunks are not created because evaluation is forced at each step, so you don’t see the initial spike of Integer values. Memory is used only in ARR_WORDS.
../images/316945_2_En_5_Chapter/316945_2_En_5_Fig8_HTML.jpg
Figure 5-8

Graphical output of heap profiling by types, foldl' version

As you have seen, profiling is a great tool for spotting problems both in time and in memory because it allows focusing on those points that are really wasting those resources, instead of having to guess where the leak comes from. I suggest profiling some larger programs to get confident with the profiler and its output in order to be productive with the tool in the near future.

Strictness Annotations

This section gives more insight into GHC internals. You may safely skip this section in a first reading, but you should return to it later because you will greatly benefit from this information. This section will help you understand the memory and time used by your problem, and it gives you the tools to enhance your application in those aspects.

In general, you can think of a value in Haskell as being represented in memory as some header stating its type and the constructor used to build it, followed by references to each of the fields composing that value. Basic types, such as integers or characters, deviate from this layout and are represented just by the header and the value itself. One Individual client, as defined in the previous chapters, would then conform to the representation shown in Figure 5-9.
../images/316945_2_En_5_Chapter/316945_2_En_5_Fig9_HTML.jpg
Figure 5-9

Representation of Individual 1 (Person "Andrea" "Blacksmith")

Remember that before being completely evaluated, expressions in Haskell are represented by thunks. Figure 5-9 shows the memory representation when an expression is completely evaluated. If some parts of it were still to be computed, the references will point to thunks containing the code to be executed.

This representation is flexible but suffers from some performance penalties. First, you may be creating thunks for values that you know will be used in the future or that may be needed to ensure good performance for that data type. For example, if you implement a new kind of list that stores its length, it doesn’t make much sense to not store the length directly and instead evaluate it lazily because at the moment you need to query it, a long chain of computations will happen and performance will suffer.

In that case, you want the length to be a strict field. Following the same syntax of bang patterns in matching, strict fields are declared by writing ! before the type of the field itself. As a result, every time a new value of that type is created, the expressions in the strict positions will be forced to evaluate in the same fashion as if you had included an explicit seq. A possible implementation of your lists with length could be as follows:
data ListL a = ListL !Integer [a]

The memory representation of values also makes generous use of references to associate values with field positions. This means every time you want to access a field in a value, you need to traverse one reference. Once again, this is flexible and allows you to have potentially extensive structures in memory. However, it can be overkill for accessing small fields, such as integer ones, whose value could be directly encoded in the space that is taken by the reference (the size of a pointer in the target architecture). A field in that situation is said to be unpacked.

Unpacking fields is a special feature of the GHC compiler, and it’s declared via an {-# UNPACK #-} annotation right before the field declaration. For example, you could decide to unpack the identifiers of all the constructors of the Client data type to make it more efficient.
data Client = GovOrg     {-# UNPACK #-} !Int String
            | Company    {-# UNPACK #-} !Int String Person String
            | Individual {-# UNPACK #-} !Int Person
            deriving Show

Note that not all fields can be unpacked; it depends on the type of field. Basic types, such as integers or characters, are eligible. Other data types can be used only if they consist of just one constructor and all their fields are also unpacked. This makes it possible to unpack a tuple of types that are unpackable themselves but forbids unpacking a list. Trying to unpack a String field will also produce a warning since it’s just a list of Char.

In many cases, you should consider whether for your particular application you prefer a lazier or a stricter implementation of your data structures. Laziness delays the moment of evaluation and allows you to compute only what is strictly needed for the program but has the trade-offs of larger memory consumption and more uncertainty over when the evaluation will take place.

Some packages, such as containers, provide both lazy and strict implementations of the data structures. For example, you have both implementations of maps living in different modules: Data.Map.Lazy and Data.Map.Strict. By default, the module Data.Map uses the lazy versions. The difference in this case, stated in the documentation, is that in the strict version both keys and values are forced before being saved in the map, whereas in the lazy version this is done only for keys.

Even Deeper

In some cases you need to evaluate an expression until no thunks are left. For that matter, the Haskell Platform provides the deepseq package , which in its module Control.DeepSeq provides the deepseq and ($!!) functions, similar to seq and ($!), respectively, but that also takes care of forcing the subexpressions, not stopping at the layer of constructors.

If you want your data types to support deep evaluation with deepseq, you have to make them instances of the NFData type class. Implementing them is quite easy; you just need to force all the fields and then return (). Here’s an example, in Client:

import Control.DeepSeq

instance NFData Client where

rnf (GovOrg i n) = i `deepseq` n `deepseq` ()

rnf (Company i n (Person f l) r) = i `deepseq` n `deepseq` f `deepseq` l

`deepseq` r `deepseq` ()

rnf (Individual i (Person f l)) = i `deepseq` f `deepseq` l `deepseq` ()

The same warnings for forcing with seq apply to deepseq, but they are even stronger because the latter forces even more evaluation to take place.

Summary

In this chapter you looked at several ways to use the evaluation model of Haskell, based on laziness.
  • You saw how lazy evaluation allows you to work with seemingly infinite or cyclic structures, making for elegant patterns in the code.

  • I explained the lazy evaluation model, explaining the special role of thunks for delaying evaluation until a value is needed and at the same time increasing sharing of evaluated computations.

  • You looked at the shortcomings of lazy evaluation, the most important being increased memory consumption and uncertainty about the moment in which a thunk will become evaluated.

  • You learned how to annotate the code using seq, or strictness annotations in both pattern matching and data types, to work around these problems.

  • The GHC profiler is a powerful tool for detecting time and space leaks. I covered its basic usage and the interpretation of its results in this chapter.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.14.136.168