At this point you know many of the features and intricacies of the Haskell language and many of its libraries. This chapter won’t teach you any more about the language but rather will focus on some tools that help you in the process of coding libraries and applications. These tools support good engineering practices within Haskell.
An important and often overlooked practice as you program is to write good documentation for the source. Even though Haskell is a high-level language, there’s always room for documenting the purpose of each data type and function in your modules. Haddock is bundled with the Haskell Platform, and it creates beautiful documentation pages from special comments in your code.
The main set of tools I will cover is related to testing. Automatically testing your code ensures that any change in the code does not affect the behavior that it should have. Programmers today see unit testing as an essential activity, and you will learn how to create unit tests in Haskell with HUnit.
Haskell was the first language to support randomized testing in addition to unit testing, with its QuickCheck library. This form of testing generates random calls to your functions and ensures that certain properties of the output are satisfied, freeing the tester from the task of writing hundreds of test cases. Another tool called SmallCheck builds on this idea but performs exhaustive testing . This means a certain piece of code is called not with random inputs but with all possible inputs up to a size limit.
Testing can show that a bug exists in your implementation but testing along does not give you complete assurance of their absence. The next step is to formally verify that the implementation satisfies the properties that are requested. You’ll get a glimpse of how this can be done using LiquidHaskell, an extension of Haskell with refinement types.
Documenting Binary Trees with Haddock
Even without any extra comments, you can generate a summary of the modules in the package by running cabal haddock or stack haddock in the root of your project. A set of HTML pages will be created in the dist/doc/html folder in your package if using Cabal, or .stack-work/install if using Stack. If you open them, you’ll see the documentation in the same format in which the documentation from Hackage is presented. However, the information there is rough: only a list of modules and, inside each module, a list of all the definitions. It would be nice to incorporate comments and examples into each of the functions, data types, and type classes.
The way you can convey extra information to be included in those HTML pages is via documentation comments . Documentation comments are written in the same file where the code is written, which helps you keep both code and documentation in sync. They follow the same syntax as regular comments, but they start with a special marker. This idea is found in other documentation systems such as Javadoc and Doxygen.
Caution
Haskell single-line comments start with -- followed by a space. You should respect this syntax in documentation comments and use a space between -- and either | or ^, as the example shows.
If you refer to other declarations by writing it in single quotes, Haddock will generate a link to that place. This functionality is also available for linking to modules; in that case, you need to write its name in double quotes.
Unnumbered lists are declared by prefixing each of the items by *. In the case of numbered lists, you can choose between writing n. or (n) for the nth element. Be aware that each item should be surrounded by blank lines.
You can introduce pieces of code by prefixing each line with >. Additionally, Haddock supports the declaration of properties, which are prefixed by prop> and of interpreter examples, in which the input is written after >>> and the output immediately after.
If you look carefully at Figure 15-1, you will notice some hyperlinks labeled as Source. The target of each of those links is the definition of that element in a colored version of your source. You can get them in your own documentation by running cabal haddock --hyperlink-source instead of plain cabal haddock. Notice that you need to have the package hscolour in your system for this feature to work.
HLint
A Haskell compiler checks whether your source code complies with the syntax and semantics of the language. However, there are cases where your code is correct but it’s not easy to read or contains some fragment that is usually seen as confusing and may lead to errors in the future. HLint is a tool that suggests changes to Haskell source in order to make the code much clearer.
HLint covers many style issues in code. HLint is also aware of laziness, so it ensures that your code never evaluates more – which could have a negative performance impact – when suggesting a change. Furthermore, HLint can apply the suggestions directly to your code.
Unit Testing with HUnit
Unit testing is a methodology for checking that your functions perform their work as intended. Following these methods, the programmer writes a set of test cases, with each of them executing a piece of code and describing the expected result. These tests can be run at the developer’s discretion or can be used to check automatically that code that was working fine doesn’t break after changes (this is called regression testing ).
To provide a unified interface to the various testing methods available, the Haskell community has created several test frameworks. In this section I’ll introduce the Tasty library, but many others are available. HTF, test-framework, and HSpec are other interesting options. The good news is that any of these test frameworks can be integrated into your Cabal project and run through the same command-line option.
Declaring Tests in Cabal
In principle, you could create the application that runs your tests as a mere executable in your package. However, Cabal embodies the idea of tests in its core and has special support for them. In this section you’ll learn how to add a test executable to your Cabal project.
Inside a Cabal file, a test is defined in its own stanza, like library and executable parts were. Thus, the declaration of a test inside the project file is headed by test-suite followed by the name of the test suite. The key field to include in that stanza is the type one. Cabal has extensible support for different kinds of test providers that can run the tests in the project in different ways.
The focus in this section will be on type being equal to exitcode-stdio-1.0, which is the best-supported option at the moment of writing. By specifying that test provider, you declare to Cabal that your tests should be compiled as an executable file. The file will run to execute the tests and will tell Cabal whether the tests were successful via its exit code. The good news is that all the test frameworks in Hackage are knowledgeable about this way of running them and make it easy to integrate them as a Cabal test.
Notice that I’ve declared that the source files for my tests will reside in the test directory. It’s a good practice to keep source code and tests in different folders to make it easier to find the place where you include new functionality or new checks of behavior.
Now you’re ready to run the tests that you’ll declare in the next section through the cabal or stack command-line tools. This functionality is quite useful because it centralizes all the project information in a single place.
Writing Unit Tests
Note
TestTree is one of the types used in the Tasty framework. It’s not related to the binary trees you’ve been using as examples throughout the chapter.
In this case there’s a top-level group called “Tasty Tests” that has another group called “HUnit Tests” inside. This nested group is where the code puts the test case defined previously.
As you can see, the file shows the hierarchical structure declared in the source file.
HUnit is a small library, but it embodies the most common uses of unit testing. In Exercise 15-1, you will write some extra tests to check the implementation of binary trees. Following the exercise, you will learn a little about a framework named HSpec.
Exercise 15-1.Unit Testing Binary Trees
Write test cases to check that once you add an element to a binary tree, the size (the number of internal nodes) is increased by 1. Additionally, write test cases to check that deleting an element indeed removes it from the binary tree.
Hspec
As you can see, Hspec embodies a more textual style for describing tests. The aim is to be closer to the language used in the specification phase, making it easier to write the tests that check that a further implementation is correct.
Randomized Testing with QuickCheck
Unit testing is the most common approach to checking that your implementation satisfies the requirements. However, unit testing imposes quite a big load of work. For each requirement, you need to create several test cases, which you must ensure cover a big enough number of possible scenarios. Typically, you include test cases for extreme values, empty collections, invalid data, and so on.
Instead of having to think about all this by yourself, it would be great if you could express your specifications at a higher level and then test whether those properties hold for your program. Haskell’s usage of higher-order functions makes it easy to express those properties. For example, the property “reversing a list twice is the same as leaving it as is” can be written as reverse . reverse = id. The bad news is that doing so automatically is a task that’s impossible to achieve for every single property in the wild (but you’ll see in the next section how formal verification can help you in those cases where an entire proof of correctness is possible).
QuickCheck tries to bring these two worlds together. Using this library, you express how your program should behave in the form of high-level properties. Then, the tool creates a lot of random values that your program is tested against. If you use a sufficiently large set of tests (hundreds or thousands), your confidence in that property holding is increased. Furthermore, QuickCheck includes a shrinking component that is used when a value that doesn’t satisfy your specification is found. Its task is trying to make that value as small as possible, helping you in reproducing and tracking the source of the bug.
Testing List Properties
As you can see, the definition starts similarly to an HUnit test. You call the testProperty function and give a name to the property. Then, you define the body of the property as a function, which should hold for every possible value of the arguments of that function. In other words, you can think of the property as having “for all” before each of the arguments to the body function.
Note
In some cases you will need to restrict the types of the arguments to QuickCheck properties. The reason is that it’s not possible to create random values for every possible type in Haskell. For example, you cannot generate a list of functions. Thus, in the example, the code works only for lists of integers, even though the reverse' function is applicable to any kind of list.
The message is telling you that after trying three times, it was able to find an example where the property does not hold (although given that QuickCheck tests in a random fashion, the number in the counterexample may be different in your case). Then, it shrank the list until it made the example small. Indeed, a singleton list is the smallest example where the reverse' function fails.
Test your understanding of QuickCheck by doing Exercise 15-2.
Exercise 15-2. Quickchecking Reverse
Add more QuickCheck properties on the reverse' function that was defined at the beginning of the section. For example, applying it twice returns the original result (reverse' . reverse' == id), or the head of the list is the last element of the reversed list.
After checking that the properties do not hold for the initial implementation, change the code to be correct. Does the new version pass all the tests?
Testing Binary Tree Properties
You can generate values of other data types by calling their corresponding Gen instance. In the case of binary trees, this should be done for the elements in the nodes.
You can make a random choice between several generators via the oneof function, which takes a list of them and at runtime gives back one of the specified generators. A more refined version of oneof is frequency, which in addition to the possible outcomes includes a relative frequency in which each generator should appear. You’ll see how to use this latter function to decide at each point whether to generate a node or a leaf of the tree.
If you need your values to satisfy a certain condition, you can add a call to suchThat. This function will make sure that the given predicate holds for the random values.
One of the most important properties of a random generator is that it should stop and produce a value at some point. QuickCheck refines that idea and asks the generators to give back values of a certain maximum size . Think of the size as some intrinsic measure of “how big” a value is. The length of a list, the number of leaves in a tree, and the absolute value of an integer are some examples of these sizes. If your data type admits this idea, you can get the information of the wanted size via the sized QuickCheck function.
However, if you think about it, this is not a correct property to check. It may be the case that the tree t did have a copy of the element n, so inserting and deleting the item will leave that initial copy. Thus, the result of treeFind will not be Nothing, but Just n.
These examples show how QuickCheck allows you to get a higher-level view on the tests. Instead of focusing on single test cases, you declare properties, which are closer to the original specification.
SmallCheck
QuickCheck uses random data to test properties of your code. SmallCheck follows a similar spirit, but instead of creating random values, it tests properties with every possible value from a set. For example, you could check the reverse . reverse == id property on every list up to a certain length.
The public interface of SmallCheck is almost the same as that of QuickCheck. It’s also integrated in almost every Haskell test framework, including Tasty and Hspec
Formal Verification with LiquidHaskell
Testing is a very useful tool to gain confidence about your code performing correctly. But even when using randomized testing, your code only runs on a subset of all possible inputs. There are other techniques which give you full guarantees over your code, these are collectively known as formal verification. Although these techniques differ in many conceptual and technical aspects, in all cases the workflow consists on describing the intention of your code in some formal language, and the running some tool (maybe completely automatic, maybe with some human intervention) to verify that the code complies with that specification.
The type system in Haskell can be seen as a form of formal verification, in which types are the formal language and the compiler automatically checks for compliance. Stronger type systems, like dependent types, allow for more invariants to be express and checked. In this section we are going to introduce LiquidHaskell, which extends regular Haskell with refinement types . In a nutshell, a refinement type is the combination of a type (in the usual sense) with a predicate over the values. For example, “integers with are greater than 0” combines the type “integer” with the predicate “greater than 0”. The main selling point of LiquidHaskell is that you do not need to learn other language to implement your code, only a refined version of types to describe your specification.
Installing LiquidHaskell
Obtaining LiquidHaskell is as simple as running cabal install liquidhaskell in the command line. Unfortunately, a successful installation does not mean that everything is in place to verify your programs. In addition you need Z3, a so-called SMT solver, which is used internally by LiquidHaskell to verify the assertions. At the moment of writing you can get binaries for Z3 at https://github.com/Z3Prover/z3/releases . Regardless of the operating system, both liquid – LiquidHaskell’s main executable – and z3 need to the on your PATH.
At the moment of writing, LiquidHaskell is not compatible with the latest version of GHC, only with the 8.4 series. If you are using a newer version of the compiler, my suggestion is to follow the instructions in LiquidHaskell’s repository1 and install LiquidHaskell using Stack.
You can verify that the function obeys its specification by running liquid File.hs in the command line. If everything is fine the text SAFE appears in your screen. You can try to break this invariant and see how LiquidHaskell no longer accepts the code.
Note
The other annotation instructs LiquidHaskell for not checking whether your functions terminate for all inputs. This is another of the interesting features of LiquidHaskell (in fact, very few of your functions should loop indefinitely) but we shall not focus on it.
This error message tells you that the second branch of the function does not obey its specification. Indeed, the size of the tree is not increased in that case.
Caution
One of the weak points of LiquidHaskell is the poor explanations given whenever the specification is not followed. In most cases you just get a pair of expected versus actual refinement types, but no indication of why the latter is not good enough.
The error message has been left out for conciseness, but it points directly towards the third equation in treeInsert. Indeed, it your read instead Node y l (treeInsert x r), in the original version the value x is lost if it is greater than y.
Describing Binary Search Trees
To check that you understand how LiquidHaskell works, Exercise 15-3 asks you to implement some more functions over trees.
Exercise 15-3. Verifying Binary Trees
This function should combine both trees into a single one with all the elements. Think about properties such as: how does the elements and size look like?
This was just a brief introduction to LiquidHaskell. Some details we have left without treatment are LiquidHaskell’s ability to check termination (you never go into an infinite loop) and totality (you cover all possible cases) of your functions. Also, the properties we checked upon trees where quite simple, but there is much more you can do. One important subset is dependent properties, in which the refinement of an argument depends on the value of the previous one. Finally, LiquidHaskell gives you the power to not only automatically check for compliance with respect to a specification, but also to prove manually properties about your program in an equational reasoning style.
Summary
Haddock is the preferred tool for associating pieces of documentation to Haskell declarations and creating beautiful visualizations for it.
Cabal has support for including tests in packages. You learned how to declare a test-suite stanza and got familiar with several Haskell test frameworks: Tasty and Hspec.
The HUnit library for unit testing was introduced.
You read about the benefits of randomized testing and learned how to create these kinds of tests using the QuickCheck library.
As a final touch, you saw examples of how to formally verify your code by annotating it and running it through LiquidHaskell.