3 Developing an application: Stock quotes

This chapter covers

  • Designing a standalone multimodule Haskell program with dependencies
  • Dealing with dates, text, and command-line arguments
  • Parsing CSV files and plotting charts
  • Employing type classes for practical needs

There is a common pattern for many utility programs: we have some data in a form that is not convenient for analysis and want to present the data, visually or textually. While implementing such a program, we have to address many issues, such as interfacing with a user, designing data types for an application domain, reusing external packages for parts of the program, and more. We should also think about language features that can help us in terms of correctness, performance, and an ability to extend functionality, if needed.

In this chapter, we’ll explore the process of developing such a program. I’ll start by describing inputs and outputs, then move on to design issues with data types, functions, and modules, followed by a discussion of useful Haskell packages and implementation details. We’ll also see how type classes can make our programs much more flexible and resilient to changes.

3.1 Setting the scene

The overall task is as follows: we take the historical quotes data for some joint-stock company in CSV format (a text file with comma-separated values), analyze this data, and prepare a statistical report (as a text and an HTML document) with a chart. Figure 3.1 presents the overall data flow of the resulting program: we need to read the CSV file into a collection of some data type values and then process this collection in order to gather statistical information, plot a chart, and prepare a final report.

Figure 3.1 Overall dataflow: from a CSV file to an HTML report

Let’s discuss expected inputs and outputs for this project. Once we’ve presented them, we’ll talk about project structure that will help us to achieve all our goals.

Example: Processing stock quote data

stockquotes/

data/quotes.csv

stockquotes

We can use Haskell to write a program in any application domain.

It’s possible to find a Haskell library for almost any problem.

3.1.1 Inputs

Here is a fragment of the input data file, data/quotes.csv:

day,close,volume,open,high,low
2019-05-01,210.520004,64827300,209.880005,215.309998,209.229996
2019-05-02,209.149994,31996300,209.839996,212.649994,208.130005
2019-05-03,211.75,20892400,210.889999,211.839996,210.229996
2019-05-06,208.479996,32443100,204.289993,208.839996,203.5
2019-05-07,202.860001,38763700,205.880005,207.419998,200.830002
2019-05-08,202.899994,26339500,201.899994,205.339996,201.75
2019-05-09,200.720001,34908600,200.399994,201.679993,196.660004
...

The first line lists the names of the six fields, and every other line of this file contains their corresponding values, as follows:

  • day is the date of the stock transaction.

  • close is the share price at the close of business.

  • volume is the total number of shares of a stock traded during the day.

  • open is the price at the opening.

  • high is the highest price during the day.

  • low is the lowest price during the day.

Besides this file (technically its name), we expect the following to run the application:

  • The name of the company, to make information in reports more specific

  • The name of the HTML file to generate

  • The flag for whether to plot charts

  • The flag for whether to print statistical information in text

As usual, we want to give a user some choice so we ask for flags and names.

3.1.2 Outputs

This is not a book on financial analysis or trend prediction. I’ll limit myself to computing very simple characteristics, such as the mean, minimum, and maximum values of the fields and the number of days between reaching minimum and maximum values. The following is a sample “statistical report” I plan to generate:

+-------------+-------------+----------+----------+----------------------+
| Quote Field | Mean        | Min      | Max      | Days between Min/Max |
+-------------+-------------+----------+----------+----------------------+
| Open        | 202.04      | 175.44   | 224.80   | 100                  |
| Close       | 202.16      | 173.30   | 223.59   | 100                  |
| High        | 204.10      | 177.92   | 226.42   | 101                  |
| Low         | 200.32      | 170.27   | 222.86   | 101                  |
| Volume      | 27869192.38 | 11362000 | 69281400 | 28                   |
+-------------+-------------+----------+----------+----------------------+

Stock quote information is traditionally presented with charts, so we’ll generate them, too. Figure 3.2 demonstrates two sample charts:

  • A candlesticks chart with a line for closing prices

  • A volumes bar chart

 

Figure 3.2 Stock quote charts for an imaginary company

Figure 3.3 explains the meaning of a candlestick. It shows all the day prices (open, close, high, and low) and whether the price is rising over the day. If an opening price is lower than a closing one, then a candlestick body is shown as white. Otherwise (a price is lowering over the day), it is filled with a color.

Figure 3.3 Meaning of a candlestick components

We also want to generate an HTML report that consists of the following:

  • Charts

  • Statistical information

  • Raw data

The latter two items are best presented with HTML tables. Figure 3.4 illustrates sample tables generated in an HTML document.

Figure 3.4 Tables in HTML report

The next question is how to organize a workflow that will drive us from input data to outputs.

3.1.3 Project structure

Let’s think what we should do in this project:

  • Process command-line arguments.

  • Read quote data from a CSV file.

  • Compute statistics.

  • Plot charts.

  • Prepare reports on statistical info in text and HTML.

Depending on the supplied arguments, some of these stages may be skipped.

It is a good practice to split the required functionality over several modules, for example:

  • Params for describing command-line arguments and processing them

  • QuoteData for describing data types we are going to use throughout the project

  • StatReport for computing statistics and preparing a report in a text form

  • HtmlReport for generating a report in an HTML document

  • Charts for plotting charts

Surely, we also need the Main module to connect the program components all together and drive the whole program. Figure 3.5 demonstrates the module structure for this program, with arrows pointing to the imported modules.

Figure 3.5 Module structure for the stock quote data processing project

Tip This diagram was created with the help of the graphmod utility from Hackage, developed by Iavor S. Diatchki. The graphmod utility produces a .dot file. These files can be later processed by the graphviz set of tools for graph visualization.

While describing this project, I assume that you follow along by reading, exploring in GHCi, and running the code in the hid-examples package (the stockquotes folder). Alternatively, you could develop your own solution in an independent manner. For those using the latter way, I’ll provide the necessary details to set up a project from scratch.

Setting up a cabal project

A Haskell package for a project is a directory containing the following:

  • Source code, usually organized in subfolders

  • A .cabal file, which describes the package content, dependencies, and build instructions, among many other things

  • A stack.yaml file, which is a necessary file if we use stack as a building tool

Suppose we are in a fresh directory. Let’s create an src subfolder with several files for Haskell modules in it, as follows:

  • Main.hs

  • Params.hs

  • QuoteData.hs

  • Charts.hs

  • StatReport.hs

  • HtmlReport.hs

Every module should start with a module declaration featuring its name (which should be the same as the filename without an extension), for example:

module QuoteData where

The Main.hs should also contain the main function. Let’s start with the simplest one:

main :: IO ()
main = putStrLn "Stock quotes processing project"

Once we are done with modules, we create a stockquotes.cabal file in the root folder of the project with the following content:

cabal-version:  >= 1.29
name:           stockquotes
version:        0.0.1
synopsis:       Stockquotes processes historical stock quotes data.
build-type:     Simple
 
executable stockquotes
  hs-source-dirs: stockquotes
  main-is: Main.hs
  other-modules: Params QuoteData StatReport Charts HtmlReport
  build-depends:
      base
  default-language: Haskell2010

At this stage we can build and run the project as follows:

$ cabal build
$ cabal run stockquotes

If we want to use stack, we should also add a one-line stack.yaml file:

resolver: lts-14.27

This line fixes a set of packages we can use as external libraries. Building and running with stack is done as follows:

$ stack build
$ stack exec stockquotes

The stack utility reads a .cabal file and builds a project based on information there. In what follows, we’ll add some source code and specify additional dependencies in the stockquotes.cabal file (in the build-depends section).

Note We’ll get back to a Haskell project structure, corresponding files, and cabal/stack commands in chapter 4.

Main project data types, functions, and a flowchart

We are ready to describe the program functionality with types and functions. Figure 3.6 is an informal flowchart that presents the proposed structure of the program. There, you can see user input and both the I/O and pure parts of the program.

Figure 3.6 Processing stock quote data: program structure flowchart

First, we’ll need data types to represent the following:

  • Command-line arguments (Params)

  • Quote data for one day (QuoteData)

  • Some collection with all the data (let’s call it QuoteDataCollection for now)

  • Computed statistical information (StatInfo)

  • A report as an HTML document (Html)

We’ll postpone the definition of these data types until we have enough information on what exactly should be in there.

The program should start by reading user input in the form of command-line arguments (normally a list of Strings) and then either do its job or inform the user about the correct way to run it. Once we have command-line arguments parsed to Params, we can start working with data, as follows:

work :: Params -> IO ()

In this function, we’ll need to read the stock quote data from the CSV file as follows:

readQuotes :: FilePath -> IO QuoteDataCollection

Compute the statistical information (purely!):

statInfo :: QuoteDataCollection -> StatInfo

Prepare the text report (again, purely!):

textReport :: StatInfo -> String

The simplest way to plot a chart is to generate files with them, so we’ll have to stick with IO for this task, as shown next:

plotChart :: QuoteDataCollection -> IO ()

Finally, we generate (purely) and save an HTML document to a file:

htmlReport :: QuoteDataCollection -> StatInfo -> Html
saveHtml :: FilePath -> Html -> IO ()

We’ll refine the types and names of these functions later, but even now, they clearly represent the program functionality.

3.2 Exploring design space

While implementing this project, we should discuss and solve many common practical problems, including the following:

  • Representing data—We have to use several data types, including something for dates.

  • Parsing CSV files—We can either employ an ad hoc solution or use some external library.

  • Formatting reports—A report is text with data structured in some way. This should be addressed with flexibility and extensibility in mind. Generating HTML is another practical task that should be thought of.

  • Plotting charts—We do want to use sophisticated packages here.

  • Designing the UI—We are implementing a terminal application. Consequently, we should deal with command-line arguments. Prepare to see Semigroup and Applicative in action!

  • Maintaining a clear division between pure and I/O parts of the program—We’ll aim to keep the latter as small as possible.

In this section, I’ll present various options and make a choice for this particular project. Remember, this is still a study example. To keep things simple, I’ll leave out performance, exception handling, testing, and many other issues for now.

3.2.1 Designing the user interface

I want this application to have a command-line interface because I find that the most efficient. In all the previous examples, we’ve analyzed the command-line arguments manually. As a result, all our arguments were strictly positional : the user had to specify them in positions, expected by the program. Traditionally, mandatory arguments are positional, but program behavior can be tweaked with a set of options or flags beginning with a dash in any position. Consequently, parsing a command line becomes a not-so-easy problem, because we have to analyze all the arguments and build some specific data structure that contains all the parameters.

We’ve already discussed the set of program parameters. The following is one possible way to include all of them in the command-line arguments:

Usage: stockquotes FILE [-n|--name ARG] [-c|--chart] [--html FILE]
                        [-s|--silent]
  Stock quotes data processing
 
Available options:
  FILE                     CSV file name
  -n,--name ARG            Company name
  -c,--chart               Generate chart
  --html FILE              Generate HTML report
  -s,--silent              Don't print statistics
  -h,--help                Show this help text

This is a rather standard way of presenting command-line interfaces. We have one positional argument, namely, the name of the CSV data file, and several short and long flags and options. Remember that flags can be given in any order and any position. Moreover, they can be omitted altogether.

One option could be traversing a list of command-line arguments (retrieved from getArgs) and filling some Map or associate list with them. Fortunately, we have other options. The two most popular Haskell libraries for parsing command-line arguments are

  • optparse-applicative by Paolo Capriotti

  • cmdargs by Neil Mitchell

Both of them force a distinction between command-line arguments and the data type for storing configuration parameters. To use these libraries, we first describe our options by associating them to a configurational data structure. Then, a library based on this description parses a command line and gives us a well-formed configuration if the user specified arguments correctly or generates an error message otherwise. Both libraries can generate an interface description we’ve seen.

In this book, I’ve chosen the optparse-applicative library because it features a very nice example of Applicative in practice. Despite that, both libraries are well suited to be used in industrial applications and are widely adopted by the community.

Note In my opinion, these Haskell libraries for dealing with command-line arguments are extremely powerful when compared with other programming languages. As we’ll see, unique Haskell features contribute a lot to this power.

3.2.2 Dealing with input data

We have a CSV file as the input. Most of its components are simply numbers: volume is an integer number, whereas all prices are floating-point numbers. For prices, we could use fixed-point numbers as we discussed in the previous chapter. In fact, share prices deserve their own data type able to deal with rounding errors, different currencies, and localization. One could use the safe-decimal or safe-money packages to deal with these problems. The simplest solution, though, is to use the Double type, so let’s stick with it as it perfectly suits our goals.

One of the CSV file fields, day, is interesting. Let’s discuss representing dates in Haskell.

Dates and times

You should take into account many factors when using dates and times in software. First, we have to decide which calendar to use. These days, the most straightforward solution is to stick with the Gregorian calendar, but there are other options as well. Processing dates in the first millennium AD would require using the Julian calendar, although writing software for businesses or governments (with fiscal years in mind) might result in employing the ISO week date system (as defined in ISO 8601). Referring to time means dealing with timestamps, moments in time with respect to time zones, or durations. Time zones introduce the issue of Daylight Savings Time. It could get much worse: what about so-called leap seconds, which are irregularly added to some years due to the changes in the Earth’s rate of rotation around the Sun?

Fortunately, the time package in Haskell is sophisticated enough to deal with all these technicalities. It employs the type system to prevent users from making mistakes in mixing times and dates. The Day type represents a date in the Gregorian calendar (which is stored as a count of days, with zero being the day November 17, 1858). We can use it for the day field. Many types for times and durations are also available in the time package. All of them can be imported from the Data.Time module of this package. We’ll use some of them later in this book.

Apart from these data types, the time package provides many functions, including

  • Constructing dates and times from integer values (like years, months, days, hours, minutes, and seconds)

  • Parsing dates and times from strings (with an ability to specify an expected format)

  • Formatting dates and times into strings (by specified formats and with rather limited localization)

  • Getting the current date and time (this clearly requires IO)

  • Manipulating dates and times, such as by adding date intervals or computing differences

Tip All the details about the time package are presented in the documentation on Hackage (https://hackage.haskell.org/package/time). I also recommend reading “A Haskell Time Library Tutorial” (https://two-wrongs.com/ haskell-time-library-tutorial.html) by Christoffer Stjernlöf.

Parsing CSV files

Parsing data files is a well-known programming task. CSV files have very simple structure. To read them we could use basic Text processing facilities, such as lines, splitOn, and read functions. One line of a CSV file could be parsed into the QuoteData type and a complete file into [QuoteData] as follows:

  1. Split the file content into a list of lines.

  2. Skip the first line with the names of the fields.

  3. Transform all other lines into values of the QuoteData type.

Transforming file lines into QuoteData is the most challenging task here: we should split the line into components, parse the date from the first component, and turn the others into Double values. Then we should create the QuoteData value from the extracted components.

Such a naive implementation could use the plain old “garbage in, garbage out” strategy. Any formatting errors in the original file, such as the wrong number of fields, wrong date formats, or NaN (not-a-number) values in other fields would result in an exception, leading to the program halting. We could deal with errors differently, for example:

  • Ignore incorrect lines silently, or report them to the user.

  • Interpolate missing values somehow using neighboring values.

  • Stop reading the file after encountering an error.

All of these strategies make a manual implementation much harder. Alternatively, we could use some powerful parsing libraries, such as parsec. But again, dealing with CSV file irregularities and corner cases can be quite cumbersome.

Fortunately, we have a third path. We could use an external package, cassava, designed specifically for parsing CSV files. The cassava package allows us to avoid hand-rolled CSV parsing and replace it with a carefully crafted, highly efficient implementation. As always, there is a price to pay, as follows:

  • We have to describe our data in terms of this package by defining a conversion for file content into stock quote data fields. This can be done by implementing instances of the FromField type class that comes with the cassava package.

  • We have to work with Vector (found in the Data.Vector module from the vector package), the data structure that is given to us as a result of parsing. The good news is that doing so can be almost transparent for us, thanks to the Foldable type class that has an instance for Vector.

It seems that cassava is the best approach in this case, so we’ll use it here.

Making design decisions

Note the choice we are making here:

  • Manual naive implementation—quick and dirty, many issues including dealing with errors and bad performance

  • Common powerful instruments (such as full-blown parsers in this case)—may be hard to learn and difficult to use

  • Specific tool with its own limitations and choices made for us

This situation is quite common. In general, we have to think carefully before making a choice.

3.2.3 Formatting reports

Our expected results include a statistical table in text form, charts, and a full report in HTML. Although the charts seem completely independent from the rest of the results, text and HTML share some formatting. For example, we’ve decided to use Double for prices. Clearly, we should use a fixed number of decimal places when reporting them. For another example, both text and HTML contain tables. Wouldn’t it be nice to have some common subsystem for tables? Let’s see our options on preparing reports first and then move to charts.

We already know the fmt package, which provides a set of functions and type classes for formatting text information. There is no need to look for something similar. This library can be used at the level of formatting individual values as follows:

  • If it is a value of some specific type, then we can implement a Buildable instance for it.

  • If it is a value of some type that is already known to fmt, then we can use functions and format expressions to get the desired output.

As an example of the latter, the fmt package provides the fixedF function, which produces a Builder with a fixed number of decimal digits for any Double.

Another good thing about the fmt library is that it is highly polymorphic. For example, the pretty function can take anything of the Buildable a type and produce String, or Text, or even a printed value, depending on what we need in the particular context.

Printing tables can be implemented in several ways. We could prepare a list of rows and format every row as columns via any formatting library. We could also use matrices with a Text cell for every value. Unfortunately, with this approach it is quite hard to get a nice output. Dealing with column widths can be quite cumbersome—we need to at least precalculate every value in the column to do that correctly, which would require a lot of manual work. As usual with Haskell, there is also a library for this. In this case, we could use the colonnade package. To use this library, we describe our columns first (their names and instructions on how to format individual values) and then provide data for rows (as a list of row data types). The colonnade package itself supports printing tables in text form exclusively, but it has adapters for HTML generation also. A good library is a paramount choice for us in this chapter, so let’s use the colonnade package.

Generating HTML manually is also possible, but it’s definitely not a good option. Instead, we could use one of the following libraries:

  • The blaze-html package by Jasper Van der Jeugt and Simon Meier

  • The lucid package by Chris Done

Both libraries support HTML generation in a very clear manner. We could use either, but let’s take blaze-html because it is more popular, judged by the number of downloads on Hackage.

Tip I recommend reading about both the blaze-html and lucid libraries because they are good examples of designing a library for Haskell that features convenience and performance. The starting point could be a tutorial on blaze-html (https://jaspervdj.be/blaze/tutorial.html) and a blog post on lucid (https://chrisdone.com/posts/lucid/), both written by the authors of the libraries. The lucid library is more novel. Chris Done likes to argue for why his lucid library is better. The discussion in the blog post is very instructive.

The colonnade package for representing tables can be used together with the blaze-html and lucid libraries. This is possible thanks to the lucid-colonnade and blaze-colonnade packages. A generic backend-agnostic library with different backends is an extremely popular approach for designing packages within the Haskell community.

3.2.4 Plotting charts

Unfortunately, Haskell is not the best language for presenting data in visual form. Such languages as Python and R provide much better infrastructure and tooling. Nevertheless, we can still draw 2-D charts and plots in Haskell. In this project, we’ll use the Chart package by Tim Docker for that. It’s a good example of a package built on top of other sophisticated Haskell packages, so it’s instructive to discuss its ideas and implementation.

The Chart package allows describing a chart we want to plot. A chart in terms of this package is a deeply nested data structure. We construct elements we are interested in (such as layouts, axes, legends, data points) one by one, leaving all others with their default values. This package supports line plots, bar plots, pie charts, and even candlestick plots out of the box. One difficulty about this package is that it uses lenses to access elements of the chart structure. This approach is very powerful. In this chapter, we’ll see how to use it without a deep understanding of what lenses are. For basic charts with default parameters, it’s possible to avoid using lenses altogether because we can use simple wrappers. Unfortunately, this is not our case. We’ll cover lenses in greater detail in chapter 14.

The Chart package requires a backend for generating image files. We’ll use Chart-diagrams to generate SVG files because it is the simplest one. It’s also possible to generate PNG or JPG images, as well as other graphical formats. Unfortunately, this would require a lot of system dependencies that may be hard to install correctly on Windows. There should be no problems with SVG in any operating system.

The Chart package has a wiki page on GitHub (https://github.com/timbod7/ haskell-chart/wiki) with several examples on how to use it for plotting various sorts of graphs.

3.2.5 Project dependencies overview

I’ve already mentioned several packages apart from base that we’ll need for this project. In fact, we’ll need two more:

  • The text package provides Text processing facilities that we’ll use in almost every example in this book.

  • bytestring for reading CSV files (cassava expects data in the form of a byte string) and saving HTML reports to a file.

All the packages we need are listed in table 3.1.

Table 3.1 Used packages

Package

Used for

text

Efficient text processing and input/output

bytestring

Efficient input/output for binary data

time

Dealing with dates and times

fmt

Formatting text

blaze-html

HTML generation

colonnade, blaze-colonnade

Generating tables in text and HTML form

Chart, Chart-diagrams

Drawing charts

cassava

Parsing CSV files

optparse-applicative

Processing command-line arguments

If you are working with the hid-examples package, all these packages are installed automatically on the first build. If you’ve created your own project for working through this example, then you need to specify all these dependencies in the build-depends section of the stockquotes.cabal file as follows:

build-depends:
      base
    , text >=1.2 && <1.3
    , bytestring >=0.10 && <0.11
    , time >=1.8 && <1.11
    , fmt >=0.5 && <0.7
    , colonnade >=1.1 && <1.3
    , blaze-html >=0.9 && <0.10
    , blaze-colonnade >=1.1 && <1.3
    , Chart >=1.8 && <1.10
    , Chart-diagrams >=1.8 && <1.10
    , cassava >=0.5 && <0.6
    , optparse-applicative >=0.14 && <0.16
  default-language: Haskell2010

At the first build after editing the .cabal file, all these dependencies will be installed. We’ll discuss many issues with version numbers in the next chapter. It may be the case that you need to edit some of the upper bounds to get this package compiled. If you run into problems, the easiest solution could be to find the corresponding section in the hid-examples.cabal file and copy its content.

While installing these packages, many others are also installed as dependencies. In total, there are 108 external packages used to build this project. It is almost impossible to write useful programs without referring to external packages. We’ll meet many other packages later in this book that you can use in your own projects.

3.3 Implementation details

In this section, I’ll go over all the details to implement the stock quote processing project. We already know all the inputs and outputs. We’ve settled on the external packages we are going to use. My plan is as follows:

  1. We’ll start with describing data and cooking it in a way suitable for both reading from CSV file and processing (the QuoteData module).

  2. Once we have our data ready, we will plot charts. Remember that charts represent only input data, not the statistics (the Charts module).

  3. Then we’ll move to preparing reports and discuss how to compute all the statistics, format the results, and generate tables both in text and HTML forms (the StatReport and HtmlReport modules).

  4. After that, we’ll describe program configuration and command-line arguments (the Params module).

  5. Finally, we’ll connect everything together in the Main module.

Note the style of the descriptions. I don’t attempt to present every piece of code. After all, looking over it on GitHub or in a text editor is much more convenient. Instead, I show the most interesting fragments and comment on them in terms of Haskell features used, programming tricks and techniques, and external library facilities.

3.3.1 Describing data

Remember that we have input data in the form of the data/quotes.csv file, as shown here:

day,close,volume,open,high,low
2019-05-01,210.520004,64827300,209.880005,215.309998,209.229996
2019-05-02,209.149994,31996300,209.839996,212.649994,208.130005
...

This is a CSV file with named fields. We have six fields in every line: the first one represents the date, the third one is an integer value, and all others are floating-point numbers.

Example: Representing and processing data

stockquotes/QuoteData.hs

stockquotes

Type classes and instances help external packages work with our data.

Now it’s time to declare a data type corresponding to one line of the CSV file as follows:

data QuoteData = QuoteData {
                   day :: Day,
                   volume :: Int,
                   open :: Double,
                   close :: Double,
                   high :: Double,
                   low :: Double
                 }

This data type should be used in different situations, such as when parsing a CSV file or computing statistics, so we need to make it suitable for that.

Cooking data for cassava

To describe our data in a form suitable for the cassava package, we’ll derive or define instances of several type classes, namely:

  • Generic from the GHC.Generics module to give cassava instances for working with our data types using generic programming machinery (more on that in chapter 12).

  • FromNamedRecord from the Data.Csv module to allow cassava to read a CSV file with named fields; this will be possible thanks to the same names being used in the CSV file and the QuoteData data type.

  • FromField from the Data.Csv module to teach cassava how to parse Day values; cassava can parse values of many types from base, but it doesn’t know how to deal with other types.

Note the heavy use of type classes. This is a quite common idiom in Haskell. The library cannot imagine all types it is used with (those types may not even exist yet). Instead, it describes constraints and behavior with type classes. Now it’s our responsibility to provide the corresponding instances in order to use the library. Thus, type class instances build a bridge between the library’s interface (API in the form of type classes and functions that rely on them) and the client’s data types. We’ll see the same idea at work many times later in this book.

The code we are going to write requires the following GHC extensions for instance derivation:

{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE DeriveAnyClass #-}

We’ll get back to these extensions in chapter 12. For now, it’ll suffice to know that they extend the behavior of the deriving clause in data type declarations.

We’ll aso need to import the following modules:

import Data.Time (Day, parseTimeM, defaultTimeLocale)
import Data.ByteString.Char8 (unpack)
import GHC.Generics (Generic)
import Data.Csv (FromNamedRecord, FromField (..))

Note how we limit imports: we avoid introducing unnecessary names from the imported modules by specifying what we actually need. The syntax (..) in the import lists refers to everything inside. For example, FromField (..) in the import list for the Data.Csv module refers to the FromField type class and every method of this type class. We can use the same syntax for algebraic data types and their value constructors. Alternatively, we could list names of methods and value constructors explicitly if we want to import only some of them.

The cassava package knows nothing about the Day type, so we need to teach this library how to parse it. We can do this by implementing an instance of the FromField type class, which is defined in Data.Csv as follows:

class FromField a where
  parseField :: Field -> Parser a
  {-# MINIMAL parseField #-}

This type class defines how to parse a field of type a. The type Field is a synonym for ByteString (that’s why we’ve imported unpack), and Parser is a monadic parser used inside cassava. We already know the monad interface, so we don’t even need to think about what this Parser is about—it’s a monad, and that’s enough. One possible instance for Day follows:

 FromField Day where
  parseField = parseTimeM True defaultTimeLocale "%Y-%m-%d" . unpack

We first unpack the given ByteString into a String and then use the parseTimeM function. This function can work in any monad to parse a Day value; it will report a failure to the underlying monad in case of errors. For example, we can use it in the Maybe context and get Nothing if parsing fails. In this case, it will be called in the context of the Parser monad.

Reporting failures in monads

Not every Monad features an ability to report failures. It comes through the MonadFail type class with only one fail method. In the past, this method was part of the Monad type class. Fortunately, it’s not anymore. If some particular monadic context has an ability to report failures, then we expect an instance of the MonadFail type class for it.

In addition to a String with a value, the parseTimeM function takes an expected date or time format, a date/time locale, and a flag for whether to accept leading and trailing spaces in the given String.

Exploring functions in GHCi

Remember that you can always explore functions in GHCi yourself as follows:

$ cabal repl stockquotes
ghci> import Data.Time
ghci> :type parseTimeM
...
ghci> :doc parseTimeM
...

Exploring functions from external packages is easier within a cabal project, because all the dependencies are already available.

Once we have the FromField instance for the Day type, we can derive a corresponding instance for the QuoteData itself as follows:

data QuoteData = QuoteData {
                   ...
                 }
  deriving (Generic, FromNamedRecord)

This is enough for cassava to decode a CSV file and create a QuoteData value from every line if the given file is parsed correctly. Note that we had to write only nine lines of code for that. Well, technically, it’s only one line of code. Everything else is about instances and deriving them.

Cooking data for computing statistics

There is not much difference in computing minimums or maximums for opening or closing share prices. If we have an array-like structure for those values, we could process them uniformly. Of course, we have Int values for volumes as well, not only Double, and we still need to compute minimums and maximums for the volume field. We cannot put Int and Double into one list or any other array-like data structure. This is not a problem for programming languages featuring dynamic typing, such as Python. In Haskell, it is a problem.

My solution is as follows:

  • I am going to introduce a data type for referring to those fields of the QuoteData data type that require statistical processing.

  • I’ll write a function that transforms both Int and DoubleQuoteData components into Double, based on the required field information.

Clearly, transforming everything to Double is not the best idea for Haskell, but it’s going to work. As an aside, we’ll have to ignore the floating-point part later when presenting originally integer data in reports.

The first part of this plan follows:

data QField = Open | Close | High | Low | Volume
  deriving (Eq, Ord, Show, Enum, Bounded)
 
field2fun :: QField -> QuoteData -> Double
field2fun Open = open
field2fun Close = close
field2fun High = high
field2fun Low = low
field2fun Volume = fromIntegral . volume

We’ve defined a value constructor for every numeric field in QuoteData and mapped it to the record fields (which are technically accessor functions, QuoteData -> Double). As a result, we’ll be able to write something like field2fun qf q to access any required field qf from a value q of the QuoteData type.

That’s it for the QuoteData module. Note that we never defined the QuoteDataCollection data type for storing a collection of the QuoteData values. In fact, we don’t need it. Any such collection can well be Foldable t => t QuoteData. An interface we have from the Foldable type class is enough to do whatever we want with the data. Of course, we could do more things and do them with better performance if we knew the exact internal representation of this collection, but that’s beyond the scope of this chapter.

Exploring quote data in GHCi

I’ve implemented a small readQuotes wrapper function for loading data from a CSV file to a [QuoteData] list to play with them in GHCi as follows:

$ cabal repl stockquotes
ghci> quotes <- readQuotes "data/quotes.csv"
ghci> day $ head quotes
2019-05-01
ghci> field2fun High $ last quotes
220.960007

All further functions we are going to implement will work with the resulting data, because the list type implements the Foldable type class.

3.3.2 Plotting charts

Haskell is an old and powerful language. Consequently, over the years, Haskell has acquired many sophisticated ideas and techniques, or styles, which we can use to write our code. Different libraries promote very different programming styles. We’ll see one such style in this section on plotting charts. I’d call it a lens-based declarative description.

Example: Plotting charts

stockquotes/Charts.hs

stockquotes

We can plot charts in Haskell with the Chart package.

A chart is a deeply nested data structure with a lens interface.

Let’s look at the chart in figure 3.7 and discuss its structure.

Figure 3.7 Stock quote charts for an imaginary company

First, this chart consists of two charts. In terms of the Chart package, this chart has a stacked layout with two layouts inside it. The layout is one plotting area with a background (grid), axes, a title, a legend, and plotted data. Our first layout contains candles and a line representing closing prices. Our second layout contains volume bars. Stacked layouts share one x-axis; that is a very useful feature in our case.

Layouts in the Chart package

Despite stacked layouts, the Chart library also supports individual layouts, two-sided layouts (with different axes on the left- and right-hand sides of the plotting area), and a grid layout when charts are organized in a free-form grid.

Second, let’s describe individual layouts. The upper one has a title, two axes, and two plots—one with a line, another with candles. The lower one has two axes and a bar plot, with no title.

Third, we are going deeper. Axes, a legend, and plotting area grids are pretty standard. Thanks to the stacked layout, we have the shared x-axis for dates and the common legend for two charts. It’s possible to tweak these components, too, but I prefer the default view.

Fourth, we have data plots, a line, candles, and bars. Besides data, all these plots have many properties like colors, line widths, fill styles, and data row labels. We can either use default values or set them as we like.

Note the pattern. We have some default values on every level of a chart description, and we can set them to something different if we want. We can imagine a big, deeply nested data structure: the chart itself, layouts, layout components, all the way down to individual values. This is the case for the lenses, an approach to work with deeply nested data. We’ll use it to set values on the deeper levels of a data structure.

So, we describe our chart level by level and never say how to draw all this stuff. This is an example of a declarative description. The library knows better how to draw. Let’s get to the code that follows:

module Charts (plotChart) where                         
 
import Data.Foldable (toList)                           
import Graphics.Rendering.Chart.Easy                    
import Graphics.Rendering.Chart.Backend.Diagrams        
 
import QuoteData

Exports only one function, which plots a chart

Needs only one function from this module

Exports everything we need

Backend to draw a chart in SVG format

We have only one function to plot everything:

plotChart :: Foldable t =>         
             String                
             -> t QuoteData        
             -> FilePath           
             -> IO ()              

Interfaces to a data collection

Chart title

Quote data collection

Filepath to save a chart image

Saving files requires IO.

The plotChart function saves a prepared chart to a file via the diagrams backend as follows:

plotChart title quotes fname = do
    _ <- renderableToFile fileOptions fname (toRenderable chart)
    pure ()
  where
    fileOptions = FileOptions (800, 600) SVG loadSansSerifFonts
    ...

The chart variable is the most interesting here. Before defining it, we prepare our data for injecting into a chart description as follows:

    (candles, closings, volumes) = unzip3 $
      [ (Candle day low open 0 close high,
        (day, close),
        (day, [volume])) | QuoteData {..} <- toList quotes ]

Three lists, namely candles, closings and volumes, have a form that is expected by the Chart library. To plot candles, we have to provide a list of Candle data type values. For lines, it requires a list of pairs (x, y). For bars, we give a list of (x, [...]) where [...] is a list of bars for every data point. We have only one bar here, a volume.

Note the QuoteData {..} syntax. It requires enabling the RecordWildCards GHC extension.

RecordWildCards GHC extension

The RecordWildCards GHC extension allows bringing all the record fields into scope without mentioning their names explicitly. We can use it for both accessing these values in pattern matching and constructing records. For example, if we are given the QuoteData, we could define a function over it as follows:

isRising :: QuoteData -> Bool
isRising QuoteData {..} = close > open

Or we can construct a new record, like so:

zeroQD :: Day -> QuoteData
zeroQD day = let close = 0
              open = 0
              high = 0
              low = 0
              volume = 0
           in QuoteData {..}

Note that the day argument is also captured, so we don’t have to introduce a local binding for it.

Remember, code should have a LANGUAGE pragma mentioning the RecordWildCards extension at the beginning of the file in order to use it.

The chart itself is a data structure with a list of layouts inside it, as shown next:

    chart = slayouts_layouts .~             
        [ StackedLayout candlesLayout,
          StackedLayout volumesLayout
        ]
      $ def                                 

Sets a field value

Default value for a stacked layout

Technically, chart is a value of the StackedLayouts data type. This is a record with two fields: one for a list of inner layouts, and another one for a flag for whether to compress legends from individual layouts into one legend below. We use the lens (.~) operator to set a value for the first field and leave the second one with a default value (which is True).

The candlesLayout function describes the first layout with candles and closing prices, as shown here:

    candlesLayout =
       layout_title .~ title
     $ layout_plots .~ [ toPlot $ qline "Close" closings green,
                         toPlot $ candle "Candle" candles cyan ]
     $ def

This function returns a Layout, which is a record with more than a dozen fields. We’ve set only two of them here and left all the others with their default values.

The second layout, shown next, is even simpler:

    volumesLayout =
       layout_plots .~ [ plotBars $ bars "Volume" volumes gray ]
     $ def

I set more fields as follows to describe plots themselves, but the idea is the same:

    candle label values color =
       plot_candle_line_style  .~ lineStyle 1 gray
     $ plot_candle_fill .~ True
     $ plot_candle_rise_fill_style .~ fillStyle white
     $ plot_candle_fall_fill_style .~ fillStyle color
     $ plot_candle_tick_length .~ 0
     $ plot_candle_width .~ 3
     $ plot_candle_values .~ values
     $ plot_candle_title .~ label
     $ def
 
    qline label values color =
       plot_lines_style .~ lineStyle 1 color
     $ plot_lines_values .~ [values]
     $ plot_lines_title  .~ label
     $ def
 
    bars label values color =
       plot_bars_titles .~ [label]
     $ plot_bars_values .~ values
     $ plot_bars_item_styles .~ [(fillStyle color, Nothing)]
     $ def

In these descriptions I’ve used a couple of functions. Let’s also mention them as follows:

    fillStyle color = solidFillStyle (opaque color)
 
    lineStyle n color =
       line_width .~ n
     $ line_color .~ opaque color
     $ def

Tweaking charts

I recommend working on this chart description. Change colors and styles, and add plots. Try to use LayoutLR to combine all the plots into one plotting area. It’s very easy to experiment in GHCi, as shown here:

$ cabal repl stockquotes
ghci> quotes <- readQuotes "data/quotes.csv"
ghci> plotChart "Sample quotes" quotes "chart.svg"

This will allow you to understand the Chart library much better.

Tip Unfortunately, with a stacked layout, it’s impossible to have individual layouts of varying heights. All of them have to be the same height. Why not hack on that? There is an issue on GitHub (https://github.com/timbod7/haskell-chart/issues/152) that discusses this problem. For the chart in this section, I’d like to have a 3:1 relation with three parts for candles and one for volumes. We could do that with a grid layout, but it doesn’t support shared axes.

That’s it for charts. About 70 lines of code to get a quite informative, professional-looking chart. Not bad, eh?

3.3.3 Preparing reports

In this subsection, we’ll implement an important piece of functionality: we’ll compute statistics about our data and build reports in text form and in HTML.

Example: Preparing the statistical report

stockquotes/StatReport.hs

stockquotes

We can use higher-order functions to make code much smaller.

The colonnade package greatly simplifies printing tabular data.

Computing statistics

Remember, we’ve decided to do all the statistics computations with Double, although some of the fields are Int, namely, volumes. Even worse, minimum and maximum of volumes are integers, but the mean value should always be a floating-point number.

To deal with these issues, I introduce a default number of floating-point places and a type for representing a statistic value with respect to the number of decimal places expected, as shown next:

decimalPlacesFloating = 2
 
data StatValue = StatValue {
    decimalPlaces :: Int,
    value :: Double
  }

When we compute such a value, we know precisely how many decimal places are meaningful for it, either zero or two, or maybe four. Later in this book, we’ll see a better way to deliver configuration values than having the decimalPlacesFloating constant all over the program.

Once again, this is not the best decision in terms of types and separation of concerns. Why on earth should we combine computations with formatting? Well, it’s simple and practical. I’m sorry.

What do we need from the given data? We want to map over it and fold it into a single value, nothing else. Consequently, (Functor t, Foldable t) => t QuoteData should suffice. We could require Traversable t instead, which conveniently extends both Functor and Foldable, but there is no need to constrain our data beyond what is actually required.

The analysis has two dimensions: the chosen statistic (minimum, maximum, mean, and number of days between the minimum and the maximum) and the specific record field (open, close, high, low, and volume). Clearly, computing the minimum is the same for any field, and extracting a field from the quote data does not depend on the computed statistic.

The following data type can be used to represent all the statistics for one field:

data StatEntry = StatEntry {
    qfield :: QField,
    meanVal :: StatValue,
    minVal :: StatValue,
    maxVal :: StatValue,
    daysBetweenMinMax :: Int
  }

We compute means with almost no information about our actual data. Once we have something Foldable with Fractional values inside, as shown next, we are good to go:

mean :: (Fractional a, Foldable t) => t a -> a
mean xs = sum xs / fromIntegral (length xs)

Computing a number of days is also quite generic. As we relate prices and volumes to days, we have to supply a Foldable with the whole QuoteData inside, as follows:

import Data.Ord (comparing)
import Data.Foldable (minimumBy, maximumBy)
import Data.Time (diffDays)
 
...
 
computeMinMaxDays :: (Ord a, Foldable t) =>
                     (QuoteData -> a) -> t QuoteData -> (a, a, Int)
computeMinMaxDays get quotes = (get minQ, get maxQ, days)
  where
    cmp = comparing get
    minQ = minimumBy cmp quotes
    maxQ = maximumBy cmp quotes
    days = fromIntegral $ abs $ diffDays (day minQ) (day maxQ)

Note that this function allows us to work with individual fields without changing their type to Double:

ghci> :type computeMinMaxDays
 open quotes
computeMinMaxDays open quotes :: (Double, Double, Int)
ghci> :type computeMinMaxDays volume quotes
computeMinMaxDays volume quotes :: (Int, Int, Int)

Now we can compute all the statistics into [StatEntry] as follows:

statInfo :: (Functor t, Foldable t) => t QuoteData -> [StatEntry]
statInfo quotes = fmap qFieldStatInfo [minBound .. maxBound]               
  where
    decimalPlacesByQField Volume = 0                                       
    decimalPlacesByQField _ = decimalPlacesFloating
 
    qFieldStatInfo qfield =
      let
        get = field2fun qfield                                             
        (mn, mx, daysBetweenMinMax) = computeMinMaxDays get quotes
        decPlaces = decimalPlacesByQField qfield                           
        meanVal = StatValue decimalPlacesFloating                          
                            (mean $ fmap get quotes)                       
        minVal = StatValue decPlaces mn
        maxVal = StatValue decPlaces mx
      in StatEntry {..}                                                    

Computes statistics for all the fields in the QField data type

Volumes are presented without a fractional part.

Getter to access the particular field

Decimal places for the particular field

The mean value always has a fractional part.

Extracts a Foldable with one field

Uses RecordWildCards to fill a record

Isn’t it interesting to see what we’ve just computed? Well, we should explain to GHCi how to print all these values before that. That is our next goal.

Formatting individual values

Let’s use the fmt package to format values. We have at least two types to write Buildable instances: StatValue and StatEntry. In fact, we need the latter to print corresponding values in GHCi. This is still useful.

The Buildable instance for StatValue is very simple: we just apply formatting for Double as follows:

instance Buildable StatValue where
  build sv = fixedF (decimalPlaces sv) (value sv)

The same formatting should be applied when printing all the prices in the HTML report, so let’s define an auxiliary function for that next:

showPrice :: Double -> Builder
showPrice = fixedF decimalPlacesFloating

The StatEntry value has many fields, and we use Builder operators to define the final formatting as follows:

instance Buildable StatEntry where
  build StatEntry {..} =
           "Stats for "+||qfield||+": "
             +|meanVal|+" (mean), "
             +|minVal|+" (min), "
             +|maxVal|+" (max), "
             +|daysBetweenMinMax|+" (days)"

Remember, this code requires two GHC extensions, namely, RecordWildCards and OverloadedStrings.

With these two instances, we can explore statistics information for our sample quotes data as shown next:

ghci> quotes <- readQuotes "data/quotes.csv"
ghci> si = statInfo quotes
ghci> import Fmt
ghci> pretty $ unlinesF si
Open: 202.04 (mean), 175.44 (min), 224.80 (max), 100 (days)
Close: 202.16 (mean), 173.30 (min), 223.59 (max), 100 (days)
High: 204.10 (mean), 177.92 (min), 226.42 (max), 101 (days)
Low: 200.32 (mean), 170.27 (min), 222.86 (max), 101 (days)
Volume: 27869192.38 (mean), 11362000 (min), 69281400 (max), 28 (days)

We definitely need tables, don’t we? Stay with me.

Printing a table

Let’s look at the ideas behind the colonnade package. It allows defining the structure of a table that is a collection of columns. Every column is defined by the column header and a function to extract and format a value from the data structure corresponding to one row of the table. Tables in colonnade are Monoid values. Every column is a one-columned table. If we combine two columns with (<>), we get a two-columned table. Once a table structure is ready, we supply a list of row values. The library then prepares data, computes column widths, and formats output in a tabular form. All this functionality is provided by the Colonnade module.

To organize tabular printing as a text for [StatEntry], we define a list of columns, mconcat them, and then call an ascii function that produces a String formatted as a table, as shown next:

textReport :: [StatEntry] -> String
textReport = ascii colStats
  where
    colStats = mconcat
      [ headed "Quote Field" (show . qfield)
      , headed "Mean" (pretty . meanVal)
      , headed "Min" (pretty . minVal)
      , headed "Max" (pretty . maxVal)
      , headed "Days between Min/Max" (pretty . daysBetweenMinMax)
      ]

Note how we use the pretty function from fmt, which formats the given Buildable as expected by the context. In this case, the ASCII backend from colonnade expects a String for every cell value. So the pretty function returns a String.

Let’s print this report immediately as follows:

ghci> quotes <- readQuotes "data/quotes.csv"
ghci> putStr $ textReport $ statInfo quotes
+-------------+-------------+----------+----------+----------------------+
| Quote Field | Mean        | Min      | Max      | Days between Min/Max |
+-------------+-------------+----------+----------+----------------------+
| Open        | 202.04      | 175.44   | 224.80   | 100                  |
| Close       | 202.16      | 173.30   | 223.59   | 100                  |
| High        | 204.10      | 177.92   | 226.42   | 101                  |
| Low         | 200.32      | 170.27   | 222.86   | 101                  |
| Volume      | 27869192.38 | 11362000 | 69281400 | 28                   |
+-------------+-------------+----------+----------+----------------------+

Writing monadic one-liners with (>>=)

For quick and dirty GHCi exploring, we could have the same table printed as a one-liner as follows:

ghci> readQuotes "data/quotes.csv" >>= putStr . textReport . statInfo

I believe that using monadic bind (>>=) explicitly from time to time helps to understand monads better. After all, it’s just a function. In this particular case, it is implemented for sequencing IO actions. Nothing special.

That’s it. We’ve combined the following to get a tabular view:

  • A data type representing rows

  • Formatting of individual values via fmt

  • A monoidal table structure (headers, cell data, and their formatting) via colonnade

  • ASCII backend from colonnade to produce a resulting String

Let’s move to preparing an HTML report and apply the same ideas to get an HTML table.

Generating an HTML document

The structure of the document we want to generate follows:

<html>
  <head>
    <title>...</title>
    <style>...</style>
  </head>
  <body>
    <h1>Charts</h1>
    <!-- charts -->
    <h1>Statistics Report</h1>
    <!-- statistics table -->
    <h1>Stock Quotes Data</h1>
    <!-- quote data table -->
  </body>
</html>

HTML is quote verbose. Hopefully, it is possible to alleviate this verbosity with a library.

Example: Preparing the report in HTML

stockquotes/HtmlReport.hs

stockquotes

The blaze-html package provides a monadic interface for generating HTML.

The blaze-colonnade generates tables in HTML.

To build an HTML document, we use the following two modules:

import Text.Blaze.Html5 as H
import Text.Blaze.Html5.Attributes (src)

Note the unqualified aliased import of the Text.Blaze.Html5 module. We usually do that when some names from the module are ambiguous but others are not. For example, this module provides head and body functions for the corresponding HTML tags. With such an import declaration, we write H.head to avoid ambiguity with the head function over lists and leave body unqualified.

Let’s start with tables. We prepare to format individual values first, as shown next:

viaFmt :: Buildable a => a -> Html
viaFmt = text . pretty

The text function expects Text for input, and it outputs an Html value. Consequently, the pretty function from fmt will provide Text from the given Buildable.

Next, we describe table structures:

colStats :: Colonnade Headed StatEntry Html
colStats = mconcat
      [ headed "Quote Field" (i . string . show . qfield)                  
      , headed "Mean" (viaFmt . meanVal)
      , headed "Min" (viaFmt . minVal)
      , headed "Max" (viaFmt . maxVal)
      , headed "Days between Min/Max" (viaFmt . daysBetweenMinMax)
      ]
 
colData :: Colonnade Headed QuoteData Html
colData = mconcat
      [ headed "Day" (viaFmt . day)
      , headed "Open" (viaFmt . showPrice . open)                          
      , headed "Close" (viaFmt . showPrice . close)
      , headed "High" (viaFmt . showPrice . high)
      , headed "Low" (viaFmt . showPrice . low)
      , headed "Volume" (viaFmt . volume)
      ]

The quote field is formatted as italic, and string converts String to Html.

The showPrice function leaves only two decimal points.

The Text.Blaze.Colonnade module provides two useful functions to look at these table structures in action. The encodeHtmlTable function takes Attributes for the table (it’s a Monoid, and we can use mempty for no attributes), table structure, and a list of raw data. It returns an Html value that can be printed via the printCompactHtml function. We can look at the HTML code generated with the table structures just defined as follows:

ghci> quotes <- readQuotes "data/quotes.csv"
ghci> si = statInfo quotes
ghci> import Text.Blaze.Colonnade
ghci> printCompactHtml (encodeHtmlTable mempty colStats si)
<table>
    <thead>
        <tr>
            <th>QuoteField</th>
            <th>Mean</th>
            <th>Min</th>
            <th>Max</th>
            <th>DaysbetweenMin/Max</th>
        </tr>
    </thead>
    <tbody>
     ...
    </tbody>
</table>

Note This representation is so compact that the printCompactHtml function has stripped away all spaces inside the th tags. Hopefully, the main HTML-rendering function does better. As for this function, the documentation warns: “The implementation is inefficient and incorrect in many corner cases. [...] Use of this function is discouraged.” Okay, we just wanted to look at our table in GHCi.

Now we are ready to generate the whole HTML document as shown here:

htmlReport :: (Functor t, Foldable t) =>
              String -> t QuoteData -> [StatEntry] -> [FilePath] -> ByteString
htmlReport docTitle quotes statEntries images = renderHtml $ docTypeHtml $ do
     H.head $ do
       title $ string docTitle
       style tableStyle
     body $ do
       unless (null images) $ do
         h1 "Charts"
         traverse_ ((img!).src.toValue) images           
 
       h1 "Statistics Report"
       encodeHtmlTable mempty colStats statEntries
 
       h1 "Stock Quotes Data"
       encodeHtmlTable mempty colData quotes
  where
    tableStyle = "table {border-collapse: collapse}" <>
                 "td, th {border: 1px solid black; padding: 5px}"

Generates the img tags with the src attributes pointing to the image file provided

Only one line of code may be hard to understand; the one where we traverse over the list of images:

traverse_ ((img!).src.toValue) images

As usual, our main way to understand what is going on is to look at the types. We can use :type and :info in GHCi after importing all the modules we use, as shown next:

images :: [FilePath]
img :: Html
type Html = Markup
type Markup = MarkupM ()
(!) :: Attributable h => h -> Attribute -> h
src :: AttributeValue -> Attribute
toValue :: ToValue a => a -> AttributeValue
traverse_  :: (Foldable t, Applicative f) => (a -> f b) -> t a -> f ()

Now we can use this information to specify types as follows:

type Html = MarkupM ()
img! :: Attribute -> Html
(img!).src.toValue :: ToValue a => a -> Html
traverse_ :: (FilePath -> Html) -> [FilePath] -> Html
traverse_ ((img!).src.toValue) images :: Html

Finally, we’ve got Html as expected! It turned out, also, that Html is a monadic context with the () value in it. The resulting HTML for images follows:

<h1>Charts</h1><img src="chart.svg"/>

We can also note the following style definition for HTML tables with the <> semigroup operation:

tableStyle = "table {border-collapse: collapse}" <>
             "td, th {border: 1px solid black; padding: 3px}"

Well, HTML and CSS are all about monads and monoids. Checkmate, my cheerful frontend developers.

The blaze-html package relies heavily on the OverloadedStrings GHC extension to turn every String literal into a value of the Html type.

I believe everything else in this code is self-explanatory. This is a sign of a good library. A monadic interface with do blocks guarantees that we store all the information about a document we provide somewhere inside the Html value. Compare this approach with the lens-based interface to the Chart library. Both approaches are extensively used in Haskell. It’s crucial to get used to both of them.

The Html we have is rendered as a ByteString that we can export to a file. Note that htmlReport is a pure function. We don’t have IO in its type. Everything in this module works in a pure part of our program.

We are done with reporting. Let’s describe a user interface and connect everything together.

3.3.4 Implementing the user interface

As usual in Haskell, we struggle to turn the user input into something explicitly typed as quickly as possible. In the case of command-line arguments (a list of String values), this means parsing them into some record. The optparse-applicative package, which we’ll use in this section for parsing command-line arguments, follows exactly this approach. It is an example of a highly regarded, professional, purely declarative (thanks to good abstractions) package with great documentation and many use cases. I don’t attempt to describe all of its features but limit myself to a short demonstration.

Example: Describing and processing command-line arguments

stockquotes/Params.hs

stockquotes

We can describe command-line arguments declaratively.

The interface we want follows:

Usage: stockquotes FILE [-n|--name ARG] [-c|--chart] [--html FILE]
                        [-s|--silent]
  Stock quotes data processing
 
Available options:
  FILE                     CSV file name
  -n,--name ARG            Company name
  -c,--chart               Generate chart
  --html FILE              Generate HTML report
  -s,--silent              Don't print statistics
  -h,--help                Show this help text

And the following is a record with all the information:

data Params = Params {
                fname :: FilePath
              , company :: Maybe Text
              , chart :: Bool
              , htmlFile :: Maybe FilePath
              , silent :: Bool
              }

The question is this: How do we relate one to another? Well, we describe every field as a command-line argument and provide an injection into a Params value. The optparse-applicative library provides an Applicative interface for that with the <$> and <*> operators. Every field description is combined via <*>, and the final injection is done by the <$>. Field descriptions are constructed with <> from Semigroup. Here is the code:

mkParams :: Parser Params
mkParams =
  Params <$>                                                 
             strArgument                                     
               (metavar "FILE" <> help "CSV file name")      
         <*> optional (strip <$> strOption                   
               (long "name" <> short 'n' <>                  
                help "company name "))                       
         <*> switch                                          
               (long "chart" <> short 'c' <>
                help "generate chart")
         <*> optional (strOption $                           
               long "html" <> metavar "FILE" <>
               help "generate HTML report")
         <*> switch
               (long "silent" <> short 's' <>
                help "don't print statistics")

Final injection

Mandatory positional FilePath argument

Argument’s name and help text in the output

Optional Maybe Text argument with a whitespace stripped-away value

Long- and short-option descriptors

Help text and default value

Switch corresponds to the Bool field.

Optional Maybe FilePath argument

Remember that FilePath is an alias for String, so optparse-applicative doesn’t have to distinguish them.

Note also that Parser is an Applicative, and the Params value is a result of computations in this context. We apply the multiparametric value constructor Params via the <$> operator from Applicative. We know that there should be exactly five arguments (as in the Params record). All of them are provided one by one via the <*> operator. Every type is checked. It’s impossible to describe a Bool field as a String argument: GHC would complain immediately.

All the strArgument, strOption, and switch functions take Semigroup-based combinations of properties, and every such function refers to exactly one Params field at the same position.

Now that we’ve described the correspondence between command-line arguments and Params fields, we should supply additional usage information, which will be printed if the user specifies --help or -h switches and runs the actual parsing. The following code sample demonstrates how to do that:

cmdLineParser :: IO Params                                                 
cmdLineParser = execParser opts                                            
  where
    opts = info (mkParams <**> helper)                                     
                (fullDesc <> progDesc "Stock quotes data processing")      

Parser is an IO action.

Runs parsing

Augments mkParams with switches for the help screen

Provides additional information for the help screen

We have an IO action here because we need access to command-line arguments. The result of the computation has the Params type. In this action, we extend command-line arguments prepared earlier with the standard help screen and execute the parser.

3.3.5 Connecting parts

The last thing to do is to connect all the parts of this project together. Namely, we should

  • Get Params from command-line arguments via cmdLineParser.

  • Read the CSV file.

  • Compute the statistics.

  • Prepare and print a text report, if required.

  • Generate the charts, if required.

  • Prepare and export the HTML report, if required.

Let’s do all that now.

Example: Connection parts in the Main module

stockquotes/Main.hs

stockquotes

The Main module connects everything in the IO part of the program.

We’ll split the job between three functions: main, work, and generateReports. The main function is responsible for running the command-line parser and delegates everything else to work, as shown next:

main :: IO ()
main = cmdLineParser >>= work

The work function takes the constructed Params as an argument, reads and decodes the CSV file, and runs generateReports, if everything goes well, as follows:

work :: Params -> IO ()
work params = do
  csvData <- BL.readFile (fname params)
  case decodeByName csvData of
    Left err -> putStrLn err
    Right (_, quotes) -> generateReports params quotes

We read a ByteString (from Data.ByteString.Lazy, imported with the prefix BL) from the file and decode it with the decodeByName function from the cassava package’s Data.Csv module. This function has the following type signature:

decodeByName :: FromNamedRecord a
             => BL.ByteString
             -> Either String (Header, Vector a)

The quotes value is later used as a value of the following type:

(Functor t, Foldable t) => t QuoteData

The type checker figures out that the a type variable in the type signature for decodeByName refers to QuoteData. Remember, we’ve derived an instance of FromNamedRecord for it.

Type Vector comes from the vector package. This package provides an efficient implementation of Int indexed arrays with many optimizations for loop-like operations.

In the case of correct decoding, we get a Vector of QuoteData values. Vector implements both Functor and Foldable type classes. Thus, all our code for computing statistics and preparing reports remains intact (though it performs quite well, thanks to Vector instances of Functor and Foldable).

The generateReports function, shown next, does the rest of the job:

generateReports :: (Functor t, Foldable t) => Params -> t QuoteData -> IO ()
generateReports Params {..} quotes = do
  unless silent $ putStr textRpt
  when chart $ plotChart title quotes chartFname
  saveHtml htmlFile htmlRpt
 where
   statInfo' = statInfo quotes
   textRpt = textReport statInfo'
   htmlRpt = htmlReport title quotes statInfo' [chartFname | chart]
 
   withCompany prefix = maybe mempty (prefix <>) company
   chartFname = unpack $ "chart" <> withCompany "_" <> ".svg"
   title = unpack $ "Historical Quotes" <> withCompany " for "
 
   saveHtml Nothing _ = pure ()
   saveHtml (Just f) html = BL.writeFile f html

Note the use of the maybe function and the Monoid instance for Text in the company name processing (the withCompany function). Once we have Text values of the chart filename and the title, we convert them to Strings expected by other functions with the unpack function.

Unless we were asked to be silent, we print the report to the console. When asked to generate charts, we plot them. Finally, we export the HTML report into the file with the given name if provided. That is all for this project.

Summary

  • Use the time package whenever processing dates and times.

  • Choose your own favorite package for representing textual data: formatting and fmt are good candidates.

  • Drawing charts is easy with the Chart package; give it a try.

  • Try the cassava package for parsing CSV files.

  • Use the optparse-applicative package for parsing command-line arguments and generating default help screens.

  • Learn to build HTML documents with the blaze-html package.

  • Monad, Applicative, Functor, Foldable, Semigroup, and Monoid are our friends in practice.

  • Use Haskell for everything you need.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.196.217