© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2022
A. Serrano MenaPractical Haskellhttps://doi.org/10.1007/978-1-4842-8581-7_13

13. Strong Types for Describing Offers

Alejandro Serrano Mena1  
(1)
Utrecht, The Netherlands
 

You’ve already finished two of the main tasks in this book: implementing some data mining algorithms for analyzing clients and their purchases and looking at several ways to save the information about purchases in a durable container. This chapter is the starting point of the last task: creating a domain-specific language (DSL) for expressing offers to be applied in the Time Machine Store. You will provide a set of combinators to create the offers, which are initially basic offers.

Not all offers expressible in this language will be valid. The interesting part is that you can use the type system to help avoid many of the wrong offers, and this chapter will teach you the features needed to do that.

I will introduce you first to generalized algebraic data types (GADTs), which are a way to create more concrete constructors for your data types. Then, you will see how to encode other invariants using different techniques. In particular, this chapter discusses both functional dependencies (FDs) and type families (TFs), along with singletons .

Once you’ve read this chapter, you’ll have a thorough grasp of the complete Haskell type system. You will be able to understand code that uses the advanced features described here, and you also will be able to use them in your own code.

Domain-Specific Languages

A domain-specific language (DSL) refers usually to a language not intended for expressing any computer program but rather data or computations about a specific problem space. For example, SQL can be seen as a DSL for expressing database queries. Haskell is not a DSL, but rather a general-purpose language, because it’s not tied to any particular application domain.

The existence of DSLs is not new in computer science. Many of the programs you use contain one or several DSLs at their core. For example, CSS is a DSL for expressing presentation attributes of markup elements, regular expressions are a DSL for expressing patterns over a string, hardware designers use specific languages for designing circuits, and so on.

Again, a DSL is not intended to express just any domain but rather a specific one. This means a DSL is tightly linked to the things you intend to express, allowing you to communicate those concepts using exactly the abstractions you need for that domain. This gives you two advantages :
  • You no longer need developers to write the business rules in your program. If your DSL really expresses the rules using the usual abstractions for the intended domain, the experts on that specific domain can be the ones writing the rules. This decreases the mismatch between domain experts and programmers.

  • You need to implement only a core DSL, which is usually quite small, and then use the new abstractions to build the rest of the system. This makes the programs easier to write, easier to understand, and easier to maintain because the abstraction of the domain is not scattered between different moving parts.

These advantages are important in the software engineering process. As a result, the concept of DSLs has risen in importance, and many problems are now tackled by means of a custom language for a domain. In fact, many tools are geared toward designing DSLs, and this chapter will make the case that Haskell is one of best in this field.

Embedding Your Language in Haskell

Any DSL you design will ultimately be integrated inside a larger application . That larger application will be written in a general-purpose language, like Haskell. Thus, it’s important to consider the ways in which the DSL can hook into that other language, usually called the host.

One possibility is to make the DSL completely independent of the host language. In that case, you need to develop a full set of tools for this language. At the very least, you need to write a parser for the language. Thus, you get a lot of freedom in the design, but you need to put a lot of effort into its implementation. This way of integrating your language, called external or stand-alone, is useful when the DSL forms the core of your application and you want to provide extra tools (such as debuggers and IDE integration). Think of HTML as a perfect example of this kind of DSL.

In contrast, you can also develop an embedded or internal DSL , where your language is expressed using some of the constructs of the host language. The main advantage of this approach is that you don’t need to write as many of the tools, and the host language can help you in designing the DSL in such a way that you don’t need extra passes to detect those values of your DSL that are illegal. Furthermore, you are not tied to using only one DSL at once; you can embed several of them in your application. The main disadvantage is that you’re limited by the host language. In particular, the internals of your DSL may be exposed in the case of an error. Haskell has several features that make it a good host language for embedded DSLs :
  • Its powerful type system allows you to express invariants that valid values in your DSL must satisfy. This means the extra safety Haskell provides is also in your language.

  • Haskell syntax is terse, so it doesn’t introduce many strange symbols in your DSL. Using operators and the mixfix application, you can make the values in your new language resemble a natural language description.

  • Type classes, especially applicatives, monads, and do notation, provide a convenient framework for expressing the abstractions that are part of a DSL. This means Haskell developers can use a well-known notation for many concepts.

Many of the libraries I’ve talked about in this book are actually embedded DSLs: parser combinators like attoparsec for describing grammars only need to use special functions, a query language such as Esqueleto uses a combination of type classes and custom data types, and digestive-functors expresses forms using applicative style and a set of basic ways to treat a form value. As you can see, the features of Haskell are used differently by each DSL, depending on how convenient the features are.

There are two main trends in how embedded DSLs express their behavior in Haskell. A shallow embedding encodes the values in your language directly by their meaning inside Haskell (this meaning is called the language’s interpretation). For example, say you create a DSL for stack operations; its shallow embedding would represent each operation with a Haskell function , as this code shows:
pop :: [a] -> [a]
pop (x:xs) = xs
pop _      = error "empty stack"
push :: a -> [a] -> [a]
push x xs = x:xs
-- A value in our language: its interpretation directly works on a stack
v :: [Int] -> [Int]
v = push 1 . pop . pop . push 2
A deep embedding separates the use of the language into two parts. First, you create a representation of the value in your language as a syntax tree, using Haskell data types. The previous example would be deeply embedded as follows:
data StackOps a = Pop (StackOps a) | Push a (StackOps a) | End
-- The same value represented as a syntax tree
v :: StackOps Int
v = Push 1 $ Pop $ Pop $ Push 2 $ End
Next, you give an interpretation of the syntax tree that converts it to its meaning inside Haskell. The advantage is that now you can add some extra optimizations as you go; for example, Pop after Push is like performing no operation at all.
interpret :: StackOps a -> [a] -> [a]
interpret (Pop (Push _ ops)) stack = interpret ops stack
interpret (Pop ops)          stack = tail $ interpret ops stack
interpret (Push e ops)       stack = e : interpret ops stack
interpret End                stack = stack

You’ve seen the two advantages of using deep instead of shallow embedding. First, you can give more than one interpretation; that is, you can treat the values of your embedded DSL in different ways depending on the situation. Additionally, you can inspect the syntax tree before creating the meaning and implementing optimizations or statistics about your value.

In this chapter, I will show how to design a deep embedding of the offers language inside Haskell. The next chapter will explain a conceptual tool , attribute grammars, and a concrete implementation of those ideas, UUAGC , which help you express the behavior of one of the offers applied to a basket of products.

The Offers Language

Let’s create a DSL for expressing the offers in the Time Machine Store. In addition to empowering managers to directly encode the offers for your application (so you don’t need to manually implement them each time), implementing the language will be a good way to learn more Haskell. You don’t need a big language for the second objective, so I will keep the core DSL small. As a general guide, you should always try to make your DSL as small as possible because it is easier to work with a small core. For complex needs, you can write functions that generate compound expressions out of the simple language you have created.

The first things the offers language needs to provide are the basic offers that the Store may use. In this case, you have three of them: giving something as a present, discounting some percentage of the price, and discounting some absolute value from the price. From these basic offers, you will be able to generate values in the DSL by combining offers and extra pieces of data. In this case, the combinations can be split into three groups :
  1. 1.

    The language may restrict the offer to be valid only on a set of products.

     
  2. 2.

    By default, all the offers are valid for an indefinite period of time. The language will provide ways to constrain the starting and ending dates of the offer to extend the offer for a longer time.

     
  3. 3.

    The language will allow you to make an offer be the union of two offers (e.g., give a free balloon and a 10 percent discount), be just the best out of two (e.g., either give a 10 percent or give a $5 discount), or be conditional upon satisfying some property in the purchase (such as applying an offer only when the client purchases more than $100 worth of products).

     
From this description, the data declaration for the syntax tree is straightforward to obtain.
data Offer a = Present a
             | PercentDiscount  Float
             | AbsoluteDiscount Float
             | Restrict [a] (Offer a)
             -- product restriction (1)
             | From Integer (Offer a)
             -- time restriction (2)
             | Until Integer (Offer a)
             | Extend Integer (Offer a)
             | Both (Offer a) (Offer a)
             -- offer combinators (3)
             | BetterOf (Offer a) (Offer a)
             | If (Expr a) (Offer a) (Offer a)
             deriving Show
In some cases when using conditional expressions , you may need to express that no offer is given if some condition is not satisfied. For those cases, you would need a “no offer” value. One approach would be to include an extra constructor in the data type. However, you can see that an absolute discount of $0 is equivalent to no offer. Thus, you can keep the core as is and define this offer in terms of the others.
noOffer :: Offer a
noOffer = AbsoluteDiscount 0
The missing part from the code of the Offer type is the declaration of the Expr data type that will encode the expressions that may appear as conditions over the purchase. This should include the amount and prices of several of the items in the shopping basket, comparisons between those quantities, and Boolean combinations of conditions (and, or, and not). Notice in the following code how you need to lift basic integer and floating-point values into the language via the IVal and FVal constructors :
data Expr a
  = AmountOf a | PriceOf a
  -- information about the cart
  | TotalNumberProducts | TotalPrice
  -- lifting numerical values
  | IVal Integer | FVal Float
  -- arithmetic
  | (Expr a) :+: (Expr a) | (Expr a) :*: (Expr a)
  -- comparison
  | (Expr a) :<: (Expr a) | (Expr a) :<=: (Expr a)
  | (Expr a) :>: (Expr a) | (Expr a) :>=: (Expr a)
  -- boolean operations
  | (Expr a) :&&: (Expr a) | (Expr a) :||: (Expr a) | Not (Expr a)
  deriving Show
For example, let’s express the offer “for the next 30 days, you will get the best of these two details: either getting a discount of $10 off your final bill and getting a balloon as a present or, if you buy more than $100 of products, a 5 percent discount.” The value for this offer is as follows:
v :: Offer String
v = Until 30 $ BetterOf (AbsoluteDiscount 10.0)
                        (Both (Present "balloon")
                              (If (TotalPrice :>: IVal 100)
                                  (PercentDiscount 5.0)
                                  noOffer))

These data types are the core of our DSL. But as Exercise 13-1 shows, you can add some helper functions to make it easier to describe offers that follow the same pattern often.

EXERCISE 13-1. OFFER PATTERNS
You’ve seen how noOffer could be defined in terms of more basic constructors, keeping the core language simple. Following the same idea, write definitions for the following patterns of offers:
  • period f d o will constrain the offer o for the following d days starting from day f. Remember that From and Until have as arguments specific points in time, not lengths.

  • allOf os should be the conjunction with all the offers in the list os.

Then, express the following offer: “From the third day and for five days hence, you will get a free balloon, a free chocolate muffin, and a 10 percent discount in the Time Machine Store.” Check whether your expression corresponds to the correct offer expressed using the core DSL.

In addition to constraining the kind of basic offers that you can express, there are two further requirements that all values in your DSL must satisfy. The first one is the Presents Rule: at some time during the year, the number of presents that will be given for free with a purchase is limited. Thus, the system should check that constraint. The second one is the Duration Rule: you don’t want to allow offers if they violate a time restriction. You’ll learn how to enforce these two rules in the language later in this chapter.

Adding Safety to the Expression Language

I emphasized in the introduction to the chapter that Haskell’s strong type system helps you to forbid incorrect values in your DSL. In the first implementation I showed previously, you can create such incorrect values without much problem. Take the following example, which creates an Expr value by taking the disjunction of the price and some other expression. But a price alone is not a Boolean value , so you wouldn’t be able to give any meaning to it.
incorrectExpression :: Expr Char
incorrectExpression = TotalPrice :||: (TotalNumberProducts :<: PriceOf 'a')
The remedy, which is common to all the examples, is to add a tag to the types involved in the DSL and constrain the ways in which the values of different types can be combined between them:
  • Amounts, prices of items, and constant values (those created through the FVal or IVal constructors) should be tagged as numbers.

  • Comparisons will take as arguments only values tagged as numbers and will produce a value tagged as a Boolean.

  • Boolean operators will combine only those expressions tagged as Booleans.

  • The final expression in an offer must be tagged as a Boolean.

The perfect way to add this tag to expressions is by adding a new type parameter to the Expr data type . In that case, Expr a t will be an expression over products of type a tagged with type t. Now, if each combinator that makes up an expression is a regular function instead of a constructor, you could enforce the constraints by restricting the signature of the function. Here’s an example, but you would have a similar function for each combinator:
(:||:) :: Expr a Bool -> Expr a Bool -> Expr a Bool

The problem is that plain data declarations do not allow you to return different types depending on the constructor of a value. All constructors uniformly construct values of the type expressed after the data keyword. Generalized algebraic data types (GADTs) lift that restriction ; now each of the constructors of the data type can return a different set of type parameters to the type being defined. That is, in the definition of Expr a t, you can return Expr a Bool, Expr Int Int, and so on, but not Offer Char.

The syntax starts with the same keyword, data, followed by the name and type variables of the type to define. But instead of equal signs, you need to write where and list the constructors via its signatures. Your expression data type written as a GADT becomes the following:
{-# LANGUAGE GADTs #-}
data Expr a r where
  AmountOf            :: a -> Expr a Integer
  PriceOf             :: a -> Expr a Float
  TotalNumberProducts :: Expr a Integer
  TotalPrice          :: Expr a Float
  IVal                :: Integer -> Expr a Integer
  FVal                :: Float -> Expr a Float
  (:+:)               :: Num n => Expr a n -> Expr a n -> Expr a n
  (:*:)               :: Num n => Expr a n -> Expr a n -> Expr a n
  (:<:)               :: Num n => Expr a n -> Expr a n -> Expr a Bool
  (:<=:)              :: Num n => Expr a n -> Expr a n -> Expr a Bool
  (:>:)               :: Num n => Expr a n -> Expr a n -> Expr a Bool
  (:>=:)              :: Num n => Expr a n -> Expr a n -> Expr a Bool
  (:&&:)              :: Expr a Bool -> Expr a Bool -> Expr a Bool
  (:||:)              :: Expr a Bool -> Expr a Bool -> Expr a Bool
  Not                 :: Expr a Bool -> Expr a Bool

Since the arithmetic and comparison operators need to work on both Integer and Float values , we cannot use a type there directly. Instead, we use a Num constraint, since both types are instances of that type class.

Now if you try to build some code with the incorrect expression that started this section, you will get a compile error because the expression will not type check. This is the first example of using the strong type system to constrain the kind of values that the DSL can express.

GADTs solve another problem that will appear in one way or another later. Suppose you want to interpret one expression using the original Expr data type. This interpretation will be a function that, given a list of products as (name, price) pairs, returns the result of applying the expression to it. Since the result of the expression can be either an Integer, a Float, or a Bool, you need a sum type as a return value. Here’s a small excerpt of the interpretation for the (:||:) case; notice how you need to take care of type mismatches explicitly in the code:
data ExprR = EInt Integer | EFloat Float | EBool Bool
interpretExpr :: Expr a -> [(a,Float)] -> ExprR
interpretExpr (e1 :||: e2) list =
  case (interpretExpr e1 list, interpretExpr e2 list) of
    (EBool b1, EBool b2) -> EBool (b1 || b2)
    _                    -> error "type error"
interpretExpr ...               = ...
But if you use your GADT, you no longer need to create a special data type for the return value of the expression because the resulting value can depend on the tag in the Expr type . This gives you a new way of achieving polymorphism. Before, you could return a value of a subpart of your input type, and now, thanks to GADTs, that type doesn’t have to be uniform over all constructors, thus making the function return values from different types. Check how this is the case in the interpretation function for the (:||:) and (:+:) cases, as shown here:
interpretExpr :: Eq a => Expr a t -> [(a,Float)] -> t
interpretExpr (e1 :+: e2)  list
  = interpretExpr e1 list + interpretExpr e2 list
interpretExpr (e1 :||: e2) list
  = interpretExpr e1 list || interpretExpr e2 list
interpretExpr ... = ...

This interpretation function still needs some love. Exercise 13-2 asks you to complete the work.

EXERCISE 13-2. COMPLETE INTERPRETATION FOR EXPRESSIONS

Write the missing code of the interpretExpr function for the cases of Expr being defined by a regular ADT and by a GADT. In the case of PriceOf, you should take into account that the same product may appear more than once in the purchase list.

Even though here you’re tagging types using built-in types from the Prelude module , nothing stops you from using any other type here. Actually, it may be the case that you create new data types only for tagging other types. In that case, you don’t even need any constructors in the declaration since you will never use them. You can create empty data declarations if you enable the EmptyDataDecls extension .

I’ll show an example of how this could be useful. In the Time Machine Store, there will be many users. But not all of them will have the same role in the store; some will be administrators or store managers (who are able to change everything), some will be store people (who are allowed only to update products in the database), and some will be regular users (who are the ones making the purchases). You can tag the level of access to the store using a set of empty data types .
{-# LANGUAGE EmptyDataDecls, GADTs #-}
data AllowEverything
data AllowProducts
data AllowPurchases
data Person = Person :: { firstName :: String, lastName :: String }
data User r where
  Admin        :: Person -> User AllowEverything
  StoreManager :: Person -> User AllowEverything
  StorePerson  :: Person -> User AllowProducts
  Client       :: Person -> User AllowPurchases
Now a function that should be called only by people with access to everything can be defined to require the value tagged with the correct type.
changePurchaseFinalPrice
  :: User AllowEverything -> Purchase -> Float -> Purchase
changePurchaseFinalPrice = ...

Time traveling is a tiresome task, so users of your time machines need to eat snacks from time to time. However, people have constraints on the food they can take, such as vegetarian, no pork, low salt, and so on. Exercise 13-3 asks you to build a GADT that represents those snacks tagged with constraints.

EXERCISE 13-3. SNACKS FOR TIME TRAVELERS

Create a GADT representing a set of possible snacks for the Time Machine Store. The snacks must be tagged with a type defining whether it’s OK for vegetarians and whether it contains pork. Use empty data types as shown in this section.

Type-Level Programming

It seems that this idea of tagging types with extra information is quite successful, at least for expressions. However, the way in which you can do it is quite limited because you can use only other types and only as constants or variables that you don’t inspect.

One way to have a more powerful type system is by allowing values, in addition to types, to take part in other types. For example, you might be interested in tagging lists with their length and expressing things such as “the length of append l1 l2 is the sum of the lengths of l1 and l2.” As you can see, in this case, the tag is not another type, but rather a number (or a value in the world of Haskell). Dependent type systems open the last barrier that Haskell imposes between values and types and allow you to create types depending on values, as the list tagged with its length that was being discussed.

Haskell is not a dependently typed language, though; it imposes a clear separation between the world of terms or expressions and the world of types. In Haskell , values are only allowed to depend on other values (via regular function parameters) or on types (via parametric or ad hoc polymorphism), whereas types are allowed to depend only on other types (via type variables). On the other hand, it’s forbidden to use terms as parameters to types. When working with GADTs , you used some empty data types as tags. This approach mimics partly dependent typing but allows tagging only with constant values, not performing any operation on the types. The rest of the chapter is devoted to showing different ways in which you could describe operations that work on types. All these methods are known collectively as type-level programming techniques.

DEPENDENTLY TYPED LANGUAGES

Dependent typing is an extensive area of knowledge. It is expected that Haskell gets more and more of these features as time progresses. If you want to learn more about dependent types, you can read Type-Drive Development with Idris and Programming in Idris: A Tutorial both by Edwin Brady,1 “Dependently Typed Programming in Agda” by Ulf Norell and James Chapman,2 Certified Programming with Dependent Types by Adam Chlipala,3 or Software Foundations by Benjamin C. Pierce et al.4 (the latter two using Coq).

The programming techniques that will be presented in the rest of the chapter are usually categorized as advanced Haskell features . You don’t need to understand every detail of functional dependencies and type families to be a proficient Haskell programmer. Indeed, these features are recent additions to the Haskell language, so their use is not widespread yet.

However, type-level programming is becoming an increasingly important technique. More recent libraries, such as Persistent and Yesod, make heavy use of them. Even though you may skip some of this material in a first reading, you should come back to it in the future. It will help you understand many of the error messages and design decisions of those libraries, and it will also help you build better applications.

Two Styles of Programming

Type-level programming in Haskell generates many of its ideas from enhancements in ad hoc polymorphism. There are two different ways you can simulate parts of dependent typing in Haskell:
  • Functional dependencies (FDs) allow you to constrain the parameters of type classes. Given the correct constraints, the Haskell compiler can infer a unique type, which can be seen as the result of a type-level function.

  • Type families (TFs) let you create functions that assign a type given a set of other types. Recent versions of GHC include two kinds of type families. Closed type families are the closest to the intuitive notion of type function, and open type families are like type classes in the sense that you may add a new rule to an open type family at any point, like you may add an instance to a type class.

Both ways are equally powerful, so in principle it doesn’t matter which one you choose to encode your invariants. However, in some cases, it’s easier to use FDs, and in other cases, it’s better to use TFs. As a rule of thumb, start using TFs (because the type-level concepts they expose are closer to the simple Haskell level) and use FDs if you need more expressiveness in the relations between types.

Caution

Although they have the same power, mixing FDs and TFs in the same code can become challenging. Thus, if you depend on a library that exposes FDs or TFs, you should use the same technique to avoid further problems. The situation, however, may improve in newer versions of the GHC compiler.

Representing Natural Numbers

Since the tags that will be used to check the Presents Rule are natural numbers, you must know how they are encoded as values previous to using FDs or TFs to represent them at the type level. This section tries to give a fast-paced introduction to natural numbers . Feel free to skip it if you already know the standard data type for natural numbers and how addition, maximum, and minimum are coded using them.

The most common way to represent natural numbers as a data type is based on the axioms stated by the 19th-century mathematician Giuseppe Peano. In particular, he gave two rules for constructing numbers:
  • Zero is a natural number.

  • For every natural number n, there exists a successor of n, which is also a natural number.

You can encode those axioms in a Haskell data declaration like so:
data Number = Zero | Succ Number deriving Show
The number 1, for example, is the successor of 0; the number 2 is the successor of 1; and so on.
one :: Number
one = Succ Zero
two :: Number
two = Succ one  -- Succ (Succ Zero)
Note

You can think of this encoding of natural numbers as lists in which you don’t care about elements. Then, addition would be concatenation of lists, taking the minimum would be returning the list with the smallest number of elements, and so on.

Now let’s move to the operations. The first one you will need to use is addition. Like with most Haskell data types, the best way to design a function over Number is to handle each constructor and use recursion for nested values. In this case, you have two different constructors :
  • If you add 0 to any natural number y, the result is y.

  • If you add the successor of x to y, this is equal to the successor of the addition of x and y. Since the successor is equivalent to (+1), you can see this as encoding the algebraic law that reads (x + 1) + y = (x + y) + 1.

In Haskell syntax, the branches of the plus' function are written as follows:
plus' :: Number -> Number -> Number
plus' Zero     y = y
plus' (Succ x) y = Succ (plus' x y)
You can test whether the function works correctly by summing up 1 and 2, for example.
*Chapter13.Numbers> plus' one two
Succ (Succ (Succ Zero))

The result is the number 3 represented in this way, as you can see in the output.

For the maximum, there are also two cases to consider. First, any of the numbers can be 0, in which case you know for sure that the other number is greater than or equal to 0. The other case is when both numbers are successors , for example, x + 1 and y + 1. In this case, the maximum can be computed by recursively computing the maximum of x and y and then adding 1.
max' :: Number -> Number -> Number
max' Zero y = y
max' x Zero = x
max' (Succ x) (Succ y) = Succ (max' x y)

The minimum function has a similar skeleton. Exercise 13-4 asks you to write the full code.

EXERCISE 13-4. MINIMUM OF NATURAL NUMBERS

Write a function min' of type Number -> Number -> Number that computes the minimum value of the two natural numbers given as arguments.

Functional Dependencies

As stated in the previous section, functional dependencies represent one of the ways in which you can encode type-level operations in Haskell. However, the original intention of FDs was to enhance the type class mechanism by constraining the set of types that can be related via a multiparameter type class. In this section, I’ll start looking at this original aim and then move on to encoding the Presents Rule via FDs.

Categories of Products with FDs

Let’s diverge for a moment from the offers language and focus on a completely different problem . Until now, all the products in the Time Machine Store were represented using the same data type, Product. However, the information for describing a time machine is not the same as that needed to describe a book or a costume. Thus, it may be interesting to make Product a type class and make data types represent different categories of products.

In addition to different fields to describe them, you use different products in different ways. For example, you travel with a time machine, but you read a book. It would be interesting to specify, for each category of products, which operations are available to perform upon them. Then, you could include specific instructions on how to perform each operation and a specification of which operation should be used to test the product.

From this discussion, the task is to create a Product type class with two parameters, one defining the category of products and the other one the operations that the category supports. The following code includes as operations the price of a product and the operation functions discussed earlier:
class Product p op where
  price :: p -> Float
  perform :: p -> op -> String
  testOperation :: p -> op
Note

Remember that you need to enable the MultiParamTypeClasses extension for GHC to accept this code.

Given a simple data type for representing time machines and their operations, writing its instance declaration is straightforward. The following code shows a possible way in which you could do this:
data TimeMachine = TimeMachine { model :: String }
                 deriving Show
data TimeMachineOps = Travel Integer | Park deriving Show
instance Product TimeMachine TimeMachineOps where
  price _ = 1000.0
  perform (TimeMachine m) (Travel y)
     = "Travelling to " ++ show y ++ " with " ++ m
  perform (TimeMachine m) Park
     = "Parking time machine " ++ m
  testOperation _ = Travel 0
Of course, the main aim for creating a type class is to write a function that works on any kind of Product. For example, a function could get the total price of a list of products of the same category, and another one performs the test operation on a concrete product. The definitions are as follows:
totalAmount :: Product p op => [p] -> Float
totalAmount = foldr (+) 0.0 . map price
performTest :: Product p op => p -> String
performTest p = perform p $ testOperation p
The problem is that this code will not compile. Instead, you will get several errors similar to the following:
src/Chapter13/CategoriesFnDeps.hs:
    Could not deduce (Product p op0) arising from a use of `price'
    from the context (Product p op)
      bound by the type signature for
                 totalAmount :: Product p op => [p] -> Float
      at src/Chapter13/CategoriesFnDeps.hs:
    The type variable `op0' is ambiguous
    Possible fix: add a type signature that fixes these type variable(s)
    Note: there is a potential instance available:
      instance Product TimeMachine TimeMachineOps
        -- Defined at src/Chapter13/CategoriesFnDeps.hs:18:10

The error message tells you that there’s not enough information to infer which operation type corresponds to each product category. But the intent of the code is clear. For TimeMachine, the corresponding operation type is TimeMachineOps, and only that one. Somehow, the compiler should infer that when you’re using a TimeMachine, the operations will always belong to TimeMachineOps .

The problem is that the type class declaration, as it stands, does not declare this intention in any way. Any two types p and op could be related via Product. For example, you can add an instance that uses TimeMachine as the category, but book operations are as follows:
data Book = Book { title :: String, author :: String, rating :: Integer }
          deriving Show
data BookOps = Read | Highlight | WriteCritique deriving Show
instance Product TimeMachine BookOps where
  price _         = 500.0
  perform _ _     = "What?!"
  testOperation _ = Read  -- ??

The compiler won’t complain because the code is perfectly fine. But now you can see why the definitions of totalAmount or testOperation were ambiguous. Potentially, there is another instance where a declaration different from Product TimeMachine TimeMachineOps may be applicable.

The solution is to add a constraint to the type class declaration that exactly expresses that given a specific category of products, only one possibility is available for the set of operations. This is done via a functional dependency, a concept from database theory that describes exactly these scenarios. Functional dependencies are written in the head of the class declaration , separated from the name and type variables by the | sign, and with commas between each of them. Each functional dependency, in turn, follows the same schema, x 1 ... x n -> y 1 ... y m, expressing that for each unique substitution of the types x 1 to x n, there’s only one possible compound value of y 1 to y m. Note that you need to enable the FunctionalDependencies extension to use this syntax in your own type classes.

In this case, categories constrain the operations, so the functional dependency to add is p -> op. The refined head of the definition of the type class should be changed to the following:
class Product p op | p -> op where
Once you do this, the compiler will complain about two different Product instances given for a TimeMachine.
src/Chapter13/CategoriesFnDeps.hs:
    Functional dependencies conflict between instance declarations:
      instance Product TimeMachine TimeMachineOps
        -- Defined at src/Chapter13/CategoriesFnDeps.hs
      instance Product TimeMachine BookOps
        -- Defined at src/Chapter13/CategoriesFnDeps.hs

Now the compiler allows the definition of totalPrice and performTest because it knows that given a category for products, only one possible set of operations will be available, so it can select them.

Functional dependencies are helpful once you understand when they are needed in a type class declaration . The “Functional Dependencies in Monad Classes ” sidebar describes their use in monad classes. Then Exercise 13-5 proposes a task for helping you understand these ideas.

FUNCTIONAL DEPENDENCIES IN MONAD CLASSES

When the type classes supporting the lifting of the basic operation in each monad transformer (MonadState, MonadReader, MonadWriter, etc.) were discussed in Chapter 7, functional dependencies appeared in the class declarations but weren’t explained.

The crux of a type class such as MonadState is that it declares both the monad that performs the operations and the type of elements saved in the state because they are both needed in the signatures of some operations. The functional dependency states that given a specific monad, the type of the state values is automatically known. Think of the State Int monad, for example. From its signature, you already know that only Int can be the type of elements saved in the state.

EXERCISE 13-5. PRODUCTS AND BAGS

In the store, two kinds of bags are available: big and small. Create new data types called BigBag and SmallBag for representing each kind of bag. Then, add a new parameter to the Product type class for stating which bag you should use for each category of products. In principle, time machines should go on big bags, whereas books need only small ones. Think carefully about extra functional dependencies .

Vectors Using FDs

At first sight, it seems that functional dependencies have nothing to do with type-level operations in Haskell. However, a second look will show how you can encode type-level functions in this way.

To begin with, let’s create the representation of natural numbers at type level. Once again, the best way is to use empty data types. For each constructor in the original declaration, an empty data type is needed. In the case of numbers, two of them will be used: one for representing the zero tag and one for successors .
{-# LANGUAGE EmptyDataDecls #-}
data Zero
data Succ n
Caution

Notice that in almost every example in this section, the code will use more and more GHC extensions. These will be shown using the LANGUAGE pragma. Most of the extensions are needed only for enabling certain syntactic constructs, and the compiler will tell you to enable them if you forget, so you don’t need to worry too much about them.

With only the data types just given, you can represent lists tagged with their length. Following the usual convention in dependently typed languages, lists tagged with numbers will be called Vects.
{-# LANGUAGE GADTs #-}
data Vect n a where
  VNil  :: Vect Zero a
  VCons :: a -> Vect n a -> Vect (Succ n) a
To check that our vectors record their length correctly, let’s ask the interpreter for the type of a list having three elements :
*Chapter12.VecFnDeps> :t VCons 'a' (VCons 'b' (VCons 'c' VNil))
VCons 'a' (VCons 'b' (VCons 'c' VNil))
  :: Vect (Succ (Succ (Succ Zero))) Char
You can see that the first argument of Vect is the representation of the number 3 using the Peano encoding. An immediate use for this additional information is to create a completely type-safe head which only works with vectors of at least one element:
headVect :: Vect (Succ n) a -> a
headVect (VCons x _) = x

In fact, if you try to pattern match with VNil, the compiler rejects such declaration, because an empty vector cannot have a length of the form (Succ n).

Now it’s time to use the type class system in your favor. Each type-level operation will be encoded as a type class that will have as variables the input arguments to the type-level operation and an extra one that represents the result of the operation. For example, class Plus x y z represents “the result of the addition of x and y is z,” or in other terms x + y = z. But in order to be a function, you must explicitly say that for any pair of values x and y, there’s only one possible result z. Specifying that is a perfect job for a functional dependency. Thus, the entire type class declaration representing type-level addition is as follows:
{-# LANGUAGE MultiParamTypeClasses, FunctionalDependencies #-}
class Plus x y z | x y -> z
Note

Since the Plus type class is used only for its results at type level, it’s not necessary to include any function in its body. In that case, Haskell allows you to omit the where keyword from the declaration.

The type class declaration is just describing the number of arguments to the type-level function. For expressing the rules that make the operation, you need to write different instances. Usually, these instances correspond to each of the cases in a function definition. Let’s see how they look for addition.
{-# LANGUAGE FlexibleInstances, UndecidableInstances #-}
instance Plus Zero x x
instance Plus x y z => Plus (Succ x) y (Succ z)

This is expressing the same logic for Peano addition but in a backward style of reasoning. The first instance encodes the rule of adding 0 to a number, whose result is the same as the second argument. The second rule is a bit subtler; it’s expressing that if you know that x + y = z, you can infer the addition (x + 1) + y, which will be exactly z + 1. In some sense, the declaration is reversing the way in which you write the pattern matching on the arguments and handling the recursion via a call to a smaller instance.

To understand how this encoding works, Exercise 13-6 asks you to define binary tree tagged with height, in a similar fashion to vectors.

EXERCISE 13-6. BINARY TREES TAGGED WITH HEIGHT

Create a BinaryTree data type with two constructors: Leafs, which contain only one element, and Nodes, which contain an element and two subtrees. The type must be tagged with the height of the tree. Hint: Implement the max' function for Peano numbers using a type class Max and functional dependencies.

Enforcing the Presents Rule with FDs

The next step is to use these new data types and type class inside the declaration of Offer. The first step is adding a variable to the type . We shall use this additional argument to tag the offer with the maximum number of presents inside it.
data Offer a p where
The basic constructor Present should reflect that one present is given with it. On the other hand, neither a percentage discount nor an absolute discount adds any present to the offer, so their tags should be zero. Remember that instead of plain numbers, you need to use Peano numerals.
  Present          :: a -> Offer a (Succ Zero)
  PercentDiscount  :: Float -> Offer a Zero
  AbsoluteDiscount :: Float -> Offer a Zero
The combination of several offers is straightforward: when two offers are applied at the same time, you get the presents of both.
  Both :: Plus p q r => Offer a p -> Offer a q -> Offer a r
As you can see, the code includes a context with the Plus type class . This is expressing that if p + q = r, then the union of an offer with p presents and another one with q presents is an offer with r presents. If you check in the interpreter for the type of a test present, the compiler will follow the rules for the Plus type class to get a final type. Here’s one example of its output:
*> :t let p = Present 'a' in Both p (BetterOf p (Both p p))
let p = Present 'a' in Both p (BetterOf p (Both p p))
  :: Offer Char (Succ (Succ (Succ (Succ Zero))))
The other combinator for offers is BetterOf. Since you don’t know in advance which of the branches will be taken, you cannot compute the exact amount of presents for that offer. But you can still get some bounds: you always get at most the larger amount from any of the two suboffers. Here, we use the Max type class you have been asked to define in Exercise 13-6.
BetterOf :: Max p q r => Offer a p -> Offer a q -> Offer a r
Another interesting constructor is the restriction one, which should take the minimum between the number of elements in a list and the number of presents. Let’s again follow the same steps to use functional dependencies for encoding this type-level operation . Declare the type class with an extra argument for the result and specify that the last argument functionally depends on all the rest.
class Min x y z | x y -> z
For each rule in the function definition, you must include an instance declaration. Remember that the style of programming must be backward; in other words, you must specify the recurring conditions before the full result.
instance Min Zero     y    Zero
instance Min (Succ x) Zero Zero
instance Min x y z => Min (Succ x) (Succ y) (Succ z)
The final part is to use the type class in the declaration of the constructor. Since we need to know the number of elements in the list, we change from using a simple list [a] to a list tagged with its length Vect n a. Furthermore, we restrict ourselves to nonempty lists.
{-# LANGUAGE FlexibleContexts #-}
  Restrict :: Min (Succ n) p r => Vect (Succ n) a -> Offer a p -> Offer a r

There are still some constructors left for the full offers language. Exercise 13-7 asks you to write the rest of them, encoding the type-level functions needed as functional dependencies.

EXERCISE 13-7. OFFERS WITH FUNCTIONAL DEPENDENCIES

Include the constructors for the remaining offers language: From, Until, Extend, and If. Write them following the same technique from this section. In the case of conditionals, the same considerations as BetterOf should be taken: you cannot always compute the exact value, but you can approximate it.

A LOGIC TYPE-LEVEL LANGUAGE

If you’ve ever used a logic programming language, such as Prolog or Datalog, you may find some resemblance between the way you encode type-level functions using instance declarations and the way you write predicates on these languages. This relation is indeed true: programming with functional dependencies exposes a logic programming style in a Haskell type system.

This extra paradigm brought into Haskell is the main criticism with functional dependencies. The programmer should not change programming styles when moving from term-level to type-level coding. Type families, as you will see later, have a more functional style in their declaration.

Type Families

You’ve already seen how functional dependencies can empower you to construct a stronger type system. This section will show you how to express the same kind of invariants but using the language of type families (TFs). In short, a type family is a function at the type level. It gives you a type given some other types as parameters, but in an ad hoc way and in contrast to parametric polymorphism. The type family may appear in the top level of your module or inside a type class. You will see the purpose of each of them in this section.

Vectors Using TFs

The definition of a type family is usually quite simple given a corresponding definition at the term level. In the example you’ve been dealing with, the first type-level function that you need is addition, so I will show you that one first. The same data types that were used for encoding natural numbers earlier with FDs will be used with type families.
{-# LANGUAGE EmptyDataDecls #-}
data Zero
data Succ n
The Vect data type will also be reused.
{-# LANGUAGE GADTs #-}
data Vect n a where
  VNil  :: Vect Zero a
  VCons :: a -> Vect n a -> Vect (Succ n) a
A TF starts with the type family keywords, followed by the declaration of its name and arguments and the where keyword. This first line defines the signature of the type-level function. After the signature, the type family is defined using pattern matching on its arguments. There are two small syntactic differences between TFs and regular functions. Whereas term-level functions must start with a lowercase letter, type family names must start with an uppercase one. In this case, the code will refer to addition at the type level with the name Plus. The other difference is that all the rules for a TF must be indented, in contrast to regular functions where rules appeared at the same indentation level of signatures. This is the definition of addition using TFs:
{-# LANGUAGE TypeFamilies #-}
type family Plus x y where
  Plus Zero     x = x
  Plus (Succ x) y = Succ (Plus x y)

As you can see, the type instances completely mimic the definition of the (+) function on regular, term-level, natural numbers. This similarity makes it easier to port code from term level to type level if you use type families.

One function which can be defined using Plus is the append of two vectors. If you have a vector of length n and another of length m, the resulting vector has length n+m. The compiler is able to check that fact in the following code:
append :: Vect x a -> Vect y a -> Vect (Plus x y) a
append VNil         ys = ys
append (VCons x xs) ys = VCons x (append xs ys)
Caution

The previous code uses closed TFs, an extension available only since version 7.8.1 of the GHC compiler. You can always use open TFs for maximum backward compatibility, which are described in the next section.

Enforcing the Presents Rule with TFs

The next step is to use Plus inside the definition of the Offer data type . As in the case of FDs, the basic constructors use the representation of the numbers 0 and 1 using the Peano data type.
data Offer a p where
  Present          :: a -> Offer a (Succ Zero)
  PercentDiscount  :: Float -> Offer a Zero
  AbsoluteDiscount :: Float -> Offer a Zero
The place where the Plus TF is expected to be used is in the Both constructor. When using TFs, you don’t need to use the type context part of the signature; you can just use an applied TF in the place where a type is expected. For example, in this constructor, Plus p q is used in lieu of a type:
  Both :: Offer a p -> Offer a q -> Offer a (Plus p q)
To reinforce the steps you need to follow to use a TF for encoding a type-level function , let’s use the same process for restriction. Looking at the previous sections, you can see that in that case you need to define a type-level minimum function. The following code does so with a TF Min:
type family Min x y where
  Min Zero     y        = Zero
  Min x        Zero     = Zero
  Min (Succ x) (Succ y) = Succ (Min x y)
The corresponding Restrict constructor is easily updated from FDs to TFs:
  Restrict :: Vect (Succ n) a -> Offer a p -> Offer a (Min (Succ n) p)

As in the previous case, it’s your task (in Exercise 13-8) to write the rest of the cases of the data type.

EXERCISE 13-8. OFFERS WITH TYPE FAMILIES

Include the constructors for the rest of the offers language: From, Until, Extend, BetterOf, and If. At this point, you’ll know that you’ll need a type-level function that encodes the maximum of two natural numbers in the last two cases. Define it using type families.

Categories of Products with TFs

The introduction to type-level programming in Haskell stated that functional dependencies and type families had the same power of expressiveness. You’ve already seen how both can encode type-level functions, but there’s still the question of how to use type families to solve the problem with categories of products.

One possibility is to define a TF that assigns the type of operations to each type of product. Following the instructions from the previous section, an Operation function would resemble the following:
type family Operation x where
  Operation TimeMachine = TimeMachineOps
  Operation Book        = BookOps

This is not a satisfactory solution, though. When you used a type class along with an FD, you had the option of adding a new product along with its sets of operations at any moment, via a new instance declaration. However, if you follow the approach I’ve just shown, with the previous definition of a TF, you need to add both an instance to the Product type class and a new equation to the Operation type family. This makes the type class and the type family tightly coupled and thus less maintainable.

The problem in this case does not lie in the use of type families but rather in the fact that the type family is defined as a closed one. GHC supports two kinds of type families. The closed ones, which I’ve already introduced, cannot be enlarged with more rules after its definition. In contrast, open type families define a partial function that can be refined or enlarged in other parts of the code.

To define a type family as open, drop the final where keyword from its declaration.
type family Operation x
Each time you want to add a new rule to the type family, you have to include it after the type instance keywords . For example, the previous two relations between products and operations would read as follows:
type instance Operation TimeMachine = TimeMachineOps
type instance Operation Book        = BookOps

Notice that in this case the instance declarations appear at the same level of the signature, since they may be defined in completely different modules.

TYPE INSTANCES MUST NOT OVERLAP

There’s an important difference between closed and open type families. Closed type families follow the usual evaluation model. The first rule is tried; if it doesn’t match, the second rule is tried, and so on, until a match is ultimately found. This can be done because rules from a closed type family have a defined order. However, rules from an open type family come from different, unrelated places and have no order. If the compiler found that two patterns match, it wouldn’t be able to know which choice to take. Thus, GHC forces type instance declarations to not overlap.

Take as an example the definition of the Min TF shown earlier. If the compiler had to compute the result of Min Zero Zero, both the first and second rules would match. It knows that it should use the first one because the TF is closed. But if you were to define Min as an open TF (e.g., because your version of GHC is earlier than 7.8.1), you would need to refine the declaration to make type instances not overlap. You could achieve it by making the second rule fire only when the first argument is larger than zero.
  type instance Min Zero     y        = Zero
  type instance Min (Succ x) Zero     = Zero
  type instance Min (Succ x) (Succ y) = Succ (Min x y)

Now each application of the open TF Min has one and only one rule to apply.

In many of the cases where functional dependencies are used, you want to create a type-level function but also enforce each implementer of the type class to add a new rule to that type-level function. Or from another point of view, you want to add a type-level function inside the type class. When you enable the TypeFamilies extension, type class declarations are allowed to contain both term-level and type-level function signatures, as desired. The type-level signatures inside a type class are known as associated types.

Note

Remember that types have a simple kind system (the only possible kinds are * and function-like kinds such as * -> *) that checks whether the application of type constructors is correct. If you don’t remember all the details, you can refer to Part 1 of the book, where I introduced the kind system.

Let’s see how you would rework5 the Product type class to use associated types instead of functional dependencies. If you remember the original definition, there was a p variable representing the category of products and an op variable for the operations for that category. The latter was functionally dependent on the former, so it’s a perfect candidate for being changed into an associated type. Thus, the new code drops the op parameter and adds a type-level function called Operation , as follows:
class Product p where
  type Operation p :: *
  price :: p -> Float
  perform :: p -> Operation p -> String
  testOperation :: p -> Operation p
As you can see, any appearance of op in the old declaration is now replaced with Operation p, which gives back the type corresponding to the operations. An implementation of Operation must also appear in every instantiation of the Product type class, like the following one for TimeMachine:
instance Product TimeMachine where
  type Operation TimeMachine = TimeMachineOps
  price _ = 1000.0
  perform (TimeMachine m) (Travel y)
    = "Travelling to " ++ show y ++ " with " ++ m
  perform (TimeMachine m) Park
    = "Parking time machine " ++ m
  testOperation _ = Travel 0

Like in the previous section, the type of bag that each category of products needs should be encoded at the type level. Exercise 13-9 asks you to do so, now using associated types.

EXERCISE 13-9. PRODUCTS AND BAGS, REDUX

Using the previously defined data types BigBag and SmallBag, represent the kind of bag each category of product needs. To do this, add a new associated type to the Product type class. Finally, include the instances that express that time machines should go on big bags, whereas books need only small ones.

For the simple scenario of one of the types being completely dependent on the rest of the variables in the type class (like operations and bags in this section), associated types usually make more explicit that only one possibility can be chosen in each instance. However, functional dependencies shouldn’t be overlooked, because they allow richer expression of dependence.

There’s one last addition to the type system brought by type families. Consider the following function, performTestFromOther , which executes the test operation of a product on a completely different product:
performTestFromOther p q = perform p $ testOperation q
The task now is to give a type signature that is as abstract as possible to allow the function to be used in the largest variety of situations. A first approximation is to require p and q to be of the same type, which should implement the Product type class . Indeed, if you try to add the following type signature to the function, the compiler will accept the definition:
performTestFromOther :: Product p => p -> p -> String
But this signature is overly restrictive. You don’t need both arguments to have the same type. The only thing you need is for them to support the same operations, that is, for their Operation associated type to coincide. This kind of requisite can be expressed in Haskell using an equality constraint x ~ y, which states that x and y must be syntactically equal after all type families have been resolved. The most general constraint for this case is then as follows:
performTestFromOther :: (Product p, Product q, Operation p ~ Operation q)
                     => p -> q -> String
Actually, if you try to compile the code without giving an explicit type signature, GHC will give you this more general type as a hint. You could directly copy that signature in your code, removing the initial forall part that is implicit for every free variable in the signature . Here’s an example:
Warning: Top-level binding with no type signature:
      performTestFromOther
        :: forall p p1.
           (Product p, Product p1, Operation p ~ Operation p1) =>
           p -> p1 -> String

Equality constraints are not often seen in handwritten code, but they are essential to understanding GHC error messages. In most of the instances where the type families do not coincide, the compiler will warn you about an equality constraint (one with ~) not being respected.

DATA FAMILIES

Being completely correct, the previous sections didn’t introduce type families but a subclass of them called type synonym families. When using this subclass, you need to use it in both parameters and return expression types that were already defined elsewhere.

There is another kind of TF, called a data family, where you directly list the constructors for each instance of the family instead of using another type. For example, the Product type class and its TimeMachine instance could also be defined as follows:
class Product p where
  data Operation p
  price :: p -> Float
  perform :: p -> Operation2 p -> String
  testOperation :: p -> Operation2 p
instance Product TimeMachine where
  data Operation TimeMachine = Travel Integer | Park
  price _ = 1000.0
  perform (TimeMachine m) (Travel y) = "Travelling to " ++ show y ++ " with " ++ m
  perform (TimeMachine m) Park       = "Parking time machine " ++ m
  testOperation _ = Travel 0

Type synonym and data families are not interchangeable, though, because they have different properties (e.g., data families can be partially applied, while type synonym families can’t).

Many of the techniques in this chapter are applied in libraries in Hackage to make stronger guarantees about the code that is executed. You’ve already used associated types, although you didn’t know it back then, when writing the schema description in Persistent.6 Another interesting package that uses type-level programming is HList, which allows you to create lists whose elements have different types.

Data Type Promotion and Singletons

From version 7.4.1 on, GHC includes an extension called data type promotion. When enabled, for each data type at the term level, a new set of types is created at the type level (the type is promoted). Furthermore, a new kind is associated to each of these promoted data types, leading to safer type-level programming . In this section, you’ll see how to take advantage of this feature and how to promote functions in addition to data types using the singletons package.

A Further Refinement to the Presents Rule

Let’s start from scratch with the implementation of the Presents Rule (if you’re writing the code as you read, start a new empty module). But now instead of creating empty data types for representing 0 and successors, the code will use a regular data declaration and turn on the DataKinds extension – which enables data type promotion. The code is similar to the following:
{-# LANGUAGE DataKinds #-}
data Nat = Zero | Succ Nat
If on the same file you include the declaration of lists tagged with their length from the section on functional dependencies, the code will compile just fine. As a reminder, here’s the definition of the Vect data type:
{-# LANGUAGE GADTs #-}
data Vect n a where
  VNil  :: Vect Zero a
  VCons :: a -> Vect n a -> Vect (Succ n) a
It seems that you’ve used the data type from the term level inside a type, as you would do in a dependently typed language! However, this would defy the strict separation between types and terms in the Haskell language. The truth is that when the DataKinds extension is enabled, the compiler creates a copy of the data type in the type level. Conceptually and for now, you can think of the source file with DataKinds as being equivalent to the following:
data Nat = Zero | Succ Nat
data Zero
data Succ nat
The compiler can distinguish between constructors and type names because they live in separate worlds. In the rare event in which the compiler could not make that distinction, you can use the syntax 'Identifier to refer explicitly to the Identifier at the type level. For example, you may write the definition of Vect as more explicit about Zero and Succ being promoted types.
data Vect n a where
  VNil  :: Vect 'Zero a
  VCons :: a -> Vect n a -> Vect ('Succ n) a
The next step for porting the code to this new file is to declare a type family Plus , which will encode addition. You can copy the code from the previous section (including the TypeFamilies pragma). If you now ask for the kind of Plus, you get
*> :kind Plus
Plus :: Nat -> Nat -> Nat
The compiler has inferred the right kinds.7 Following the same idea of being explicit as adding type signatures to every definition, we can add kind annotations to our Plus type family to indicate the kind of arguments and the result kind:
type family Plus (x :: Nat) (y :: Nat) :: Nat where
Note

Remember that kinds are used to categorize types, like types do for values. In Haskell, without data type promotion, all the fully applied data types have kind *. However, this extension opens the door to user-defined kinds, as the example shows.

The next step is to use this new approach to your advantage. For example, right now I haven’t explicitly forbidden writing a type such as Vect Int Char . But this makes no sense. The only possible values for the first variable in Vect should be those from the kind Nat (which are the one representing numbers). You can write it explicitly in the type declaration.
data Vect (n :: Nat) a where
Note

In this case, the annotation is not needed because the compiler is able to infer that the kind of n must be Nat. But it’s still interesting to make it explicit, at least for documentation purposes and to ensure that a change in your code doesn’t make the compiler infer a different type.

This same idea should be applied to Offer, which has a type parameter that should take values only from type-level natural numbers. The corresponding refinement of this data type should be declared as follows:
data Offer a (p :: Nat) where

As you can see, the DataKinds extension brings to type-level programming most of the safety that types give to the Haskell programs at the term level. Furthermore, the automatic promotion makes it easier to declare the types that will be used for programming at the type level. However, the kind system is not as powerful as the type system (e.g., you don’t have anything like kind classes), but for most of the type-level programming, this extension should be more than enough.

Cooking with Singletons

The DataKinds extension is really useful , but it doesn’t give you the full package. Your data types can be promoted to the type level seamlessly, but you still need to define your type-level functions using either FDs or TFs. If you are using some functionality that was already available at the term level, this means that you need to duplicate code and to do so in different styles of programming, hurting the readability and maintainability of the code.

The singletons package provides a pragmatic solution to this problem. Using the metaprogramming facilities of Template Haskell, it creates type-level versions of the term-level functions you ask for. Although the two worlds are still separated, this library creates the illusion that the same constructors and functions are used in both levels seamlessly.

To start using the package, you must add it as a dependency of your project and import the Data.Singletons.TH module in your source file. The expansion of Template Haskell blocks user type families and data type promotion, so in addition to the metaprogramming extensions, you need to enable them in your source file, or the code will refuse to compile.

The module provides a lot of functionality, but the most interesting one for your needs here is promote. Using it, you can create type-level versions of both data types and functions. The functions will be encoded at the type level using TFs, with a name resulting in changing the first letter of the function name into uppercase. For example, here’s how you could promote natural number operations that have been guiding you in the chapter:
{-# LANGUAGE GADTs, DataKinds, TypeFamilies, UndecidableInstances #-}
{-# LANGUAGE TemplateHaskell #-}
import Data.Singletons.TH hiding (Min)
$(promote [d|
  data Nat = Zero | Succ Nat
           deriving (Show, Eq)
  plus :: Nat -> Nat -> Nat
  plus Zero     y = y
  plus (Succ x) y = Succ (plus x y)
  min :: Nat -> Nat -> Nat
  min Zero     _        = Zero
  min _        Zero     = Zero
  min (Succ x) (Succ y) = Succ (min x y)
  |])
Caution

The code inside the block starting from [d| to |] must be indented to work correctly. Double-check this fact when working with singletons.

THE SINGLETONS PRELUDE

In this section, we are defining our own version of promoted natural numbers and several operations over them. This is not needed, though, since the singletons package contains a quite complete copy of Haskell’s Prelude in the type level. You can find it in the Data.Singletons.Prelude module.

Enforcing the Duration Rule

Since now you’re able to reuse most of your regular Haskell knowledge in the type level via promotion , you may think about encoding a more complicated invariant: the Duration Rule. To do so, let’s create a data type for ranges of time. Three cases must be handled: a range that is open only at the end, which means that the offer is applicable from a specific point in time (e.g., an infinite range would be an open range starting at 0); a closed range with start and end points; and an empty range, which is the one to be forbidden. As a further example, let’s define a type-level Infinite range using promotion. This code uses the promoted Nat functionality from the previous section:
$(promote [d|
  data Range = Empty | Open Nat | Closed Nat Nat
  infinite :: Range
  infinite = Open Zero
  |])
Now let’s create the TFs that will be ultimately called when applying the From and Until constructors of the Offer data type. For the first case, the function will be restrictFrom and should take as arguments the range of days that the offer is available for before the restriction and the new point in time for the initial day of the offer. The code is a bit long but should be straightforward to understand.
$(promote [d|
  data Comparison = Less' | Equal' | Greater'
  compare' :: Nat -> Nat -> Comparison
  compare' Zero     Zero     = Equal'
  compare' Zero     (Succ _) = Less'
  compare' (Succ _) Zero     = Greater'
  compare' (Succ x) (Succ y) = compare' x y
  restrictFrom :: Nat -> Range -> Range
  restrictFrom _ Empty = Empty
  restrictFrom n (Open f)
    = restrictFrom1 n f (compare' n f)
  restrictFrom n (Closed f t)
    = restrictFrom2 n f t (compare' n f) (compare' n t)
  restrictFrom1 :: Nat -> Nat -> Comparison -> Range
  restrictFrom1 n _ Greater' = Open n
  restrictFrom1 _ f Equal'   = Open f
  restrictFrom1 _ f Less'    = Open f
  restrictFrom2 :: Nat -> Nat -> Nat -> Comparison -> Comparison -> Range
  restrictFrom2 _ _ _ Greater' Greater' = Empty
  restrictFrom2 _ _ _ Greater' Equal'   = Empty
  restrictFrom2 n _ t Greater' Less'    = Closed n t
  restrictFrom2 _ f t Equal'   _        = Closed f t
  restrictFrom2 _ f t Less'    _        = Closed f t
  |])
Caution

I am adding ticks to the end of some names to disambiguate them from the ones in the singletons’ Prelude.

Almost any use of singletons leads to the UndecidableInstances extension . This is because the compiler cannot prove that the TFs you’ve created will always terminate their execution when you have nested TFs. This nesting appears in the promoted TFs because of the call to compare inside restrictFrom. However, if you know that the function you wrote will terminate, it’s safe to unveil that restriction and tell the compiler to accept that function without further termination checks.

The first step for tagging your offers with duration information is to enlarge the initial type with a new type variable of kind Range, which was promoted previously. To keep the examples concise, I will use the original Offer data type, instead of the one tagged with a number of presents. The declaration of the GADT reads as follows:
data Offer a (r :: Range) where
The basic combinators for offers (presents and discounts) have an infinite duration by default. To write them down, you can use the Infinite TF that was created in the previous promotion.
  Present          :: a -> Offer a Infinite
  PercentDiscount  :: Float -> Offer a Infinite
  AbsoluteDiscount :: Float -> Offer a Infinite
As a first approximation, you could write the time restriction From as follows:
  From :: (n :: Nat) -> Offer a d -> Offer a (RestrictFrom n d)
However, this will make the compiler quite unhappy and will make it show the following type error :
src/Chapter13/CheckDurationPromotion.hs:
    Expected a type, but `(n :: Nat)' has kind `Nat'
    In the type `(n :: Nat)'
    In the definition of data constructor `From'
    In the data declaration for `Offer'

The problem is that you cannot use promoted kinds in a constructor; only types of kind * are allowed. In particular, using Nat is forbidden.

Fortunately, there’s a construction that helps you overcome this difficulty. The idea is to create a regular data type that carries as a tag the type-level number you need. In that way, you have a value that you can use at runtime, and the tag gives the type-level information. This data type will have only one possible value for each possible tag. For that reason, they are called singleton data types. For example, a singleton type SNat corresponding to the Nat promoted kind would read as follows:
data SNat (n :: Nat) where
  SZero :: SNat Zero
  SSucc :: SNat n -> SNat (Succ n)
By the way in which this type is constructed, given a type of kind Nat, only one value of SNat is possible. For example, the only inhabitant of SNat (Succ (Succ Zero)) is SSucc (SSucc SZero). So, you’ve reflected the type corresponding to 2 as a runtime value, as desired. If you look at the output from GHCi, you can spot the explicit reference to promoted constructors, which are shown with the ' sign in front of them.
*Chapter13.CheckDurationPromotion> :t SSucc (SSucc SZero)
SSucc (SSucc SZero)
  :: SNat ('Succ ('Succ 'Zero))
The next step is to use this singleton type in the constructor From that uses as runtime arguments only values of kind *. Notice how you have access to the type-level number n from the argument to SNat.
  From :: SNat n -> Offer a d -> Offer a (RestrictFrom n d)
Another easy construction involves moving from the SNat singleton data type back to the initial Nat data type before its promotion. Once the conversion is done, you have a runtime value that you may use in regular functions. The following piece of code defines that conversion and uses it to print an offer restriction:
toNat :: SNat n -> Nat
toNat SZero     = Zero
toNat (SSucc n) = Succ (toNat n)
printDateRestriction :: Offer a r -> String
printDateRestriction (From n _)  = "From " ++ show (toNat n)
printDateRestriction (Until n _) = "Until" ++ show (toNat n)
printDateRestriction _           = "No date restriction"
Once again, the creation of a singleton type given a data type that is promoted is just boilerplate. The singletons library includes another function of singletons that supersedes promote and that generates singleton types along with promoting data types and functions. For example, if you want to generate SNat automatically, you can include in your source file the following:
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE EmptyCase #-}
{-# LANGUAGE InstanceSigs #-}
$(singletons [d|
  data Nat = Zero | Succ Nat
           deriving (Show, Eq)
  |])
In addition, the singletons library includes a SingI type class with only one function, sing. The purpose of this function is to create the unique value of a singleton type given its corresponding promoted data type. In this way, you don’t have to write the constructors as you did before with SNat. For example, the inhabitants corresponding to the first four natural numbers at the type level can be written like so:
zero :: SNat Zero
zero = sing   -- results in SZero
one :: SNat (Succ Zero)
one = sing    -- results in SSucc SZero
two :: SNat (Succ (Succ Zero))
two = sing    -- results in SSucc (SSucc SZero)
three :: SNat (Succ (Succ (Succ Zero)))
three = sing  -- results in SSucc (SSucc (SSucc SZero))
The singletons package also provides a type class for converting from a singleton type to the regular data type. It’s called SingE and has a single function fromSing. Instead of the homemade toNat function, you could use that function to turn the singleton three into a term-level Nat.
*Chapter13.CheckDurationPromotion> fromSing three
Succ (Succ (Succ Zero))

There are some constructors left for the full Offer data type. Exercise 13-10 asks you to finish the job.

EXERCISE 13-10. OFFERS WITH SINGLETONS
Add the rest of constructors to the Offer GADT: Restrict should keep its duration; Until should change the duration in a similar way to From; Both, BetterOf, and If must compute the smallest duration range that includes those of both arguments (i.e., the intersection). At this point, you can use Offer to build a complete offer and compute its range.
*> let one = SSucc SZero                    -- build the singleton 1
*> let three = SSucc (SSucc (SSucc SZero))  -- build the singleton 3
*> :t let p = Present 'a' in Both (From one p) (BetterOf p (Until three p))
let p = Present 'a' in Both (From one p) (BetterOf p (Until three p))
  :: Offer Char ('Closed ('Succ 'Zero) ('Succ ('Succ ('Succ 'Zero))))
TYPE-LEVEL LITERALS

GHC provides extra features for tagging types with either natural numbers or strings. If you enable the TypeOperators extension and import the GHC.TypeLits module, you can use numbers and string literals at those places where you need a type of kind Nat (for natural numbers) or Symbol (for strings).

Using these literals, the Vect data type could have been declared as follows:
data Vect n a where
  VNil  :: Vect 0 a
  VCons :: a -> Vect n a -> Vect (n + 1) a

Note that this module provides only a small set of operations on natural numbers, namely, addition, product, and exponentiation, as well as a type class that encodes whether two type-level numbers are related as being less or equal.

Singleton types put an end to the bird’s-eye view on type-level programming in Haskell. The relation between term-level data types and functions, their corresponding promoted types and kinds, and the singleton types is subtle, but each one serves a purpose:
  • data and function declarations express how to create values that Haskell can use to compute at runtime.

  • Promoted data types and kinds and type-level functions expressed as either FDs or TFs are evaluated at compile time and allow you to tag values with stronger types that introduce extra invariants.

  • Singleton types are the bridge between the worlds. When you need a type-level value that should also be reflected at runtime, you should use them.

Exercise 13-11 provides an exercise on a different domain to help you better understand these relations.

EXERCISE 13-11. RECTANGLES AND BOUNDING BOXES
For this exercise, you will use the following data type, which represents images built from rectangles. The Rect constructor represents a single rectangle, and then you can combine images (with Union), take just the common part (with Intersection), or put together several copies in a row (using Replicate).
data Rectangle = R { topLeft :: (Nat, Nat), bottomRight :: (Nat, Nat) }
data Image = Rect Rectangle
           | Union Image Image
           | Intersection Image Image
           | Replicate Nat Image

You must tag images with their bounding box, that is, the smallest rectangle that contains the whole of the image. For example, if you have the union of the rectangles from (1,0) to (5,4) and from (0,1) to (3,2), the bounding box is (0,0) to (5,4). You can draw them on paper to convince yourself about that.

The type-level calculations should be developed using the techniques from the singletons package. Think carefully about which places need singleton types.

Summary

In this chapter, you explored many of the advanced features of the Haskell type system while designing a domain-specific language for expressing offers for the Time Machine Store.
  • This chapter introduced the concepts of external/stand-alone and internal/embedded domain-specific languages and the difference between deep and shallow embedding.

  • Generalized algebraic data types allow constructors of a data type to build values with different types; you used that extra functionality to create type-safe expressions.

  • You were introduced to the idea of tagging a type with some extra information, which allows you to check stronger invariants at compile time.

  • You explored several possibilities for doing type-level programming in Haskell, including functional dependencies, type families, and data type promotion. The main characteristic of Haskell in this aspect is the separation between the term and type worlds.

  • Functional dependencies and associated types refine the type class mechanism in Haskell and were covered in this chapter.

  • Data type promotion and the singletons library make it possible for you to move declarations from the term level to the type level. Furthermore, it extends the kind system to provide safer type-level programming in Haskell.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.133.13.76