CHAPTER 5

images

Understanding Types in Functional Programming

F# is a typed language, and F# programmers often use types in sophisticated ways. In this chapter, you learn about the foundations of types, focusing on how types are defined and used in F# functional programming. You also look closely at generics, and closely related to generics is the notion of subtyping. Generics and subtyping combine to allow you to write code that is generic over families of types. You will see how F# uses automatic generalization to automatically infer generic types for your code, and the chapter covers some of the basic generic functions in the F# libraries, such as generic comparison, hashing, and binary serialization.

Exploring Some Simple Type Definitions

Given that F# is a typed language, you will often need to declare new shapes of types via type definitions and type abbreviations. This chapter covers only some of the simpler type definitions that are useful and succinct workhorses for functional programming. F# also lets you define a range of sophisticated type definitions related to object-oriented programming, discussed in Chapter 6. Often, however, these aren’t required in basic functional programming.

Defining Type Abbreviations

Type abbreviations are the simplest type definitions:

type index = int
type flags = int64
type results = string * System.TimeSpan * int * int

It’s common to use lowercase names for type abbreviations, but it’s certainly not compulsory. Type abbreviations aren’t concrete, because they simply alias an existing type. For example, when doing .NET programming with F#, they’re expanded during the process of compiling F# code to the format shared among multiple .NET languages. Because of this, a number of restrictions apply to type abbreviations. For example, you can’t augment them with additional members, as can be done for concrete types, such as records, discriminated unions, and classes. In addition, you can’t truly hide a type abbreviation using a signature (see Chapter 7).

Defining Record Types

The simplest concrete type definitions are records. Here’s an example:

type Person =
    {Name : string
     DateOfBirth : System.DateTime}

You can construct record values by using the record labels:


> {Name = "Bill"; DateOfBirth = new System.DateTime(1962, 09, 02)};;

val it : Person = {Name="Bill"; DateOfBirth = 09/02/1962 ...}

Here, semicolons are used to separate each field, which are optional in the longer, one-field-per-line form above. Should there be a conflict between labels among multiple records, you can also construct record values by using an explicit type annotation:


> ({Name = "Anna"; DateOfBirth = new System.DateTime(1968, 07, 23)} : Person);;

val it : Person = {Name="Anna"; DateOfBirth = 23/07/1968 ... }

Other times, referencing the containing type for any given record label using the type.field syntax helps to avoid conflicts. Record values are often used to return results from functions:

type PageStats =
    {Site : string;
     Time : System.TimeSpan;
     Length : int;
     NumWords : int;
     NumHRefs : int}

This technique works well when returning a large number of heterogeneous results:

//Using the time, http and getWords functions from Chapter 3.
let stats site =
    let url = "http://" + site
    let html, t = time (fun () -> http url)
    let words = html |> getWords
    let hrefs = words |> Array.filter (fun s -> s = "href")
    {Site = site; Time = t; Length = html.Length;
     NumWords = words.Length; NumHRefs = hrefs.Length}

Here is the type of stats:


val stats : site:string -> PageStats

Here is how F# Interactive shows the results of applying the function:


> stats "www.live.com";;

val it : PageStats = {Site = "www.live.com";
                      Time = 00:00:03.0941770 {Days = 0; ...};
                      Length = 12139;
                      NumWords = 892;
                      NumHRefs = 11;}

Handling Non-Unique Record Field Names

Record labels need not be unique among multiple record types. Here is an example:

type Person =
    {Name : string
     DateOfBirth : System.DateTime}
type Company =
    {Name : string
     Address : string}

When record names are non-unique, constructions of record values may need to use object expressions in order to indicate the name of the record type, thus disambiguating the construction. For example, consider the following type definitions:

type Dot = {X : int; Y : int}
type Point = {X : float; Y : float}

On lookup, record labels are accessed using the dot (.) notation in the same way as properties. One slight difference is that in the absence of further qualifying information, the type of the object being accessed is inferred from the record label. This is based on the latest set of record labels in scope from record definitions and uses of open. For example, given the previous definitions, you have the following:


> let coords1 (p:Point) = (p.X, p.Y);;

val coords1 : p:Point -> float * float

> let coords2 (d:Dot) = (d.X, d.Y);;

val coords2 : d:Dot -> int * int

> let dist p = sqrt (p.X * p.X + p.Y * p.Y);; // use of X and Y implies type "Point"
val dist : p:Point -> float

The accesses to the labels X and Y in the first two definitions have been resolved using the type information provided by the type annotations. The accesses in the third definition have been resolved using the default interpretation (taking the last definition that fits) of record field labels in the absence of any other qualifying information.

Cloning Records

Records support a convenient syntax to clone all the values in the record, creating a new value with some values replaced. Here is a simple example:

type Point3D = {X : float; Y : float; Z : float}

let p1 = {X = 3.0; Y = 4.0; Z = 5.0}
let p2 = {p1 with Y = 0.0; Z = 0.0}val p1 : Point3D = {X = 3.0; Y = 4.0; Z = 5.0;}

val p2 : Point3D = {X = 3.0; Y = 0.0; Z = 0.0;}

The definition of p2 is identical to this:

let p2 = {X = 3.0; Y = 0.0; Z = 0.0}

This expression form doesn’t mutate the values of a record, even if the fields of the original record are mutable.

Defining Discriminated Unions

The second kind of concrete type definition discussed in this section is a discriminated union. Here is a very simple example:

type Route = int
type Make = string
type Model = string
type Transport =
    | Car of Make * Model
    | Bicycle
    | Bus of Route

Each alternative of a discriminated union is called a discriminator. You can build values by using the discriminator much as if it were a function:


> let ian = Car("BMW","360");;

val ian : Transport = Car ("BMW","360")

> let don = [Bicycle; Bus 8];;

val don : Transport list = [Bicycle; Bus 8]

> let peter = [Car ("Ford","Fiesta"); Bicycle];;

val peter : Transport list = [Car ("Ford","Fiesta"); Bicycle];;

You can also use discriminators in pattern matching:

let averageSpeed (tr : Transport) =
    match tr with
    | Car _ -> 35
    | Bicycle -> 16
    | Bus _ -> 24

Discriminated unions can include recursive references (the same is true of records and other type definitions). This is frequently used when representing structured languages via discriminated unions, a topic covered in depth in Chapter 9:

type Proposition =
    | True
    | And of Proposition * Proposition
    | Or of Proposition * Proposition
    | Not of Proposition

Recursive functions can be used to traverse such a type. For example:

let rec eval (p : Proposition) =
    match p with
    | True -> true
    | And(p1,p2) -> eval p1 && eval p2
    | Or (p1,p2) -> eval p1 || eval p2
    | Not(p1) -> not (eval p1)

Several of the types you’ve already met are defined as discriminated unions. For example, the 'T option type and the F# type of immutable lists are effectively defined as:

type 'T option =
    | None
    | Some of 'T

type 'T list =
    | ([])
    | (::) of 'T * 'T list

A broad range of tree-like data structures are conveniently represented as discriminated unions. For example:

type Tree<'T> =
    | Tree of 'T * Tree<'T> * Tree<'T>
    | Tip of 'T

You can use recursive functions to compute properties of trees:

let rec sizeOfTree tree =
    match tree with
    | Tree(_, l, r) -> 1 + sizeOfTree l + sizeOfTree r
    | Tip _ -> 1

Here is the inferred type of sizeOfTree:


val sizeOfTree : tree:Tree<'a> -> int

Here is an example of a constructed tree term and the use of the size function:


> let smallTree = Tree ("1", Tree ("2", Tip "a", Tip "b"), Tip "c");;

val smallTree : Tree<string> = Tree ("1",Tree ("2",Tip "a",Tip "b"),Tip "c")

> sizeOfTree smallTree;;

val it : int = 5

Chapters 9 and 12 discuss symbolic manipulation based on trees.

images Note  Discriminated unions are a powerful and important construct and are useful when modeling a finite, sealed set of choices. This makes them a perfect fit for many constructs that arise in applications and symbolic analysis libraries. They are, by design, nonextensible: subsequent modules can’t add new cases to a discriminated union. This is deliberate: you get strong and useful guarantees by placing a limit on the range of possible values for a type.

Using Discriminated Unions as Records

Discriminated union types with only one data tag are an effective way to implement record-like types:

type Point3D = Vector3D of float * float * float

let origin = Vector3D(0., 0., 0.)
let unitX = Vector3D(1., 0., 0.)
let unitY = Vector3D(0., 1., 0.)
let unitZ = Vector3D(0., 0., 1.)

These are particularly effective because they can be decomposed using succinct patterns in the same way as tuple arguments:

let length (Vector3D(dx, dy, dz)) = sqrt(dx * dx + dy * dy + dz * dz)

This technique is most useful for record-like values where there is some natural order on the constituent elements of the value (as shown earlier) or where the elements have different types.

It might also be helpful to be aware that when pattern matching against single-case discriminated union values in a match construct, the use of the pipe | operator is optional. This is also the case above, in the type definition.

Defining Multiple Types Simultaneously

Multiple types can be declared together to give a mutually recursive collection of types, including record types, discriminated unions, and abbreviations. The type definitions must be separated by the keyword and:

type Node =
    {Name : string;
     Links : Link list}and Link =
    | Dangling
    | Link of Node

Understanding Generics

F# constructs such as lists, tuples, and function values are all generic, which means they can be instantiated with multiple different types. For example, int list, string list, and (int * int) list are all instantiations of the generic family of F# list types. Likewise, int -> int and string -> int are both instantiations of the generic family of F# function types. The F# library and the .NET Framework have many other generic types and operations in addition to these.

Generic constructs are always represented through the use of type variables, which in F# syntax are written 'T, 'U, 'a, 'Key, and so on. For example, the definition of the Set type in the F# library begins like this:

type Set<'T> = ...

Type abbreviations can also be generic, e.g.

type StringMap<'T> = Map<string, 'T>

type Projections<'T, 'U> = ('T -> 'U) * ('U -> 'T)

Values can also be generic. A typical generic value is List.map, whose type is as follows:


val map : ('T -> 'U) -> 'T list -> 'U list

Each time you name a generic type or value, the F# type system must infer instantiations for the type variables involved. For example, in Chapter 3, you used List.map fetch over an input list, where fetch had the following type:


val fetch : url:string -> string * string

In this case, the type variable 'T in the signature of List.map is instantiated to string, and the type variable 'U is instantiated to string * string, giving a return type of (string * string) list.

Generic values and functions such as List.map are common in F# programming; they’re so common that you usually don’t even write the declarations of the type variables in the types of these values. Sometimes, however, the declaration point of these variables is made explicit in output from tools and the F# compiler. For example, you may see this:


val map<'T, 'U> : ('T -> 'U) -> 'T list -> 'U list

Frequently, type variables have an implicit scope, governed by the rules of automatic generalization discussed in the section “Writing Generic Functions.” This means you can introduce type variables simply by writing them as part of the type annotations of a function:

let rec map (f : 'T -> 'U) (l : 'T list) =
    match l with
    | h :: t -> f h :: map f t
    | [] -> []

If you want, you can also write the type parameters explicitly on a declaration. You typically have to use each type variable at least once in a type annotation in order to relate the type parameters to the actual code:

let rec map<'T, 'U> (f : 'T -> 'U) (l : 'T list) =
    match l with
    | h :: t -> f h :: map f t
    | [] -> []

images Note  By convention, uppercase type-variable names are used for user-declared type parameters, and lowercase names are used for inferred type parameters. In general, the style TypeName<'T> is preferred for F# types, although for historical reasons, the style 'T TypeName is used for list, option, reference, and array types.

Writing Generic Functions

A key feature of F# is the automatic generalization of code. The combination of automatic generalization and type inference makes many programs simpler, more succinct, and more general. It also greatly enhances code reuse. Languages without automatic generalization force programmers to compute and explicitly write down the most general type of their functions, and often this is so tedious that programmers don’t take the time to abstract common patterns of data manipulation and control.

For example, type parameters are automatically introduced when you write simple functions that are independent of some of their arguments:

let getFirst (a,b,c) = a

The inferred type of getFirst is reported as follows:


val getFirst : a:'a * b:'b * c:'c -> 'a

Here, getFirst has been automatically inferred to be generic. The function is generic in three type variables, where the result type is the first element of the input tuple type. Automatic generalization is applied when a let or member definition doesn’t fully constrain the types of inputs or outputs. You can tell automatic generalization has been applied by the presence of type variables in an inferred type and ultimately by the fact that you can reuse a construct with multiple types.

Automatic generalization is particularly useful when taking functions as inputs. For example, the following takes two functions as input and applies them to each side of a tuple:

let mapPair f g (x, y) = (f x, g y)

The generalized, inferred type is as follows:


val mapPair : f:('a -> 'b) -> g:('c -> 'd) -> x:'a * y:'c -> 'b * 'd

Some Important Generic Functions

The F# and .NET libraries include definitions for some important generic functions. You saw a number of these in action in earlier chapters. It’s important to have a working knowledge of these building blocks, because often your code will automatically become generic when you use these primitives.

Generic Comparison

The first primitives are all related to generic comparison, also often called structural comparison. Every time you use operators such as <, >, <=, >=, =, <>, compare, min, and max in F# code, you’re using generic comparison. All of these operators are located in the Microsoft.FSharp.Core.Operators module, which is opened by default in all F# code. Some important data structures also use generic comparison internally; for example, you may also be using generic comparison when you use F# collection types such as Microsoft.FSharp.Collections.Set and Microsoft.FSharp.Collections.Map. This is discussed in the documentation for these types. The type signatures of the basic generic comparison operators are shown here:


val compare : 'T -> 'T -> int when 'T : comparison
val (=) : 'T -> 'T -> bool when 'T : equality
val (<) : 'T -> 'T -> bool when 'T : comparison
val (<=) : 'T -> 'T  -> bool when 'T : comparison
val (>) : 'T  -> 'T  -> bool when 'T : comparison
val (>=) : 'T  -> 'T  -> bool when 'T : comparison
val (min) : 'T  -> 'T  -> 'T  when 'T : comparison
val (max) : 'T  -> 'T  -> 'T  when 'T : comparison

All of these routines are constrained, which means they may be used only on a subset of types that are known to support either equality (for =) or ordered comparison (for the others). It may help to think of those that implement ordered comparison as being implemented in terms of compare, which returns 0 if the arguments are equal and returns –1 and 1 for less than and greater than, respectively.

On ordinary simple types, such as integers, generic comparison works by invoking the default .NET behavior for these types, giving the natural ordering for these types. For strings, culture-neutral ordinal comparison is used, which means the local culture settings on your machine don’t affect string comparison (see System.Globalization, System.Threading.Thread.CurrentThread.CurrentCulture, and String.Compare for more information about local culture settings and culture-specific string ordering). Most other .NET base types implement the System.IComparable interface, such as System.DateTime values, and generic comparison uses these implementations where necessary.

You can also use the comparison operators on most structured types. For example, you can use them on F# tuple values, where a lexicographic left-to-right comparison is used:


> ("abc", "def") < ("abc", "xyz");;

val it : bool = true

> compare (10, 30) (10, 20);;

val it : int = 1

Likewise, you can use generic comparison with list and array values:


> compare [10; 30] [10; 20];;

val it : int = 1

> compare [|10; 30|] [|10; 20|];;

val it : int = 1

> compare [|10; 20|] [|10; 30|];;

val it : int = -1

Generic Hashing

Generic hashing is an important partner of generic comparison. The primary primitive function used to invoke generic hashing is hash, again located in the Microsoft.FSharp.Core.Operators module. The type signature is:


val hash : 'T -> int when 'T : equality

Again, this is a constrained operation, requiring that the type support equality. Most types support some form of equality, even if it’s the default reference equality of .NET.

When used with simple structural types, the function returns an integer that gives a stable hash of the input value:


> hash 100;;

val it : int = 100

> hash "abc";;

val it : int = 536991770

> hash (100, "abc");;

val it : int = 536990974

Generic hashing is implemented similarly to generic comparison. Like generic comparison, generic hashing should generally be used only with base types, such as integers, and with structural types built using records and discriminated unions.

For the most part, generic hashing and comparison are implemented efficiently—code is autogenerated for each type definition where possible fast-path comparison techniques are used. For example, the generated code uses primitive Common IL/native instructions for integer comparisons. This means that in practice, structural comparison is typically fast when used with appropriately sized keys. However, you should consider the following before using generic comparison over complex new data types:

  • When using .NET collections, consider passing the HashIdentity.Structural or ComparisonIdentity.Structural parameters to the object constructor of the collection. This helps the F# compiler optimize the performance of the hashing, equality, and comparison functions for the collection instance.  
  • Hashing, comparison, and equality on tuples can be slower than expected in some circumstances where the F# compiler can’t optimize the hashing, equality, and comparison functions for these types. Consider replacing uses of tuples as keys by the use of a new, named key type, often using a union type with a single case, e.g. type Key = Key of string * int. This allows the compiler to optimize the hashing.
  • Consider customizing the behavior of generic hashing, comparison, and equality for new types you define, at least when those types will be used as keys in a data structure. You can do this by implementing the System.IComparable interface and overriding the System.Object.Equals method, covered in Chapter 8. You can customize generic hashing for new types by either overriding the GetHashCode method or implementing the Microsoft.FSharp.Core.IStructuralHash interface, also covered in Chapter 8.
  • Both ordered comparison and equality (in combination with hashing) can be used to build interesting indexed data structures. Collections built using these techniques are efficient over small keys. Performance issues may arise, however, if they’re used repeatedly over large structured keys. In this case, using custom comparison may be appropriate.
Generic Pretty-Printing

Some useful generic functions do generic formatting of values. The simplest ways to access this functionality is to use the %A specifiers in printf format strings. Here is an example:


> sprintf "result = %A" ([1], [true]);;

val it : string = "result = ([1], [true])"

This code uses .NET and F# reflection to walk the structure of values to build a formatted representation of the value. You format structural types such as lists and tuples using the syntax of F# source code. Unrecognized values are formatted by calling the .NET ToString() method for these values. F# and .NET reflection are discussed in more depth toward the end of Chapter 17.

Generic Boxing and Unboxing

Two useful generic functions convert any F# data to and from the universal type System.Object (the F# type obj):


val box : 'T -> obj
val unbox : obj -> 'T

Here are some simple examples of these functions in action:


> box 1;;

val it : obj = 1

> box "abc";;

val it : obj = "abc"

> let stringObj = box "abc";;

val stringObj : obj = "abc"

> (unbox<string> stringObj);;

val it : string = "abc"

> (unbox stringObj : string);;

val it : string = "abc"

Note that using unbox generally requires you to specify the target type, given either as an explicit type parameter unbox<string> or as a type constraint (unbox stringObj: string)—these forms are equivalent. A runtime check is performed on unboxing to ensure that the object can be safely converted to the target type. Values of type obj carry dynamic type information, and attempting to unbox a value to an incompatible type raises an error:


> (unbox stringObj : int);;

System.InvalidCastException: Specified cast is not valid.
   at <StartupCode$FSI_0046>.$FSI_0046.main@()
Stopped due to error

Boxing is important partly because many early .NET libraries provide operations through functions that accept arguments of type obj. You see an example in the next section. Furthermore, some .NET APIs are dynamically typed, and almost all parameters are of type obj. This was common before generics were added in v2.0 of the .NET framework.

Generic Binary Serialization via the .NET Libraries

The .NET libraries provide an implementation of generic binary serialization that is useful as a quick and easy way of saving computed values to disk and sending values over the network. Let’s use this as an example to see how you can define building-block generic operations using functionality in the .NET libraries combined with box and unbox. You first define functions with the following signatures:


val writeValue : outputStream:System.IO.Stream -> x:'T -> unit
val readValue : inputStream:Stream -> 'a

The function writeValue takes an arbitrary value and writes a binary representation of its underlying object graph to the given I/O stream. The function readValue reverses this process, in much the same way that unbox reverses the process performed by box. Here are the implementations of the functions in terms of the .NET binary serializer located in the namespace System.Runtime.Serialization.Formatters.Binary:

open System.IO
open System.Runtime.Serialization.Formatters.Binary

let writeValue outputStream (x : 'T) =
    let formatter = new BinaryFormatter()
    formatter.Serialize(outputStream, box x)

let readValue inputStream =
    let formatter = new BinaryFormatter()
    let res = formatter.Deserialize(inputStream)
    unbox res

Note that box and unbox are used in the implementation, because the Serialize and Deserialize functions accept and return a value of type obj. Here is an example of how to use the functions to write a value of type Microsoft.FSharp.Collections.Map<string,string> to a FileStream and read it back in again:

open System.IO
let addresses = Map.ofList ["Jeff", "123 Main Street, Redmond, WA 98052"
                            "Fred", "987 Pine Road, Phila., PA 19116"
                            "Mary", "PO Box 112233, Palo Alto, CA 94301"]



let fsOut = new FileStream("Data.dat", FileMode.Create)
writeValue fsOut addresses
fsOut.Close()
let fsIn = new FileStream("Data.dat", FileMode.Open)
let res : Map<string, string> = readValue fsIn
fsIn.Close()

The final result of this code when executed interactively is:


> res;;

val it : Map<string,string> =
  map
    [("Fred", "987 Pine Road, Phila., PA 19116");
     ("Jeff", "123 Main Street, Redmond, WA 98052");
     ("Mary", "PO Box 112233, Palo Alto, CA 94301")]

Note that values of type Map<string,string> are printed interactively as sequences of key/value pairs. Also, a type annotation is required when reading the data back in using readValue, and a runtime type error results if the types of the objects reconstructed from the binary data don’t match this type annotation.

The .NET Framework provides several other generic serializers that differ in output format and operational characteristics. The most important of these are based around the notion of DataContract serialization:

  • System.Runtime.Serialization.Json.DataContractJsonSerializer: Used for serializing public data objects into the popular JavaScript Object Notation (JSON) format, using formats that aren’t specifically tied to particular .NET types but that are, rather, compatible with a range of types
  • System.Runtime.Serialization.DataContractSerializer: Used for serializing public data objects into XML, using formats that aren’t specifically tied to particular .NET types but that are, rather, compatible with a range of types
  • System.Runtime.Serialization.NetDataContractSerializer: Used for serializing arbitrary object graphs, including private data and closures, into XML using the same idioms as the other DataContract serializers

In addition, you can use some older serializers with F#. For example, System.Runtime.Serialization.XmlSerializer is used for serializing public data into XML. This serializer is now used less frequently; you should generally use NetDataContractSerializer instead. You can also write your own generic serializer, for example using the techniques described in Chapter 8.

Making Things Generic

The following sections discuss how to make existing code more generic (that is, reusable) and how to systematically represent the abstract parameters of generic algorithms.

Generic Algorithms through Explicit Arguments

A common pattern in F# programming is to accept function parameters in order to make an algorithm abstract and reusable. A simple sample is the following generic implementation of Euclid’s algorithm for finding the highest common factor (HCF) of two numbers:

let rec hcf a b =
    if a = 0 then b
    elif a < b then hcf a (b - a)
    else hcf (a - b) b

The type of this function is:


val hcf : a:int -> b:int -> int

For example:


> hcf 18 12;;

val it : int = 6

> hcf 33 24;;

val it : int = 3

This algorithm isn’t generic, however, because as written, it works only over integers. In particular, although the operator (-) is by default overloaded in F#, each use of the operator must be associated with at most one type decided at compile time. This restriction is discussed in more detail in the section “Understanding Generic Overloaded Operators.” In addition, the constant 0 is an integer and isn’t overloaded.

Despite this, this algorithm can be easily generalized to work over any type. To achieve this, you must provide an explicit zero, a subtraction function, and an ordering. Here’s one way:

let hcfGeneric (zero, sub, lessThan) =
    let rec hcf a b =
        if a = zero then b
        elif lessThan a b then hcf a (sub b a)
        else hcf (sub a b) b
    hcf

The inferred, generic type of this function is as follows:


val hcfGeneric :
  zero:'a * sub:('a -> 'a -> 'a) * lessThan:('a -> 'a -> bool) ->
    ('a -> 'a -> 'a) when 'a : equality

      The numeric type being manipulated has type 'a in the inferred type, and the result is a computed function. This approach uses techniques for computing functions similar to those discussed in Chapter 3. Here are some examples of using this generic function:

let hcfInt = hcfGeneric (0, (-), (<))
let hcfInt64  = hcfGeneric (0L, (-), (<))
let hcfBigInt = hcfGeneric (0I, (-), (<))

Note that when you instantiate the generic function for these cases, you’re drawing on particular instances of the default overloaded operator (-). You can check that the code is executing correctly as follows:


> hcfInt 18 12;;

val it : int = 6

> hcfBigInt 1810287116162232383039576I 1239028178293092830480239032I;;

val it : System.Numerics.BigInteger = 33224

Generic Algorithms through Function Parameters

The generic implementation from the previous section took three related parameters for zero, comparison, and subtraction. It’s common practice to package related operations together. One way to do this is to use a concrete record type containing function values:

type Numeric<'T> =
    {Zero : 'T;
     Subtract : ('T -> 'T -> 'T);
     LessThan : ('T -> 'T -> bool);}

let intOps = {Zero = 0; Subtract = (-); LessThan = (<)}
let bigintOps = {Zero = 0I; Subtract = (-); LessThan = (<)}
let int64Ops = {Zero = 0L; Subtract = (-); LessThan = (<)}

let hcfGeneric (ops : Numeric<'T>) =
    let rec hcf a b =
        if a = ops.Zero then b
        elif ops.LessThan a b then hcf a (ops.Subtract b a)
        else hcf (ops.Subtract a b) b
    hcf

let hcfInt = hcfGeneric intOps
let hcfBigInt = hcfGeneric bigintOps

The inferred types are as follows and the hcfGeneric type is now simpler:


val hcfGeneric : ops:Numeric<'T> -> ('T -> 'T -> 'T) when 'T : equality
val hcfInt : (int -> int -> int)
val hcfBigInt :
  (System.Numerics.BigInteger -> System.Numerics.BigInteger -> System.Numerics.BigInteger)

To double-check that everything works as before:


> hcfInt 18 12;;

val it : int = 6

> hcfBigInt 1810287116162232383039576I 1239028178293092830480239032I;;

val it : System.Numerics.BigInteger = 33224

Record types such as Numeric<'T>, often called dictionaries of operations, are similar to vtables from object-oriented programming and the compiled form of type classes from Haskell. As you’ve seen, dictionaries such as these can be represented in different ways according to your tastes, using tuples or records. For larger frameworks, a carefully constructed classification of object interface types is often used in place of records. Here is an interface type definition that plays the same role as the record in the previous example:

type INumeric<'T> =
    abstract Zero : 'T
    abstract Subtract : 'T * 'T -> 'T
    abstract LessThan : 'T * 'T -> bool

You can implement interface types with object expressions similarly to record values:

let intOps =
    {new INumeric<int> with
        member ops.Zero = 0
        member ops.Subtract(x, y) = x - y
        member ops.LessThan(x, y) = x < y}

val intOps : INumeric<int>

The code for Euclid’s algorithm using interface types is essentially the same as for the code based on record types:

let hcfGeneric (ops : INumeric<'T>) =
    let rec hcf a b =
        if a = ops.Zero then b
        elif ops.LessThan(a, b) then hcf a (ops.Subtract(b, a))
        else hcf (ops.Subtract(a, b)) b
    hcf

GENERIC ALGORITHMS AND FUNCTION PARAMETERS

Generic Algorithms through Inlining

An  additional way to make code generic is to mark a function as inline. You can use this technique for code that uses F# operators, such as float, int, +, -, *, /, and other arithmetic operations. As we’ll discuss in the section “Understanding Generic Overloaded Operators,” each use of these operators is normally statically resolved. For example, consider this code:

let convertToFloat x = float x

In the F# type system, the type of the float function is 'T -> float (requires member op_Explicit). This means the type 'T must support an explicit conversion from 'T to float. Thus, you can use it with integers, singles, doubles, decimals, and so on, but you can’t use it with any old type. The 'T is constrained.

As mentioned earlier, F# operators such as float are statically resolved. This means each use of float is associated with one statically inferred type. The same applies to other operators such as +, *, and so on. These operators also generally default to working over integers. For example, this shows float being used with three different statically resolved types:

float 3.0 + float 1 + float 3L

To make code such as this generic, you can use inline. For example, consider:

let inline convertToFloatAndAdd x y = float x + float y

      convertToFloatAndAdd can be used with an integer and a float, or a decimal and an integer, or two decimals. You can use inline to write code and algorithms that are implicitly generic over arithmetic type while still maintaining type safety. This gives you generic, type-safe code while ensuring static resolution of basic numeric operations, making it easy to reason about the efficiency of your code.

You can also apply this technique to the earlier HCF algorithm:

let hcfGeneric (ops : INumeric<'T>) =
    let rec hcf a b =
        if a = ops.Zero then b
        elif ops.LessThan(a, b) then hcf a (ops.Subtract(b, a))
        else hcf (ops.Subtract(a, b)) b
    hcf

The type of this function is:


val inline hcf :
  a: ^a -> b: ^a ->  ^a
    when  ^a : (static member get_Zero : ->  ^a) and  ^a : comparison and
          ^a : (static member ( - ) :  ^a *  ^a ->  ^a)

For example:


> hcf 18 12;;

val it : int = 6

> hcf 18I 12I;;

val it : System.Numerics.BigInteger = 6

The algorithm is now generic. You should use this technique sparingly, but it can be extremely powerful. Note the use of the F# library primitive GenericZero to get a zero value for an arbitrary type supporting a Zero static property.

One variation on this technique provides a simple wrapper that delegates to a non-inlined generic routine. For example, the following code delegates to the non-inlined routine hcfGeneric, defined earlier in this section:

let inline hcf a b =
    hcfGeneric
        {new INumeric<'T> with
            member ops.Zero = LanguagePrimitives.GenericZero<'T>
            member ops.Subtract(x, y) = x - y
            member ops.LessThan(x, y) = x < y}
        a b

This gives you a generic function hcfGeneric that can be used with any explicit set of arithmetic operations, and a generic routine hcf that can be used with any type that supports an implicit set of arithmetic operations.

More on Different Kinds of Types

For the most part, programming in F# involves using types in a simple way: each type has some values, and types are related by using explicit functions to map from one type to another. In reality, however, types in F# are more sophisticated than this statement implies. First, F# types are really .NET types, and .NET makes some distinctions among different kinds of types that are occasionally important, such as between value types and reference types.

      Furthermore, .NET and F# support hierarchical relationships between types through subtyping. The following sections first cover .NET types from the F# perspective and then cover subtyping.

Reference Types and Value Types

The .NET documentation and other .NET languages often describe types as either reference types or value types. You can use System.String and System.DateTime as a typical example of each. First note that both of these types are immutable. That is, given a value of type System.String, you can’t modify the contents of the string; the same is true for System.DateTime. This is by far the most important fact you need to know about these types: for immutable types the reference/value type distinction is relatively unimportant. It’s still useful, however, to know the following:

  • Representation: Values of type System.String are single pointers into the garbage-collected heap where the actual data for the string reside. Two System.String values may point to the same data. In contrast, values of type System.DateTime are somewhat larger blobs of integer data (64-bits of data, in this case), and no data live on the garbage-collected heap. The full data for the value are copied as needed.
  • Boxing: All .NET types can be marshaled to and from the .NET type System.Object (the F# type obj) by using F# functions such as box and unbox. All reference types are trivially marshaled to this type without any change of representation or identity, so for reference types, box is a no-op. Boxing a value type involves a heap allocation, resulting in a new object. Often this object is immediately discarded after the operation on the obj type has been performed.

If a value can be mutated, then the distinction between value types and reference types is more serious. Fortunately, essentially all mutable values in the .NET libraries are reference types, which means mutation actually mutates the heap-allocated data referred to by the reference.

Other Flavors of .NET Types

The .NET type system makes some additional distinctions among types that are occasionally significant for F# programming:

  • Delegate types: Delegate types , such as System.Windows.Forms.MouseEventHandler, are a form of named function type supported by all .NET languages. They tend not to be as convenient to use as F# function types, because they don’t support compositional operations such as pipelining and forward composition, but .NET APIs use delegates extensively. To create a delegate value, you name the delegate type and pass it a function value that accepts the same arguments expected by the delegate type, such as MouseEventHandler(fun sender args -> printf "click! ").
  • Attribute types: Types derived from the System.Attribute class are used to add metadata to source-code declarations and typically end in the suffix Attribute. You can access these attributes via .NET and F# reflection. You can add attributes to F# declarations using the [<...>] syntax. For example, System.ObsoleteAttribute marks a function as obsolete, and the F# compiler produces a warning if the function is used. Attribute names such as this one, when used inside the [<…>] syntax, can omit the …Attribute suffix, which is then automatically inserted by the F# compiler.
  • Exception types: Types derived from the System.Exception class are used to represent raised exceptions. Chapter 4 discussed exceptions in detail.
  • Enum types: .NET enum types are simple integer-like value types associated with a particular name. They’re typically used for specifying flags to APIs; for example, FileMode in the System.IO namespace is an enum type with values such as FileMode.Open and FileMode.Create. .NET enum types are easy to use from F# and can be combined using bitwise AND, OR, and XOR operations using the &&&, |||, and ^^^ operators. Most commonly, the ||| operator is used to combine multiple flags. On occasion, you may have to mask an attribute value using &&& and compare the result to enum 0. You see how to define .NET-compatible enum types in F# at the end of Chapter 6.

Understanding Subtyping

Simple types are related in simple ways. For example, values of the type int are distinct from values of the type float, and values of one record type are distinct from values of another. This approach to types is powerful and often sufficient, partly because type inference and function values make it easy to write generic code, although .NET and F# also support hierarchical relationships between types through subtyping. Subtyping is a key concept of object-oriented programming and is discussed in more detail in Chapter 6. In addition, you can use subtyping in conjunction with pure functional programming, because it offers one technique to make algorithms generic over a restricted family of types.

The following sections explain how these constructs appear to the F# programmer and the role they play in F# programming. Subtyping in F# is the same as that used by the .NET Framework, so if you’re familiar with another .NET language, you already know how things work.

Casting Up Statically

You can explore how subtyping works by using F# Interactive. First, let’s look at how some of the F# types you’ve already seen relate to the type obj:


> let xobj = (1 :> obj);;

val xobj : obj = 1

> let sobj = ("abc" :> obj);;

val sobj : obj = "abc"

This example shows the subtyping relationship through the use of the built-in coercion (or upcast) operator, which is :>. This operator converts a value to any of its supertypes in precisely the same way as the box function.

The previous code indicates the subtyping between ordinary types and the type obj. Subtyping occurs between other kinds of types as well (Chapters 3 and 6 discuss the various kinds of type definitions, such as records, discriminated unions, classes, and interfaces):

  • All types are subtypes of System.Object (called obj in F#).
  • Record and discriminated union types are subtypes of the interface types they implement.
  • Interfaces types are subtypes of the other interfaces they extend.
  • Class types are subtypes of both the interfaces they implement and the classes they extend.
  • Array types are subtypes of the .NET type System.Array.
  • Value types (types such as int32 that are abbreviations of .NET value types) are subtypes of the .NET type System.ValueType. Likewise, .NET enumeration types are subtypes of System.Enum.

Casting Down Dynamically

Values that may have subtypes carry a runtime type, and you can use runtime-type tests to query the type of an object and convert it to one of the subtypes. You can do this in three main ways: the unbox operation, downcasts, and pattern type tests. We’ve already explained the unbox function. As with most object-oriented languages, the upcasts performed through subtyping are reversible through the use of downcasts—in other words, by using the:?> operator. Consider these examples:


> let boxedObject = box "abc";;

val boxedObject : obj

> let downcastString = (boxedObject :?> string);;

val downcastString : string = "abc"

Downcasts are checked at runtime and all values of the obj type are implicitly annotated with the runtime type of the value. The operator :?> raises an exception if the object isn’t of a suitable type:


> let xobj = box 1;;

val xobj : obj = 1

> let x = (xobj :?> string);;

System.InvalidCastException: Unable to cast object of type 'System.Int32' to type 'System.
String'.
   at <StartupCode$FSI_0037>.$FSI_0037.main@()

Performing Type Tests via Pattern Matching

A more convenient way of performing dynamic type tests uses type-test patterns—in particular the :? pattern construct, which you encountered in Chapter 4 in the context of catching various .NET exception types. This example uses a pattern-type test to query the dynamic type of a value of type obj:

let checkObject (x : obj) =
    match x with
    | :? string -> printfn "The object is a string"
    | :? int -> printfn "The object is an integer"
    | _ -> printfn "The input is something else"

> checkObject (box "abc");;

The object is a string

Such a pattern may also bind the matched value at its more specific type:

let reportObject (x : obj) =
    match x with
    | :? string as s -> printfn "The input is the string '%s'" s
    | :? int as d -> printfn "The input is the integer '%d'" d
    | _ -> printfn "the input is something else"

>  reportObject (box 17);;

The input is the integer '17'

Knowing When Upcasts Are Applied Automatically

Like most object-oriented languages, F# automatically applies upcasts whenever a function or method is called or wherever a value is used. If a parameter to an F# function has a named type, then the function implicitly accepts parameters that are any subtype, assuming that type supports subtyping. This is particularly common when working with .NET libraries that use subtyping heavily. For example:


> open System.Windows.Forms;;
> let setTextOfControl (c : Control) (s : string) = c.Text <- s;;

val setTextOfControl : c:Control -> s:string -> unit

> let form = new Form();;

val form : Form = System.Windows.Forms.Form, Text:

> let textBox = new TextBox();;

val textBox : TextBox = System.Windows.Forms.TextBox, Text:

> setTextOfControl form "Form Text";;
> setTextOfControl textBox "Text Box Text";;

Here, the function setTextOfControl is used on two different subtypes of Control. When functions have parameters that accept named types such as Control, you don’t need to supply an upcast explicitly.

F# doesn’t apply upcasts in all the situations where other object-oriented languages do, however. In practice, this means you sometimes have to add explicit upcasts to your code to throw away information. For example, if each branch of an if ... then ... else ... construct returns different types, then you need to upcast the results of one or both of the branches. This is shown by the type error given for the following code, which returns Console.In (a TextReader) from one branch and the results of File.OpenText (a StreamReader) from the other branch:

open System
open System.IO

let textReader  =
    if DateTime.Today.DayOfWeek = DayOfWeek.Monday
    then Console.In
    else File.OpenText("input.txt")

The error reported is as follows:


error FS0001: This expression was expected to have type
    TextReader    
but here has type    
    StreamReader

StreamReader is a subtype of TextReader, so the code can be corrected by throwing away the information that the returned type is a StreamReader:

open System
open System.IO

let textReader =
    if DateTime.Today.DayOfWeek = DayOfWeek.Monday
    then Console.In
    else (File.OpenText("input.txt") :> TextReader)

Upcasts are applied automatically in the following situations:

  • When passing arguments to functions and all members associated with .NET and F# objects and types. This applies when parameters have named types, such as TextReader, rather than generic types.
  • When calling functions with flexible parameter types (see in the following section), such as #TextReader.
  • When assigning into fields and properties.
  • When accessing members using dot notation. For example, given a value of type StreamReader, all the members associated with TextReader can also be accessed without needing to apply an upcast.

Note particularly that upcasts aren’t applied automatically for the result of a function. For example, an upcast is needed here, to coerce a StreamReader to a TextReader, despite the presence of an explicit type annotation for the return type:

let getTextReader () : TextReader = (File.OpenText("input.txt") :> TextReader)

Flexible Types

F# also has the notion of a flexible type constraint, which is shorthand for a generic function and a constraint on the type variable. You could equally write this:


> open System.Windows.Forms;;
> let setTextOfControl (c : 'T when 'T :> Control) (s : string) = c.Text <- s;;

val setTextOfControl: c:#Control -> s:string -> unit

Automatic generalization lifts the constraints implicit in types of the form #type to the nearest enclosing function or member definition. Flexible type constraints sometimes occur when you’re working with sequence values. For example, consider this function from the F# library:


module Seq =
    ...
    val concat : seq<#seq<'T>> -> seq<'T>
    ...

When implicit flexibility of arguments is taken into account, the signature of concat means that Seq.concat accepts a list of lists, or a list of arrays, or an array of lists, and so forth. For example:

Seq.concat [[1;2;3]; [4;5;6]]
Seq.concat [[|1; 2; 3|]; [|4; 5; 6|]] ]

Troubleshooting Type-Inference Problems

The following sections cover some of the techniques you can use to understand the type-inference process and to debug problems when inferred types aren’t as expected.

Using a Visual Editing Environment

The best and most important technique to debug and understand type inference is to use a visual editing environment for F#. For example, Visual Studio performs interactive type-checking as you’re writing code. Such tools display errors interactively and show inferred types as you move the mouse pointer over identifiers.

Using Type Annotations

Type inference in F# works through a process of type-constraint propagation. As a programmer, you can add further type constraints to your code in several ways. For example, you can do the following:

  • Add a rigid type constraint using the : notation, such as let t : float = 5.0.
  • Add a type constraint to an argument, such as let setTextOfControl (c : Control) (s : string) = c.Text <- s.
  • Apply a function that accepts only a particular type of argument, such as let f x = String.length x. Here, the use of String.length generates a constraint on the type of x.

Adding type annotations to and removing them from your code is the standard technique to troubleshoot type-inference problems. For example, the following code doesn’t type-check:


> let getLengths inp = inp |> Seq.map (fun y -> y.Length) ;;

error FS0072: Lookup on object of indeterminate type based on information prior to this
program point. A type annotation may be needed prior to this program point to constrain the
type of the object. This may allow the lookup to be resolved.

You can easily solve this problem by adding a type annotation, such as to y:

let getLengths inp =
    inp |> Seq.map (fun (y : string) -> y.Length)

You can also use type annotations to discover why code isn’t as generic as you think it should be. For example, the following code has a mistake, and the F# type checker says the code is less generic than expected:

let printSecondElements (inp : seq<'T * int>) =
    inp
    |> Seq.iter (fun (x, y) -> printfn "y = %d" x)

> ... enter the code above ...

warning FS0064: This construct causes code to be less generic than indicated by the type
annotations. The type variable 'T has been constrained to be type 'int'.

The mistake here is that you’re printing the variable x instead of y, but it’s not always so easy to spot what has caused this kind of problem. One way to track down this problem is to temporarily change the generic type parameter to some arbitrary, unrelated type. After all, if code is generic, then you should be able to use any type; and by changing the type variable to an unrelated type, you can track down where the inconsistency first arises. For example, let’s change 'T to the type PingPong:

type PingPong = Ping | Pong

let printSecondElements (inp : seq<PingPong * int>) =
    inp
    |> Seq.iter (fun (x,y) -> printfn "y = %d" x)

You now get a different and in many ways more informative error message, localized to the first place that the value x marked with type PingPong is used in an unexpected way:


> ... enter the code above ...

error FS0001: The type 'PingPong' is not compatible with any of the types byte,int16,int32,int
64,sbyte,uint16,uint32,uint64,nativeint,unativeint, arising from the use of a printf-style
format string

Understanding the Value Restriction

F# sometimes requires a little help before a definition is automatically generalized. In particular, only function definitions and simple immutable data expressions are automatically generalized; this is called the value restriction. For example, the following definition doesn’t result in a generic type and gives an error:


> let empties = Array.create 100 [];;

error FS0030: Value restriction. The value 'empties' has been inferred to have generic type

    val empties : '_a list []    

Either define 'empties' as a simple data term, make it a function with explicit arguments or,
if you do not intend for it to be generic, add a type annotation.

The code attempts to create an array of empty lists. The error message indicates that type inference has given empties the type '_a list []. The underscore (_) indicates that the type variable 'a is ungeneralized, meaning this code isn’t fully generic. It would make no sense to give empties the truly generic type 'a list [], because this would imply that you’ve created an array of lists somehow suitable for use with any type 'a. In reality, any particular array should have one specific type, such as int list [] or string list [], but not both. (If it were usable with both types, then you could store an integer list in the array and fetch it out as a string list!)

The value restriction ensures that declarations don’t result in this kind of confusion; automatic generalization isn’t applied to declarations unless they’re functions or simple, immutable data constructs. One way to think of this is that you can create concrete objects only after the type-inference problem is sufficiently constrained so that every concrete object created at the top level of your program has a ground type—a type that doesn’t contain any ungeneralized type variables.

The value restriction doesn’t apply to simple immutable data constants or function definitions. For example, the following declarations are all automatically generalized, giving the generic types shown:

let emptyList = []
let initialLists = ([], [2])
let listOfEmptyLists = [[]; []]
let makeArray ()  = Array.create 100 []

val emptyList : 'a list
val initialLists : 'a list * int list
val listOfEmptyLists : 'a list list
val makeArray : unit -> 'a list []

Working Around the Value Restriction

The value restriction crops up with mild regularity in F# coding—particularly when you’re using F# Interactive, where the scope of type inference is at the granularity of each entry sent to the tool rather than an entire file, and hence, fewer constraints are placed on these code fragments. You can work around the value restriction in several ways, depending on what you’re trying to do.

Technique 1: Constrain Values to Be Nongeneric

The first technique applies when you make a definition such as empties, shown earlier, but you meant to create one value. Recall that this definition was problematic because it’s not clear what type of array is being created:

let empties = Array.create 100 []

If you meant to create one value with a specific type, then use an explicit type annotation:

let empties : int list [] = Array.create 100 []

The value is then not generic, and the value restriction doesn’t apply.

Technique 2: Ensure Generic Functions Have Explicit Arguments

The next technique applies when you’re defining generic functions. In this case, make sure you define them with explicit arguments. For example, look at the following definition of a function value:

let mapFirst = List.map fst

You may expect this to be a generic function with type ('a * 'b) list -> 'a list. This isn’t what happens. Type variables are automatically generalized at true syntactic function definitions—that is, function definitions with explicit arguments. The function mapFirst has implicit arguments. Fortunately, it’s easy to work around this by making the arguments explicit (or in academic terms, performing η [eta] abstraction):

let mapFirst inp = List.map fst inp

This function now has the following type:


val mapFirst : inp:('a * 'b) list -> 'a list

When there is only one argument, our favorite way of writing the extra arguments is as follows:

let mapFirst inp = inp |> List.map (fun (x, y) -> x)

The same problem arises when you try to define functions by composition. For example:

let printFstElements = List.map fst >> List.iter (printf "res = %d")

The arguments here are implicit, which causes problems. This definition isn’t automatically generalized because this isn’t a syntactic function. Again, make the arguments explicit:

let printFstElements inp = inp |> List.map fst |> List.iter (printf "res = %d")
Technique 3: Add Dummy Arguments to Generic Functions When Necessary

Look at this definition again:

let empties = Array.create 100 []

It’s possible that you really did intend to define a function that generates arrays of different types on demand. In this case, add a dummy argument:

let empties () = Array.create 100 []
let intEmpties : int list [] = empties()
let stringEmpties : string list [] = empties()

The dummy argument ensures that empties is a function with generic arguments and that it can be automatically generalized.

Technique 4: Add Explicit Type Arguments When Necessary

You can use one final technique to make a value generic. This one is rarely used, but it is very handy when it’s required. It’s normally used when you’re defining values that are generic but that are neither functions nor simple data expressions. For example, let’s define a sequence of 100 empty lists:

let emptyLists = Seq.init 100 (fun _ -> [])

The expression on the right isn’t a function or simple data expression (it’s a function application), so the value restriction applies. One solution is to add an extra dummy argument, as in the previous section. If you’re designing a library, however, this can seem very artificial. Instead, you can use the following declaration form to make emptyLists generic:

let emptyLists<'T> : seq<'T list> = Seq.init 100 (fun _ -> [])

You can now use emptyLists as a type function to generate values of different types. For example:


> Seq.length emptyLists;;

val it : int = 100

> emptyLists<int>;;

val it : seq<int list> = seq [[]; []; []; []; ...]

> emptyLists<string>;;

val it : seq<string list> = seq [[]; []; []; []; ...]

Some values and functions in the F# library are defined in this way, including typeof<_>, Seq.empty, and Set.empty. Chapter 17 covers typeof.

Understanding Generic Overloaded Operators

There is one further important way in which code doesn’t automatically become generic in F#: when you use overloaded operators such as +, -, *, and /, or overloaded conversion functions such as float and int64. For example, the following is not a generic function:

let twice x = (x + x)

In the absence of further information, the type of this function is as follows:


val twice : x:int -> int

This is because the overloaded operator + defaults to operating on integers. If you add type annotations or further type constraints, you can force the function to operate on any type supporting the overloaded + operator:

let twiceFloat (x : float) = x + x

val twiceFloat : x:float -> float

The information that resolves a use of an overloaded operator can come after the use of the operator:

let threeTimes x = (x + x + x)
let sixTimesInt64 (x:int64) = threeTimes x + threeTimes x

val threeTimes : x:int64 -> int64
val sixTimesInt64 : x:int64 -> int64

Note how the constraint in the definition of sixTimesInt64 is the only mention of int64 and affects the type of threeTimes. The technical explanation is that overloaded operators such as + give rise to floating constraints, which can be resolved later in the type-inference scope.

Summary

F# is a typed language, and F# programmers often use types in sophisticated ways. In this chapter, you learned about the foundations of types, focusing on how types are used in functional programming and with generics and subtyping in particular. The next chapter covers the related topics of object-oriented and modular programming in F#.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.141.28.107