© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2022
K. EasonStylish F# 6https://doi.org/10.1007/978-1-4842-7205-3_3

3. Missing Data

Kit Eason1  
(1)
Farnham, Surrey, UK
 

Not even a thought has arisen; is there still a sin or not?

—Zen Koan, 10th Century CE

This is a chapter about nothing! Specifically, it’s about how we handle the absence of data in our programs. It’s a more important topic than you might think at first: bugs caused by incorrect handling of missing data, typically manifested as “null reference errors,” are distressingly common in Object-Oriented programs. And this still happens, despite code to avoid such errors forming a significant proportion of the line count of many C# code bases.

In this chapter I’ll try to convince you how serious a problem this is and show you the many features and idioms that F# offers to mitigate and even eliminate this class of error.

A Brief History of Null

When computer scientist Tony Hoare invented the concept of null in 1965 in developing ALGOL W, his purpose was to represent a thing or property that is potentially present but might not be present in a particular situation. Take the ALGOL W program in Listing 3-1, where null values are used extensively.
RECORD PERSON (
    STRING(20) NAME;
    INTEGER AGE;
    LOGICAL MALE;
    REFERENCE(PERSON) FATHER, MOTHER, YOUNGESTOFFSPRING, ELDERSIBLING
);
REFERENCE(PERSON) PROCEDURE YOUNGESTUNCLE (REFERENCE(PERSON) R);
    BEGIN
        REFERENCE(PERSON) P, M;
        P := YOUNGESTOFFSPRING(FATHER(FATHER(R)));
        WHILE (P ¬= NULL) AND (¬ MALE(P)) OR (P = FATHER(R)) DO
            P := ELDERSIBLING(P);
        M := YOUNGESTOFFSPRING(MOTHER(MOTHER(R)));
        WHILE (M ¬= NULL) AND (¬ MALE(M)) DO
            M := ELDERSIBLING(M);
        IF P = NULL THEN
            M
        ELSE IF M = NULL THEN
            P
        ELSE
            IF AGE(P) < AGE(M) THEN P ELSE M
    END
Listing 3-1

Some ALGOL W code that uses null

Here, the ability to have a null or an actual value is used to model – for example - the fact that a person might or might not have an elder sibling. Null and nonnull instance values are used as flags to go down various branches of code. The modeling is definitely a bit fuzzy: for instance, FATHER and MOTHER are also nullable, even though everyone has a mother and father. Perhaps this models the fact that we might not know who they are. This kind of ambiguity was excusable in the 1960s, but coding patterns in the style of Listing 3-1 are still surprisingly common, even though there are now well-known techniques for modeling such relationships much more explicitly.

Of course, things have improved somewhat since 1965: in C#, for example, we now have the null coalescing operator ??, which allows us to retrieve either the nonnull value of some source item, or some other value, typically a default. As of C# 6.0, we also have the null-conditional operators ?. and ?[] that allow us to reach into an object or array for a property or indexed item and safely return null if either the object with the property, or the property itself, is null.

Despite these improvements, we all regularly see problems caused by null-based modeling. Spotting a ticketing machine or timetable display that has crashed with a null reference error can brighten any programmer’s commute. Figure 3-1 shows a less high-profile but equally typical example: Team Explorer in Visual Studio 2017 exposing a null reference exception during a git syncing operation.
../images/462726_2_En_3_Chapter/462726_2_En_3_Fig1_HTML.png
Figure 3-1

Visual Studio 2017 Team Explorer exposing a null reference exception

What has happened in these cases (typically) is that code has tried to access some property or method of an object, which is itself null, such as the arrival time of the first train when there is no known first train.

It’s common to blame the programmer in these situations, attributing such errors either to incompetence or to outdated practices and technologies. But it isn’t as simple as that. I took a look at the GitHub issue list of a very modern, reputable, high-profile C# code base. (I won’t be so rude as to name it.) When I checked (in April 2021) for mentions of null references in that GitHub issue list, I got hundreds of hits, many of which were still open (Table 3-1). (There will of course be some double counting in these figures.)
Table 3-1

Null Reference Mentions in a Major C# Code Base Issue List

Search Term

Open

Closed

NullReferenceException

201

521

null reference

264

811

null-ref

73

191

nullref

2

24

Incidentally, when I updated these figures for the new edition of this book, the figures in all but two categories had gone up since 2018. Clearly, it isn’t just “bad programmers” making these mistakes: null reference errors are accidents waiting to happen. Rather than blaming the operator, we should follow the basic principles of ergonomics and design such errors out of the technology at the language level.

At the time of writing, the primary approach in C# is still to “code around” the problem of null, which works (if you remember to do it) but does have a cost. I analyzed several open-source C# code bases and found that the proportion of lines involved in managing nulls (null checks, uses of null-coalescing and null-conditional operators) amounted to between 3% and 5% of the significant lines of code. Not crippling by any means, but certainly a significant distraction. Anything we can do to make this process easier has a worthwhile payoff.

The conclusion must be that paying attention to missing data and spending some time learning the techniques handle to it correctly, or avoiding it completely, are among the most useful things you can do as you learn idiomatic F# coding.

Option Types vs. Null

F#’s answer to the problem of potentially absent values is the option type. If you’ve coded in F# at all, you are probably familiar with option types, but please bear with me for a few moments while I establish very clearly what option types are and what they are not.

Fundamentally, the option type is just another Discriminated Union (DU), a type that represents several case values, each of which may have a different type of payload. Just in case you aren’t fully conversant with DUs, Listing 3-2 shows a general example: a type that can represent the dimensions of a square, a rectangle, or a circle. The Shape DU is made generic (the <'T> part) so that we could express the dimensions in any type we wanted – single precision, double precision, integer pixels, or whatever.
        type Shape<'T> =
            | Square of height:'T
            | Rectangle of height:'T * width:'T
            | Circle of radius:'T
Listing 3-2

Example of a Discriminated Union

Conceptually, the F# option type is just the same: you can think of it as being a generic DU as shown in Listing 3-3. (Actually, within the compiler, it’s not quite as simple as that. For one thing, the option type has its own keyword: option.)
        type Option<'T> =
            | Some of 'T
            | None
Listing 3-3

The Option type viewed as a Discriminated Union

One obvious difference between Shape and Option is that one of the cases of Option takes no payload at all - which makes sense because we can’t know the value of something that, by definition, doesn’t exist. DU cases without payloads are perfectly fine.

Listings 3-4 and 3-5 show us creating and pattern matching on the Shape DU and the Option DU in exactly the same way, to illustrate that there is nothing really magical about the Option DU.
        type Shape<'T> =
            | Square of height:'T
            | Rectangle of height:'T * width:'T
            | Circle of radius:'T
        let describe (shape : Shape<float>) =
            match shape with
            | Square h -> sprintf "Square of height %f" h
            | Rectangle(h, w) -> sprintf "Rectangle %f x %f" h w
            | Circle r -> sprintf "Circle of radius %f" r
        let goldenRect = Rectangle(1.0, 1.61803)
        // Rectangle 1.000000 x 1.618030
        printfn "%s" (describe goldenRect)
Listing 3-4

Creating and using the Shape DU

        let myMiddleName = Some "Brian"
        let herMiddleName = None
        let displayMiddleName (name : Option<string>) =
            match name with
            | Some s -> s
            | None -> ""
        // >>>Brian<<<
        printfn ">>>%s<<<" (displayMiddleName myMiddleName)
        // >>><<<
        printfn ">>>%s<<<" (displayMiddleName herMiddleName)
Listing 3-5

Creating and using the Option DU

The Shape type and the (built-in) Option type are treated in comparable ways in Listings 3-4 and 3-5 – the only real difference is that we could have declared the displayMiddleName function’s argument using string option instead of Option<string>, thus:
        let displayMiddleName (Name : string option) = ...

I could have done this because the compiler offers a special keyword for option types. I only used the Option<string> version in Listing 3-5 to highlight the fact that option types are DUs. In practice, you should use the option keyword as this is built into the language, making it widely understood and performant.

Consuming Option Types

How does all this help us step away from the risky world of nullable types, where we are always one missed null check away from a NullReferenceException? The difference from using nulls is that – provided we don’t deliberately bypass F# idioms – we are forced by the compiler to consider both the Some and None cases whenever we consume an option type. Consider Listing 3-6, where we have a billing details record that might, or might not, have a separate delivery address. (Again, this isn’t great modeling – see the next few sections for some improvements.)
        type BillingDetails = {
            Name : string
            Billing : string
            Delivery : string option }
        let myOrder = {
            name = "Kit Eason"
            billing = "112 Fibonacci Street Erehwon 35813"
            delivery = None }
        let hisOrder = {
            name = "John Doe"
            billing = "314 Pi Avenue Erewhon 15926"
            delivery = Some "16 Planck Parkway Erewhon 62291" }
        // Error: the expression was expected to have type 'string'
        // but here has type 'string option'
        printfn "%s" myOrder.delivery
        printfn "%s" hisOrder.delivery
Listing 3-6

Modeling an optional delivery address using an Option type

Note how at the end of Listing 3-6, we try to treat the orders’ delivery addresses as strings, not as string options, which are a different type. This causes a compiler error for both the myOrder and hisOrder cases, not just a runtime error in the myOrder case. This is the option type protecting us by forcing us to consider the has-data and nodata possibilities at the point of consumption.

This begs the question: How are we supposed to access the underlying value or payload? There are several ways to do this, some more straightforward than others, so in the next few sections, we’ll go through these and examine their benefits and costs.

Pattern Matching on Option Types

Since an option type is a Discriminated Union, the obvious way to get at its payload (when there is one) is using pattern matching using a match expression (Listing 3-7).
        // BillingDetails type and examples as Listing 3-6.
        let addressForPackage (details : BillingDetails) =
            let address =
                match details.delivery with
                | Some s -> s
                | None -> details.billing
            sprintf "%s %s" details.name address
        // Kit Eason
        // 112 Fibonacci Street
        // Erehwon
        // 35813        printfn "%s" (addressForPackage myOrder)
        // John Doe
        // 16 Planck Parkway
        // Erewhon
        // 62291
        printfn "%s" (addressForPackage hisOrder)
Listing 3-7

Accessing an option type’s payload using pattern matching

Consuming option types using explicit pattern matching in this way has clear trade-offs. The big advantage is that it’s simple: everyone familiar with the basics of F# syntax will be familiar with it, and the reader doesn’t require knowledge of other libraries (or even computer science theory!) to understand what is going on. The disadvantage is that it’s a little verbose and pipeline unfriendly.

I’ll present alternatives in future sections, but before I do, let me say this: if you, and anyone maintaining your code, aren’t completely comfortable with the basics of option types – comfortable to the extent that everyone is ready and keen to move onto more fluent methods of consumption – I’d advise that you stick with good old-fashioned pattern matching, at least for a while. As with many other areas of F# coding, trying to get too clever too quickly can lead to some pretty obscure code and a definite blurring of the principles of motivational transparency and semantic focus.

The Option Module

Once you are ready to go beyond pattern matching, you can start using some of the functions available in the Option module . I personally found the Option module functions a little hard to get my head around at first. I suspect this is because English language descriptions of these functions don’t make much sense without examples – so proceed with this section slowly!

The Option.defaultValue Function

Let me start off with the equivalent code, in the Option module world, to that presented in Listing 3-7 – that is, getting either a string representing a delivery address or a default value (Listing 3-8).
        type BillingDetails = {
            Name : string
            Billing : string
            Delivery : string option }
        let addressForPackage (details : BillingDetails) =
            let address =
                Option.defaultValue details.billing details.delivery
            sprintf "%s %s" details.name address
Listing 3-8

Defaulting an Option Type Instance using Option.defaultValue

The usage of addressForPackage is exactly the same as in Listing 3-7, so I haven’t repeated the usage here.

Option.defaultValue is pretty straightforward: you give it an option type as its second argument (in this case, details.delivery), and it’ll either return the underlying value of that instance if there is one or instead the value you give it in the first parameter (in this case, details.billing). One thing that might confuse you is the ordering of the parameters – the default value first and the option value second. The reason for this is to make the function “pipeline friendly.” The usefulness of this becomes clear if we apply Option.defaultValue as part of a pipeline, as in Listing 3-9.
        let addressForPackage (details : BillingDetails) =
            let address =
                 details.delivery
                 |> Option.defaultValue details.billing
            sprintf "%s %s" details.name address
Listing 3-9

Using Option.defaultValue in a pipeline

The Option.iter Function

The Option module also offers a function to do something imperative with an option type, for example, printing out its payload or writing it to a file. It’s called Option.iter , by analogy with functions like Array.iter that “do something imperative” with each element of a collection. If the value is Some, it performs the specified imperative action once using the payload; otherwise, it does nothing at all. The function printDeliveryAddress in Listing 3-10 prints "Delivery address: <address>" if there is such an address; otherwise, it takes no action.
        let printDeliveryAddress (details : BillingDetails) =
            details.delivery
            |> Option.iter
                (fun address -> printfn "%s %s" details.name address)
        // No output at all
        myOrder |> printDeliveryAddress
        // Delivery address:
        // John Doe
        // 16 Planck Parkway
        // Erewhon
        // 62291
        hisOrder |> printDeliveryAddress
Listing 3-10

Using Option.iter to take an imperative action if a value is populated

There are additional Option module functions analogous to their collection-based cousins. These include Option.count, which produces 1 if the value is Some, otherwise 0, and Option.toArray and Option.toList, which produce a collection of length 1 containing the underlying value, otherwise an empty collection.

Option.map and Option.bind

The two Option module functions that I personally struggled most with were Option.map and Option.bind , so we’ll spend a little more time on them. The documented behavior of these functions is a good example of descriptions of function behavior in English not being terribly useful (Table 3-2). (It may be that the descriptions are more helpful if – unlike me – you have a computer science or formal functional programming background!)
Table 3-2

Documented Behavior of the Option.map and Option.bind Functions

Function

Description

Option.map

Transforms an option value by using a specified mapping function

Option.bind

Invokes a function on an optional value that itself yields an option

The Option.map Function

Option.map is a way to apply a function to the underlying value of an option type if it exists and to return the result as a Some case; and if the input value is None, to return None without using the function at all. An example probably says it better: Listing 3-11 is a variation on printDeliveryAddress.
        let printDeliveryAddress (details : BillingDetails) =
            details.delivery
            |> Option.map
                (fun address -> address.ToUpper())
            |> Option.iter
                (fun address ->
                    printfn "Delivery address: %s %s"
                        (details.name.ToUpper()) address)
        // No output at all
        myOrder |> printDeliveryAddress
        // Delivery address:
        // JOHN DOE
        // 16 PLANCK PARKWAY
        // EREWHON
        // 62291
        hisOrder |> printDeliveryAddress
Listing 3-11

Using Option.map to optionally apply a function, returning an option type

Here, the requirement is to print a delivery address in capitals if it exists, otherwise to do nothing. We combine Option.map, to do the uppercasing when necessary, with Option.iter, to do the printing.

Another way of thinking of Option.map is in diagram form (Figure 3-2).
../images/462726_2_En_3_Chapter/462726_2_En_3_Fig2_HTML.png
Figure 3-2

Option.map as a diagram

In the None case (top of the diagram), the None effectively passes through untouched and never goes near the uppercasing operation. In the Some case (bottom of diagram), the payload is uppercased and comes out as a Some value. At this point, we begin to see the beginnings of the “Railway Oriented Programming” paradigm, which we’ll discuss in detail in Chapter 11.

The Option.bind Function

Option.bind is so similar to Option.map that I found it very hard to get my head around the difference. (Indeed, I still often catch myself trying each of them until the compiler errors go away!) I think the best way to start is to compare the signatures of Option.map and Option.bind (Table 3-3).
Table 3-3

Type Signatures for Option.map and Option.bind

Function

Signature

Option.map

('T -> 'U) -> 'T option -> 'U option

Option.bind

('T -> 'U option) -> 'T option -> 'U option

Look at them carefully: the only difference is that the “binder” function needs to return an option type ("U" option) rather than an unwrapped type ("U"). The usefulness of this is that if you have a series of operations, each of which might succeed (returning Some value) or fail (returning None), you can pipe them together without any additional ceremony. Execution of your pipeline effectively “bails out” after the first step that returns None, because subsequent steps just pass the None through to the end without attempting to do any processing.

Think about a situation where we need to take the delivery address from the previous example, pull out the last line of the address, check that it is a postal code by trying to convert it into an integer, and then look up a delivery hub (a package-sorting center) based on the postal code. The point is that several of these operations might “fail,” in the sense of returning None .
  • The delivery address might not be specified (i.e., have a value of None).

  • The delivery address might exist but be an empty string, hence having no last line from which to get the postal code.

  • The last line might not be convertible to a postal code.

(I’ve made some simplifying assumptions here: I’m ignoring the billing address; I’m ignoring any validation that might in practice mean the delivery address isn’t an empty string; I’m assuming that a postal code must simply be an integer; and I’m assuming that the hub lookup always succeeds.) What does the code look like to achieve all this? (Listing 3-12).
        open System
        type BillingDetails = {
            Name : string
            Billing : string
            Delivery : string option }
        let tryLastLine (address : string) =
            let parts =
                address.Split([|' '|],
                              StringSplitOptions.RemoveEmptyEntries)
            // Could also just do parts |> Array.tryLast
            match parts with
            | [||] ->
                None
            | parts ->
                parts |> Array.last |> Some
        let tryPostalCode (codeString : string) =
            match Int32.TryParse(codeString) with
            | true, i -> i |> Some
            | false, _ -> None
        let postalCodeHub (code : int) =
            if code = 62291 then
                "Hub 1"
            else
                "Hub 2"
        let tryHub (details : BillingDetails) =
            details.delivery
            |> Option.bind tryLastLine
            |> Option.bind tryPostalCode
            |> Option.map postalCodeHub
        let myOrder = {
            name = "Kit Eason"
            billing = "112 Fibonacci Street Erehwon 35813"
            delivery = None }
        let hisOrder = {
            name = "John Doe"
            billing = "314 Pi Avenue Erewhon 15926"
            delivery = Some "16 Planck Parkway Erewhon 62291" }
        // None
        myOrder |> tryHub
        // Some "Hub 1"
        hisOrder |> tryHub
Listing 3-12

Using Option.bind to create a pipeline of might-fail operations

In Listing 3-12, we have a trylastLine function that splits the address by line breaks and returns the last line if there is one, otherwise None. Similarly, tryPostalCode attempts to convert a string to an integer and returns Some value only if that succeeds. The postalCodeHub function does a super-naive lookup (in reality, it would be some kind of database lookup) and always returns a value. We bring all these together in tryHub, which uses two Option.bind calls and an Option.map call to apply each of these operations in turn to get us from an optional delivery address to an optional delivery hub.

This is a really common pattern in idiomatic F# code: a series of Option.bind and Option.map calls to get from one state to another, using several steps, each of which can fail. Common though it is, it is quite a high level of abstraction, and it’s one of those things where you have to understand everything before you understand anything. So if you aren’t comfortable using it for now – don’t. A bit of nested pattern matching isn’t the worst thing in the world! I’ll return to this topic in Chapter 11 when we talk about “Railway Oriented Programming,” at which point perhaps it’ll make a little more sense.

Option Type No-Nos

Using option types can be frustrating at first. There’s often a strong temptation to bypass the pattern-matching or bind/map approach and instead tear open the package by examining the IsSome and Value properties that the option type offers (Listing 3-13).
        // Accessing payload via .IsSome and .Value
        // Don't do this!
        let printDeliveryAddress (details : BillingDetails) =
            if details.delivery.IsSome then
                printfn "Delivery address: %s %s"
                    (details.name.ToUpper())
                    (details.delivery.Value.ToUpper())
Listing 3-13

Antipattern: accessing Option type payloads using hasValue and Value

Don’t do this! You’d be undermining the whole infrastructure we have built up for handling potentially missing values in a composable way.

Some people would also consider explicit pattern matching using a match expression (in the manner of Listing 3-7) and antipattern too and would have you always use the equivalent functions from the Option module. I think this is advice that’s great in principle but isn’t always easy to follow; you’ll get to fluency with Option.map, Option.bind, or so forth in due course. In the meantime, a bit of pattern matching isn’t going to hurt anyone, and the lower level of abstraction may make your code more comprehensible to nonadvanced collaborators.

Designing Out Missing Data

So far, we’ve been accepting the admittedly not-great modeling embodied in our original BillingDetails type. (As a reminder, this is repeated in Listing 3-14.)
        type BillingDetails = {
            Name : string
            Billing : string
            Delivery : string option }
Listing 3-14

The BillingDetails type

The reason I say this is not great is that it isn’t clear under what circumstances the delivery address might not be there. (You might have to look elsewhere in the code to find out, which is a violation of the principle of semantic focus.) We can certainly improve on this. Let’s think about what the business rules might be for the BillingDetails type:
  • There must always be a billing address.

  • There might be a different delivery address but….

  • There must be no delivery address if the product isn’t a physically deliverable one, such as a download.

A good way to model this kind of thing is to express the rules as Discriminated Union cases. Listing 3-15 shows how this might play out.
        type Delivery =
            | AsBilling
            | Physical of string
            | Download
        type BillingDetails = {
            Name : string
            Billing  : string
            delivery : Delivery }
Listing 3-15

Modeling delivery address possibilities using a DU

In the new Delivery type, we’ve enumerated the three business possibilities: that the delivery address is the same as the billing address, that the delivery address is a separate physical address, or that the product is a download that does not need a physical address. Only in the Physical case do we need a string in which to store the address. In Listing 3-16, I’ve shown how it feels to consume the revamped BillingDetails type .
        let tryDeliveryLabel (billingDetails : BillingDetails) =
            match billingDetails.delivery with
            | AsBilling ->
                billingDetails.billing |> Some
            | Physical address ->
                address |> Some
            | Download -> None
            |> Option.map (fun address ->
                sprintf "%s %s" billingDetails.name address)
        let deliveryLabels (billingDetails : BillingDetails seq) =
            billingDetails
            // Seq.choose is a function which calls the specified function
            // (in this case tryDeliveryLabel) and filters for only those
            // cases where the function returns Some(value). The values
            // themselves are returned.
            |> Seq.choose tryDeliveryLabel
        let myOrder = {
            name = "Kit Eason"
            billing = "112 Fibonacci Street Erehwon 35813"
            delivery = AsBilling }
        let hisOrder = {
            name = "John Doe"
            billing = "314 Pi Avenue Erewhon 15926"
            delivery = Physical "16 Planck Parkway Erewhon 62291" }
        let herOrder = {
            name = "Jane Smith"
            billing = "9 Gravity Road Erewhon 80665"
            delivery = Download }
        // seq
        //     [ "Kit Eason
        //        112 Fibonacci Street
        //        Erehwon
        //        35813";
        //       "John Doe
        //        16 Planck Parkway
        //        Erewhon
        //        62291"]
        [ myOrder; hisOrder; herOrder ]
        |> deliveryLabels
Listing 3-16

Consuming the improved BillingDetails type

In Listing 3-16, I’ve imagined that we want a function that generates delivery labels only for those orders that require physical delivery. I’ve divided the task up into two parts:
  • The tryDeliveryLabel function uses a match expression to extract the relevant address. Then (when it exists), it uses Option.map to pair this with the customer name to form a complete label.

  • The deliveryLabels function takes a sequence of billingDetails items and applies tryDeliveryLabel to each item. Then it uses Seq.choose both to pick out those items where Some was returned and to extract the payloads of these Some values. (I go into more detail about Seq.choose and related functions in Chapter 4.)

Viewed in the light of the principles I laid out in Chapter 1, the code in Listings 3-15 and 3-16 is much better:
  • It has good semantic focus . You can tell without looking elsewhere what functions such as tryDeliveryLabel will do and why.

  • It has good revisability . Let’s say you realize that you want to support an additional delivery mechanism: so-called “Click and Collect,” where the customer comes to a store to collect their item. You might start by adding a new case to the Delivery DU, maybe with a store ID payload. From then on, the compiler would tell you all the points in existing code that you needed to change, and it would be pretty obvious how to add new features such as a function to list click-and-collect orders and their store IDs.

  • It has good motivational transparency . You aren’t left wondering why a particular delivery address is None. The reasons why an address might or might not exist are right there in the code. Other developers both “above you” in the stack (e.g., someone designing a view model for a UI) and “below you” (e.g., someone consuming the data to generate back-end fulfilment processes) can be clear about when and why certain items should and should not be present.

Modeling like this, where we use DUs to provide storage only for the DU cases where a value is required, brings us toward the nirvana of “Making Illegal State Unrepresentable,” an approach that I believe does more to eliminate bugs than any other coding philosophy I’ve come across.

Interoperating with the Nullable World

In this section, I’ll talk a bit about the implications of nullability when interoperating between F# and C#. There shouldn’t be anything too unexpected here, but when working in F#, it’s always worth bearing in mind the implications of interop scenarios.

Leaking In of Null Values

If you’re of a skeptical frame of mind, you’ll realize that there is a pretty big hole in my suggestion so far in this chapter (i.e., the claim that you can protect against null values by wrapping things in option types or Discriminated Unions). The hole is that (if it is a nullable reference type like a string), the wrapped type could still have a value of null. So, for example, the code in Listing 3-17 will compile fine, but it will fail with a null reference exception at runtime.
        type BillingDetails = {
            Name : string
            Billing : string
            Delivery : string option }
        let printDeliveryAddress (details : BillingDetails) =
            details.delivery
            |> Option.map
                (fun address -> address.ToUpper())
            |> Option.iter
                (fun address ->
                    printfn "Delivery address: %s %s"
                        (details.name.ToUpper()) address)
        let dangerOrder = {
            name = "Will Robinson"
            billing = "2 Jupiter Avenue Erewhon 199732"
            delivery = Some null }
        // NullReferenceException
        printDeliveryAddress dangerOrder
Listing 3-17

A null hiding inside an Option type

(As an aside, and perhaps a little surprisingly, doing a printfn "%s" null or a sprint "%s" null is fine – formatting a string with %s produces output as if the string was a nonnull, empty string. The problem in Listing 3-17 is the call to the ToUpper() method of a null instance.)

Obviously, you wouldn’t knowingly write code exactly like Listing 3-17, but it does indicate how we are at the mercy of anything calling our code that might pass us a null. This doesn’t mean that the whole exercise of using option types or DUs is worthless. Option types and other DU wrappers are primarily useful because they make the intention of our code clear. But it does mean that, at the boundary of the code we consider to be safe, we need to validate for or otherwise deal with null values.

Defining a SafeString Type

One generalized way to deal with incoming nulls is to define a new wrapper type and perform the validation in its constructor (Listing 3-18).
        type SafeString (s : string) =
            do
                if s = null then
                    raise <| System.ArgumentException()
            member __.Value = s
            override __.ToString() = s
        type BillingDetails = {
            name : SafeString
            billing :  SafeString
            delivery : SafeString option }
        let printDeliveryAddress (details : BillingDetails) =
            details.delivery
            |> Option.map
                (fun address -> address.Value.ToUpper())
            |> Option.iter
                (fun address ->
                    printfn "Delivery address: %s %s"
                        (details.name.Value.ToUpper()) address)
        // NullReferenceException at construction time
        let dangerOrder = {
            name = SafeString "Will Robinson"
            billing = SafeString "2 Jupiter Avenue Erewhon 199732"
            delivery = SafeString null |> Some }
Listing 3-18

Validating strings on construction

Having done this, one would need to require all callers to provide us with a SafeString rather than a string type .

It’s a tempting pattern, but frankly, things like nullable strings are so ubiquitous in .NET code that hardly anyone bothers. The overhead of switching to and from such null-safe types so that one can consume them and use them in .NET calls requiring string arguments is just too much to cope with. This is particularly in the case of mixed-language code bases, where, like it or not, nullable strings are something of a lingua franca.

Using Option.ofObj

We can fight the battle at a different level by using some more functions from the Option module; there are several very useful functions here to help mediate between the nullable and the nonnullable worlds. The first of these is Option.ofObj , which takes a reference type instance and returns that same instance wrapped in an option type. It returns Some value if the input was nonnull, or None if the input was null. This is invaluable at the boundaries of your system, when callers might give you nulls (Listing 3-19).
        let myApiFunction (stringParam : string) =
            let s =
                stringParam
                |> Option.ofObj
                |> Option.defaultValue "(none)"
            // You can do things here knowing that s isn't null
            printfn "%s" (s.ToUpper())
        // HELLO
        myApiFunction "hello"
        // (NONE)
        myApiFunction null
Listing 3-19

Using Option.ofObj

Using Option.ofNullable

If you have an instance of System.Nullable (e.g., a nullable integer), you can use Option.ofNullable to smoothly transition it into an option type (Listing 3-20).
        open System
        let showHeartRate (rate : Nullable<int>) =
            rate
            |> Option.ofNullable
            |> Option.map (fun r -> r.ToString())
            |> Option.defaultValue "N/A"
        // 96
        showHeartRate (System.Nullable(96))
        // N/A
        showHeartRate (System.Nullable())
Listing 3-20

Using Option.ofNullable

Incidentally, Listing 3-20 was inspired by my exercise watch, which occasionally tells me that my heart rate is null.

Leaking Option Types and DUs Out

Clearly, the flipside of letting nulls leak into our F# code is the potential for leakage outward of F#-specific types such as the option type and Discriminated Unions in general. It’s possible to create and consume these types in languages such as C# using compiler-generated sugar such as the NewCase constructor and the .IsCase, .Tag, and .Item properties, plus a bit of casting. However, it’s generally regarded as bad manners to force callers to do so, if those callers might not be written in F#. Again, some functions in the Option module come to the rescue.

Using Option.toObj

Option.toObj is the mirror image of Option.ofObj. It takes an option type and returns either the underlying value if it is Some or null if it is None. Listing 3-21 shows how we might handle returning a nullable “location” string for a navigation UI.
        open System
        let random = new Random()
        let tryLocationDescription (locationId : int) =
            // In reality this would be attempting
            // to get the location from a database etc.
            let r = random.Next(1, 100)
            if r < 50 then
                Some (sprintf "Location number %i" r)
            else
                None
        let tryLocationDescriptionNullable (locationId : int) =
            tryLocationDescription()
            |> Option.toObj
        // Sometimes null, sometimes "Location number #"
        tryLocationDescriptionNullable 99
Listing 3-21

Using Option.toObj

Alternatively, you might want to repeat the kind of pattern used in standard functions like System.Double.TryParse(), which return a Boolean value indicating success or failure, and place the result of the operation (if successful) into a “by reference” parameter (Listing 3-22). This is a pattern that might feel more natural if the function is being called from C#.
    open System
    let random = new Random()
    let tryLocationDescription (locationId : int, description : string byref) : bool =
        // In reality this would be attempting
        // to get the description from a database etc.
        let r = random.Next(1, 100)
        if r < 50 then
            description <- sprintf "Location number %i" r
            true
        else
            description <- null
            false
Listing 3-22

Returning success or failure as a Boolean, with result in a reference parameter

Using Option.toNullable

It won’t surprise you to learn that Option.toNullable is the counterpart of Option.ofNullable. It gets you from an option type to a nullable type, for example, Nullable<int>. Listing 3-23 shows us getting a heart rate from an unreliable sensor and returning either null or a heart rate value. (Clearly, unlike my exercise watch, the UI would need to know how to handle the null case!)
        open System
        let random = new Random()
        let getHeartRateInternal() =
            // In reality this would be attempting
            // to get a heart rate from a sensor:
            let rate = random.Next(0, 200)
            if rate = 0 then
                None
            else
                Some rate
        let tryGetHeartRate () =
            getHeartRateInternal()
            |> Option.toNullable
Listing 3-23

Using Option.toNullable

The Future of Null

At the time of writing, there is some light at the end of the tunnel regarding nulls in the .NET framework. (Hopefully, the light is not of the oncoming-train variety!) C# 8.0 allows you to specify that reference types such as strings are not nullable by default. This feature is opt-in; when switched on, you have to use specific syntax (adding a question mark to the declaration – see Listing 3-24) to declare a reference type as nullable. In due course, this should make it less likely that C# code that calls our nice clean F# code will send us null values by accident. At the time of writing, however, this feature is turned off by default, so the impact for the time being is likely to be small.
class Person
{
    public string FirstName;   // Not null
    public string? MiddleName; // May be null
    public string LastName;    // Not null
}
Listing 3-24

C# 8.0 Syntax for nullable and nonnullable types

The ValueOption Type

In addition to option types, F# offers a type called ValueOption. This is analogous to the option type, except that it is a value type (i.e., a struct) rather than a reference type. This means that instances of ValueOption are stored on the stack or inline in their parent array, which can help performance in some scenarios. Listing 3-25 shows usage of the ValueOption type . Note the new voption keyword and the ValueSome and ValueNone case names.
    let valueOptionString (v : int voption) =
        match v with
        | ValueSome x ->
            sprintf "Value: %i" x
        | ValueNone ->
            sprintf "No value"
    // "No value"
    ValueOption.ValueNone
    |> valueOptionString
    // "Value: 99"
    ValueOption.ValueSome 99
    |> valueOptionString
Listing 3-25

Using the ValueOption type

There is also a ValueOption module that contains useful functions like ValueOption.bind, ValueOption.map, ValueOption.count, and ValueOption.iter, which behave in the same way that we described for the Option module previously.

Using ValueOption values can have performance benefits in some kinds of code. To quote the documentation for value option types:

Not all performance-sensitive scenarios are “solved” by using structs. You must consider the additional cost of copying when using them instead of reference types. However, large F# programs commonly instantiate many optional types that flow through hot paths, and in such cases, structs can often yield better overall performance over the lifetime of a program.

The only way to be sure is to experiment with realistic volumes and processing paths.

Recommendations

Here are the key points I’d like you to take away from this chapter.
  • Avoid using null values to represent things that legitimately might not be set. Instead, use Discriminated Unions to model explicit cases when a value is or is not relevant, and only have storage for the value in the cases where it is relevant. If DUs make things too complicated, or if it is obvious from the immediate context why a value might not be set, model it as an option type.

  • To make your option-type handling more fluent, consider using functions from the Option module such as Option.bind, Option.map, and Option.defaultValue to create little pipelines that get you safely through one or more processing stages, each of which might fail. But don’t get hung up on this – pattern matching is also fine. What’s not fine is accessing the .IsSome and .Value properties of an option type!

  • At the boundary of your system, consider using Option.ofObj and Option.ofNull to move incoming nullable values into the option world and Option.toObj and Option.toNullable for option values leaving your code for other languages.

  • Avoid exposing option types and DUs in APIs if callers might be written in C# or other languages that might not understand F# types.

  • Remember the voption type and ValueOption module for optional values you want to be stored as structs. Using voption may have performance benefits.

Summary

In this chapter, you learned how to stop thinking of null values and other missing data items as rare cases to be fended off as an afterthought in your code. You found out how to embrace and handle missing data stylishly using F#’s rich toolbox, including option types, value option types, Discriminated Unions, pattern matching, and the Option and ValueOption modules. These techniques may not come easily at first, but after a while, you’ll wonder how you managed in any other way.

In the next chapter, we’ll look at how to use F#’s vast range of collection functions, functions that allow you to process collections such as arrays, lists, and IEnumerable values with extraordinary fluency.

Exercises

Exercise 3-1 – Supporting Click and Collect

Take the code from Listing 3-16 and update it to support the following scenario:

There is an additional delivery type called “Click and Collect.”

When a BillingDetails instance’s delivery value is “Click and Collect,” we need to store an integer StoreId value but no delivery address. (We still store a billing address as for the other cases.)

Write and try out a function called collectionsFor. It needs to take an integer StoreId and a sequence of BillingDetails instances and return a sequence of “Click-and-Collect” instances for the specified store.

Exercise 3-2 – Counting Nonnulls
You have a BillingDetails type and some orders in this form:
    type BillingDetails = {
        Name : string
        Billing : string
        Delivery : string option }
    let myOrder = {
        name = "Kit Eason"
        billing = "112 Fibonacci Street Erehwon 35813"
        delivery = None }
    let hisOrder = {
        name = "John Doe"
        billing = "314 Pi Avenue Erewhon 15926"
        delivery = None }
    let herOrder = {
        name = "Jane Smith"
        billing = null
        delivery = None }
    let orders = [| myOrder; hisOrder; herOrder |]

What is the most concise function you can write to count the number of BillingDetails instances that have a nonnull billing address? (Ignore the delivery address.)

Hint: One way to solve this is using two functions from the Option module. Option.ofObj is one of them. The other one we only mentioned in passing, earlier in this chapter. You might also want to use Seq.map and Seq.sumBy.

Exercise Solutions

This section shows solutions for the exercises in this chapter.

Exercise 3-1 – Supporting Click And Collect

You can achieve the requirement by adding a new case called ClickAndCollect of int to the Delivery DU (or ClickAndCollect of storeId:int).

Then your collectionsFor function can do a Seq.choose, containing a lambda that maps the ClickAndCollect back into Some, using a when clause to check the StoreId. All other cases can be mapped to None, meaning they don’t appear in the results at all.
module Exercise_03_03 =
    type Delivery =
        | AsBilling
        | Physical of string
        | Download
        | ClickAndCollect of int
    type BillingDetails = {
        Name : string
        Billing : string
        delivery : Delivery }
    let collectionsFor (storeId : int) (billingDetails : BillingDetails seq) =
        billingDetails
        |> Seq.choose (fun d ->
            match d.delivery with
            | ClickAndCollect s when s = storeId ->
                Some d
            | _ -> None)
    let myOrder = {
        name = "Kit Eason"
        billing = "112 Fibonacci Street Erehwon 35813"
        delivery = AsBilling }
    let yourOrder = {
        name = "Alison Chan"
        billing = "885 Electric Avenue Erewhon 41878"
        delivery = ClickAndCollect 1 }
    let theirOrder = {
        name = "Pana Okpik"
        billing = "299 Relativity Drive Erewhon79245"
        delivery = ClickAndCollect 2 }
    // { name = "Alison Chan";
    //   billing = "885 Electric Avenue
    //              Erewhon
    //              41878"; }
    //   delivery = ClickAndCollect 1;}
    [ myOrder; yourOrder; theirOrder ]
    |> collectionsFor 1
    |> Seq.iter (printfn "%A")
You’ll also have to add a new case to the pattern match in the tryDeliveryLabel function to ensure it ignores Click-and-Collect instances.
           | ClickAndCollect _
               -> None
Exercise 3-2 – Counting Nonnulls
There are many ways to do this. You can, for example, use Seq.map to pick out the billing address, another Seq.map with an Option.ofObj to map from nulls to None and nonnulls to Some, and Seq.sumBy with an Option.count to count the Some values. Remember, Option.count returns 1 when there is a Some and 0 when there is a None.
    let countNonNullBillingAddresses (orders : seq<BillingDetails>) =
        orders
        |> Seq.map (fun bd -> bd.billing)
        |> Seq.map Option.ofObj
        |> Seq.sumBy Option.count
    countNonNullBillingAddresses orders
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.141.47.163