Chapter 4. Type safety

This chapter covers

  • Avoiding the primitive obsession antipattern
  • Enforcing constraints during instance construction
  • Increasing safety by adding type information
  • Increasing flexibility by hiding and restoring type information

Now that we know how to use the basic types provided by our programming language and how to compose them to create new types, let’s look at how we can make our programs safer by using types. By safer, I mean reducing the opportunity for bugs.

There are a couple of ways to achieve this by creating new types that encode additional information: meanings and guarantees. The former, which we’ll cover in the first section, removes the opportunity for us to misinterpret a value, such as mistaking a mile for a kilometer. The latter allows us to encode guarantees such as “an instance of this type will never be less than 0” in the type system. Both techniques make our code safer, as we eliminate invalid values from the set of possible values represented by a type and avoid misunderstandings as soon as we can, preferably at compile time or as soon as we instantiate our types if at run time. When we have an instance of one of our types, from then on we know what it represents and that it is a valid value.

Because we’re discussing type safety, we’ll also look at how we can add and hide information from the type checker manually. If we somehow know more than the type checker does, we can tell it to trust us and pass our information down to it. On the other hand, if the type checker knows too much and ends up impeding our work, we can make it “forget” some of the typing information, giving us more flexibility at the cost of safety. These techniques are not to be used lightly, as they move the responsibility of proper type checking from the type checker to us as developers, but as we’ll see, there are some legitimate scenarios in which these techniques are desired.

4.1. Avoiding primitive obsession to prevent misinterpretation

In this section, we’ll see how using basic types to represent values and implicitly assuming what those values represent can cause problems when two different parts of the code, often written by different developers, make incompatible assumptions (figure 4.1).

Figure 4.1. The numeric value 1000 could represent 1,000 dollars or 1,000 miles. Two different developers could interpret it as two very different measures.

We can rely on the type system to make those assumptions explicit by defining types to describe them, in which case the type checker can detect incompatibilities and signal them before anything bad happens.

Let’s say we have a function addToBill() that takes as its argument a number. The function is supposed to add the price of an item to a bill. Because the argument is of type number, we could pass it a distance between cities in miles, also represented as a number. We end up adding miles to a price total, and the type checker doesn’t suspect anything!

Figure 4.2. Having an explicit Currency type makes it clear that the value does not represent 1,000 miles, but rather a dollar amount.

On the other hand, if we make our addToBill() function take an argument of type Currency and our distance between cities is represented as a type Miles, the code will not compile (figure 4.2).

4.1.1. The Mars Climate Orbiter

The Mars Climate Orbiter disintegrated because a component developed by Lockheed used a different unit of measure (pound-force seconds) for momentum than a component developed by NASA, which consumed that measure (in metric units). Let’s imagine how the code looked for the two components. The trajectory-Correction() function consumes a measurement as Newton-seconds, or Ns (the metric unit for momentum), whereas the provideMomentum() function produces a measure in pound-force seconds, or lbfs, as shown in the next listing.

Listing 4.1. Sketch of incompatible components
function trajectoryCorrection(momentum: number) {    1
    if (momentum < 2 /* Ns */)  {                    2
        disintegrate();
    }

    /* ... */
}

function provideMomentum() {
    trajectoryCorrection(1.5 /* lbfs */);            3
}

  • 1 trajectoryCorrection takes momentum as an argument of type number.
  • 2 If momentum is less than 2 Ns, disintegrate.
  • 3 provideMomentum passes in a measurement of 1.5 lbfs.

Converting to metric, 1 lbfs equals 4.448222 Ns. From the perspective of the provide-Momentum() function, the value provided is good, because 1.5 lbfs is more than 6 Ns. That’s way more than the 2 Ns lower limit. What went wrong? The main issue in this case is that both components treated momentum as a number, implicitly assuming the unit in which it was measured. trajectoryCorrection() interpreted the momentum as 1 Ns, less than the 2 Ns lower limit, and inappropriately triggered the disintegration.

Let’s see whether we can leverage the type system to prevent such catastrophic misunderstandings. Let’s make the unit of measure explicit by defining a Lbfs type and a Ns type in listing 4.2. Both types wrap a number, as the actual measure is still a value. We will use a unique symbol for each type because TypeScript considers types to be compatible if they have the same shape, as we will see when we discuss subtyping. The unique symbol trick makes it so that one type can’t be implicitly interpreted as the other. Not all languages require this additional unique symbol member. We’ll explain this trick in chapter 7; for now, we’ll focus on the new types defined.

Listing 4.2. Pound-force second and Newton-second types
declare const NsType: unique symbol;         1

class Ns {
   readonly value: number;                   2
    [NsType]: void;                          1

    constructor(value: number) {
        this.value = value;
    }
}

declare const LbfsType: unique symbol;

class Lbfs {
    readonly value: number;                  3
    [LbfsType]: void;                        3

    constructor(value: number) {
        this.value = value;
    }
}

  • 1 TypeScript-specific way to ensure that other objects with the same shape can’t be interpreted as this type
  • 2 Ns effectively just wraps a value of type number.
  • 3 Similarly, Lbfs type wraps a number and a unique symbol.

Now that we have our two separate types, we can easily implement a conversion between them because we know the ratio. Let’s look at the following listing to see a conversion from lbfs to Ns, which we need in our update trajectoryCorrection() code.

Listing 4.3. Converting lbfs to Ns
function lbfsToNs(lbfs: Lbfs): Ns {
    return new Ns(lbfs.value * 4.448222);     1
}

  • 1 Take the lbfs value, multiply by the ratio, and return a Ns value.

Going back to the Mars Climate Orbiter, we can reimplement the two functions to use the new types. trajectoryCorrection() expects a Ns momentum (and will still disintegrate if the value is less than 2 Ns), and provideMomentum() still produces values as lbfs. But now we can’t simply take the value produced by provideMomentum() and pass it to trajectoryCorrection(), because the returned value and the function argument have different types. We have to explicitly convert from one to the other, using our lbfsToNs() function, as the following listing shows.

Listing 4.4. Updated components
function trajectoryCorrection(momentum: Ns) {         1
    if (momentum.value < new Ns(2).value)  {          1
        disintegrate();
    }

   /* ... */
}

function provideMomentum() {
    trajectoryCorrection(lbfsToNs(new Lbfs(1.5)));    2
}

  • 1 trajectoryCorrection now takes an argument of type Ns and compares it with 2 Ns.
  • 2 provideMomentum generates a 1.5 lbfs value and has to convert it to Ns.

If we omitted the conversion lbfsToNs(), the code would simply not compile, and we would get the following error: Argument of type 'lbfs' is not assignable to parameter of type 'Ns'. Property '[NsType]' is missing in type 'lbfs'.

Let’s review what happened: we started with two components that both manipulated momentum values, but even though they used different units when handling those values, they both represented the values simply as number. To avoid misinterpretations, we created a couple of new types, one to represent each unit of measure, which effectively left no room for misinterpretation. If a component explicitly deals with Ns, it can’t accidentally consume a Lbfs value.

Also note that the assumptions that showed up in the code as comments in our first example (1.5 /* lbfs */) became code in our final implementation (new Lbfs(1.5)).

4.1.2. The primitive obsession antipattern

In the same way that design patterns capture reusable software designs that are highly reliable and effective, antipatterns are common designs that are ineffective and counterproductive when a better alternative exists. The preceding example is an instance of a well-known antipattern called primitive obsession. Primitive obsession turns up when we rely on basic types to represent everything: a postal code is a number, a phone number is a string, and so on.

If we fall into this trap, we leave a lot of room for errors like the one we saw in this section. That’s because the meaning of the values is not explicitly captured in the type system. If I consume a momentum value given as a number, I, the developer, implicitly assume that it is a Newton-second value. The type checker does not have enough information to detect when two developers make incompatible assumptions. When this assumption is explicitly captured as a type declaration, and I consume a momentum value given as a Ns instance, the type checker can verify when someone else is attempting to give me a Lbfs instance instead and not allow the code to compile.

Even though a postal code is a number, that doesn’t mean we should store it as a value of type number. We should never interpret momentum as a postal code.

If the entities you represent are simple values, such as physical measurements and postal codes, consider defining them as new types, even if these types simply wrap a number or a string. This practice gives the type system more information to work with in analyzing our code and eliminates a whole class of errors caused by incompatible assumptions, not to mention that it makes the code more readable. For contrast, compare the first definition of trajectoryCorrection(), which is trajectory-Correction(momentum: number), with the second one, which is trajectory--Correction(momentum: Ns). The second one gives more information to readers of the code as to what its contract is. (Expected momentum is in Ns.)

So far, we’ve seen how we can wrap primitive types into other types to encode more information. Now let’s move on to see how we can provide even more safety by restricting the range of allowed values for a given type.

4.1.3. Exercise

1

What is the safest way to represent a weight measurement?

  1. As a number
  2. As a string
  3. As a custom Kilograms type
  4. As a custom Weight type

4.2. Enforcing constraints

In chapter 3, we talked about composition and how to take basic types and combine them to represent more complex concepts, such as representing a point on a 2D plane as a pair of number values, one for each of the X and Y coordinates. Now let’s look at what we can do when the basic types we get out of the box allow for more values than we need.

Let’s take, as an example, a measure of temperature. We’re going to avoid primitive obsession and declare a Celsius type to make it clear which unit of measure we expect the temperature to have. This type will also simply wrap a number.

We have an additional constraint, though: we should never have a temperature less than absolute zero, which is –273.15 degrees Celsius. One option is to check whenever we use an instance of this type that the value is a valid one. This option leaves room for error, though: we always add the check, but a new developer on the team doesn’t know the pattern and misses checking. Wouldn’t it be better to make sure that we can never get an invalid value?

We can do this in two ways: via the constructor or via a factory.

4.2.1. Enforcing constraints with the constructor

We can implement the constraint in the constructor and handle a value that’s too small in one of the two ways we saw when we looked at integer overflow. One option is to throw an exception when the value is invalid and disallow creation of the object.

Listing 4.5. Constructor throwing on invalid value
declare const celsiusType: unique symbol;

class Celsius {
    readonly value: number;                       1
    [celsiusType]: void;

   constructor(value: number) {
        if (value < -273.15) throw new Error();   2

        this.value = value;
    }
}

  • 1 The value is immutable, so when it’s initialized, it can’t be changed.
  • 2 Constructor throws if we attempt to create an invalid temperature.

We ensure that the value stays valid after construction by making it readonly. Another option would be to make it private and access it with a getter (so that the value can be retrieved but not set).

We can also implement our constructor to coerce the value to be a valid one: anything less than -273.15 becomes -273.15.

Listing 4.6. Constructor coercing an invalid value
declare const celsiusType: unique symbol;

class Celsius {
    readonly value: number;
    [celsiusType]: void;

    constructor(value: number) {
        if (value < -273.15) value = -273.15;     1

        this.value = value;
    }
}

  • 1 Instead of throwing, we “fix” the value.

Either of the two approaches is valid, depending on the scenario. We can also use a factory function instead. A factory is a class or function whose main job is to create another object.

4.2.2. Enforcing constraints with a factory

A factory is useful when we don’t want to throw an exception, but to return undefined or some other value that is not a temperature and represents failure to create a valid instance. A constructor can’t do this because it doesn’t return: it either finishes initializing its instance or throws. Another reason to use a factory is when the logic required to construct and validate an object is complex, in which case it might make sense to implement it outside the constructor. As a rule of thumb, constructors shouldn’t do heavy lifting—just get the object members initialized.

Let’s look at how an implementation of a factory works in the following listing. We will make the constructor private so that only the factory method can call it. The factory will be a static method on our class. It will return either a Celsius instance or undefined.

Listing 4.7. Factory returning undefined on invalid value
declare const celsiusType: unique symbol;

class Celsius {
    readonly value: number;
    [celsiusType]: void;

    private constructor(value: number) {                        1
        this.value = value;
    }

    static makeCelsius(value: number): Celsius | undefined {    2
        if (value < -273.15) return undefined;                  3

        return new Celsius(value);
    }
}

  • 1 Constructor is now private because it doesn’t perform any checks itself.
  • 2 Factory returns either a valid Celsius instance or undefined.
  • 3 Constraint is enforced in the factory, which is the only way to create Celsius instances.

In all these cases, we have the additional guarantee that if we have an instance of Celsius, its value will never be less than -273.15. The advantage of performing the check when an instance of the type is created and ensuring that the type can’t be created in other ways is that you are guaranteed a valid value whenever you see an instance of the type being passed around.

Instead of checking whether the instance is valid when using it, which usually means performing the check in multiple places, we perform the check just once and make it impossible for an invalid object of the type to exist.

This technique goes beyond simple value wrappers like Celsius, of course. We can ensure that a Date object created from a year, a month, and a day is valid and disallow dates like June 31. There are many cases in which the basic types at our disposal don’t allow us to impose the restrictions we want directly, in which case we can create types that encapsulate additional constraints and provide the guarantee that they can’t exist with invalid values.

Next, let’s look at how we can add and hide typing information throughout our code and when this practice is useful.

4.2.3. Exercise

1

Implement a Percentage type that represents a value between 0 and 100. Values smaller than 0 should become 0, and values larger than 100 should become 100.

4.3. Adding type information

Although type checking has strong theoretical foundations, all programming languages provide shortcuts that allow us to bypass the type checks and tell the compiler to treat a value as a certain type. We are effectively saying, “Trust us; we know what this type is better than you do.” This is called a type cast—a term you might have heard before.

Type Cast

A type cast converts the type of an expression to another type. Each programming language has its own rules about which conversions are valid and which are not, which can be done automatically by the compiler, and which must be done with additional code (figure 4.3).

Figure 4.3. With casting, we can turn a value of type 16-bit signed integer into a UTF-8 encoded character.

4.3.1. Type casting

An explicit type cast is a cast that allows us to tell the compiler to treat a value as though it had a certain type. In TypeScript, we do a cast to NewType by adding <NewType> in front of the value or by adding as NewType after the value.

This technique can be dangerous when misused: if we bypass the type checker, we get a run-time error if we attempt to use a value as something it is not. I can cast my Bike, which I can ride(), to a SportsCar, for example, but I still won’t be able to drive() it, as the following listing shows.

Listing 4.8. Type cast causing a run-time error
class Bike {
    ride(): void { /* ... */ }
}

class SportsCar {
    drive(): void { /* ... */ }
}

let myBike: Bike = new Bike();                                    1
                                                                  1
myBike.ride();                                                    1

let myPretendSportsCar: SportsCar = <SportsCar><unknown>myBike;   2

myPretendSportsCar.drive();                                       3

  • 1 myBike is created as type Bike, so we can call ride() on it.
  • 2 We can tell the compiler to treat it as a SportsCar, which we assign to myPretendSportsCar.
  • 3 Calling drive() on myPretendSportsCar causes a run-time error.

Here, we can tell the type checker to let us pretend that we have a SportsCar, but that doesn’t mean we actually have one. Calling drive results in the following exception being thrown: TypeError: myPretendSportsCar.drive is not a function.

We had to cast myBike first to the unknown type and then to a SportsCar because the TypeScript compiler realizes that the Bike and SportsCar types don’t overlap. (A valid value of one of the types can never be a valid value of the other.) So simply calling <SportsCar>myBike still causes an error. Instead, we first say <unknown>myBike, which tells the compiler to forget the type of myBike. Then we can say, “Trust us; it’s a SportsCar.” But as we saw, this still causes a run-time error. In other languages, it can cause a crash. In general, such a situation is not valid. So when would this be useful?

4.3.2. Tracking types outside the type system

Sometimes, we know more than the type checker. Let’s revisit the Either implementation from chapter 3. It stores a value of TLeft or TRight type, and a boolean flag keeps track of whether the value is TLeft, as shown in the next listing.

Listing 4.9. Revisiting Either implementation
class Either<TLeft, TRight> {
    private readonly value: TLeft | TRight;               1
    private readonly left: boolean;                       2

    private constructor(value: TLeft | TRight, left: boolean) {
        this.value = value;
        this.left = left;
    }

    isLeft(): boolean {
        return this.left;
    }

    getLeft(): TLeft {
        if (!this.isLeft()) throw new Error();            3
                                                          3
        return <TLeft>this.value;                         3
    }

    isRight(): boolean {
        return !this.left;
    }

    getRight(): TRight {
        if (!this.isRight()) throw new Error();

        return <TRight>this.value;
    }

    static makeLeft<TLeft, TRight>(value: TLeft) {
        return new Either<TLeft, TRight>(value, true);    4
    }

   static makeRight<TLeft, TRight>(value: TRight) {
        return new Either<TLeft, TRight>(value, false);   4
    }
}

  • 1 We store a value of type TLeft or type TRight.
  • 2 We keep track of whether it is a TLeft or not by using the left property.
  • 3 When we want to get a TLeft, we check whether we are storing the right type; then we cast to TLeft.
  • 4 The makeLeft factory initializes left to true; makeRight initializes it to false.

This allows us to combine two types into a sum type that can represent a value from either of them. If we look closely, though, the value we are storing has type TLeft | TRight. After we assign it, the type checker no longer knows whether the actual value we stored was a TLeft or a TRight. From now on, it will consider value to be either of the two. This is what we want while storing the value, but at some point, we would like to use it.

The compiler will not allow us to pass a value of type TLeft | TRight to a function that expects a TLeft value, because if our value is in fact TRight, we are going to be in trouble. If we have a triangle or a square, we can’t necessarily pass that through a triangular slot. It would work to have a triangle to pass through it. But what if we have a square (figure 4.4)?

Figure 4.4. If we have a triangle or a square, we can’t say for sure whether the actual shape we have will pass through a triangular slot. It will if it’s a triangle, but it won’t if it’s a square.

Trying to do something like this results in a compiler error, which is good. But we know something the type checker doesn’t: we know from when we set the value whether it came from a TLeft or a TRight. If we created our object by using makeLeft(), we set left to true. If we created our object by using makeRight(), we set left to false, as shown in the next listing. We are keeping track of this fact even if the type checker forgets.

Listing 4.10. makeLeft and makeRight
class Either<TLeft, TRight> {
    private readonly value: TLeft | TRight;
    private readonly left: boolean;                        1

    private constructor(value: TLeft | TRight, left: boolean) {
        this.value = value;
        this.left = left;                                  2
    }


    /* ... */

   static makeLeft<TLeft, TRight>(value: TLeft) {
        return new Either<TLeft, TRight>(value, true);     3
    }

    static makeRight<TLeft, TRight>(value: TRight) {
        return new Either<TLeft, TRight>(value, false);    3
    }
}

  • 1 left tells us whether we are storing a TLeft.
  • 2 left is assigned in the private constructor that only makeLeft and makeRight can call.
  • 3 makeLeft and makeRight initialize left to the appropriate value.

When we want to take the value out, as a caller, it is our responsibility to first check which of the two types the value is. If we have an Either<Triangle, Square> and want a Triangle, we start by calling isLeft(). If true is returned, we call getLeft() and end up with a Triangle, as the following listing shows.

Listing 4.11. Triangle or Square
declare const triangleType: unique symbol;
class Triangle {                              1
    [triangleType]: void;
    /* ... */
}

declare const squareType: unique symbol;
class Square {
    [squareType]: void;                       1
    /* ... */
}

function slot(triangle: Triangle) {
    /* ... */
}

let myTriangle: Either<Triangle,Square>
    = Either.makeLeft(new Triangle());        2

if (myTriangle.isLeft())
    slot(myTriangle.getLeft());               3

  • 1 Triangle and Square types
  • 2 From here on, myTriangle.value is of type Triangle | Square; the compiler no longer knows that we placed a Triangle there.
  • 3 getLeft() casts the value back to a Triangle.

Internally, our getLeft() implementation performs whatever checks it needs (in this case by checking that this.isLeft() is true) and handles an invalid call however we want (in this case by throwing Error). When all that is out of the way, it casts the value to the type. The type checker forgot which type the value was when we assigned it, so now we remind it, as shown in the following code, as we were keeping track of the type in left.

Listing 4.12. isLeft() and getLeft()
class Either<TLeft, TRight> {
    private readonly value: TLeft | TRight;
    private readonly left: boolean;

    /* ... */

    isLeft(): boolean {
        return this.left;                        1
    }

    getLeft(): TLeft {
        if (!this.isLeft()) throw new Error();   2

        return <TLeft>this.value;                3
    }

    /* ... */
}

  • 1 Clients can check whether the value stored is of type TLeft by calling isLeft().
  • 2 In case the value is of the wrong type, we can handle the error. In this case, we throw an Error. An alternative would be to return undefined.
  • 3 The value is cast to the type TLeft.

In this case, we don’t need the <unknown> cast: a value of the type TLeft | TRight could be a valid value of type TLeft, so the compiler won’t complain and will trust us with the cast.

When used correctly, casting is powerful because it allows us to refine the type of a value. If we have a Triangle | Square, and we know that it is a Triangle, we can cast it to a Triangle, which the compiler will allow us to fit through a triangular slot.

In fact, most type checkers do several such casts automatically without requiring us to write any code.

Implicit and Explicit Type Casts

An implicit type cast, also known as coercion, is a type cast that is performed automatically by the compiler. It doesn’t require any code to be written. Such casts are usually safe. By contrast, an explicit type cast is a type cast that we need to specify with code. This type cast effectively bypasses the rules of the type system, and we should use it with care.

4.3.3. Common type casts

Let’s look at a few common types of casts, both implicit and explicit, and see how they can be useful.

Upcasts and downcasts

One example of a common type cast is interpreting an object of a type that inherits from another type as its parent type. If our base class is Shape, and we have a Triangle, we can always use a Triangle whenever a Shape is required, as shown in the following code.

Listing 4.13. Upcast
class Shape {
    /* ... */
}

declare const triangleType: unique symbol;

class Triangle extends Shape {              1
    [triangleType]: void;
    /* ... */
}

function useShape(shape: Shape) {           2
    /* ... */
}

let myTriangle: Triangle = new Triangle();

useShape(myTriangle);                       3

  • 1 The Triangle type extends Shape.
  • 2 useShape() expects an argument of type Shape.
  • 3 We can pass a Triangle to it, and it is automatically cast to Shape.

Inside the body of useShape(), the compiler treats the argument as a Shape, even if we passed in a Triangle. Interpreting a derived class (Triangle) as a base class (Shape) is called an upcast. If we know for sure that our Shape is actually a Triangle, we can cast it back to Triangle, but this cast needs to be explicit. Casting from a parent class to a derived class is called a downcast, shown in the next listing, and most strongly typed languages don’t do this automatically.

Listing 4.14. Downcast
class Shape {
    /* ... */
}

declare const triangleType: unique symbol;

class Triangle extends Shape {
    [triangleType]: void;
    /* ... */
}

function useShape(shape: Shape, isTriangle: boolean) {    1
    if (isTriangle) {
        let triangle: Triangle = <Triangle>shape;         2
        /* ... */
    }
    /* ... */
}

let myTriangle: Triangle = new Triangle();

useShape(myTriangle, true);                               3

  • 1 This version of the function has an additional argument that tracks whether a triangle was passed in.
  • 2 If the argument is in fact a triangle, we can get the type back with a cast.
  • 3 The caller needs to set this flag correctly; otherwise, a run-time error occurs.

Unlike an upcast, a downcast is not safe. Although it’s easy to tell from a derived class what its parent is, the compiler can’t automatically determine, given a parent class, which of the possible derived classes a value might be.

Some programming languages store additional type information at run time and include an is operator, which can be used to query the type of an object. When we are creating a new object, its associated type is stored alongside, so even if we upcast away some of the type information from the compiler, at run time we can check whether we have an instance of a certain type with if (shape is Triangle) ....

Languages and run times that implement this kind of run-time type information provide a safer way to store and query for types, as there is no risk that this information will get out of sync with the objects. This comes at the cost of storing additional data in memory for each object instance.

In chapter 7, when we discuss subtyping, we will look at more complex upcasts and talk about variance. For now, we’ll move on to talk about widening and narrowing casts.

Widening casts and narrowing casts

Another common implicit cast is from an integer type with a fixed number of bits—say, an 8-bit unsigned integer—to another integer type that represents values with more bits—say, a 16-bit unsigned integer. You can do this implicitly because a 16-bit unsigned integer can represent any 8-bit unsigned integer value and more. This type of cast is called a widening cast.

On the other hand, casting a signed integer to an unsigned integer is dangerous, as a negative number can’t be represented by an unsigned integer. Similarly, casting an integer with more bits to an integer with fewer bits, such as a 16-bit unsigned integer to an 8-bit unsigned integer, would work only for values that the smaller type can represent.

This type of cast is called a narrowing cast. Some compilers force you to be explicit when performing a narrowing cast because it’s dangerous. Being explicit helps, in that it makes it clear you didn’t do it unintentionally. Other compilers allow narrowing casts but issue a warning. Run-time behavior when the value doesn’t fit the new type is similar to the integer overflow that we discussed in chapter 2: depending on the language, we get an error or the value gets chopped so that it fits in the new type (figure 4.5).

Figure 4.5. Example of widening and narrowing casts. The widening cast is safe: the gray squares represent the extra bits we get, so no information can be lost. On the other hand, the narrowing cast is dangerous: the black squares represent bits that no longer fit in the new type.

Casts are not to be used lightly, as they bypass the type checker, effectively eliminating all the goodness that type checking brings us. They are useful tools, though, especially when we have more information than the compiler does and want to push that information back to the compiler. After we tell the compiler what we know, it can use that information in further analysis. Going back to the Triangle | Square example, after we tell the compiler our value is a Triangle, there can be no Square value farther on. This technique is similar to the one discussed in section 4.2, in which we looked at enforcing constraints, but here, instead of performing a run-time check, we simply tell the compiler to trust us.

In the next section, we’ll look at a few other situations in which it’s useful to make the compiler “forget” typing information.

4.3.4. Exercises

1

Which of the following casts are considered to be safe?

  1. Upcasts
  2. Downcasts
  3. Upcasts and downcasts
  4. Neither

2

Which of the following casts are considered to be unsafe?

  1. Widening casts
  2. Narrowing casts
  3. Widening and narrowing casts
  4. Neither

4.4. Hiding and restoring type information

One example of hiding type information is wanting to have a collection that can contain a combination of values of different types. If the collection contains values of just one type, such as a bag of cats, it’s easy, because we know that whenever we pull some thing out from the bag, it’s going to be a cat. If we want to put groceries in the bag too, when we pull something out, we might end up with either a cat or a grocery item (figure 4.6).

Figure 4.6. If we have a bag that contains only cats, we can bet that whichever item we pull out of it will be a cat. If the bag can also contain groceries, we are no longer able to guarantee what we will pull out.

A collection with items of the same type, like our bag of cats, is also called a homogenous collection. Because all items have the same type, we don’t need to hide their type information. A collection of items of different types is also known as a heterogenous collection. In this case, we need to hide some of the typing information to declare such a collection.

4.4.1. Heterogenous collections

A document can contain text, pictures, or tables. When we work with the document, we want to keep all its constituent parts together, so we will store them in some collection. But what is the type of the elements of that collection? There are several ways to implement this, all of which involve hiding some type information.

Base type or interface

We can create a class hierarchy and say that all items in the documents must be part of some hierarchy. If everything is a DocumentItem, we can store a collection of DocumentItem values even if, when we add items to the collection, we add types such as Paragraph, Picture, and Table. Similarly, we can declare an IDocumentItem interface and say that the array contains only types that implement this interface, as shown in the following listing.

Listing 4.15. A collection of types implementing IDocumentItem
interface IDocumentItem {                     1
    /* ... */
}

class Paragraph implements IDocumentItem {    2
    /* ... */
}

class Picture implements IDocumentItem {      2
    /* ... */
}

class Table implements IDocumentItem {        2
    /* ... */
}

class MyDocument {
    items: IDocumentItem[];                   3

    /* ... */
}

  • 1 IDocumentItem is the common interface for document elements.
  • 2 Paragraph, Picture, and Table all implement IDocumentItem.
  • 3 We store document items as an array of IDocumentItem objects.

We’ve hidden some of the typing information, so we no longer know whether a particular item in the collection is a Paragraph, a Picture, or a Table, but we know that it implements the DocumentItem or IDocumentItem contract. If we need only behavior specified by that contract, we can work with the elements of the collection as is. If we need an exact type, such as a picture that we want to pass to an image-enhancing add-on, we have to downcast the DocumentItem or IDocumentItem back to a Picture.

Sum type or variant

If we know up front all the types we are dealing with, we can use a sum type, as shown in listing 4.16. We can define our document as an array of Paragraph | Picture | Table (in which case we must track what each item in the collection is by some other means) or as a type such as Variant<Paragraph, Picture, Table> (which keeps track internally of the type it stores).

Listing 4.16. A collection of types as a sum type
class Paragraph {                              1
    /* ... */
}

class Picture {                                1
    /* ... */
}

class Table {                                  1
    /* ... */
}

class MyDocument {
    items: (Paragraph | Picture | Table)[];    2

    /* ... */
}

  • 1 Paragraph, Picture, and Table no longer implement an interface.
  • 2 The document item collection is now an array of objects that can be either of the types.

Both Paragraph | Picture | Table and Variant<Paragraph, Picture, Table> options allow us to store a set of items that don’t need to have anything in common (no common base type or implemented interface). The advantage is that we don’t impose anything on the types in the collection. The disadvantage is that there is not much we can do with the items in the list without casting them back down to their actual types or, in the Variant case, calling visit()and having to provide functions for each of the possible types in the collection.

As a reminder, because a type like Variant keeps track internally of which type it actually stores, just as Either does, it knows which function to pick from a set of functions passed to visit().

Unknown type

At an extreme, we can say we have a collection that can contain anything. As shown in listing 4.17, TypeScript provides the type unknown to represent that type of collection. Most object-oriented programming languages have a common base type that is the parent of all other types, usually called Object. We’ll cover this topic in depth in chapter 7 when we discuss subtyping.

Listing 4.17. A collection of unknown type
class MyDocument {
    items: unknown[];      1
    /* ... */
}

  • 1 The elements of the array can be anything.

This technique allows us to have a document containing anything. Types don’t need to have a shared contract, and we don’t even need to know beforehand what the types do. On the other hand, there’s even less we can do with the elements of this collection. We’ll almost always have to cast them to other types, so we have to keep track of their original types in another way.

Table 4.1 summarizes the different approaches and trade-offs.

Table 4.1. Pros and cons of heterogenous list implementations
 

Pros

Cons

Hierarchy Can easily use any property or method of the base type without casting Types in the collection must be related by base type or implemented interface
Sum type No requirement that types be related Need to cast back to actual type to use items if we don’t have Variant’s visit()
Unknown type Can store anything Need to keep track of actual types and cast back to them to use items

All these examples have pros and cons, depending on how flexible we want our collection to be in terms of what can be stored there and how often we expect to have to restore the items to their original types. That being said, all the examples hide some amount of type information when we put items in the collection. Another example of hiding and restoring type information is serialization.

4.4.2. Serialization

When we write information to a file and want to load it back and use it in our program, or when we connect to an internet service and send and retrieve some data, that data travels as a sequence of bits. Serialization is the process of taking a value of a certain type and encoding it as a sequence of bits. The opposite operation, deserialization, involves taking a sequence of bits and decoding it into a data structure we can work with (figure 4.7).

Figure 4.7. A compact car with two doors and front-wheel drive serialized as JSON and then deserialized back into a car

The exact encoding depends on the protocol we use. It can be JSON, XML, or any other of the multitude of available protocols. From a type perspective, the important part is that after serialization, we end up with a value that should be equivalent to the typed value we started with, but all typing information becomes unavailable to the type system. Effectively, we end up with a string or an array of bytes. The JSON.stringify() method takes an object and returns a JSON representation of that object as a string. If we stringify a Cat, as the next listing shows, we can write the result to disk, to the network, or even to the screen, but we cannot get it to meow().

Listing 4.18. Serializing a cat
class Cat {
    meow() {                                              1
        /* ... */
    }
}

let serializedCat: string = JSON.stringify(new Cat());    2

// serializeCat.meow();                                   3

  • 1 A Cat type that has a meow() method.
  • 2 We serialize a Cat object as a JSON string by using JSON.stringify().
  • 3 Obviously, we can’t use a method like meow() because serializedCat is a string.

We still know what the value is, but the type checker no longer does. The opposite operation involves taking a serialized object and turning it back into a typed value. In this case, we can use the JSON.parse() method, which takes a string and returns a JavaScript object. Because this technique works for any string, the result of calling it is of type any.

The any type

TypeScript provides an any type. This type is used for interoperability with JavaScript when typing information is unavailable. any is a dangerous type because the compiler does no type checking on instances of this type, which can be freely converted to and from any other type. It’s up to the developer to ensure that no misinterpretations happen.

If we know that we have a serialized Cat, we can assign it to a new Cat object by using Object.assign() as shown in the following listing, and then cast it back to its type, as Object.assign() returns a value of type any.

Listing 4.19. Deserializing a Cat
class Cat {
    meow() {
        /* ... */
    }
}

let serializedCat: string = JSON.stringify(new Cat());

let deserializedCat: Cat =
    <Cat>Object.assign(new Cat(), JSON.parse(serializedCat));    1

deserializedCat.meow();                                          2

  • 1 We deserialize the object by using JSON.parse(), assign it to a new Cat instance, and cast it to the Cat type.
  • 2 We can call meow() on the object, as it is of type Cat and has a meow() method.

In some cases, we can get and deserialize any number of possible types, in which case it might be a good idea to encode some of the typing information in the serialized object too. We can define a protocol in which each object is prefixed with a character that represents its type. Then we can encode a Cat and prefix the resulting string with "c" for Cat. If we get a serialized object, we check the first character. If it’s "c", we can safely restore our Cat. If it’s "d", for Dog, we know not to deserialize a Cat, as shown in the following listing.

Listing 4.20. Serializing and tracking type
class Cat {
    meow() { /* ... */ }
}

class Dog {
    bark() { /* ... */ }
}

function serializeCat(cat: Cat): string {
    return "c" + JSON.stringify(cat);                                  1
}

function serializeDog(dog: Dog): string {
    return "d" + JSON.stringify(dog);                                  2
}

function tryDeserializeCat(from: string): Cat | undefined {            3
    if (from[0] != "c") return undefined;                              4

    return <Cat>Object.assign(new Cat(), JSON.parse(from.substr(1)));  5
}

  • 1 We serialize a Cat object by prefixing a “c” to the JSON representation.
  • 2 We serialize a Dog object by prefixing a “d” to the JSON representation.
  • 3 Given a serialized Cat or Dog, we can attempt to deserialize a Cat.
  • 4 If the first character is not “c”, return undefined because we can’t deserialize a Cat.
  • 5 Otherwise, JSON.parse() the rest of the string and assign it to a Cat object.

If we serialize a Cat object and call tryDeserializeCat() on its serialized representation, we get back a Cat object. If, on the other hand, we serialize a Dog object and call tryDeserializeCat(), we get back undefined. Then we can check to see whether we got an undefined and see whether we have a Cat, as shown in the next listing.

Listing 4.21. Deserializing with tracked type
let catString: string = serializeCat(new Cat());                 1
let dogString: string = serializeDog(new Dog());                 1

let maybeCat: Cat | undefined = tryDeserializeCat(catString);    2

if (maybeCat != undefined) {                                     3
    let cat: Cat = <Cat>maybeCat;                                4
    cat.meow();                                                  4
}

maybeCat = tryDeserializeCat(dogString);                         5

  • 1 We serialize a Cat and a Dog to strings.
  • 2 Calling tryDeserializeCat gives us either a Cat or undefined.
  • 3 We can check whether we got a Cat.
  • 4 If we did, we can cast to Cat and get an object we can call meow() on.
  • 5 Attempting to deserialize a Cat object from a serialized Dog object will give us undefined.

The reason why we can compare maybeCat with undefined, even though we couldn’t compare Triangle with TLeft previously, is that undefined is a special unit type in TypeScript. The undefined type has a single possible value, which is undefined. In the absence of this type, we can always use a type like Optional<Cat>. We described Optional<T> in chapter 3 as a type that contains a value of type T or nothing.

As we’ve seen throughout this chapter, types enable whole new levels of safety for our code. We can capture what would’ve been implicit assumptions in type declaration and make them explicit by avoiding primitive obsession and letting the type checker make sure that we don’t misinterpret values. We can further restrict the allowed values of a certain type and ensure that constraints are met during instance creation, so that we have a guarantee that when we have an instance of a given type, it will always be valid.

On the other hand, we want to be more flexible in some situations and handle multiple types in the same way. In such situations, we can hide some of the type information and expand the possible values that a variable can take. In most cases, we would still like to keep track of the original type of the value so we can restore it later. We do that outside the type system by storing the type somewhere else, such as in another variable. As soon as we no longer need the extra flexibility and want to rely on the type checker again, we can restore the type by using a type cast.

4.4.3. Exercises

1

Which type should we use if we want to assign any possible value to it?

  1. any
  2. unknown
  3. any | unknown
  4. Either any or unknown

2

What is the best way to represent an array of numbers and strings?

  1. (number | string)[]
  2. number[] | string[]
  3. unknown[]
  4. any[]

Summary

  • The primitive obsession antipattern shows up when we declare values as basic types and make implicit assumptions about their meaning.
  • The alternative to using primitive obsession is defining types that explicitly capture the meaning of the values and prevent misinterpretations.
  • If we have additional constraints that we want to impose but can’t at compile time, we can enforce them in constructors or factories, so that when we have an object of the type, we are guaranteed that it is valid.
  • Sometimes, we know more than the type checker does, as we can store typing information outside the type system itself as data.
  • We can use this information to perform safe type casts, adding more information for the type checker.
  • We may want to treat different types the same way, perhaps to store values of different types in a single collection or serialize them.
  • We can hide type information by casting to a type that includes our type, a type our type inherits from, a sum type, or a type that can store values of any other type.

So far we’ve looked at basic types, ways to compose them, and other ways in which we can leverage the type systems to increase the safety of our code. In chapter 5, we’ll look at something radically different: What new possibilities will be open to us when we can assign types to functions and treat functions like any other values in our code?

Answers to exercises

Avoiding primitive obsession to prevent misinterpretation

1

c—Specifying the measurement unit is a safer approach.

 

Enforcing constraints

1

Here is a possible solution:

declare const percentageType: unique symbol;

class Percentage {
    readonly value: number;
    [percentageType]: void;

    private constructor(value: number) {
        this.value = value;
    }

   static makePercentage(value: number): Percentage {
        if (value < 0) value = 0;
        if (value > 100) value = 100;

        return new Percentage(value);
    }
}

 

Adding type information

1

a—Upcasts are safe (casting child to parent type).

2

b—Narrowing casts are unsafe (might lose information).

 

Hiding and restoring type information

1

b—unknown is a safer option than any.

2

a—unknown and any remove too much type information.

 

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.221.53.5