Chapter 8. Type Hints in Functions

I learned a painful lesson that for small programs, dynamic typing is great. For large programs you need a more disciplined approach. And it helps if the language gives you that discipline rather than telling you “Well, you can do whatever you want”.1

Guido van Rossum

In 2006, PEP 3107 introduced the function annotation syntax for Python 3.0. For nine years, that syntactic feature had no standard meaning, to allow for experimentation. The Python community tried different ways of using them, and coverged to PEP 484 — Type Hints, giving a very specific meaning to the annotations, supported by the typing module since Python 3.5.

PEP 484 introduced a gradual type system to Python. Besides Microsoft’s TypeScript, other languages with gradual type systems are Dart (the language of the Flutter SDK, created by Google), and Hack (a dialect of PHP supported by Facebook’s HHVM virtual machine). The Mypy type checker itself started as a language: a gradually typed dialect of Python with its own interpreter. Guido van Rossum convinced the creator of Mypy, Jukka Lehtosalo, to make it a tool for checking annotated Python code.

The best usability feature of gradual typing is that annotations are always optional. With static type systems, most type constraints are easy to express, many are cumbersome, some are hard, and a few are impossible.2 You may very well write an excellent piece of Python code, with good test coverage and passing tests, but still be unable to add type hints that satisfy a type checker. That’s ok, just leave out the problematic type hints and ship it!

This chapter focuses on Python’s type hints in function signatures. [Link to Come] explores type hints in the context of classes, and other typing module features.

The major topics in this chapter are:

  • A hands-on introduction to gradual typing with Mypy.

  • The complementary perspectives of duck typing and nominal typing.

  • Overview of the main categories of types that can appear in annotations—this is about 60% of the chapter.

  • Function signature overloading.

  • Type hinting variadic parameters (*args, **kwargs).

  • Runtime access to annotations.

What’s new in this chapter

Most of the content of this chapter is new. Type hints appeared in Python 3.5 after I wrapped up the first edition of Fluent Python.

In addition to the new content, I moved section “Reading annotations at runtime” from Chapter 7 because holding type hints is now the only recommended use for __annotations__.

Now let’s review the essence of gradual typing, then see it in practice through an example.

About gradual typing

A gradual type system:

Is optional.

By default, the type checker should not emit warnings for code that has no type hints. Instead, if a type cannot be inferred, the type checker assumes the Any type, which is consistent with all other types.

Does not catch type errors at runtime.

Type hints are used by static type checkers, linters, and IDEs to raise warnings; they do not prevent inconsistent values to be passed to functions or assigned to variables at runtime.

Does not enhance performance.

Type annotations provide data that can help generating optimized byte code or machine code, but such optimizations are not implemented in any Python runtime that I am aware in early 2020.3

Gradual typing in practice

Let’s see how gradual typing works in practice, starting with a simple function and gradually adding type hints to it, guided by Mypy.

Note

There are several Python type checkers compatible with PEP 484, including Google’s Pytype, Microsoft’s Pyright, Facebook’s Pyre—in addition to type checkers embedded in IDEs such as PyCharm. I picked Mypy for the examples because it’s the best known, by far. However, one of the others may be a better fit for some projects or teams. Pytype, for example, is designed to handle code bases with no type hints and still provide useful advice. It is more lenient than Mypy, and can also generate annotations for your code.

We will annotate a show_count function that returns a string with a count and a singular or plural word, depending on the count:

>>> show_count(99, 'bird')
'99 birds'
>>> show_count(1, 'bird')
'1 bird'
>>> show_count(0, 'bird')
'no bird'

Example 8-1 shows the source code of show_count, without annotations.

Example 8-1. show_count from messages.py without type hints.
def show_count(count, word):
    if count == 0:
        return f'no {word}'
    elif count == 1:
        return f'{count} {word}'
    return f'{count} {word}s'

Starting with Mypy

To begin type checking, I run the mypy command on the messages.py module:

/no_hints/ $ pip install mypy
[lots of messages omitted...]
/no_hints/ $ mypy messages.py
Success: no issues found in 1 source file

Mypy finds no problem with the code in Example 8-2 when run with its default settings.4

If a function signature has no annotations, Mypy ignores it by default. That’s the spirit of gradual typing.

For this example, I also have pytest unit tests. This is the code in messages_test.py.

Example 8-2. messages_test.py without type hints.
from pytest import mark

from messages import show_count

@mark.parametrize('qty, expected', [
    (1, '1 part'),
    (2, '2 parts'),
])
def test_show_count(qty, expected):
    got = show_count(qty, 'part')
    assert got == expected

def test_show_count_zero():
    got = show_count(0, 'part')
    assert got == 'no part'

Let’s check messages_test.py:

$ mypy messages_test.py
messages_test.py:3: error: Skipping analyzing 'pytest':
 found module but no type hints or library stubs
messages_test.py:3: note: See
 https://mypy.readthedocs.io/en/latest/running_mypy.html#missing-imports
Found 1 error in 1 file (checked 1 source file)

The problem is that messages_test imports pytest, which doesn’t have type hints as I write this.5 I add this comment to the import line to make Mypy ignore pytest:

from pytest import mark  # type: ignore

Now Mypy doesn’t report any issues with messages_test.py.

Making Mypy More Strict

The command-line option --disallow-untyped-defs makes Mypy flag any function definition that does not have type hints for all its parameters and for its return value.

Using --disallow-untyped-defs on the test file produces three errors and a note:

/no_hints/ $ mypy --disallow-untyped-defs messages_test.py
messages.py:14: error: Function is missing a type annotation
messages_test.py:10: error: Function is missing a type annotation
messages_test.py:15: error: Function is missing a return type annotation
messages_test.py:15: note: Use "-> None" if function does not return a value
Found 3 errors in 2 files (checked 1 source file)

For the first steps with gradual typing, I prefer to use another option: --disallow-incomplete-defs. Initially, it tells me nothing:

/no_hints/ $ mypy --disallow-incomplete-defs messages_test.py
Success: no issues found in 1 source file

But now, I may add just the return type to show_count in messages.py:

def show_count(count, word) -> str:

This is enough to make Mypy look at it. Using the same command line as before to check messages_test.py, will lead Mypy to look at messages.py again:

/no_hints/ $ mypy --disallow-incomplete-defs messages_test.py
messages.py:14: error: Function is missing a type annotation for one or more arguments
Found 1 error in 1 file (checked 1 source file)

Now I can gradually add type hints function by function, without getting warnings about functions that I haven’t annotated. This is a fully annotated signature that satisfies Mypy:

def show_count(count: int, word: str) -> str:
Tip

Instead of providing options like --disallow-incomplete-defs on the command line, it’s better to create a configuration file as described in the Mypy configuration file documentation. You can have global settings and per-module settings. Here is a good mypy.ini to get started—which also ignores pytest:

[mypy]
python_version = 3.8
warn_unused_configs = True
disallow_incomplete_defs = True
[mypy-pytest]
ignore_missing_imports = True

A Default Parameter Value

The show_count function—first shown in Example 8-2—has an obvious limitation: it only works with regular nouns. If the plural can’t be spelled by appending an 's', we should let the user provide the plural form, like this:

>>> show_count(3, 'mouse', 'mice')
'3 mice'

Let’s do a little “type driven development”. First we add a test that uses that third argument. Don’t forget the return type hint -> None to the test function otherwise Mypy will not check it.

def test_irregular() -> None:
    got = show_count(2, 'child', 'children')
    assert got == '2 children'

Mypy detects the error:

/hints_2/ $ mypy messages_test.py
messages_test.py:22: error: Too many arguments for "show_count"
Found 1 error in 1 file (checked 1 source file)

Now I edit show_count, adding the optional plural parameter:

Example 8-3. showcount from hints_2/messages.py with an optional parameter.
def show_count(count: int, singular: str, plural: str = '') -> str:
    if count == 0:
        return f'no {singular}'
    elif count == 1:
        return f'1 {singular}'
    else:
        if plural:
            return f'{count} {plural}'
        else:
            return f'{count} {singular}s'

The following details are not mandatory, but are considered good style for type hints:

  • There should be no space between the parameter name and the :, and one space after the :.

  • There should be one space on each side of the = that precedes the default parameter value.

Tip

The PEP 8 style guide has different recommendations for default parameter values, depending on the use of type hints. Here is an actual example from section Other Recommendations in PEP 8:

def munge(input: AnyStr, sep: AnyStr = None, limit=1000):

If there is a type hint, there must be spaces around the = preceding the default value. If there is no type hint, there should be no spaces.

Using None as a default

In Example 8-3 the parameter plural is anotated as str, and the default value is '', so there is no type conflict.

I like that solution, but in other contexts None is a better default. If the optional parameter expects a mutable type, then None is the only sensible default—as we saw in “Mutable Types as Parameter Defaults: Bad Idea”.

To have None as the default for the plural parameter, here is how the signature would look like:

from typing import Optional

def show_count(count: int, singular: str, plural: Optional[str] = None) -> str:

Let’s unpack that:

  • Optional[str] means plural may be a str or None.

  • You must explicitly provide the default value = None.

If you don’t assign a default value to plural, the Python runtime will treat it as a required parameter. Remember: at runtime, type hints are ignored.

Note that we need to import Optional from the typing module. When importing types, it’s good practice to use the syntax from typing import X, to reduce the length of the function signatures.

Warning

Optional is not a great name, because that annotation does not make the parameter optional. What makes it optional is assigning a default value to the parameter. Optional[str] just means: the type of this parameter may be str or NoneType. In the Haskell and Elm languages, a similar type is named Maybe.

Type hints for Python 2.7 and 3.x

Companies with large Python 2 codebases have learned that type hints are very helpful when migrating to Python 3. It’s possible to annotate code that will run on Python 2.7 and Python 3.x using special comments described in PEP 484.

This is how the final version of the show_count signature looks like using syntax that works in Python 2.7 and 3.x—also supported by Mypy and other type checkers:

from typing import Optional

def show_count(count, singular, plural=None):
    # type: (int, str, Optional[str]) -> str

Note that only the parameter types appear in the comment.

If the parameter list is too long, the signature may be annotated like this:

from typing import Optional

def show_count(count,       # type: int
               singular,    # type: str
               plural=None  # type: Optional[str]
               ):
    # type: (...) -> str

The last type comment would be exactly as shown: the ... replaces the parameter types already given, and the -> str defines the return type.

For more details, see Suggested syntax for Python 2.7 and straddling code in PEP 484.

Stub files

Another way of making type hints compatible with Python 2.7 and 3.x is to use stub files, which contain just annotated function and class declarations—much like header files in C and C++. Mypy, PyCharm, and other type checkers know how to read stub files, and they share the typeshed project, a collection of stub files for the Python standard library and popular external packages like Flask, attr, requests, etc. I will not cover how to create and manage stub files. If you are interested, see PEP 484 section Stub Files and PEP 561—Distributing and Packaging Type Information

Now that we’ve had a first practical view of gradual typing, let’s consider what the concept of type means in practice.

Types are defined by supported operations

There are many definitions of the concept of type in the literature. Here we assume that type is a set of values and a set of functions that one can apply to these values.

PEP 483: The Theory of Type Hints

In practice, it’s more useful to consider the set of supported operations as the defining characteristic of a type.6

For example, from the point of view of applicable operations, what are the valid types for n in the following function?

def double(n):
    return n * 2

The n parameter type may be numeric (int, complex, Fraction, numpy.uint32 etc.) but it may also be a sequence (str, tuple, list, array), an N-dimensional numpy.array or any other type that implements or inherits a __mul__ method that accepts an int argument.

However, consider this annotated double. Please ignore the missing return type for now, let’s focus on the parameter type:

from collections import abc

def double(n: abc.Sequence):
    return n * 2

A type checker will reject that code. If you tell Mypy that n is of type abc.Sequence, it will flag n * 2 as an error because the Sequence ABC does not implement or inherit the __mul__ method. At runtime, that code will work with concrete sequences such as str, tuple, list, array etc.—as well as numbers, because at runtime the type hints are ignored. But the type checker only cares about what is explicitly declared, and abc.Sequence has no __mul__.

That’s why the title of this section is “Types are defined by operations”. The Python runtime accepts any object as the n argument for both versions of the double function. The computation n * 2 may work, or it may raise TypeError if the operation is not supported by n. In contrast, Mypy will declare n * 2 as wrong while analyzing the annotated double source code, because its an unsupported operation for the declared type: n: abc.Sequence.

In a gradual type system, we have the interplay of two different views of types:

Duck typing

The view adopted by Smalltalk—the pioneering OO language—as well as Python and Ruby. Objects have types, but variables (including parameters) are untyped. In practice, it doesn’t matter what is the declared type of the object, only what operations it actually supports. If I can invoke birdie.quack(), then birdie is a duck in this context. By definition, duck typing is only enforced at runtime, when operations on objects are attempted. This is more flexible than nominal typing, at the cost of allowing more errors at runtime.7

Nominal typing

The view adopted by C++, Java, and C#, supported by annotated Python. Objects and variables have types. But objects only exist at runtime, and the type checker only cares about the source code where variables (including parameters) are annotated with type hints. If Duck is a subclass of Bird, you can assign a Duck instance to a parameter annotated as birdie: Bird. But in the body of the function, the type checker considers the call birdie.quack() illegal, because birdie is nominally a Bird—even if at runtime it’s actually a Duck. Nominal typing is enforced statically, before the program is run. This is more rigid than duck typing, with the advantage of catching some bugs earlier in a build pipeline, or even as the code is typed in an IDE.

Here is a silly example that contrasts duck typing and nominal typing, as well as static type checking and runtime behavior8:

Example 8-4. birds.py
class Bird:
    pass

class Duck(Bird):  1
    def quack(self):
        print('Quack!')

def alert(birdie):  2
    birdie.quack()

def alert_duck(birdie: Duck) -> None:  3
    birdie.quack()

def alert_bird(birdie: Bird) -> None:  4
    birdie.quack()
1

Duck is a subclass of Bird.

2

alert has no type hints, so the type checker ignores it.

3

alert_duck takes one argument of type Duck.

4

alert_bird takes one argument of type Bird.

Type checking birds.py with Mypy, we see a problem:

/birds/ $ mypy birds.py
birds.py:16: error: "Bird" has no attribute "quack"
Found 1 error in 1 file (checked 1 source file)

Just by analyzing the source code, Mypy sees that alert_bird is problematic: the type hint declares the birdie parameter with type Bird, but the body of the function calls birdie.quack()—and the Bird class has no such method.

Now let’s try to use the birds module in daffy.py:

Example 8-5. daffy.py
from birds import *

daffy = Duck()
alert(daffy)       1
alert_duck(daffy)  2
alert_bird(daffy)  3
1

Valid call, because alert has no type hints.

2

Valid call, because alert_duck takes a Duck argument, and daffy is a Duck.

3

Valid call, because alert_bird takes a Bird argument, and daffy is a also a Bird—the superclass of Duck.

Running Mypy on daffy.py raises the same error about the quack call in the alert_bird function defined in birds.py:

/birds/ $ mypy daffy.py
birds.py:16: error: "Bird" has no attribute "quack"
Found 1 error in 1 file (checked 1 source file)

But Mypy sees no problem wit daffy.py itself: the three function calls are OK.

Now, if you run daffy.py, this is what you get:

…/birds/ $ python3 daffy.py
Quack!
Quack!
Quack!

Everything works! Duck typing FTW!

At runtime, Python doesn’t care about declared types. It uses duck typing only. Mypy flagged an error in alert_bird, but calling it with daffy works fine at runtime. This may surprise many Pythonistas at first: a static type checker will sometimes find errors in programs that we know will execute.

However, if months from now you are tasked with extending the silly bird example, you may be grateful for Mypy. Consider this woody.py module which also uses birds:

Example 8-6. woody.py
from birds import *

woody = Bird()
alert(woody)
alert_duck(woody)
alert_bird(woody)

Mypy finds two errors while checking woody.py:

/birds/ $ mypy woody.py
birds.py:16: error: "Bird" has no attribute "quack"
woody.py:5: error: Argument 1 to "alert_duck" has incompatible type "Bird"; expected "Duck"
Found 2 errors in 2 files (checked 1 source file)

The first error is in birds.py: the birdie.quack() call in alert_bird, which we’ve seen before. The second error is in woody.py: woody is an instance of Bird, so the call alert_duck(woody) is invalid because that function requires a Duck. Every Duck is a Bird, but not every Bird is a Duck.

At runtime, none of the calls in woody.py succeed. Given that woody is a Bird:

  • alert(woody) fails, and Mypy could not help us because there are no type hints in alert.

  • alert_duck(woody) fails, and Mypy saw the problem: Argument 1 to "alert_duck" has incompatible type "Bird"; expected "Duck".

  • alert_bird(woody) fails, and Mypy has been telling us since Example 8-4 that the body of the alert_bird function is wrong: "Bird" has no attribute "quack".

This little experiment shows that duck typing is easier to get started and is more flexible, but allows unsupported operations to cause errors at runtime. Nominal typing detects errors before runtime, but sometimes can reject code that actually runs—such as the call alert_bird(daffy) in Example 8-5. Even if it sometimes works, the alert_bird function is misnamed: its body does require an object that supports the .quack() method, which Bird doesn’t have.

In this silly example, the functions are one-liners. But in real code they could be longer, they could pass the birdie argument to more functions, and the origin of the birdie argument could be many frames away, making it hard to pinpoint the cause of a runtime error. The type checker prevents many such errors from ever happening at runtime.

Note

The value of type hints is questionable in the tiny examples that fit in a book. The benefits grow with the size of the codebase. That’s why companies with millions of lines of Python code—like Dropbox, Google, and Facebook—invested in teams and tools to support the company-wide adoption of type hints, and have significant and increasing portions of their Python codebases type checked in their CI pipelines.

In this section we explored the relatioship of types and operations in duck typing and nominal typing, starting with the simple double() function—which we left without proper type hints. Now we will tour the most important types used for annotating functions. We’ll see a good way to add type hints to double() when we reach “Protocols”. But before we get to that, there are more fundamental types to know.

Types usable in annotations

Pretty much any Python type can be used in type hints, but there are restrictions and recommendations. In addition, the typing module introduced special constructs with semantics that are sometimes surprising.

This section covers all the major types you can use with annotations:

  • typing.Any;

  • Simple types and classes;

  • typing.Optional and typing.Union;

  • Generic collections, including tuples and mappings;

  • typing.TypedDict—for type hinting dicts used as records;

  • Abstract Base Classes—and a few you should not use;

  • Generic iterables;

  • Parameterized generics and the TypeVar class;

  • typing.Protocols—the key to static duck typing;

  • typing.Callable;

  • typing.NoReturn—a good way end to this list.

We’ll cover each of these in turn, starting with a type that is strange, apparently useless, but crucially important.

The Any type

The keystone of any gradual type system is the Any type, also known as the dynamic type. When a type checker sees an untyped function like this:

def double(n):
    return n * 2

It assumes this:

def double(n: Any) -> Any:
    return n * 2

That means the n argument and the return value can be of any type, including different types. Any is assumed to support every possible operation.

Contrast Any with object. Consider this signature:

def double(n: object) -> object:

This function also accepts arguments of every type, because every type is a subtype of object.

However, a type checker will reject this function:

def double(n: object) -> object:
    return n * 2

The problem is that object does not support the __mul__ operation. This is what Mypy reports:

/birds/ $ mypy double_object.py
double_object.py:2: error: Unsupported operand types for * ("object" and "int")
Found 1 error in 1 file (checked 1 source file)

More general types have narrower interfaces, i.e. they support less operations. The object class implements fewer operations than abc.Sequence, which implements fewer operations than abc.MutableSequence, which implements fewer operations than list.

But Any is a magic type that sits at the top and the bottom of the type hierarchy. It’s simultaneously the most general type—so that an argument n: Any accepts values of every type—and the most specialized type, supporting every possible operation. At least, that’s how the type checker understands Any.

Of course, no type can support every possible operation, so using Any prevents the type checker from fulfilling its core mission: detecting potentially illegal operations before your program crashes with a runtime exception.

is-subtype-of versus is-consistent-with

Traditional object-oriented nominal type systems rely on the is-subtype-of relationship. Given a class C1 and a subclass C2, then C2 is-subtype-of C1.

Consider this code:

class C1:
    ...

class C2(C1):
    ...

def f1(p: C1) -> None:
    ...

o2 = C2()

f1(o2)  # OK

The call f1(o2) is an application of the Liskov Substitution Principle—LSP. Barbara Liskov9 actually defined is-sub-type-of terms of supported operations: if an object of type C2 substitutes an object of type C1 and the program still behaves correctly, then C2 is-subtype-of C1.

Continuing from the previous code, this shows a violation of the LSP:

def f2(p: C2) -> None:
    ...

o1 = C1()

f2(o1)  # type error

From the point of view of supported operations, this makes perfect sense: as a subclass, C2 inherits and must support all operations that C1 does. So an instance of C2 can be used anywhere a instance of C1 is expected. But the reverse is not necessarily true: C2 may implement additional methods, so an instance of C1 may not be used everywhere an instance of C2 is expected. This focus on supported operations is reflected in the name behavioral subtyping, also used to refer to the LSP.

In a gradual type system, there is another relatioship: is-consistent-with, which applies wherever is-subtype-of applies, except when the type Any is involved.

The rules for is-consistent-with are:

  1. Given T1 and a subtype T2, then T2 is-consistent-with T1 (Liskov substitution).

  2. Every type is-consistent-with Any: you can pass objects of every type to an argument declared of type Any.

  3. Any is-consistent-with every type: you can always pass an object of type Any where an argument of another type is expected.

Considering the previous definitions of the objects o1 and o2, here are examples of valid code, illustrating rules #2 and #3:

def f3(p: Any) -> None
    ...

o0 = object()
o1 = C1()
o2 = C2()

f3(o0)  #
f3(o1)  #  all OK: rule #2
f3(o2)  #

def f4():  # implicit return type: `Any`
    ...

o4 = f4()  # inferred type: `Any`

f1(o4)  #
f2(o4)  #  all OK: rule #3
f3(o4)  #

Every gradual type system needs a wildcard type like Any. Now we can explore the rest of the types used in annotations.

Simple types and classes

Simple types like int, float, str, bytes may be used directly in type hints. Concrete classes from the standard library, external packages, or user defined— FrenchDeck, Vector2d, and Duck—may also be used in type hints.

Abstract Base Classes are also useful in type hints. We’ll get back to them as we study collection types, and in “Abstract Base Classes”.

Among classes, is-consistent-with is defined like is-subtype-of: a subclass is consistent with all its superclasses.

However, “practicality beats purity” so there is an important exception:

int is consistent with complex

There is no nominal subtype relationship between the built-in types int, float and complex: they are direct subclasses of object. But PEP 484 declares that int is-consistent-with float, and float is-consistent-with complex. It makes sense in practice: int implements all operations that float does, and int implements additional ones as well—bitwise operations like &, |, << etc. The end result is: int is-consistent-with complex. For i = 3, i.real is 3, and i.imag is 0.

Optional and Union types

We saw the Optional special type in “Using None as a default”. It solves the problem of having None as a default, as in this example from that section:

from typing import Optional

def show_count(count: int, singular: str, plural: Optional[str] = None) -> str:

The construct Optional[str] is actually a shortcut for Union[str, None] which means the type of plural may be str or None.

The ord built-in function’s signature is a simple example of Union—it accepts str or bytes, and returns an int:10

def ord(c: Union[str, bytes]) -> int: ...
Tip

There are functions that acccept a str or bytes argument but return str if the argument was str or bytes if the arguments was bytes. In those cases, the return type is determined by the input type, so Union is not an accurate solution. To properly annotate such functions, we need a type variable—presented in “Parameterized generics and TypeVar—or overloading, which we’ll see in “Overloaded signatures”.

Here is an example of a function takes a str, but may return a str or a float:

from typing import Union

def parse_token(token: str) -> Union[str, float]:
    try:
        return float(token)
    except ValueError:
        return token

If possible, avoid creating functions that return Union types, as they put an extra burden on the user—forcing them to check the type of the returned value at runtime to know what to do with it. But the parse_token above is a reasonable use case in the context of a simple expression evaluator.

Union[] requires at least two types. Nested Union types have the same effect as a flattened Union. So this type hint:

Union[A, B, Union[C, D, E]]

is the same as:

Union[A, B, C, D, E]

Union is more useful with types that are not consistent with each other. For example: Union[int, float] is rarely useful as a type hint, because int is-consistent-with float. Usually it’s better to use float to annotate the parameter, then it will accept int values as well.

Note

A new syntax for Union is under consideration: instead of Union[str, float], we could write str | float. See PEP 604 — Complementary syntax for Union[]. The status of PEP 604 is “Draft” as I write this. It was intended for Python 3.9, but missed the feature freeze for beta 1 and now is slated for 3.10.

Generic collections

Most Python collections are heterogeneous: for example, you can put any mixture of different types in a list. However, in practice that is not very useful: if you put objects in a list, you are likely to want to operate on them later, and usually this means they share at least one common method.11

With Python’s type hints, a collection can be annotated with a generic type to constrain the type of the elements in the collection. For example:

def tokenize(text: str) -> list[str]:
    return text.upper().split()

In Python 3.9, that means tokenize returns a list where all items are str.

However, list and the other built-in collections only support that notation in Python 3.7 and 3.8 with a __future__ import:

from __future__ import annotations

def tokenize(text: str) -> list[str]:
    return text.upper().split()

Sadly, that __future__ does not work with Python 3.5 or 3.6, nor is it supported by Mypy 0.770—which I am using as I write this.

So this is how to annotate tokenize in a way that works with Python ≥ 3.5 and Mypy in May 2020:

from typing import List

def tokenize(text: str) -> List[str]:
    return text.upper().split()

Note that you need to import the List type from the typing module.

To annotate a list that can hold any type of object, the type hint would be List[Any]. That’s the same as writing just List, or even list.

Besides List, the typing module defines dozens of types that are derived from existing standard library classes with the added feature of supporting generic type notation with []. Table 8-1 lists the collections that are not mappings.

Note

I’d rather use the built-in collections as generics, but I chose to import the generic collections from the typing module because when this book is released most readers will probably be using Python 3.8 or earlier.

Table 8-1. Collection types and their type hint equivalents (mapping types excluded)
collection type hint equivalent

list

typing.List

set

typing.Set

frozenset

typing.FrozenSet

collections.deque

typing.Deque

collections.abc.MutableSequence

typing.MutableSequence

collections.abc.Sequence

typing.Sequence

collections.abc.Collection

typing.Collection

collections.abc.Container

typing.Container

collections.abc.Set

typing.AbstractSet

collections.abc.MutableSet

typing.MutableSet

Python 3.9 implements PEP 585—Type Hinting Generics In Standard Collections, which means the collection types in the left column of Table 8-1 support the generic [], so you can write set[str] and don’t need to import typing.Set.

Warning

With the release of 3.9, the type hint equivalents in the right column of Table 8-1 become reduntant and are deprecated. Type checkers are expected to warn about this deprecation. To minimize runtime impact, Python itself will issue no warnings, and the type hint equivalents will only removed from the typing module five years after Python 3.9 is released.

As of May 2020, there is no good way to annotate array.array taking into account the typecode constructor argument which determines whether integers or floats are stored in the array.12

Tuple

There are three ways to annotate tuple types.

The first is using typing.Tuple and specifying the type of each field in the tuple. For example, to accept a tuple with city name, population and country—('Shanghai', 24.28, 'China')—the type hint would be Tuple[str, float, str].

Consider a function that takes a pair of geographic coordinates and returns a Geohash, used like this:

>>> shanghai = 31.2304, 121.4737
>>> geohash(shanghai)
'wtw3sjq6q'

This is how geohash is defined, using the geolib package from PyPI:

Example 8-7. coordinates.py with the geohash function.
from typing import Tuple

from geolib import geohash as gh  # type: ignore

PRECISION = 9

def geohash(lat_lon = Tuple[float, float]) -> str:
    return gh.encode(*lat_lon, PRECISION)

The second way is using typing.NamedTuple—as seen in Chapter 5. Here is a variation of Example 8-7 with NamedTuple:

Example 8-8. coordinates_named.py with the NamedTuple Coordinates and the geohash function.
from typing import Tuple, NamedTuple

from geolib import geohash as gh  # type: ignore

PRECISION = 9

class Coordinate(NamedTuple):
    lat: float
    lon: float

def geohash(lat_lon: Coordinate) -> str:
    return gh.encode(*lat_lon, PRECISION)

def display(lat_lon: Tuple[float, float]) -> str:
    lat, lon = lat_lon
    ns = 'N' if lat >= 0 else 'S'
    ew = 'E' if lon >= 0 else 'W'
    return f'{abs(lat):0.1f}°{ns}, {abs(lon):0.1f}°{ew}'

As explained in “Overview of data class builders”, typing.NamedTuple is a factory for tuple subclasses, so Coordinate is-consistent-with Tuple[float, float] but the reverse is not true—after all, Coordinate has extra methods added by NamedTuple, like ._as_dict(), and could also have user-defined methods.

The third way is for annotating tuples of unspecified length that are used as immutable lists: you must specify a single type, followed by a comma and ... (that’s Python’s ellipsis token, made of three periods, not Unicode U+2026—HORIZONTAL ELLIPSIS).

For example, the type for a tuple with int elements is Tuple[int, ...].

The ellipsis indicates that any number of elements >= 1 is acceptable. There is no way to specify fields of different types for tuples of unspecified length.

Here is a columnize function that transforms a sequence into a table of rows and cells in the form of list of tuples with unspecified lengths. This is useful to display items in columns, like this:

>>> animals = 'drake fawn heron ibex koala lynx tahr xerus yak zapus'.split()
>>> table = columnize(animals)
>>> table
[('drake', 'koala', 'yak'), ('fawn', 'lynx', 'zapus'), ('heron', 'tahr'),
 ('ibex', 'xerus')]
>>> for row in table:
...     print(''.join(f'{word:10}' for word in row))
...
drake     koala     yak
fawn      lynx      zapus
heron     tahr
ibex      xerus

Example 8-9 shows the implementation of columnize. Note the return type, List[Tuple[str, ...]].

Example 8-9. columnize.py returns a list of tuples of strings.
from typing import Sequence, List, Tuple

def columnize(sequence: Sequence[str], num_columns: int = 0) -> List[Tuple[str, ...]]:
    if num_columns == 0:
        num_columns = round(len(sequence) ** .5)
    num_rows, reminder = divmod(len(sequence), num_columns)
    num_rows += bool(reminder)
    return [tuple(sequence[i::num_rows]) for i in range(num_rows)]

The annotations Tuple[Any, ...], Tuple, and tuple mean the same thing.

In codebases supporting only Python 3.9 or later, the recommended signature for Example 8-9 is:

from collections.abc import Sequence

def columnize(sequence: abc.Sequence[str],
              num_columns: int = 0) -> list[tuple[str, ...]]:

Note there is no typing import. The list and tuple built-ins are used, as well as collections.abc.Sequence instead of typing.Sequence.

Warning

PEP 585—Type Hinting Generics In Standard Collections also affects the use of typing.Tuple, which is deprecated in Python 3.9 and will be removed five years after that release.

Generic mappings

Generic mapping types are annotated as MappingType[KeyType, ValueType].

For example, a JSON object must have string keys, but the values can be anything, so this would be written as Dict[str, Any].

Example 8-10 shows a practical use of a function returning an inverted index to search Unicode characters by name—a variation of Example 4-21 more suitable for server-side code that we’ll study in [Link to Come].

Given starting and ending Unicode character codes, name_index returns a Dict[str, Set[str]] which is an inverted index mapping each word to a set of characters that have that word in their names. For example, after indexing ASCII characters from 32 to 64, here are the sets of characters mapped to the words 'SIGN' and 'DIGIT', and how to find the character named 'DIGIT EIGHT':

>>> index = name_index(32, 65)
>>> index['SIGN']
{'$', '>', '=', '+', '<', '%', '#'}
>>> index['DIGIT']
{'8', '5', '6', '2', '3', '0', '1', '4', '7', '9'}
>>> index['DIGIT'] & index['EIGHT']
{'8'}

Below is the source code for charindex.py with the name_index function. Besides a Dict[] type hint, this example has three features appearing for the first time in the book.

Example 8-10. charindex.py
import sys
import re
import unicodedata
from typing import Dict, Set, Iterator

RE_WORD = re.compile(r'w+')
STOP_CODE = sys.maxunicode + 1

def tokenize(text: str) -> Iterator[str]:  1
    """return iterable of uppercased words"""
    for match in RE_WORD.finditer(text):
        yield match.group().upper()

def name_index(start: int = 32, end: int = STOP_CODE) -> Dict[str, Set[str]]:
    index: Dict[str, Set[str]] = {}  2
    for char in (chr(i) for i in range(start, end)):
        if name := unicodedata.name(char, ''):  3
            for word in tokenize(name):
                index.setdefault(word, set()).add(char)
    return index
1

tokenize is a generator function. [Link to Come] is about generators.

2

The local variable index is annotated. Without the hint, Mypy says: error: Need type annotation for 'index' (hint: "index: Dict[<type>, <type>] = ..."). We’ll come back to variable type hints in [Link to Come].

3

I used the walrus operator := in the if condition. It assigns the result of the unicodedata.name() call to name, and the whole expression evaluates to that result. When the result is '', that’s falsy and the index is not updated.13

The typing module also defines several mapping types and view types, listed in Table 8-2.

Table 8-2. Mapping types, views and their type hint equivalents
collection type hint equivalent

dict

typing.Dict

collections.defaultdict

typing.DefaultDict

collections.OrderedDict

typing.OrderedDict

collections.Counter

typing.Counter

collections.ChainMap

typing.ChainMap

collections.abc.Mapping

typing.Mapping

collections.abc.MutableMapping

typing.MutableMapping

collections.abc.MappingView

typing.MappingView

collections.abc.KeysView

typing.KeysView

collections.abc.ItemsView

typing.ItemsView

collections.abc.ValuesView

typing.ValuesView

Warning

With the release of Python 3.9, the collection types in the left column of Table 8-2 support generic notation [] and the type hint equivalents on the right are deprecated and will be removed five years later.

To instantiate an annotated defaultdict, a variable type hint is required for Python 3.8 and earlier—because typing.DefaultDict cannot be called to construct an object, and collections.defaultdict does not accept the generic syntax. Here is an example:

my_dict: DefaultDict[str, List[int]] = defaultdict(list)

TypedDict

Note

This is a rather long section for one specific type. However, exploring TypedDict with Mypy illustrates important points about gradual typing in Python. In particular, it’s tempting to use TypedDict to protect against errors while handling dynamic data structures like JSON API responses. But the examples here make clear that correct handling of JSON is essentially a matter of runtime validation, not static type checking.

The mapping types we just saw limit all values to have the same type.

However, Python dictionaries are often used as records, with field names as keys. For example, consider a record describing a book in JSON or Python:

{"isbn": "0134757599",
 "title": "Refactoring, 2e",
 "authors": ["Martin Fowler", "Kent Beck"],
 "pagecount": 478}

Before Python 3.8, there was no good way to annotate a record like that. Here are two possibilities, both unsatisfactory:

Dict[str, Any]

The values may be of any type.

Dict[str, Union[str, int, List[str]]]

Hard to read, and doesn’t preserve the relationship between field names and field types: title is supposed to be a str, it can’t be an int or a List[str].

Python 3.8 solved that problem by implementing PEP 589—TypedDict: Type Hints for Dictionaries with a Fixed Set of Keys.

Here is a simple TypedDict:

Example 8-11. books.py: the BookDict definition.
from typing import TypedDict, List
import json

class BookDict(TypedDict):
    isbn: str
    title: str
    authors: List[str]
    pagecount: int

At first glance, typing.TypedDict may seem like a data class builder, similar to the @dataclass decorator or typing.NamedTuple—both covered in Chapter 5.

The syntactic similarity is misleading. TypeDict is very different. It exists only for the benefit of type checkers, and has no runtime effect.

TypedDict provides two things:

  1. Class-like syntax to annotate a dict with type hints for the value of each “field”.

  2. A constructor that tells the type checker to expect a dict with the keys and values as specified.

At runtime, a TypeDict constructor such as BookDict is placebo: it has the same effect as calling the dict constructor with the same arguments.

The fact that BookDict creates a plain dict also means that:

  • The “fields” in the pseudo-class definition don’t create instance attributes.

  • You can’t write initializers with default values for the “fields”.

  • Method definitions are not allowed.

Let’s explore the behavior of a BookDict at runtime.

Example 8-12. Using a BookDict, but quite as intended.
>>> from books import BookDict
>>> pp = BookDict(title='Programming Pearls',  1
...               authors='Jon Bentley',  2
...               isbn='0201657880',
...               pagecount=256)
>>> pp  3
{'title': 'Programming Pearls', 'authors': 'Jon Bentley', 'isbn': '0201657880',
 'pagecount': 256}
>>> type(pp)
<class 'dict'>
>>> pp.title  4
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'dict' object has no attribute 'title'
>>> pp['title']
'Programming Pearls'
>>> BookDict.__annotations__  5
{'isbn': <class 'str'>, 'title': <class 'str'>, 'authors': typing.List[str],
 'pagecount': <class 'int'>}
1

You can call BookDict like a dict constructor with keyword arguments, or passing a dict argument—including a dict literal.

2

Ooops… I forgot authors takes a list. But gradual typing means no type checking at runtime.

3

The result of calling BookDict is a plain dict

4

… therefore you can’t read the data using object.field notation.

5

To get the type hints at runtime, read BookDict.__annotations__.

Without a type checker, TypedDict is as useful as comments: it may help people read the code, but that’s it. In contrast, the class builders from Chapter 5 are useful even if you don’t use a type checker because at runtime they generate or enhance a custom class that you can instantiate. They also provide several useful methods or functions listed in Table 5-1.

Example 8-13 builds a valid BookDict and tries some operations on it. This shows how TypedDict enables Mypy to catch errors, shown in Example 8-14.

Example 8-13. demo_books.py: legal and ilegal operations on a BookDict.
from books import BookDict
from typing import TYPE_CHECKING

def demo() -> None:  1
    book = BookDict(  2
        isbn='0134757599',
        title='Refactoring, 2e',
        authors=['Martin Fowler', 'Kent Beck'],
        pagecount=478
    )
    authors = book['authors'] 3
    if TYPE_CHECKING:  4
        reveal_type(authors)  5
    authors = 'Bob'  6
    book['weight'] = 4.2
    del book['title']


if __name__ == '__main__':
    demo()
1

Remember to add a return type, so that Mypy doesn’t ignore the function.

2

This is a valid BookDict: all the keys are present, with values of the correct types.

3

Mypy will infer the type of authors from the annotation for the 'authors' key in BookDict.

4

typing.TYPE_CHECKING is only True when the program is being type checked. At runtime, it’s always false.

5

The previous if statement prevents reveal_type(authors) from being called at runtime. reveal_type is not a runtime Python function, but a debugging facility provided by Mypy. That’s why there is no import for it. See its output in Example 8-14.

6

The last three lines of the demo function are illegal. They will cause error messages in in Example 8-14.

Type checking demo_books.py from Example 8-13, this is what we get:

Example 8-14. Type checking demo_books.py.
/typedict/ $ mypy demo_books.py
demo_books.py:13: note: Revealed type is 'built-ins.list[built-ins.str]'  1
demo_books.py:14: error: Incompatible types in assignment
                  (expression has type "str", variable has type "List[str]")  2
demo_books.py:15: error: TypedDict "BookDict" has no key 'weight'  3
demo_books.py:16: error: Key 'title' of TypedDict "BookDict" cannot be deleted  4
Found 3 errors in 1 file (checked 1 source file)
1

This note is the result of reveal_type(authors).

2

The type of the authors variable was inferred from the type of the book['authors'] expression that initialized it. You can’t assign a str to a variable of type List[str]. Type checkers usually don’t allow the type of a variable to change.14

3

Cannot assign to a key that is not part of the BookDict definition.

4

Cannot delete a key that is part of the BookDict definition.

Now let’s see BookDict used in function signatures, to type check function calls.

Imagine you need to generate XML from book records, similar to this:

<BOOK>
  <ISBN>0134757599</ISBN>
  <TITLE>Refactoring, 2e</TITLE>
  <AUTHOR>Martin Fowler</AUTHOR>
  <AUTHOR>Kent Beck</AUTHOR>
  <PAGECOUNT>478</PAGECOUNT>
</BOOK>

If you were writing MicroPython code to embed in a tiny microcontroller, you might write a function like this:15

Example 8-15. books.py: to_xml function.
AUTHOR_EL = '<AUTHOR>{}</AUTHOR>'

def to_xml(book: BookDict) -> str:  1
    elements: List[str] = []  2
    for key, value in book.items():
        if isinstance(value, list):  3
            elements.extend(
                AUTHOR_EL.format(n) for n in value)  4
        else:
            tag = key.upper()
            elements.append(f'<{tag}>{value}</{tag}>')
    xml = '
	'.join(elements)
    return f'<BOOK>
	{xml}
</BOOK>'
1

The whole point of the example: using BookDict in the function signature.

2

It’s often necessary to annotate collections that start empty, otherwise Mypy can’t infer the type of the elements.16

3

Mypy understands isinstance checks, and treats value as a list in this block.

4

When I used key == 'authors' as the condition for the if guarding this block, Mypy found an error in this line: "object" has no attribute "__iter__", because it inferred the type of value returned from book.items() as object, which doesn’t support the __iter__ method required by the generator expression. With the isinstance check, this works because Mypy knows that value is a list in this block.

Here is a function that parses a JSON str and returns a BookDict:

Example 8-16. books_any.py: from_json function.
def from_json(data: str) -> BookDict:
    whatever = json.loads(data)  1
    return whatever  2
1

The return type of json.loads() is Any.17

2

I can return whatever—of type Any—because Any is consistent with every type, including the declared return type, Bookdict.

The second point of Example 8-16 is very important to keep in mind: Mypy will not flag any problem in this code, but at runtime the value in whatever may not conform to the BookDict structure—in fact, it may not be a dict at all!

If you run Mypy with --disallow-any-expr it will complain about the two lines in the body of from_json:

/typedict/ $ mypy books_any.py --disallow-any-expr
books.py:30: error: Expression has type "Any"
books.py:31: error: Expression has type "Any"
Found 2 errors in 1 file (checked 1 source file)

In this case, the type error can be silenced by adding a type hint to the initialization of the whatever variable, as in Example 8-17:

Example 8-17. books.py: from_json function with variable annotation.
def from_json(data: str) -> BookDict:
    whatever: BookDict = json.loads(data)  1
    return whatever  2
1

--disallow-any-expr does not cause errors when an expression of type Any is immediately assigned to a variable with a type hint.

2

Now whatever is of type BookDict, the declared return type.

Warning

Don’t be lulled into a false sense of type safety by Example 8-17! Looking at the code at rest, the type checker cannot predict that json.loads() will return anything that resembles a BookDict. Only runtime validation can guarantee that.

Static type checking is unable to prevent errors with code that is inherently dynamic, such as json.loads(), which builds a Python objects of different types at runtime. Example 8-18, Example 8-19, and Example 8-20 demonstrate.

Example 8-18. demo_not_book.py: from_json returns an invalid BookDict, and to_xml accepts it.
from books import to_xml, from_json
from typing import TYPE_CHECKING

def demo() -> None:
    NOT_BOOK_JSON = """
        {"title": "Andromeda Strain",
         "flavor": "pistachio",
         "authors": true}
    """
    not_book = from_json(NOT_BOOK_JSON)  1
    if TYPE_CHECKING:  2
        reveal_type(not_book)
        reveal_type(not_book['authors'])

    print(not_book)  3
    print(not_book['flavor'])  4

    xml = to_xml(not_book)  5
    print(xml)  6


if __name__ == '__main__':
    demo()
1

This line does not produce a valid BookDict—see the content of NOT_BOOK_JSON.

2

Let’s have Mypy reveal a couple of types.

3

This should not be a problem, as print can handle object and every subtype.

4

BookDict has no 'flavor' key, but the JSON source does… what will happen?

5

Remember the signature: def to_xml(book: BookDict) -> str:

6

How will the XML output look like?

Checking demo_not_book.py with Mypy:

Example 8-19. Mypy report for demo_not_book.py, reformatted for clarity.
/typedict/ $ mypy demo_not_book.py
demo_not_book.py:12: note: Revealed type is
   'TypedDict('books.BookDict', {'isbn': built-ins.str,
                                 'title': built-ins.str,
                                 'authors': built-ins.list[built-ins.str],
                                 'pagecount': built-ins.int})'  1
demo_not_book.py:13: note: Revealed type is 'built-ins.list[built-ins.str]'  2
demo_not_book.py:16: error: TypedDict "BookDict" has no key 'flavor'  3
Found 1 error in 1 file (checked 1 source file)
1

The revealed type is the nominal type, not runtime content of not_book.

2

Again, this is the nominal type of not_book['authors'], as defined in BookDict. Not the runtime type.

3

This error is for line print(not_book['flavor']): that key does not exist in the nominal type.

Now let’s run demo_not_book.py.

Example 8-20. Output of running demo_not_book.py.
…/typedict/ $ python3 demo_not_book.py
{'title': 'Andromeda Strain', 'flavor': 'pistachio', 'authors': True}  1
pistachio  2
<BOOK>  3
        <TITLE>Andromeda Strain</TITLE>
        <FLAVOR>pistachio</FLAVOR>
        <AUTHORS>True</AUTHORS>
</BOOK>
1

This is not really a BookDict.

2

The value of not_book['flavor'].

3

to_xml takes a BookDict argument, but at runtime it’s more flexible: garbage in, garbage out.

Example 8-20 shows that demo_not_book.py outputs nonsense, but has no runtime errors. Using a TypeDict while handling JSON data did not provide much type safety.

If you look at the code for to_xml in Example 8-15 through the lens of duck typing, the argument book must provide an .items() method that returns an iterable of tuples like (key, value) where:

  • key must have an .upper() method;

  • value can be anything.

The point of this demonstration: when handling data with a dynamic structure, such as JSON or XML, TypeDict is absolutely not a replacement for data validation at runtime.

Tip

If you are interested in runtime validation of JSON schemas with type annotations, check out the pydantic package on PyPI.

TypedDict has more features, including support for optional keys, a limited form of inheritance, and an alternative declaration syntax for Python versions before 3.6. These will be covered in [Link to Come], section [Link to Come].

Abstract Base Classes

Be conservative in what you send, be liberal in what you accept.

Postel’s law, a.k.a. the Robustness Principle

Table 8-1 and Table 8-2 list several abstract classes from collections.abc. Ideally, a function should accept arguments of those abstract types—or their type hint equivalents before Python 3.9—and not concrete types. This gives more flexibility to the caller.

Consider this function signature:

def name2hex(name: str, color_map: Mapping[str, int]) -> str:

Using typing.Mapping allows the caller to provide an instance of dict, defaultdict, ChainMap, a UserDict subclass or any other type that is a subtype of Mapping.

In contrast, consider this signature:

def name2hex(name: str, color_map: Dict[str, int]) -> str:

Now color_map must be a dict or one of its subtypes such as DefaultDict or OrderedDict. In particular, a subclass of collections.UserDict would not pass the type check for color_map, despite being the recommended way to create user-defined mappings, as we saw in “Subclassing UserDict”. Mypy would reject a UserDict or an instance of class derived from it, because UserDict is not a subclass of dict; they are siblings, both are subclasses of abc.MutableMapping.18

Therefore, in general it’s better to use typing.Mapping or typing.MutableMapping instead of dict or typing.Dict as a parameter type. If the name2hex function doesn’t need to mutate the given color_map, the most accurate type hint for color_map is typing.Mapping. That way, the caller doesn’t need to provide an object that implements methods like setdefault, pop and update which are part of the MutableMapping interface, but not of Mapping. This has to do with the second part of Postel’s law: “be liberal in what you accept”.

Postel’s law also tells us to be conservative in what we send. The return value of a function is always a concrete object, so the return type hint should be a concrete type, as in the example from “Generic collections”—which uses list[str] assuming the code will run on Python 3.9—otherwise the typing equivalent List[str] should be used.

def tokenize(text: str) -> list[str]:
    return text.upper().split()

Under the entry of typing.List, the Python documentation says:

Generic version of list. Useful for annotating return types. To annotate arguments it is preferred to use an abstract collection type such as Sequence or Iterable.

A similar comment appears in the entries for typing.Dict and typing.Set.

Remember that most ABCs from collections.abc and other concrete classes from collections, as well as built-in collections, support generic type hint notation like collections.deque[str] starting with Python 3.9. The corresponding typing collections will only be needed to support code written in Python 3.8 or earlier. The full list of classes that became generic appears in section Implementation of PEP 585—Type Hinting Generics In Standard Collections.

To wrap up our discussion of ABCs in type hints, we need to talk about Numbers.

Stay away from the Numeric Tower

The numbers module is a little known corner of the standard library since it appeared Python 2.6. It defines a hierarchy of ABCs with Number at the top, then Complex, Real, Rational, and Integral. Those ABCs allow isinstance checks independent of specific implementations. For example, isinstance(x, numbers.Real) is True for x of type float, but also for NumPy types like float32, longdouble etc. PEP 484 section The Numeric Tower rejects the numbers ABCs and handles the built-in types complex, float, and int as special cases, as explained in int is consistent with complex. Mypy does not support the use of the numbers ABCs in type hints.19

Iterable

The typing.List documentation I just quoted recommends Sequence and Iterable for function parameter type hints.

One example of Iterable argument appears the math.fsum function from the standard library:

def fsum(__seq: Iterable[float]) -> float:
Tip

As of Python 3.8, the standard library has very few annotations but the Typeshed project has stub files for it. The signature for math.fsum is in /stdlib/2and3/math.pyi. The leading underscores in __seq are a PEP 484 convention explained in “Annotating positional-only and variadic parameters”.

Example 8-21 is another example using an Iterable parameter that produces items that are Tuple[str, str]. Here is how the function is used:

>>> l33t = [('a', '4'), ('e', '3'), ('i', '1'), ('o', '0')]
>>> text = 'mad skilled noob powned leet'
>>> from replacer import zip_replace
>>> zip_replace(text, l33t)
'm4d sk1ll3d n00b p0wn3d l33t'

And here is how it’s implemented:

Example 8-21. replacer.py
from typing import Iterable, Tuple

FromTo = Tuple[str, str]  1

def zip_replace(text: str, changes: Iterable[FromTo]) -> str:  2
    for from_, to in changes:
        text = text.replace(from_, to)
    return text
1

FromTo is a type alias: I assigned Tuple[str, str] to FromTo, to make the signature of zip_replace more readable.

2

changes needs to be an Iterable[FromTo]; that’s the same as Iterable[Tuple[str, str]], but shorter and easier to read.

A type alias

PEP 613—Explicit Type Aliases introduces a special type, TypeAlias, to make the assignments that create type aliases more visible and easier to typecheck. PEP 613 is aproved but was not implemented in Python 3.9 before the feature freeze in May 2020. When TypeAlias lands in the typing module, we’ll use it like this:

from typing import TypeAlias, Tuple

FromTo: TypeAlias = Tuple[str, str]

Iterable versus Sequencce

Both math.fsum and replacer.zip_replace must iterate over the entire Iterable arguments to return a result. Given an endless iterable such as the itertools.cycle generator as input, these functions would consume all memory and crash the Python process. Despite this potential danger, it is fairly common in modern Python to offer functions that accept an Iterable input even if they must process it completely to return a result. That gives the caller the option of providing input data as a generator instead of a pre-built sequence, potentially saving a lot of memory if the number of input items is large.

On the other hand, the columnize function from Example 8-9 needs a Sequence parameter, and not an Iterable, because it must get the len() of the input to decide the number or rows.

Like Sequence, Iterable is best used as a parameter type. It’s too vague as a return type. A function should be more precise about the concrete type it returns.

Closely related to Iterable is the Iterator type, used as a return type in Example 8-10. We’ll get back to it in [Link to Come] which is about generators and classic iterators.

Parameterized generics and TypeVar

A parameterized generic is a generic type, written as List[T] where T is a type variable that will be bound to a specific type with each usage. This allows a parameter type to be reflected on the result type.

Example 8-22 defines sample, a function that takes two arguments: a Sequence of elements of type T, and an int. It returns a List of elements of the same type T, picked at random from the first argument.

Here are two examples illustrate the behavior of sample:

  1. If called with a tuple of type Tuple[int, ...]—which is-consistent-with Sequence[int]—then the type parameter is int, so the return type is List[int];

  2. If called with a str—which is-consistent-with Sequence[str]—then the type parameter is str, so the return type is List[str].

This is the implementation:

Example 8-22. sample.py
from random import shuffle
from typing import Sequence, List, TypeVar

T = TypeVar('T')

def sample(population: Sequence[T], size: int) -> List[T]:
    if size < 1:
        raise ValueError('size must be >= 1')
    result = list(population)
    shuffle(result)
    return result[:size]

Why is TypeVar needed?

Python’s parser doesn’t recognize parameterized generic type notation as special case, so the name T in the example must be introduced in the local namespace by calling the typing.TypeVar constructor. You may have studied other languages such as Java, C#, or TypeScript which support parameterized generics since their inception. These languages don’t require the symbol for a type variable to be declared beforehand, so they have no equivalent of Python’s TypeVar class.

Another example is the statistics.mode function from the standard library, which returns the most common data point from a series.

Here is one usage example from the documentation:

>>> mode([1, 1, 2, 3, 3, 3, 3, 4])
3

Without using a TypeVar, mode could have this signature:

Example 8-23. mode_float.py: mode that operates on float and subtypes.20
from collections import Counter
from typing import Iterable

def mode(data: Iterable[float]) -> float:
    pairs = Counter(data).most_common(1)
    if len(pairs) == 0:
        raise ValueError('no mode for empty data')
    return pairs[0][0]

Many uses of mode involve int or float values, but Python has other numerical types, and it is desirable that the return type follows the element type of the given Iterable. We can improve that TypeVar. Let’s start with a simple but wrong parameterized signature:

from typing import Iterable, TypeVar

T = TypeVar('T')

def mode(data: Iterable[T]) -> T:

When it first appears in the signature, the type parameter T can be any type. The second time it appears, it will mean the same type as the first.

Therefore, every iterable is-consistent-with Iterable[T], including iterables of unhashable types that collections.Counter cannot handle. We need to restrict the possible types assigned to T. We’ll see two ways of doing that in the next two sections.

TypeVar with constraints

TypeVar accepts extra positional arguments to constrain the type parameter. So the signature can be improved like this, to accept more number types:

from typing import Iterable, TypeVar
from decimal import Decimal
from fractions import Fraction

NumberT = TypeVar('NumberT', float, Decimal, Fraction)

def mode(data: Iterable[NumberT]) -> NumberT:

That’s better than before, and it was the signature for mode in the statistics.pyi stub file on typeshed on May 25, 2020.

However, the statistics.mode documentation includes this example:

>>> mode(["red", "blue", "blue", "red", "green", "red", "red"])
'red'

In a hurry, we could just add str to the NumberT definition:

NumberT = TypeVar('NumberT', float, Decimal, Fraction, str)

That certainly works, but NumberT is badly misnamed if it accepts str. More importantly, we can’t keep listing types forever as we realize mode can deal with them. We can do better with another feature of TypeVar, introduced next.

Bounded TypeVar

Looking at the body of mode in Example 8-23, we see that the Counter class is used for ranking. Counter is based on dict, therefore the element type of the data iterable must be hashable.

At first, this signature may seem to work:

from typing import Iterable, Hashable

def mode(data: Iterable[Hashable]) -> Hashable:

Now the problem is that the type of the returned item is Hashable: an ABC that implements only the __hash__ method. So the type checker will not let us do anything with the return value except call hash() on it. Not very useful.

The solution is another optional parameter of TypeVar: the bound keyword parameter. It sets an upper bound for the acceptable types. In Example 8-24, we have bound=Hashable, which means the type parameter may be Hashable or any subtype of it.21

Example 8-24. mode_hashable.py: same as Example 8-23, with a more flexible signature.
from collections import Counter
from typing import Iterable, Hashable, TypeVar

HashableT = TypeVar('HashableT', bound=Hashable)

def mode(data: Iterable[HashableT]) -> HashableT:
    pairs = Counter(data).most_common(1)
    if len(pairs) == 0:
        raise ValueError('no mode for empty data')
    return pairs[0][0]

The typing.TypeVar constructor has other optional parameters—covariant and contravariant—that we’ll cover in [Link to Come], [Link to Come]. Let’s conclude this introduction to TypeVar with AnyStr.

The AnyStr predefined type variable

The typing module includes a predefined TypeVar named AnyStr. It’s defined like this:

AnyStr = TypeVar('AnyStr', bytes, str)

AnyStr is used in many functions that accept either bytes or str, and return values of the given type.

Now, on to typing.Protocol, a new feature of Python 3.8 that can support more Pythonic use of type hints.

Protocols

Note

In Object-Oriented programming, the concept of a “protocol” as an informal interface is as old as Smalltalk, and is an essential part of Python from the beginning. However, in the context of type hints, a protocol is a typing.Protocol subclass defining an interface that a type checker can verify. Both kinds of protocols are covered in [Link to Come]. This is just a brief introduction in the context of function annotations.

The Protocol type as presented in PEP 544—Protocols: Structural subtyping (static duck typing) is similar to interfaces in Go: a protocol type is defined by specifying one or more methods, and the type checker verifies that those methods are implemented where that protocol type is required.

In Python, a protocol definition is written as a typing.Protocol subclass. However, classes that implement a protocol don’t need to inherit, register or declare any relationship with the class that defines the protocol. It’s up to the type checker to find the available protocol types and enforce their usage.

Here is a problem that can be solved with the help of Protocol and TypeVar. Suppose you want to create a function top(it, n) that returns the largest n elements of the iterable it:

>>> top([4, 1, 5, 2, 6, 7, 3], 3)
[7, 6, 5]
>>> l = 'mango pear apple kiwi banana'.split()
>>> top(l, 3)
['pear', 'mango', 'kiwi']
>>>
>>> l2 = [(len(s), s) for s in l]
>>> l2
[(5, 'mango'), (4, 'pear'), (5, 'apple'), (4, 'kiwi'), (6, 'banana')]
>>> top(l2, 3)
[(6, 'banana'), (5, 'mango'), (5, 'apple')]

A parameterized generic top would look like this:

Example 8-25. top function with an undefined T type parameter.
def top(series: Iterable[T], length: int) -> List[T]:
    ordered = sorted(series, reverse=True)
    return ordered[:length]

The problem is how to constrain T? It cannot be Any or object, because the series must work with sorted. The sorted built-in actually accepts Iterable[Any], but that’s because the optional argument key takes a function that computes an arbitrary sort key from each element. What happens if you don’t provide key and give a list of plain objects to sorted? Let’s try that:

>>> l = [object() for _ in range(4)]
>>> l
[<object object at 0x10fc2fca0>, <object object at 0x10fc2fbb0>,
<object object at 0x10fc2fbc0>, <object object at 0x10fc2fbd0>]
>>> sorted(l)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: '<' not supported between instances of 'object' and 'object'

That’s interesting: sorted needs the < operator on the elements of the iterable.

Is this all it takes? Let’s do another quick experiment:

>>> class Spam:
...     def __init__(self, n): self.n = n
...     def __lt__(self, other): return self.n < other.n
...     def __repr__(self): return f'Spam({self.n})'
...
>>> l = [Spam(n) for n in range(5, 0, -1)]
>>> l
[Spam(5), Spam(4), Spam(3), Spam(2), Spam(1)]
>>> sorted(l)
[Spam(1), Spam(2), Spam(3), Spam(4), Spam(5)]

That confirms it: I can sort a list of Spam because Spam implements __lt__—the special method that supports the < operator.22

So the T type parameter in Example 8-25 should be limited to types that implement __lt__. In Example 8-24 we needed a type parameter that implemented __hash__, so we were able to use typing.Hashable as the upper bound for the type parameter. But now there is no suitable type in typing or abc to use, so we need to create it.

Here is the new Comparable type, a Protocol:

Example 8-26. comparable.py: definition of a Comparable Protocol type:
from typing import Protocol, Any

class Comparable(Protocol):  1
    def __lt__(self, other: Any) -> bool: ...  2
1

A protocol is a subclass of typing.Protocol.

2

The body of the protocol has one or more method definitions, with ... in their bodies.

A type T is-consistent-with a protocol P if T implements all the methods defined in P, with matching type signatures.

Given Comparable, we can now define this working version of top:

Example 8-27. top.py: definition of the top function using a TypeVar with bound=Comparable:
from typing import TypeVar, Iterable, List
from comparable import Comparable

CT = TypeVar('CT', bound=Comparable)

def top(series: Iterable[CT], length: int) -> List[CT]:
    ordered = sorted(series, reverse=True)
    return ordered[:length]

Let’s test-drive top. Example 8-28 shows part of a test suite for use with pytest. It tries calling top with a generator expression that yields of Tuple[int, str], and then with a list of object. With the list of object, we expect to get a TypeError exception.

Example 8-28. top_test.py: partial listing of pytest
def test_top_tuples() -> None:
    fruit = 'mango pear apple kiwi banana'.split()
    series: Iterator[Tuple[int, str]] = (
        (len(s), s) for s in fruit)
    length = 3
    expected = [(6, 'banana'), (5, 'mango'), (5, 'apple')]
    result = top(series, length)
    if TYPE_CHECKING:
        reveal_type(series)
        reveal_type(expected)
        reveal_type(result)
    assert result == expected

def test_top_objects_error() -> None:
    series = [object() for _ in range(4)]
    if TYPE_CHECKING:
        reveal_type(series)
    with pytest.raises(TypeError) as exc:
        top(series, 3)
    assert "'<' not supported" in str(exc)

The above tests pass—but they would pass anyway, even without type hints in top.py. More to the point, if I check that test file with Mypy, this is what I get:

/comparable/ $ mypy top_test.py
top_test.py:27: note: Revealed type is 'typing.Iterator[Tuple[builtins.int, builtins.str]]'
top_test.py:28: note: Revealed type is 'builtins.list[Tuple[builtins.int, builtins.str]]'
top_test.py:29: note: Revealed type is 'builtins.list[Tuple[builtins.int, builtins.str]]'
top_test.py:35: note: Revealed type is 'builtins.list[builtins.object*]'
top_test.py:37: error: Value of type variable "CT" of "top" cannot be "object"
Found 1 error in 1 file (checked 1 source file)

The type check shows that the TypeVar is working as intended:

  • in test_top_tuples, reveal_type confirms that the type returned by the top call is what we expected: given an Iterator[Tuple[int, str]], we got List[Tuple[str, int]];

  • in test_top_objects_error, reveal_type shows the series argument type is List[object];

  • Mypy flags the error: the element type of the series Iterable cannot be object.

A key advantage of a protocol type over ABCs is that a type needs no nominal connection with a specific protocol type to be consistent with it. I don’t need to derive or register str, tuple, float, set, etc. with Comparable to be able to use them where a Comparable parameter is expected. They only need to implement __lt__. And the type checker will still be able do its job, because Comparable is explicitly defined as a Protocol—in contrast with the implicit protocols that are common with duck typing, which are invisible to the type checker.

The special Procotol class was introduced in PEP 544—Protocols: Structural subtyping (static duck typing). Example 8-27 demonstrates why this feature is known as static duck typing: the solution to annotate the series parameter of top was to say “The nominal type of series doesn’t matter, as long as it implements __lt__“. Python’s duck typing always allowed us to say that implicitly, but the job of type checkers was much harder. A type checker can’t read CPython’s source code in C, or perform console experiments to find out that sorted only requires that the elements support <.

Now we are able to express this in code that the type checker can read. That’s why it makes sense to say that typing.Procotol gives us static duck typing.23

There’s more to see about typing.Protocol, but we’ll leave that to Part IV, where [Link to Come] contrasts structural typing, duck typing, and ABCs—another approach to formalizing “classic” protocols.

Callable

To annotate callback parameters or function objects returned by higher-order functions, the typing module provides the Callable type, which is parameterized like this:

Callable[[ParamType1, ParamType2], ReturnType]

The parameter list—[ParamType1, ParamType2]—can have 0 or more types.

Here is an example in context:

def repl(input_fn: Callable[[Any], str] = input) -> None:

The repl function is part of a simple interactive interpreter.24

During normal usage, the repl function uses Python’s input built-in to read expressions from the user. However, for automated testing or for integration with other input means, repl accepts an optional input_fn parameter: a Callable with the same parameter and return types as input.

The built-in input() has this signature on typeshed:

def input(__prompt: Any = ...) -> str: ...

That function is-consistent-with this Callable type hint:

Callable[[Any], str]

As another example, in Chapter 10, the Order.__init__ method in Example 10-3 uses this signature:

class Order:  # the Context

    def __init__(
        self,  1
        customer: Customer,
        cart: Sequence[LineItem],
        promotion: Optional[Callable[['Order'], float]] = None,  2
    ) -> None:  3
1

self rarely needs a type hint.25.

2

promotion may be None, or Callable[[Order], float]: a function that takes an Order and returns float.

3

__init__ always returns None, but I recommend recommend adding the return type hint for it anyway.26

Note that the Order type appears as the string 'Order' in the Callable type hint, otherwise Python would raise NameError: name 'Order' is not defined—because the Order class is not defined until Python reads the whole body of the class—an issue we’ll discuss in [Link to Come]: Class Metaprogramming.

Tip

PEP 563—Postponed Evaluation of Annotations adds support for forward references in annotations, avoiding the need to write Order as string in the previous example. However, that feature is only enabled when from __future__ import annotations is used at the top of the module, to avoid breaking code that does weird things in the annotations. See a summary of PEP 563 in What’s New In Python 3.7.

There is no syntax to annotate optional or keyword arguments in Callable[]. The documentation says “such function types are rarely used as callback types”. If you need a type hint to match a function with a dynamic signature, replace the whole parameter list with ..., like this: Callable[..., ReturnType].

NoReturn

This is a special type used only to annotate the return type of functions that never return. Usually, they exist to raise exceptions. There are dozens of such functions in the standard library.

For example: sys.exit() raises SystemExit, to terminate the Python process.

Its signature in typeshed is:

def exit(__status: object = ...) -> NoReturn: ...

The __status parameter is postional-only, and it has a default value. Stub files don’t spell out the default values: they use ... instead. The type of __status is object which means it may also be None, therefore it would be redundant to mark it Optional[object].

This ends our overview of the major groups of types used in type hints.

Overloaded signatures

When the return type of a function depends on the type of one parameter, using a TypeVar can be enough. But sometimes the return type depends on the type of more than one parameter. The solution then is to use the @typing.overload decorator.

Consider the sum built-in function. This is help(sum) from the console:

sum(iterable, /, start=0)
    Return the sum of a 'start' value (default: 0) plus an iterable of numbers

    When the iterable is empty, return the start value.
    This function is intended specifically for use with numeric values and may
    reject non-numeric types.

On typeshed, sum is annotated like this, in stdlib/2and3/builtins.pyi:

@overload
def sum(__iterable: Iterable[_T]) -> Union[_T, int]: ...
@overload
def sum(__iterable: Iterable[_T], start: _S) -> Union[_T, _S]: ...

First let’s look at the overall structure of the code with overloads. On a stub file (.pyi), that’s all there would be about sum—the implementation is elsewhere, and may even be written in C.

You can also use @overload in a regular Python module, by writing the overloaded signatures right before the function’s actual signature and implementation. Example 8-30 shows how sum would appear annotated and implemented in a Python module.

Example 8-30. mysum.py: definition of the sum function with overloaded signatures:
from functools import reduce  1
from operator import add
from typing import overload, Iterable, Union, TypeVar

T = TypeVar('T')
S = TypeVar('S')  2

@overload
def sum(it: Iterable[T]) -> Union[T, int]: ...  3
@overload
def sum(it: Iterable[T], /, start: S) -> Union[T, S]: ...  4
def sum(it, /, start=0):  5
    return reduce(add, it, start)
1

I’m lazy, so I’ll use functools.reduce and operator.add to implement sum.

2

We need this second, different TypeVar, as we’ll se in the second overload.

3

This signature is for the simple case: sum(my_iterable). The result type may be T—the type of the elements that my_iterable yields—or it may be int if the iterable is empty, because the default value of the start parameter is 0.

4

When start is given, it can be of any type S, so the result type is Union[T, S]. This is why we need S. If I reused T for the type of start, then it would have to be the same type as the elements of Iterable[T], and this is not what we want.

5

The signature of the actual function implementation has no type hints.

That’s seven lines to annotate a one-line function. Probably overkill, I know. At least it wasn’t a foo function. If you want to learn about @overload by reading code, typeshed has hundreds of examples.

As it turns out, the handy APIs we call Pythonic are often hard to annotate. On typeshed, the stub file for Python’s built-in functions has 186 overloads as I write this—more than any other in the standard library.27

Take advantage of gradual typing

Aiming for 100% of annotated code may lead to type hints that add lots of noise but little value. Annotation obsession can also lead to bloated, unpleasant APIs. Sometimes it’s better to be pragmatic and leave a piece of code without type hints.

This wraps ups our coverage of Python’s gratual type system for now. [Link to Come] covers type hints in class definitions, as well as other concepts such as variance, type erasure, and type casting.

The last sections in this epic chapter are about positional and variadic parameters, and the function attributes where type hints are stored at runtime.

Annotating positional-only and variadic parameters

Recall the tag function from Example 7-10. The last time we saw its signature was in section “Positional-only parameters”:

def tag(name, /, *content, class_=None, **attrs):

Here is tag, fully annotated, written in several lines—a common convention for long signatures, with line breaks the way the Black formatter would do it:

from typing import Optional

def tag(
    name: str,
    /,
    *content: str,
    class_: Optional[str] = None,
    **attrs: str,
) -> str:

Note the type hint *content: str for the arbitrary positional parameters: this means all those arguments must be of type str. The type of content in the function body will be Tuple[str, ...].

The type hint for the arbitrary keyword arguments is **attrs: str. In this example, the type of attrs will be Dict[str, str]. For a type hint like **settings: float, the type of settings would be Dict[str, float].

The / notation for positional-only parameters is only available since Python 3.8. In Python 3.7 or earlier, that’s a syntax error. The PEP 484 convention is to prefix each positional-only parameter name with two underscores. Here is the tag signature again, now in two lines, using the PEP 484 convention:

from typing import Optional

def tag(__name: str, *content: str, class_: Optional[str] = None,
        **attrs: str) -> str:

Mypy understands and enforces both ways of declaring positional-only parameters.

Reading annotations at runtime

Warning

When PEP 3107 introduced the function annotation syntax and the __annotations__ attribute, the community was encouraged to experiment with them. Now the experimentation phase is over. Any use of annotations that is not compatible with PEP 484 is officially deprecated since PEP 563—Postponed Evaluation of Annotations was accepted for Python 3.7. See section Non-typing usage of annotations in PEP 563.

At runtime, as a module is loaded, Python reads the type hints in functions, classes and modules and stores them in attributes named __annotations__.

For example, Example 8-31 is an annotated signature of Example 7-15.

Example 8-31. Annotated clip function
def clip(text: str, max_len: int = 80) -> str:

No processing is done with the annotations at runtime. They are merely stored as a dict in the __annotations__ attribute of the function:

>>> from clip_annot import clip
>>> clip.__annotations__
{'text': <class 'str'>, 'max_len': 'int > 0', 'return': <class 'str'>}

The item with key 'return' holds the return value annotation marked with -> in the function declaration in Example 8-31.

As far as I know, the only part of the Python standard library that uses function annotations for any purpose is the @functools.singledispatch decorator, covered in “Single Dispatch Generic Functions” in the next chapter. Class-level __annotations__ are extensively used in dataclasses, typing.NamedTuple, as seen in Chapter 5. But these packages don’t deal with function or method annotations at all.

The inspect.signature() function knows how to extract the annotations, as Example 8-32 shows.

Example 8-32. Extracting annotations from the function signature
>>> from clip_annot import clip
>>> from inspect import signature
>>> sig = signature(clip)
>>> sig.return_annotation
<class 'str'>
>>> for param in sig.parameters.values():
...     note = repr(param.annotation).ljust(13)
...     print(note, ':', param.name, '=', param.default)
<class 'str'> : text = <class 'inspect._empty'>
'int > 0'     : max_len = 80

The signature function returns a Signature object, which has a return_annotation attribute and a parameters dictionary mapping parameter names to Parameter objects. Each Parameter object has its own annotation attribute. That’s how Example 8-32 works.

FastAPI, a modern Web framework, supports annotations to automate request processing. For example, a price argument annotated as price: float is automatically converted from a string in the request to the float expected by the function.

Chapter summary

PEP 484 is the biggest change in the history of Python since the unification of types and classes in Python 2.2, which happened in 2001.

Although its use is optional in theory, in some contexts it is becoming mandatory.

We started with a brief introduction to the concept of gradual typing and then switched to a hands-on approach. It’s hard to see how gradual typing works without a tool that actually reads the type hints, so we developed an annotated function guided by Mypy error reports. That section ended with another practical matter: how to annotate code that must run under Python 2.7 and 3.x.

Back to the theory gradual typing, we explored how it is a hybrid of Python’s traditional duck typing and the nominal typing more familiar to users of Java, C++ and other statically typed languages.

Most of the chapter was devoted to presenting the major groups of types used in annotations. Many of the types we covered are related to familiar Python object types, such as collections, tuples, and callables—extended to support generic notation like Sequence[float]. Many of those types are temporary surrogates implemented in the typing module before the standard types were changed to support generics in Python 3.9.

Some of the types are special entities. Any, Optional, Union, and NoReturn have nothing to do with actual objects in memory, but exist only in the abstract domain of the type system.

We studied parameterized generics and type variables, which bring more flexibility to type hints without sacrificing type safety.

Parameterized generics become even more expressive with the use of Protocol. Because it appeared only in Python 3.8, Protocol is not widely used yet—but it is hugely important. Protocol enables static duck typing: the essential bridge between Python’s duck typed core and the nominal typing that allows type checkers to catch bugs.

While covering some of these types we experimented with Mypy to see type checking errors and inferred types with the help of Mypy’s magic reveal_type() function.

The next couple of sections covered overloaded function signatures and how to annotate positional-only and variadic parameters.

Finally, we saw how type hints can be found at runtime in the __annotations__ attribute of functions. That’s just one of a rich set of attributes that can be read with the help of the inspect module, which includes the Signature.bind method to apply the flexible rules that Python uses to bind actual arguments to declared parameters.

The unification of types and classes in 2001 was a major change that benefited every Python user. It made the language more powerful and easier to use. Type hints mostly benefit one category of users: professional software developers. That is certainly a very important category, but Python’s greatest strength is the diversity of its user base. Students, journalists, makers, artists, traders, activits, and researchers in every field are some of the user groups for whom the added complexity of the type system may not bring enough value to be justified.

Fortunately, type hints are an optional feature. Let us keep Python accessible to the widest user base and stop preaching that all Python code should have type hints—as I’ve seen in public sermons by typing evangelists.

Bernat Gabor wrote in his excellent post The state of type hints in Python:

Type hints should be used whenever unit tests are worth writing.

I am a big fan of testing, but I also do a lot exploratory coding. When I am exploring, tests and type hints are not helpful. They are a drag.

Our BDFL emeritus led this push towards type hints in Python, so it’s only fair that this chapter starts and ends with his words:

I wouldn’t like a version of Python where I was morally obligated to add type hints all the time. I really do think that type hints have their place but there are also plenty of times that it’s not worth it, and it’s so wonderful that you can choose to use them.28

Guido van Rossum

Further Reading

The best introductions to Python’s type hints that I found were Bernat Gabor’s The state of type hints in Python—which I just quoted—and Geir Arne Hjelle’s Python Type Checking (Guide). Hypermodern Python Chapter 4: Typing by Claudio Jolowicz is a shorter introduction that also covers runtime type checking validation.

For deeper coverage, the Mypy documentation is the best source. It is valuable regardless of the type checker you are using, because it has tutorial and reference pages about Python typing in general—not just about the Mypy tool itself. There you will also find super useful cheat sheets for type hints in Python 3 and Python 2.

The typing module documentation is a good quick reference, but it doesn’t go into much detail. The ultimate references are the PEP documents related to typing. There are 17 of them as of May 2020. PEPs are written by core for core developers, so they are usually not light reading and assume a lot of prior knowledge from the reader. Table 8-3 lists the typing PEPs in chronological order, with links on the titles.

Awesome Python Typing is a a good collection of links to tools and references.

PEP 3107—Function Annotations and PEP 563—Postponed Evaluation of Annotations cover all about the __annotations__ attributes. PEP 362—Function Signature Object is worth reading if you intend to use the inspect module that implements that feature.

Table 8-3. PEPs about typing. Those marked with * are important enough to be mentioned in the opening paragraph of typing documentation. PEPs 604 and 612 targeted Python 3.9 but weren’t approved in time for the beta 1 feature freeze.
PEP Title Python Year

484*

Type Hints

3.5

2014

483*

The Theory of Type Hints

n/a

2014

482

Literature Overview for Type Hints

n/a

2015

526*

Syntax for Variable Annotations

3.6

2016

544*

Protocols: Structural subtyping (static duck typing)

3.8

2017

557

Data Classes

3.7

2017

560

Core support for typing module and generic types

3.7

2017

561

Distributing and Packaging Type Information

3.7

2017

563

Postponed Evaluation of Annotations

3.7

2017

586*

Literal Types

3.8

2018

585

Type Hinting Generics In Standard Collections

3.9

2019

589*

TypedDict: Type Hints for Dictionaries with a Fixed Set of Keys

3.8

2019

591*

Adding a final qualifier to typing

3.8

2019

593

Flexible function and variable annotations

3.9

2019

604

Complementary syntax for Union[] (draft)

3.9

2019

612

Parameter Specification Variables (draft)

3.9

2019

613

Explicit Type Aliases

3.10

2020

1 From YouTube video of A Language Creators’ Conversation: Guido van Rossum, James Gosling, Larry Wall & Anders Hejlsberg, streamed live on April 2, 2019. Quote starts at 1:32:05, edited for brevity. Full transcript available at https://github.com/standupdev/language-creators.

2 For example, recursive types are not supported in Python as of May 2020—see typing module issue #182 Define a JSON type and Mypy issue #731 Support recursive types

3 A just-in-time compiler like the one in Pypy has much better data than type hints: it monitors the Python program as it runs, and determines the actual types used at runtime, generating optimized machine code on the fly.

4 I am using Mypy 0.770, the most recent release as I write this on April 28, 2020. The Mypy Introduction warns it “is officially beta software. There will be occasional changes that break backward compatibility.” Therefore, you may get different results than I did.

5 The pytest I’m using is version 5.4.1

6 Python doesn’t provide syntax to control the set of possible values for a type—except in Enum types. For example, using type hints you can’t define Quantity as an integer between 1 and 1000, or AirportCode as a 3-letter combination. NumPy offers uint8, int16 and other machine-oriented numeric types, but in the Python standard library we only have types with very small sets of values (NoneType, bool) or extremely large sets (float, int, str, all possible tuples etc.).

7 Duck typing is a weaker form of structural typing, which Python 3.8 also supports with the introducion of typing.Protocol. This is covered later in this chapter—in “Protocols”—with more details in [Link to Come].

8 Sorry about the silly example. Inheritance is often overused and hard to justify in examples that are realistic yet simple, so please accept this animal example as a quick illustration of subtyping.

9 MIT Professor, programming language designer, and Turing Award recipient. Wikipedia: Barbara Liskov.

10 To be more precise, ord only accepts str or bytes with len(s) == 1. But the type system currently can’t express this constraint.

11 In ABC—the language that most influenced Python in its roots—each list was constrained to accept values of a single type: the type of the first item you put into it.

12 An even deeper problem is how to typecheck integer ranges to prevent OverflowError at runtime when adding elements to arrays. For example, an array with typecode='B' can only hold int values from 0 to 255. Currently, Python’s static type system is not up to this challenge.

13 I will use := when it makes sense in examples, but I don’t cover it in the book. Please see PEP 572—Assignment Expressions for all the gory details.

14 As of May 2020, Pytype allows it. But its FAQ says it will be disallowed in the future. See question “Why didn’t pytype catch that I changed the type of an annotated variable?” in the Pytype FAQ.

15 I prefer to use the lxml package to generate and parse XML: it’s easy to get started, full-featured, and fast. Three friends of mine contributed to it: Martijn Faassen, Paul Everitt, and Sidnei da Silva. Unfortunately, lxml and Python’s own ElementTree don’t fit the limited RAM of my hypothetical microcontroller.

16 The Mypy documentation discusses this in its Common issues and solutions page, section Types of empty collections.

17 It’s hard to give a more precise return type hint for json.loads(). Brett Cannon, Guido van Rossum, and others have been discussing this since 2016 in Mypy issue #182: Define a JSON type.

18 Actually, dict is a virtual subclass of abc.MutableMapping. The concept of a virtual subclass is explained in [Link to Come]. For now, know that issubclass(dict, abc.MutableMapping) is True, despite the fact that dict is implemented in C and does not inherit anything from abc.MutableMapping, but only from object.

19 See Mypy issue int is not a Number?

20 The implementation here is simpler than the one in the Python standard library statistics module.

21 I contributed this solution to typeshed, and that’s how mode is annotated on statistics.pyi as of May 26, 2020.

22 How wonderful it is to open an interactive console and rely on duck typing to explore language features like I just did. I badly miss this kind of exploration when I learn new languages that don’t support it.

23 I don’t know who invented the term static duck typing, but it became more popular with the success of the Go language, which has a feature very similar to typing.Protocol. In Go, they call them “interfaces”, but they are much closer to Python’s protocols than to Java’s interfaces.

24 REPL stands for read-eval-print-loop, the common code pattern in interactive interpreters.

25 We’ll see cases where self is annotated in [Link to Come], [Link to Come]

26 As special case for __init__, if at least one parameter has a type hint, Mypy does not complain about the missing return type, by default. But if you forget this rule, and __init__ is completely untyped, then it will not be type checked.

27 In the “Soapbox” I discuss the downside of typing, exemplified by the max function with 6 overloads.

28 From YouTube video of Type Hints by Guido van Rossum (March 2015), Quote starts at 13’40”. I did some light editing for clarity.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.12.74.18