12
Decorator Design Techniques

Python offers us many ways to create higher-order functions. In Chapter 5, Higher-Order Functions, we looked at two techniques: defining a function that accepts a function as an argument, and defining a subclass of Callable, which is either initialized with a function or called with a function as an argument.

One of the benefits of decorating functions is that it can create composite functions. These are single functions that embody functionality from several sources. It’s often helpful to have the decoration syntax as a way to express complex processing.

We can also use decorators to identify classes or functions, often building a registry—a collection of related definitions. We may not necessarily create a composite function when building a registry.

In this chapter, we’ll look at the following topics:

  • Using a decorator to build a function based on another function

  • The wraps() function in the functools module; this can help us build decorators

  • The update_wrapper() function, which may be helpful in the rare case when we want access to the original function as well as the wrapped function

12.1 Decorators as higher-order functions

The core idea of a decorator is to transform some original function into a new function. Used like this, a decorator creates a composite function based on the decorator and the original function being decorated.

A decorator can be used in one of the two following ways:

  • As a prefix that creates a new function with the same name as the base function, as follows:

    @decorator 
    def base_function() -> None: 
        pass
  • As an explicit operation that returns a new function, possibly with a new name:

    def base_function() -> None: 
        pass 
     
    base_function = decorator(base_function)

These are two different syntaxes for the same operation. The prefix notation has the advantages of being tidy and succinct. The prefix location is also more visible to some readers. The suffix notation is explicit and slightly more flexible.

While the prefix notation is common, there is one reason for using the suffix notation: we may not want the resulting function to replace the original function. We may want to execute the following command, which allows us to use both the decorated and the undecorated functions:

new_function = decorator(base_function)

This will build a new function, named new_function(), from the original function. When using the @decorator syntax, the original function is no longer available for use. Indeed, once the name is reassigned to a new function object, the original object may have no remaining references, and the memory it once occupied may be eligible for reclamation.

A decorator is a function that accepts a function as an argument and returns a function as the result. This basic description is clearly a built-in feature of the language. Superficially, it may seem like we can update or adjust the internal code structure of a function.

Python doesn’t work by adjusting the internals of a function. Rather than messing about with the byte codes, Python uses a cleaner approach of defining a new function that wraps the original function. It’s easier to process the argument values or the result and leave the original function’s core processing alone.

We have two phases of higher-order functions involved in defining a decorator; they are as follows:

  • At definition time, a decorator function applies a wrapper to a base function and returns the new, wrapped function. The decoration process can do some one-time-only evaluation as part of building the decorated function. Complex default values can be computed, for example.

  • At evaluation time, the wrapping function can (and usually does) evaluate the base function. The wrapping function can pre-process the argument values or post-process the return value (or do both). It’s also possible that the wrapping function may avoid calling the base function. In the case of managing a cache, for example, the primary reason for wrapping is to avoid expensive calls to the base function.

Here’s an example of a decorator:

from collections.abc import Callable 
from functools import wraps 
 
def nullable(function: Callable[[float], float]) -> Callable[[float | None], float | None]: 
    @wraps(function) 
    def null_wrapper(value: float | None) -> float | None: 
        return None if value is None else function(value) 
    return null_wrapper

We almost always want to use the @wraps decorator when creating our own decorators to ensure that the decorated function retains the attributes of the original function. Copying the __name__ and __doc__ attributes, for example, ensures that the resulting decorated function has the name and docstring of the original function.

The resulting composite function, defined as the null_wrapper() function in the definition of the decorator, is also a type of higher-order function that combines the original function, the function callable object, in an expression that preserves the None values. Within the resulting null_wrapper() function, the original function callable object is not an explicit argument; it is a free variable that will get its value from the context in which the null_wrapper() function is defined.

The @nullable decorator’s return value is the newly minted function. It will be assigned to the original function’s name. It’s important that decorators only return functions and they don’t attempt to process data. Decorators use meta-programming: code that creates more code. The resulting null_wrapper() function is the function intended to process the application’s data.

The typing module makes it particularly easy to describe the types of null-aware function and null-aware result, using the Optional type definitions or the | type operator. The definition float | None or Optional[float] means Union[float, None]; either a None object or a float object match the type hint’s description.

As an example, we’ll assume we have a scaling function that converts input data from nautical miles to statute miles. This might be used with geolocation data that did calculations in nautical miles. The essential conversion from nautical miles, n, to statute miles, s, is a multiplication: s = 1.15078 × n.

We can apply our @nullable decorator to create a composite function as follows:

import math 
 
@nullable 
def st_miles(nm: float) -> float: 
    return 1.15078 * nm

This will create a function, st_miles(), which is a null-aware version of a small mathematical operation. The decoration process returns a version of the null_wrapper() function that invokes the original st_miles() function. This result will be named st_miles() and will have the composite behavior of both the wrapper and the original base function.

We can use this composite st_miles() function as follows:

>>> some_data = [8.7, 86.9, None, 43.4, 60] 
>>> scaled = map(st_miles, some_data) 
>>> list(scaled) 
[10.011785999999999, 100.002782, None, 49.94385199999999, 69.04679999999999]

We’ve applied the function to a collection of data values. The None value politely leads to a None result. There was no exception processing involved.

As a second example, here’s how we can create a null-aware rounding function using the same decorator:

@nullable 
def nround4(x: float) -> float: 
    return round(x, 4)

This function is a partial application of the round() function, wrapped to be null-aware. We can use this nround4() function to create a better test case for our st_miles() function as follows:

>>> some_data = [8.7, 86.9, None, 43.4, 60] 
>>> scaled = map(st_miles, some_data) 
>>> [nround4(v) for v in scaled] 
[10.0118, 100.0028, None, 49.9439, 69.0468]

This rounded result will be independent of any platform considerations. It’s very handy for doctest testing.

As an alternative implementation, we could also create these null-aware functions using the following code:

st_miles_2: Callable[[float | None], float | None] = ( 
    nullable(lambda nm: nm * 1.15078) 
) 
nround4_2: Callable[[float | None], float | None] = ( 
    nullable(lambda x: round(x, 4)) 
)

We didn’t use the @nullable decorator in front of the function definition def statement. Instead, we applied the nullable() function to another function defined as a lambda form. These expressions have the same effect as a decorator in front of a function definition.

Note how it is challenging to apply type hints to lambda forms. The variable nround4_2 is given a type hint of Callable with an argument list of float | None and a return type of float | None. The use of the Callable hint is appropriate only for positional arguments. In cases where there will be keyword arguments or other complexities, see https://mypy.readthedocs.io/en/stable/additional_features.html?highlight=callable#extended-callable-types.

The @nullable decorator makes an assumption that the decorated function is unary. We would need to revisit this design to create a more general-purpose null-aware decorator that works with arbitrary collections of arguments.

In Chapter 13, The PyMonad Library, we’ll look at an alternative approach to this problem of tolerating the None values. The PyMonad library defines a Maybe class of objects, which may have a proper value or may be the None value.

12.1.1 Using the functools update_wrapper() function

The @wraps decorator applies the update_wrapper() function to preserve a few attributes of a wrapped function. In general, this does everything we need by default. This function copies a specific list of attributes from the original function to the resulting function created by a decorator.

The update_wrapper() function relies on a global variable defined in the functools module to determine what attributes to preserve. The WRAPPER_ASSIGNMENTS variable defines the attributes that are copied by default. The default value is this list of attributes to copy:

(’__module__’, ’__name__’, ’__qualname__’, ’__doc__’, 
’__annotations__’)

It’s difficult to make meaningful modifications to this list. The internals of the def statement aren’t open to simple modification or change. This detail is mostly interesting as a piece of reference information.

If we’re going to create callable objects, then we may have a class that provides some additional attributes as part of the definition. This could lead to a situation where a decorator must copy these additional attributes from the original wrapped callable object to the wrapping function being created. However, it seems simpler to make these kinds of changes through object-oriented class design, rather than exploit tricky decorator techniques.

12.2 Cross-cutting concerns

One general principle behind decorators is to allow us to build a composite function from the decorator and the original function to which the decorator is applied. The idea is to have a library of common decorators that can provide implementations for common concerns.

We often call these cross-cutting concerns because they apply across several functions. These are the sorts of things that we would like to design once through a decorator and have them applied in relevant classes throughout an application or a framework.

Concerns that are often centralized as decorator definitions include the following:

  • Logging

  • Auditing

  • Security

  • Handling incomplete data

A logging decorator, for example, may write standardized messages to the application’s log file. An audit decorator may write details surrounding a database update. A security decorator may check some runtime context to be sure that the login user has the necessary permissions.

Our example of a null-aware wrapper for a function is a cross-cutting concern. In this case, we’d like to have a number of functions handle the None values by returning the None values instead of raising an exception. In applications where data is incomplete, we may need to process rows in a simple, uniform way without having to write lots of distracting if statements to handle missing values.

12.3 Composite design

The common mathematical notation for a composite function looks as follows:

f ∘g(x) = f(g(x))

The idea is that we can define a new function, f g(x), that combines two other functions, f(y) and g(x).

Python’s multiple-line definition of a composition function can be done through the following code:

@f_deco 
def g(x): 
    something

The resulting function can be essentially equivalent to f g(x). The @f_deco decorator must define and return the composite function by merging an internal definition of f(y) with the provided base function, g(x).

The implementation details show that Python actually provides a slightly more complex kind of composition. The structure of a wrapper makes it helpful to think of Python decorator composition as follows:

 ( ) w ∘g(x) = (wβ ∘g ∘w α)(x) = w β g(w α(x ))

A decorator applied to some application function, g(x), will include a wrapper function, w(y), that has two parts. One portion of the wrapper, wα(y), applies to the arguments of the base function; the other portion, wβ(z), applies to the result of the base function.

Here’s a slightly more concrete idea, shown as a @stringify decorator definition:

def stringify(argument_function: Callable[[int, int], int]) -> Callable[[str], str]: 
    @wraps(argument_function) 
    def two_part_wrapper(text: str) -> str: 
        # The "before" part 
        arg1, arg2 = map(int, text.split(",")) 
        int_result = argument_function(arg1, arg2) 
        # The "after" part 
        return str(int_result) 
    return two_part_wrapper

This decorator inserts conversions from string to integer, and integer back to string. Concealing the details of string processing may be helpful when working with CSV files, where the content is always string data.

We can apply this decorator to a function:

>>> @stringify 
... def the_model(m: int, s: int) -> int: 
...     return m * 45 + s * 3 
... 
>>> the_model("5,6") 
’243’

This shows the two places to inject additional processing before as well as after the original function. This emphasizes an important distinction between the abstract concept of functional composition and the Python implementation: it’s possible that a decorator can create either f(g(x)), or g(f(x)), or a more complex fβ(g(fα(x))). The syntax of decoration doesn’t describe which kind of composition will be created.

The real value of decorators stems from the way any Python statement can be used in the wrapping function. A decorator can use if or for statements to transform a function into something used conditionally or iteratively. In the next section, the examples will leverage the try: statement to perform an operation with a standard recovery from bad data. There are many things that can be done within this general framework.

A great deal of functional programming follows the essential f g(x) design pattern. Defining a composite from two smaller functions can help to summarize complex processing. In other cases, it can be more informative to keep the two functions separate.

It’s easy to create composites of the common higher-order functions, such as map(), filter(), and functools.reduce(). Because these functions are relatively simple, a composite function is often easy to describe, and can help to make the code more expressive.

For example, an application may include map(f, map(g, x)). It may be more clear to create a composite function and use a map(f_g, x) expression to describe applying a composite to a collection. We can use f_g = lambda x: f(g(x)) to help explain a complex application as a composition of simpler functions. To make sure the type hints are correct, we’ll almost always want to define individual functions with the def statement.

It’s important to note that there’s no real performance advantage to either technique. The map() function is lazy: with two map() functions, one item will be taken from the source collection, x, processed by the g() function, and then processed by the f() function. With a single map() function, an item will be taken from the source collection, x, and then processed by the f_g() composite function; the memory use is the same.

In Chapter 13, The PyMonad Library, we’ll look at an alternative approach to this problem of creating composite functions from individual curried functions.

12.3.1 Preprocessing bad data

One cross-cutting concern in some exploratory data analysis applications is how to handle numeric values that are missing or cannot be parsed. We often have a mixture of float, int, datetime.datetime, and decimal.Decimal currency values that we’d like to process with some consistency.

In other contexts, we have not applicable or not available placeholders instead of data values; these shouldn’t interfere with the main thread of the calculation. It’s often handy to allow the not applicable values to pass through an expression without raising an exception. We’ll focus on three bad-data conversion functions: bd_int(), bd_float(), and bd_decimal(). We’ve left bd_datetime() as an exercise for the reader.

The composite feature we’re adding will be defined first. Then we’ll use this to wrap a built-in conversion function. Here’s a simple bad-data decorator:

from collections.abc import Callable 
import decimal 
from typing import Any, Union, TypeVar, TypeAlias 
 
Number: TypeAlias = Union[decimal.Decimal, float] 
NumT = TypeVar("NumT", bound=Number) 
 
def bad_data( 
         function: Callable[[str], NumT] 
) -> Callable[[str], NumT]: 
    @wraps(function) 
    def wrap_bad_data(source: str, **kwargs: Any) -> NumT: 
        try: 
            return function(source, **kwargs) 
        except (ValueError, decimal.InvalidOperation): 
            cleaned = source.replace(",", "") 
            return function(cleaned, **kwargs) 
    return wrap_bad_data

The decorator, @bad_data, wraps a given conversion function, with the parameter name function, to try a second conversion in the event the first conversion fails. The ValueError and decimal.InvalidOperation exceptions are generally indicators of data that has an invalid format: bad data. The second conversion will be attempted after "," characters are removed. This wrapper passes the *args and **kwargs parameters into the wrapped function. This ensures that the wrapped functions can have additional argument values provided.

The type variable NumT is bound to the original return type of the base function being wrapped, the value of the function parameter. The decorator is defined to return a function with the same type, NumT. This type has an upper bound of the union of float and Decimal types. This boundary permits objects that are a subclass of float or Decimal.

The type hints for complex decorator design are evolving rapidly. In particular, PEP 612 ( https://peps.python.org/pep-0612/) defines some new constructs that can allow even more flexible type hints. For decorators that do not make any type changes, we can use generic parameter variables like ParamSpec to capture the actual parameters of the function being decorated. This lets us write generic decorators without having to wrestle with the details of the type hints of the functions being decorated. We’ll note where PEP 612’s ParamSpec and Concatenate will come in useful. Be sure to see the PEP 612 examples when designing generic decorators.

We can use this wrapper to create bad-data-sensitive functions as follows:

from decimal import Decimal 
 
bd_int = bad_data(int) 
bd_float = bad_data(float) 
bd_decimal = bad_data(Decimal)

This will create a suite of functions that can do conversions of good data as well as a limited amount of data cleansing to handle specific kinds of bad data.

It can be difficult to write type hints for some kinds of callable objects. For example, the int() function has optional keyword arguments, with their own complex type hints. Our decorator summarizes these keyword arguments as **kwargs: Any. Ideally, a ParamSpec can be used to capture the details of the parameters for the function being wrapped. See PEP 612 ( https://peps.python.org/pep-0612/) for guidance on creating complex type signatures for callable objects.

The following are some examples of using the bd_int() function:

>>> bd_int("13") 
13 
>>> bd_int("1,371") 
1371 
>>> bd_int("1,371", base=16) 
4977

We’ve applied the bd_int() function to a string that converted neatly and a string with the specific type of punctuation that we’ll tolerate. We’ve also shown that we can provide additional parameters to each of these conversion functions.

We may like to have a more flexible decorator. One feature that we may like to add is the ability to handle a variety of data scrubbing alternatives. Simple "," removal isn’t always what we need. We may also need to remove $ or ° symbols, too. We’ll look at more sophisticated, parameterized decorators in the next section.

12.4 Adding a parameter to a decorator

A common requirement is to customize a decorator with additional parameters. Rather than simply creating a composite f g(x), we can do something a bit more complex. With parameterized decorators, we can create (f(c) g)(x). We’ve applied a parameter, c, as part of creating the wrapper, f(c). This parameterized composite function, f(c) g, can then be applied to the actual data, x.

In Python syntax, we can write it as follows:

@deco(arg) 
def func(x): 
    base function processing...

There are two steps to this. The first step applies the parameter to an abstract decorator to create a concrete decorator. Then the concrete decorator, the parameterized deco(arg) function, is applied to the base function definition to create the decorated function.

The effect is as follows:

concrete_deco = deco(arg) 
 
def func(x): 
    base function processing... 
 
func = concrete_deco(func)

The parameterized decorator worked by doing the following three things:

  1. Applied the abstract decorator, deco(), to its argument, arg, to create a concrete decorator, concrete_deco().

  2. Defined the base function, func().

  3. Applied the concrete decorator, concrete_deco(), to the base function to create the decorated version of the function; in effect, it’s deco(arg)(func).

A decorator with arguments involves indirect construction of the final function. We seem to have moved beyond merely higher-order functions into something even more abstract: higher-order functions that create higher-order functions.

We can expand our bad-data-aware decorator to create a slightly more flexible conversion. We’ll define a @bad_char_remove decorator that can accept parameters of characters to remove. The following is a parameterized decorator:

from collections.abc import Callable 
import decimal 
from typing import Any, TypeVar 
 
T = TypeVar(’T’) 
 
def bad_char_remove( 
    *bad_chars: str 
) -> Callable[[Callable[[str], T]], Callable[[str], T]]: 
    def cr_decorator( 
            function: Callable[[str], T] 
    ) -> Callable[[str], T]: 
        def clean_list(text: str, *, to_replace: tuple[str, ...]) -> str: 
            if to_replace: 
                return clean_list( 
                    text.replace(to_replace[0], ""), 
                    to_replace=to_replace[1:] 
                ) 
            return text 
 
        @wraps(function) 
        def wrap_char_remove(text: str, **kwargs: Any) -> T: 
            try: 
                return function(text, **kwargs) 
            except (ValueError, decimal.InvalidOperation): 
                cleaned = clean_list(text, to_replace=bad_chars) 
                return function(cleaned, **kwargs) 
        return wrap_char_remove 
    return cr_decorator

A parameterized decorator has two internal function definitions:

  • The concrete decorator; in this example, the cr_decorator() function. This will have the free variable, bad_chars, bound to the function being built. The concrete decorator is then returned; it will later be applied to a base function. When applied, the decorator will return a new function wrapped inside the wrap_char_remove() function. This new wrap_char_remove() function has type hints with a type variable, T, that claim the wrapped function’s type will be preserved by the new wrap_char_remove() function.

  • The decorating wrapper, the wrap_char_remove() function in this example, will replace the original function with a wrapped version. Because of the @wraps decorator, the __name__ (and other attributes) of the new function will be replaced with the name of the base function being wrapped.

The overall decorator, the @bad_char_remove function in this example, has the job of binding the parameter, named bad_chars, to a function and returning the concrete decorator. The type hint clarifies the return value is a Callable object that transforms a Callable function into another Callable function. The language rules will then apply the concrete decorator to the following function definition.

The internal clean_list() function is used by the @bad_char_remove decorator to remove all characters in a given argument value. This is defined as a recursion to keep it very short. It can be optimized into an iteration if necessary. We’ve left that optimization as an exercise for the reader.

We can use the @bad_char_remove decorator to create conversion functions as follows:

from decimal import Decimal 
from typing import Any 
 
@bad_char_remove("$", ",") 
def currency(text: str, **kw: Any) -> Decimal: 
    return Decimal(text, **kw)

We’ve used our @bad_char_remove decorator to wrap a base currency() function. The essential feature of the currency() function is a reference to the decimal.Decimal constructor.

This currency() function will now handle some variant data formats:

>>> currency("13") 
Decimal(’13’) 
>>> currency("$3.14") 
Decimal(’3.14’) 
>>> currency("$1,701.00") 
Decimal(’1701.00’)

We can now process input data using a relatively simple map(currency, row) expression to convert source data from strings to usable Decimal values. The try:/except: error-handling has been isolated to a function that we’ve used to build a composite conversion function.

We can use a similar design to create null-tolerant functions. These functions would use a similar try:/except: wrapper, but would return the None values. This design variant is left as an exercise for the reader.

This decorator is limited to conversion functions that apply to a single string, and have a type hint like Callable[[str], T]. For generic decorators, it helps to follow the examples in PEP-612 and use the ParamSpec and Concatenate type hints to broaden the domain of the decorators. Because we’re interested in applying the internal clean_list() function to the first argument value, we can look at the conversion function as Callable[Concatenate[str, P], T]. We would define the first parameter as a string, and use a ParamSpec, P, to represent all other parameters of the conversion function.

12.5 Implementing more complex decorators

To create more complex compositions, Python allows the following kinds of function definitions:

@f_wrap 
@g_wrap 
def h(x): 
    return something...

Python permits stacking decorators that modify the results of other decorators. This has a meaning somewhat like f g h(x). However, the resulting name will be merely h(x), concealing the stack of decorations. Because of this potential confusion, we need to be cautious when creating functions that involve deeply nested decorators. If our intent is simply to handle some cross-cutting concerns, then each decorator should be designed to handle a separate concern while avoiding confusion.

While many things can be done with decoration, it’s essential to ask if using a decorator creates clear, succinct, expressive programming. When working with cross-cutting concerns, the features of the decorator are often essentially distinct from the function being decorated. This can be a wonderful simplification. Adding logging, debugging, or security checks through decoration is a widely followed practice.

One important consequence of an overly complex design is the difficulty in providing appropriate type hints. When the type hints devolve to simply using Callable[..., Any], the design may have become too difficult to reason about clearly.

12.6 Complicated design considerations

In the case of our data cleanup, the simplistic removal of stray characters may not be sufficient. When working with the geolocation data, we may have a wide variety of input formats that include simple degrees (37.549016197), degrees and minutes (37° 32.94097), and degrees-minutes-seconds (37° 3256.46′′). Of course, there can be even more subtle cleaning problems: some devices will create an output with the Unicode U+00BA character, º, the ”masculine ordinal indicator,” instead of the similar-looking degree character, °, which is U+00B0.

For this reason, it is often necessary to provide a separate cleansing function that’s bundled with the conversion function. This function will handle the more sophisticated conversions required by inputs that are as wildly inconsistent in format as latitudes and longitudes are.

How can we implement this? We have a number of choices. Simple higher-order functions are a good choice. A decorator, on the other hand, doesn’t work out terribly well. We’ll look at a decorator-based design to see some limitations to what makes sense in a decorator.

The requirements have the following two orthogonal design considerations:

  • The output conversion from string to int, float or Decimal, summarized as Callable[str, T]

  • The input cleaning; removing stray characters, reformatting coordinates; summarized as Callable[str, str]

Ideally, one of these aspects could be considered as the essential function that gets wrapped, and the other aspect is something that’s included via a decoration. The choice of essence versus wrapper isn’t always clear.

Considering the previous examples, it appears that this should be seen as a three-part composite:

  • The output conversion from string to int, float, or decimal

  • The input cleansing: either a simple replace or a more complex multiple-character replacement

  • An overall processing function that first attempts the conversion, then does any cleansing as a response to an exception, and then attempts the conversion again

The third part—attempting the conversion and retrying—is the actual wrapper that also forms a part of the composite function. As we noted previously, a wrapper contains an argument phase and a return-value phase, which we can call wα(x) and wβ(x), respectively.

We want to use this wrapper to create a composite of two additional functions. We have two choices for the design. We could include the cleansing function as an argument to the decorator on the conversion, as follows:

@cleanse_before(cleanser) 
def convert(text: str) -> int: 
    # code to convert the text, trusting it was clean 
    return # an int value

This first design claims that the conversion function is central, and the cleansing is an ancillary detail that will modify the behavior but preserve the original intent of the conversion.

Or, we could include the conversion function as an argument to the decorator for a cleansing function as follows:

@then_convert(converter) 
def cleanse(text: str) -> str: 
    # code to clean the text 
    return # the str value for later conversion

This second design claims that the cleansing is central and the conversion is an ancillary detail. This is a bit confusing because the cleansing type is generally Callable[[str], str], while the conversion’s type of Callable[[str], some other type] is what is required for the overall wrapped function.

While both of these approaches can create a usable composite function, the first version has an important advantage: the type signature of the conversion() function is also the type signature of the resulting composite function. This highlights a general design pattern for decorators: the type annotations—the signatures—of the function being decorated are the easiest to preserve.

When confronted with several choices for defining a composite function, it is generally easiest to preserve the type hints for the function being decorated. This helps identify the concept that’s central.

Consequently, the @cleanse_before(cleaner) style decorator is preferred. The decorator definition looks like the following example:

from collections.abc import Callable 
from typing import Any, TypeVar 
 
# Defined Earlier: 
# T = TypeVar(’T’) 
 
def cleanse_before( 
    cleanse_function: Callable[[str], Any] 
) -> Callable[[Callable[[str], T]], Callable[[str], T]]: 
    def concrete_decorator(converter: Callable[[str], T]) -> Callable[[str], T]: 
        @wraps(converter) 
        def cc_wrapper(text: str, **kwargs: Any) -> T: 
            try: 
                return converter(text, **kwargs) 
            except (ValueError, decimal.InvalidOperation): 
                cleaned = cleanse_function(text) 
                return converter(cleaned, **kwargs) 
        return cc_wrapper 
    return concrete_decorator

We’ve defined the following multi-layer decorator:

  • At the heart is the cc_wrapper() function that applies the converter() function. If this fails, then it uses the given cleanse_function() function and then tries the converter() function again.

  • The cc_wrapper() function is built around the given cleanse_function() and a converter() function by the concrete_decorator() decorator. The converter() function is the function being decorated.

  • The top-most layer is the concrete_decorator() function. This decorator has the cleanse_function() function as a free variable.

  • The concrete decorator is created when the decorator interface, cleanse_before(), is evaluated. The interface is customized by providing the cleanse_function as an argument value.

The type hints emphasize the role of the @cleanse_before decorator. It expects some Callable function, named cleanse_function, and it creates a function, shown as Callable[[str], T], which will transform a function into a wrapped function. This is a helpful reminder of how parameterized decorators work.

We can now build a slightly more flexible cleanse and convert function, to_int(), as follows:

def drop_punct2(text: str) -> str: 
    return text.replace(",", "").replace("$", "") 
 
@cleanse_before(drop_punct2) 
def to_int(text: str, base: int = 10) -> int: 
    return int(text, base)

The integer conversion is decorated with a cleansing function. In this case, the cleansing function removes $ and , characters. The integer conversion is wrapped by this cleansing.

The to_int() function defined previously leverages the built-in int() function. An alternative definition that avoids the def statement would be the following:

to_int2 = cleanse_before(drop_punct2)(int)

This uses drop_punct2() to wrap the built-in int() conversion function. Using the mypy tool’s reveal_type() function shows that the type signature for to_int() matches the type signature for the built-in int(). It can be argued that this style is less readable than using a decorator.

We can use this enhanced integer conversion as follows:

>>> to_int("1,701") 
1701 
>>> to_int("42") 
42

The type hints for the underlying int() function have been rewritten (and simplified) for the decorated function, to_int(). This is a consequence of trying to use decorators to wrap built-in functions.

Because of the complexity of defining parameterized decorators, it appears that this is the edge of the envelope. The decorator model doesn’t seem to be ideal for this kind of design. It seems like a definition of a composite function would be more clear than the machinery required to build decorators.

The alternative is to duplicate a few lines of code that will be the same for all of the conversion functions. We could use:

def to_int_flat(text: str, base: int = 10) -> int: 
    try: 
        return int(text, base) 
    except (ValueError, decimal.InvalidOperation): 
        cleaned = drop_punct2(text) 
        return int(cleaned, base)

Each of the data type conversions will repeat the try-except block. The use of a decorator isolates this design feature in a way that can be applied to any number of conversion functions without explicitly restating the code. Later changes to the design when using this alternative may require editing a number of similar functions instead of changing one decorator.

Generally, decorators work well when we have a number of relatively simple and fixed aspects that we want to include with a given function (or a class). Decorators are also important when these additional aspects can be looked at as infrastructure or as support, and not something essential to the meaning of the application code.

For something that involves multiple orthogonal design aspects, we may want to result to a callable class definition with various kinds of plugin strategy objects. This might have a simpler class definition than the equivalent decorator. Another alternative to decorators is to look closely at creating higher-order functions. In some cases, partial functions with various combinations of parameters may be simpler than a decorator.

The typical examples for cross-cutting concerns include logging or security testing. These features can be considered as the kind of background processing that isn’t specific to the problem domain. When we have processing that is as ubiquitous as the air that surrounds us, then a decorator might be an appropriate design technique.

12.7 Summary

In this chapter, we’ve looked at two kinds of decorators: simple decorators with no arguments and parameterized decorators. We’ve seen how decorators involve an indirect composition between functions: the decorator wraps a function (defined inside the decorator) around another function.

Using the functools.wraps() decorator ensures that our decorators will properly copy attributes from the function being wrapped. This should be a piece of every decorator we write.

In the next chapter, we’ll look at the PyMonad library to express a functional programming concept directly in Python. We don’t require monads generally because Python is an imperative programming language under the hood.

12.8 Exercises

This chapter’s exercises are based on code available from Packt Publishing on GitHub. See https://github.com/PacktPublishing/Functional-Python-Programming-3rd-Edition.

In some cases, the reader will notice that the code provided on GitHub includes partial solutions to some of the exercises. These serve as hints, allowing the reader to explore alternative solutions.

In many cases, exercises will need unit test cases to confirm they actually solve the problem. These are often identical to the unit test cases already provided in the GitHub repository. The reader should replace the book’s example function name with their own solution to confirm that it works.

12.8.1 Datetime conversions

In the Preprocessing bad data section of this chapter, we introduced the concept of data conversion functions that included special not applicable or not available data values. These are often called null values; because of this, a database may have a universal NULL literal. We’ll call them ”bad data” because that’s how we often discover them. When examining data for the first time, we find bad data that might represent missing, or not applicable, values.

This kind of data can have any of these possible processing paths:

  • The bad data is silently ignored; it’s excluded from counts and averages. To make this work out, we’ll often want to replace bad values with a consistent object. The None object is a good replacement value.

  • The bad data stops the processing, raising an exception. This is quite easy to implement, since Python tends to do this automatically. In some cases, we want to retry the conversion using alternative rules. We’ll focus on this approach for this exercise.

  • Bad data is replaced with interpolated or imputed values. This often means keeping two versions of a collection of data: the original with bad data present, and a more useful version with replacement values. This isn’t a simple computation.

The idea of our core bad_data() function is to try a conversion, replace known bad punctuation, and then try again. We might, for example, strip ”,” and ”$” from numeric values.

Earlier in this chapter, we described three bad-data conversion functions: bd_int(), bd_float(), and bd_decimal(). Each of these performed a relatively direct conversion-or-replacement algorithm. We left the bd_datetime() function as an exercise for the reader. In this case, the alternative date formats can lead to a bit more complexity.

We’ll assume that dates must be in one of three formats: ”yyyy-mon-dd”, ”yyyy-mm-dd”, or ”mon-dd” without a year. In the first and third formats, the month name is spelled out. In the second format, the month name is numeric. These are handled by the datetime.strptime() function using format strings like "%Y-%b-%d", "%b-%d", and "%Y-%m-%d".

Write a bd_datetime() function to try multiple data format conversions, looking for one that produces a valid date. In the case of a missing year, the datetime.replace() method can be used to build a final date result with the current year.

Once the basic implementation is complete, create appropriate test cases with a mix of valid and invalid dates.

Be sure to make the design flexible enough that adding another format can be done without too much struggle.

12.8.2 Optimize a decorator

In the Adding a parameter to a decorator section of this chapter, we defined a decorator to replace ”bad” characters in a given field and retry an attempted conversion.

This decorator had an internal function, clean_list(), that provided a recursive definition for removing bad characters from a string.

Here’s the Python function definition:

    def clean_list(text: str, *, to_replace: tuple[str, ...]) -> str: 
        ...

This recursion has two cases:

  • When the to_replace argument value is empty, there’s nothing to replace, and the value of the text parameter is returned unchanged.

  • Otherwise, split the to_replace string to separate the first character from the remaining characters. Remove any occurrence of the first character from the value of the text parameter and apply this function again using the remaining characters of the to_replace string.

Looking back at Chapter 6, Recursions and Reductions, we recall that this kind of tail-call recursion can be transformed into a for statement. Rewrite the clean_list() function to eliminate the recursion.

12.8.3 None-tolerant functions

In the Adding a parameter to a decorator section of this chapter, we saw a design pattern of using a try:/except: wrapper to uncover numbers with spurious punctuation marks. A similar technique can be used to detect None values and pass them through a function, unprocessed.

Write a decorator that can be used for Callable[[float], float] functions that will handle None values gracefully.

If the none-tolerant decorator is called @none_tolerant, here is a test case:

@none_tolerant 
def x2(x: float) -> float: 
    return 2 * x 
 
def test_x2() -> None: 
    assert x2(42.) == 84.0 
    assert x2(None) == None 
    assert list(map(x2, [1, 2, None, 3])) == [2, 3, None, 6]

12.8.4 Logging

A common requirement for debugging is a consistent collection of logging messages. It can become tedious to include a logger.debug() line in a number of closely-related functions. If the functions have a consistent set of type definitions, it can be helpful to define a decorator that can be applied to a number of related functions.

As example functions, we’ll define a collection of ”models” that compute an expected result from sample values. We’ll start with a dataclass to define each sample as having an identifier, an observed value, and a time-stamp. It looks like this:

from dataclasses import dataclass 
@dataclass(frozen=True) 
class Sample: 
    id: int 
    observation: float 
    date_time: datetime.datetime

We have three models to compute an expected value, e, from the observed value in the sample, so:

  • e = 0.7412 × so

  • e = 0.9 × so 90

  • e = 0.7724 × so1.0134

First, define these three functions with appropriate test cases.

Second, define a @logging decorator to use logger.info() to record the sample value and the computed expectation.

Third, add the @logging decorator in front of each function definition.

Create an overall application that uses logging.basicConfig() to set the logging level to logging.INFO to ensure that the informational messages are seen. (The default logging level only shows warnings and errors.)

This permits creating a consistent logging setup for the three ”model” functions. This reflects a complete separation between the logging aspect of the application and the computation of expected values from sample values. Is this separation clear and helpful? Are there circumstances where this separation might not be desirable?

The actual measurements are given here. One of the models is more accurate than the others:




Sample NumberObservedActual



11000883
215001242
315001217
416001306
517501534
620001805
720001720



12.8.5 Dry-run check

Applications that can modify the file system require extensive unit testing as well as integration testing. To mitigate risk even further, these applications will often have a ”dry-run” mode where file system modifications are logged but not carried out; files are not moved, directories are not deleted, and so on.

The idea here is to write small functions for file system state changes. Each function can then be decorated with a @dry_run_check decorator. This decorator can examine a global variable, DRY_RUN. The decorator writes a log message. If the DRY_RUN value is True, nothing else is done. If the DRY_RUN value is False, the base function is evaluated to make the underlying state changes, such as removing files, or removing directories.

First, define a number of functions to copy a directory. The following state changes need separate functions:

  • Create a new, empty, directory.

  • Copy a file from somewhere in the source directory to the target directory. We can use an expression like offset = source_path.relative_to(source_dir) to compute the relative location of a file in the source directory. We can use target_dir / offset to compute the new location in a target directory. The pathlib.Path objects provide all of the features required.

The pathlib.Path.glob() method provides a useful view of a directory’s content. This can be used by an overall function that calls the other two functions to create subdirectories and copy files into them.

Second, define a decorator to block the action if this is a dry run. Apply the decorator to the directory creation function and the file copy function. Note that these two function signatures are different. One function uses a single path, the other function uses two paths.

Third, create a suitable unit test to confirm that dry-run mode goes through the motions, but doesn’t alter the underlying file system. The pytest.tmp_path fixture provides a temporary working directory; using this prevents endlessly having to drop and recreate output directories while debugging.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
13.58.39.23