Caching and memoization

Caching is a great technique used to improve the performance of a wide range of applications. The idea behind caching is to store expensive results in a temporary location, called cache, that can be located in memory, on-disk, or in a remote location.

Web applications make extensive use of caching. In a web application, it often happens that users request a certain page at the same time. In this case, instead of recomputing the page for each user, the web application can compute it once and serve the user the already rendered page. Ideally, caching also needs a mechanism for invalidation so that if the page needs to be updated, we can recompute it before serving it again. Intelligent caching allows web applications to handle increasing number of users with less resources. Caching can also be done preemptively, such as the later sections of the video get buffered when watching a video online.

Caching is also used to improve the performance of certain algorithms. A great example is computing the Fibonacci sequence. Since computing the next number in the Fibonacci sequence requires the previous number in the sequence, one can store and reuse previous results, dramatically improving the running time. Storing and reusing the results of the previous function calls in an application is usually termed as memoization, and is one of the forms of caching. Several other algorithms can take advantage of memoization to gain impressive performance improvements, and this programming technique is commonly referred to as dynamic programming.

The benefits of caching, however, do not come for free. What we are actually doing is sacrificing some space to improve the speed of the application. Additionally, if the cache is stored in a location on the network, we may incur transfer costs and general time needed for communication. One should evaluate when it is convenient to use a cache and how much space we are willing to trade for an increase in speed.

Given the usefulness of this technique, the Python standard library includes a simple in-memory cache out of the box in the functools module. The functools.lru_cache decorator can be used to easily cache the results of a function. In the following example, we create a function, sum2, that prints a statement and returns the sum of two numbers. By running the function twice, you can see that the first time the sum2 function is executed the "Calculating ..." string is produced, while the second time the result is returned without running the function:

    from functools import lru_cache

    @lru_cache()
    def sum2(a, b):
        print("Calculating {} + {}".format(a, b))
        return a + b

    print(sum2(1, 2))
    # Output: 
    # Calculating 1 + 2
    # 3

    print(sum2(1, 2))
    # Output: 
    # 3

The lru_cache decorator also provides other basic features. To restrict the size of the cache, one can set the number of elements that we intend to maintain through the max_size argument. If we want our cache size to be unbounded, we can specify a value of None. An example usage of max_size is shown here:

    @lru_cache(max_size=16)
    def sum2(a, b):
        ...

In this way, as we execute sum2 with different arguments, the cache will reach a maximum size of 16 and, as we keep requesting more calculations, new values will replace older values in the cache. The lru prefix originates from this strategy, which means least recently used.

The lru_cache decorator also adds extra functionalities to the decorated function. For example, it is possible to examine the cache performance using the cache_info method, and it is possible to reset the cache using the cache_clear method, as follows:

    sum2.cache_info()
    # Output: CacheInfo(hits=0, misses=1, maxsize=128, currsize=1)
    sum2.cache_clear()

As an example, we can see how a problem, such as computing the fibonacci series, may benefit from caching. We can define a fibonacci function and time its execution:

    def fibonacci(n):
        if n < 1:
            return 1
        else:
            return fibonacci(n - 1) + fibonacci(n - 2)

    # Non-memoized version
    %timeit fibonacci(20)
    100 loops, best of 3: 5.57 ms per loop

The execution takes 5.57 ms, which is very high. The scaling of the function written in this way has poor performance; the previously computed fibonacci sequences are not reused, causing this algorithm to have an exponential scaling of roughly O(2^N).

Caching can improve this algorithm by storing and reusing the already-computed fibonacci numbers. To implement the cached version, it is sufficient to apply the lru_cache decorator to the original fibonacci function. Also, to design a proper benchmark, we need to ensure that a new cache is instantiated for every run; to do this, we can use the timeit.repeat function, as shown in the following example:

    import timeit
    setup_code = '''
    from functools import lru_cache
    from __main__ import fibonacci
    fibonacci_memoized = lru_cache(maxsize=None)(fibonacci)
    '''

    results = timeit.repeat('fibonacci_memoized(20)',
                            setup=setup_code,
                            repeat=1000,
                            number=1)
    print("Fibonacci took {:.2f} us".format(min(results)))
    # Output: Fibonacci took 0.01 us

Even though we changed the algorithm by adding a simple decorator, the running time now is much less than a microsecond. The reason is that, thanks to caching, we now have a linear time algorithm instead of an exponential one.

The lru_cache decorator can be used to implement simple in-memory caching in your application. For more advanced use cases, third-party modules can be used for more powerful implementation and on-disk caching.

Table of Contents for Caching and memoization

Create new playlist

Sign In

Sign Up

Table of Contents for
Caching and memoization