Comprehensions and generators

In this section, we will explore a few simple strategies to speed up Python loops using comprehension and generators. In Python, comprehension and generator expressions are fairly optimized operations and should be preferred in place of explicit for-loops. Another reason to use this construct is readability; even if the speedup over a standard loop is modest, the comprehension and generator syntax is more compact and (most of the times) more intuitive.

In the following example, we can see that both the list comprehension and generator expressions are faster than an explicit loop when combined with the sum function:

    def loop(): 
res = []
for i in range(100000):
res.append(i * i)
return sum(res)

def comprehension():
return sum([i * i for i in range(100000)])

def generator():
return sum(i * i for i in range(100000))

%timeit loop()
100 loops, best of 3: 16.1 ms per loop
%timeit comprehension()
100 loops, best of 3: 10.1 ms per loop
%timeit generator()
100 loops, best of 3: 12.4 ms per loop

Just like lists, it is possible to use dict comprehension to build dictionaries slightly more efficiently and compactly, as shown in the following code:

    def loop(): 
res = {}
for i in range(100000):
res[i] = i
return res

def comprehension():
return {i: i for i in range(100000)}
%timeit loop()
100 loops, best of 3: 13.2 ms per loop
%timeit comprehension()
100 loops, best of 3: 12.8 ms per loop

Efficient looping (especially in terms of memory) can be implemented using iterators and functions such as filter and map. As an example, consider the problem of applying a series of operations to a list using list comprehension and then taking the maximum value:

    def map_comprehension(numbers):
a = [n * 2 for n in numbers]
b = [n ** 2 for n in a]
c = [n ** 0.33 for n in b]
return max(c)

The problem with this approach is that for every list comprehension, we are allocating a new list, increasing memory usage. Instead of using list comprehension, we can employ generators. Generators are objects that, when iterated upon, compute a value on the fly and return the result.

For example, the map function takes two arguments--a function and an iterator--and returns a generator that applies the function to every element of the collection. The important point is that the operation happens only while we are iterating, and not when map is invoked!

We can rewrite the previous function using map and by creating intermediate generators, rather than lists, thus saving memory by computing the values on the fly:

    def map_normal(numbers):
a = map(lambda n: n * 2, numbers)
b = map(lambda n: n ** 2, a)
c = map(lambda n: n ** 0.33, b)
return max(c)

We can profile the memory of the two solutions using the memory_profiler extension from an IPython session. The extension provides a small utility, %memit, that will help us evaluate the memory usage of a Python statement in a way similar to %timeit, as illustrated in the following snippet:

    %load_ext memory_profiler
numbers = range(1000000)
%memit map_comprehension(numbers)
peak memory: 166.33 MiB, increment: 102.54 MiB
%memit map_normal(numbers)
peak memory: 71.04 MiB, increment: 0.00 MiB

As you can see, the memory used by the first version is 102.54 MiB, while the second version consumes 0.00 MiB! For the interested reader, more functions that return generators can be found in the itertools module, which provides a set of utilities designed to handle common iteration patterns.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.234.150