8
FUNCTIONAL PROGRAMMING

image

Many Python developers are unaware of the extent to which you can use functional programming in Python, which is a shame: with few exceptions, functional programming allows you to write more concise and efficient code. Moreover, Python’s support for functional programming is extensive.

This chapter will cover some of the functional programming aspects of Python, including creating and using generators. You’ll learn about the most useful functional packages and functions available and how to use them in combination to get the most efficient code.

NOTE

If you want to get serious about functional programming, here’s my advice: take a break from Python and learn a hugely functional programming language, such as Lisp. I know it might sound strange to talk about Lisp in a Python book, but playing with Lisp for several years taught me how to “think functional.” You may not develop the thought processes necessary to make full use of functional programming if all your experience comes from imperative and object-oriented programming. Lisp isn’t purely functional itself, but it has more focus on functional programming than you’ll find in Python.

Creating Pure Functions

When you write code using a functional style, your functions are designed to have no side effects: instead, they take an input and produce an output without keeping state or modifying anything not reflected in the return value. Functions that follow this ideal are referred to as purely functional.

Let’s start with an example of a regular, non-pure function that removes the last item in a list:

def remove_last_item(mylist):
    """Removes the last item from a list."""
    mylist.pop(-1)  # This modifies mylist

The following is a pure version of the same function:

def butlast(mylist):

    return mylist[:-1]  # This returns a copy of mylist

We define a butlast() function to work like butlast in Lisp, in that it returns the list without the last element without modifying the original list. Instead, it returns a copy of the list that has the modifications in place, allowing us to keep the original.

The practical advantages of functional programming include the following:

Modularity Writing with a functional style forces a certain degree of separation in solving your individual problems and makes sections of code easier to reuse in other contexts. Since the function does not depend on any external variable or state, calling it from a different piece of code is straightforward.

Brevity Functional programming is often less verbose than other paradigms.

Concurrency Purely functional functions are thread-safe and can run concurrently. Some functional languages do this automatically, which can be a big help if you ever need to scale your application, though this is not quite the case yet in Python.

Testability Testing a functional program is incredibly easy: all you need is a set of inputs and an expected set of outputs. They are idempotent, meaning that calling the same function over and over with the same arguments will always return the same result.

Generators

A generator is an object that behaves like an iterator, in that it generates and returns a value on each call of its next() method until a StopIteration is raised. Generators, first introduced in PEP 255, offer an easy way to create objects that implement the iterator protocol. While writing generators in a functional style is not strictly necessary, doing so makes them easier to write and debug and is a common practice.

To create a generator, just write a regular Python function that contains a yield statement. Python will detect the use of yield and tag the function as a generator. When execution reaches the yield statement, the function returns a value as with a return statement, but with one notable difference: the interpreter will save a stack reference, and this will be used to resume the function’s execution when the next() function is called again.

When functions are executed, the chaining of their execution produces a stack—function calls are said to be stacked on each other. When a function returns, it’s removed from the stack, and the value it returns is passed to the calling function. In the case of a generator, the function does not really return but yields instead. Python therefore saves the state of the function as a stack reference, resuming the execution of the generator at the point it saved when the next iteration of the generator is needed.

Creating a Generator

As mentioned, you create a generator by writing a normal function and including yield in the function’s body. Listing 8-1 creates a generator called mygenerator() that includes three yields, meaning it will iterate with the next three calls to next().

>> def mygenerator():
...     yield 1
...     yield 2
...     yield 'a'
...
>>> mygenerator()
<generator object mygenerator at 0x10d77fa50>
>>> g = mygenerator()
>>> next(g)
1
>>> next(g)
2
>>> next(g)
'a'
>>> next(g)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

Listing 8-1: Creating a generator with three iterations

When it runs out of yield statements, StopIteration is raised at the next call to next().

In Python, generators keep a reference to the stack when a function yields something, and they resume this stack when a call to next() is executed again.

The naive approach when iterating over any data without using generators is to build the entire list first, which often consumes memory wastefully.

Say we want to find the first number between 1 and 10,000,000 that’s equal to 50,000. Sounds easy, doesn’t it? Let’s make this a challenge. We’ll run Python with a memory constraint of 128MB and try the naive approach of first building the entire list:

$ ulimit -v 131072
$ python3
>>> a = list(range(10000000))

This naive method first tries to build the list, but if we run the program so far:

Traceback (most recent call last): File "<stdin>", line 1, in <module>
MemoryError

Uh-oh. Turns out we can’t build a list of 10 million items with only 128MB of memory!

WARNING

In Python 3, range() returns a generator when iterated. To get a generator in Python 2, you have to use xrange() instead. This function doesn’t exist in Python 3 anymore, since it’s redundant.

Let’s try using a generator instead, with the same 128MB restriction:

$ ulimit -v 131072
$ python3
>>> for value in range(10000000):
...     if value == 50000:
...             print("Found it")
...             break
...
Found it

This time, our program executes without issue. When it is iterated over, the range() class returns a generator that dynamically generates our list of integers. Better still, since we are only interested in the 50,000th number, instead of building the full list, the generator only had to generate 50,000 numbers before it stopped.

By generating values on the fly, generators allow you to handle large data sets with minimal consumption of memory and processing cycles. Whenever you need to work with a huge number of values, generators can help you handle them efficiently.

Returning and Passing Values with yield

A yield statement also has a less commonly used feature: it can return a value in the same way as a function call. This allows us to pass a value to a generator by calling its send() method. As an example of using send(), we’ll write a function called shorten() that takes a list of strings and returns a list consisting of those same strings, only truncated (Listing 8-2).

def shorten(string_list):
    length = len(string_list[0])
    for s in string_list:
        length = yield s[:length]

mystringlist = ['loremipsum', 'dolorsit', 'ametfoobar']
shortstringlist = shorten(mystringlist)
result = []
try:
    s = next(shortstringlist)
    result.append(s)
    while True:
        number_of_vowels = len(filter(lambda letter: letter in 'aeiou', s))
        # Truncate the next string depending
        # on the number of vowels in the previous one
        s = shortstringlist.send(number_of_vowels)
        result.append(s)
except StopIteration:
    pass

Listing 8-2: Returning and using a value with send()

In this example, we’ve written a function called shorten() that takes a list of strings and returns a list consisting of those same strings, only truncated. The length of each truncated string is equal to the number of vowels in the previous string: loremipsum has four vowels, so the second value returned by the generator will be the first four letters of dolorsit; dolo has only two vowels, so ametfoobar will be truncated to its first two letters am. The generator then stops and raises StopIteration. Our generator thus returns:

['loremipsum', 'dolo', 'am']

Using yield and send() in this fashion allows Python generators to function like coroutines seen in Lua and other languages.

PEP 289 introduced generator expressions, making it possible to build one-line generators using a syntax similar to list comprehension:

>>> (x.upper() for x in ['hello', 'world'])
<generator object <genexpr> at 0x7ffab3832fa0>
>>> gen = (x.upper() for x in ['hello', 'world'])
>>> list(gen)
['HELLO', 'WORLD']

In this example, gen is a generator, just as if we had used the yield statement. The yield in this case is implicit.

Inspecting Generators

To determine whether a function is considered a generator, use inspect.isgeneratorfunction(). In Listing 8-3, we create a simple generator and inspect it.

>>> import inspect
>>> def mygenerator():
...     yield 1
...
>>> inspect.isgeneratorfunction(mygenerator)
True
>>> inspect.isgeneratorfunction(sum)
False

Listing 8-3: Checking whether a function is a generator

Import the inspect package to use isgeneratorfunction() and then just pass it the name of the function to inspect. Reading the source code of inspect.isgeneratorfunction() gives us some insight into how Python marks functions as being generators (see Listing 8-4).

def isgeneratorfunction(object):
    """Return true if the object is a user-defined generator function.

    Generator function objects provides same attributes as functions.

    See help(isfunction) for attributes listing."""

  return bool((isfunction(object) or ismethod(object)) and
                object.func_code.co_flags & CO_GENERATOR)

Listing 8-4: Source code of inspect.isgeneratorfunction()

The isgeneratorfunction() function checks that the object is a function or a method and that its code has the CO_GENERATOR flag set. This example shows how easy it is to understand how Python works under the hood.

The inspect package provides the inspect.getgeneratorstate() function, which gives the current state of the generator. We’ll use it on mygenerator() here at different points of execution:

   >>> import inspect
   >>> def mygenerator():
   ...     yield 1
   ...
   >>> gen = mygenerator()
   >>> gen
   <generator object mygenerator at 0x7f94b44fec30>
   >>> inspect.getgeneratorstate(gen)
'GEN_CREATED'
   >>> next(gen)
   1
   >>> inspect.getgeneratorstate(gen)
'GEN_SUSPENDED'
   >>> next(gen)
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
   StopIteration
   >>> inspect.getgeneratorstate(gen)
'GEN_CLOSED'

This allows us to determine whether the generator is waiting to be run for the first time (GEN_CREATED) , waiting to be resumed by a call to next() (GEN_SUSPENDED) , or finished running (GEN_CLOSED) . This might come in handy to debug your generators.

List Comprehensions

List comprehension, or listcomp for short, allows you to define a list’s contents inline with its declaration. To make a list into a listcomp, you must wrap it in square brackets as usual, but also include an expression that will generate the items in the list and a for loop to loop through them.

The following example creates a list without using list comprehension:

>>> x = []
>>> for i in (1, 2, 3):
...     x.append(i)
...
>>> x
[1, 2, 3]

And this next example uses list comprehension to make the same list with a single line:

>>> x = [i for i in (1, 2, 3)]
>>> x
[1, 2, 3]

Using a list comprehension presents two advantages: code written using listcomps is usually shorter and therefore compiles down to fewer operations for Python to perform. Rather than creating a list and calling append over and over, Python can just create the list of items and move them into a new list in a single operation.

You can use multiple for statements together and use if statements to filter out items. Here we create a list of words and use list comprehension to capitalize each item, split up items with multiple words into single words, and delete the extraneous or :

x = [word.capitalize()
     for line in ("hello world?", "world!", "or not")
     for word in line.split()
     if not word.startswith("or")]
>>> x
['Hello', 'World?', 'World!', 'Not']

This code has two for loops: the first iterates over the text lines, while the second iterates over words in each of those lines. The final if statement filters out words that start with or to exclude them from the final list.

Using list comprehension rather than for loops is a neat way to define lists quickly. Since we’re still talking about functional programming, it’s worth noting that lists built through list comprehension shouldn’t rely on changing the program’s state: you are not expected to modify any variable while building the list. This usually makes the lists more concise and easier to read than lists made without listcomp.

Note that there’s also syntax for building dictionaries or sets in the same fashion, like so:

>>> {x:x.upper() for x in ['hello', 'world']}
{'world': 'WORLD', 'hello': 'HELLO'}
>>> {x.upper() for x in ['hello', 'world']}
set(['WORLD', 'HELLO'])

Functional Functions Functioning

You might repeatedly encounter the same set of problems when manipulating data using functional programming. To help you deal with this situation efficiently, Python includes a number of functions for functional programming. This section will give you a quick overview of some of these built-in functions that allow you to build fully functional programs. Once you have an idea of what’s available, I encourage you to research further and try out functions where they might apply in your own code.

Applying Functions to Items with map()

The map() function takes the form map(function, iterable) and applies function to each item in iterable to return a list in Python 2 or an iterable map object in Python 3, as shown in Listing 8-5.

>>> map(lambda x: x + "bzz!", ["I think", "I'm good"])
<map object at 0x7fe7101abdd0>
>>> list(map(lambda x: x + "bzz!", ["I think", "I'm good"]))
['I thinkbzz!', "I'm goodbzz!"]

Listing 8-5: Using map() in Python 3

You could write an equivalent of map() using list comprehension, like this:

>>> (x + "bzz!" for x in ["I think", "I'm good"])
<generator object <genexpr> at 0x7f9a0d697dc0>
>>> [x + "bzz!" for x in ["I think", "I'm good"]]
['I thinkbzz!', "I'm goodbzz!"]

Filtering Lists with filter()

The filter() function takes the form filter(function or None, iterable) and filters the items in iterable based on the result returned by function. This will return a list in Python 2 or an iterable filter object in Python 3:

>>> filter(lambda x: x.startswith("I "), ["I think", "I'm good"])
<filter object at 0x7f9a0d636dd0>
>>> list(filter(lambda x: x.startswith("I "), ["I think", "I'm good"]))
['I think']

You could also write an equivalent of filter() using list comprehension, like so:

>>> (x for x in ["I think", "I'm good"] if x.startswith("I "))
<generator object <genexpr> at 0x7f9a0d697dc0>
>>> [x for x in ["I think", "I'm good"] if x.startswith("I ")]
['I think']

Getting Indexes with enumerate()

The enumerate() function takes the form enumerate(iterable[, start]) and returns an iterable object that provides a sequence of tuples, each consisting of an integer index (starting with start, if provided) and the corresponding item in iterable. This function is useful when you need to write code that refers to array indexes. For example, instead of writing this:

i = 0
while i < len(mylist):
    print("Item %d: %s" % (i, mylist[i]))
    i += 1

you could accomplish the same thing more efficiently with enumerate(), like so:

for i, item in enumerate(mylist):
    print("Item %d: %s" % (i, item))

Sorting a List with sorted()

The sorted() function takes the form sorted(iterable, key=None, reverse=False) and returns a sorted version of iterable. The key argument allows you to provide a function that returns the value to sort on, as shown here:

>>> sorted([("a", 2), ("c", 1), ("d", 4)])
[('a', 2), ('c', 1), ('d', 4)]
>>> sorted([("a", 2), ("c", 1), ("d", 4)], key=lambda x: x[1])
[('c', 1), ('a', 2), ('d', 4)]

Finding Items That Satisfy Conditions with any() and all()

The any(iterable) and all(iterable) functions return a Boolean depending on the values returned by iterable. These simple functions are equivalent to the following full Python code:

def all(iterable):
    for x in iterable:
        if not x:
            return False
    return True

def any(iterable):
    for x in iterable:
        if x:
            return True
    return False

These functions are useful for checking whether any or all of the values in an iterable satisfy a given condition. For example, the following checks a list for two conditions:

mylist = [0, 1, 3, -1]
if all(map(lambda x: x > 0, mylist)):
    print("All items are greater than 0")
if any(map(lambda x: x > 0, mylist)):
    print("At least one item is greater than 0")

The difference here is that any() returns True when at least one element meets the condition, while all() returns True only if every element meets the condition. The all() function will also return True for an empty iterable, since none of the elements is False.

Combining Lists with zip()

The zip() function takes the form zip(iter1 [,iter2 [...]]). It takes multiple sequences and combines them into tuples. This is useful when you need to combine a list of keys and a list of values into a dict. As with the other functions described here, zip() returns a list in Python 2 and an iterable in Python 3. Here we map a list of keys to a list of values to create a dictionary:

>>> keys = ["foobar", "barzz", "ba!"]
>>> map(len, keys)
<map object at 0x7fc1686100d0>
>>> zip(keys, map(len, keys))
<zip object at 0x7fc16860d440>
>>> list(zip(keys, map(len, keys)))
[('foobar', 6), ('barzz', 5), ('ba!', 3)]
>>> dict(zip(keys, map(len, keys)))
{'foobar': 6, 'barzz': 5, 'ba!': 3}

A Common Problem Solved

There’s one important tool still to cover. Often when working with lists we want to find the first item that satisfies a specific condition. We’ll look at the many ways to accomplish this and then see the most efficient way: the first package.

Finding the Item with Simple Code

We might be able to find the first item to satisfy a condition with a function like this:

def first_positive_number(numbers):
    for n in numbers:
        if n > 0:
            return n

We could rewrite the first_positive_number() function in functional style like this:

def first(predicate, items):
    for item in items:
        if predicate(item):
            return item

first(lambda x: x > 0, [-1, 0, 1, 2])

By using a functional approach where the predicate is passed as argument, the function becomes easily reusable. We could even write it more concisely, like so:

# Less efficient
list(filter(lambda x: x > 0, [-1, 0, 1, 2]))[0]
# Efficient
next(filter(lambda x: x > 0, [-1, 0, 1, 2]))

Note that this may raise an IndexError if no items satisfy the condition, causing list(filter()) to return an empty list.

For simple cases, you can rely on next() to prevent IndexError from occurring, like so:

>>> a = range(10)
>>> next(x for x in a if x > 3)
4

Listing 8-6 will raise StopIteration if a condition can never be satisfied. This too can be solved by adding a second argument of next(), like so.

>>> a = range(10)
>>> next((x for x in a if x > 10), 'default')
'default'

Listing 8-6: Returning a default value when the condition is not met

This will return a default value rather than an error when a condition cannot be met. Lucky for us, Python provides a package to handle all of this for us.

Finding the Item Using first()

Rather than writing out the function from Listing 8-6 in all of your programs, you can include the small Python package first. Listing 8-7 shows how this package lets you find the first element of an iterable matching a condition.

>>> from first import first
>>> first([0, False, None, [], (), 42])
42
>>> first([-1, 0, 1, 2])
-1
>>> first([-1, 0, 1, 2], key=lambda x: x > 0)
1

Listing 8-7: Finding the first item in a list that satisfies a condition

You see that the first() function returns the first valid, non-empty item in a list.

Using lambda() with functools

You’ll notice that we’ve used lambda() in a good portion of the examples so far in this chapter. The lambda() function was added to Python to facilitate functional programming functions such as map() and filter(), which otherwise would have required writing an entirely new function every time you wanted to check a different condition. Listing 8-8 is equivalent to Listing 8-7 but is written without using lambda().

import operator
from first import first

def greater_than_zero(number):
    return number > 0

first([-1, 0, 1, 2], key=greater_than_zero)

Listing 8-8: Finding the first item to meet the condition, without using lambda()

This code works identically to that in Listing 8-7, returning the first non-empty value in a list to meet the condition, but it’s a good deal more cumbersome: if we wanted to get the first number in the sequence that’s longer than, say, 42 items, we’d need to define an appropriate function via def rather than defining it inline with our call to first().

But despite its usefulness in helping us avoid situations like this, lambda still has its problems. The first module contains a key argument that can be used to provide a function that receives each item as an argument and returns a Boolean indicating whether it satisfies the condition. However, we can’t pass a key function, as it would require more than a single line of code: a lambda statement cannot be written on more than one line. That is a significant limitation of lambda.

Instead, we would have to go back to the cumbersome pattern of writing new function definitions for each key we need. Or would we?

The functools package comes to the rescue with its partial() method, which provides us with a more flexible alternative to lambda. The functools.partial() method allows us to create a wrapper function with a twist: rather than changing the behavior of a function, it instead changes the arguments it receives, like so:

   from functools import partial
   from first import first

def greater_than(number, min=0):
       return number > min

first([-1, 0, 1, 2], key=partial(greater_than, min=42))

Here we create a new greater_than() function that works just like the old greater_than_zero() from Listing 8-8 by default, but this version allows us to specify the value we want to compare our numbers to, whereas before it was hardcoded. Here, we pass functools.partial() to our function and the value we want for min , and we get back a new function that has min set to 42, just as we want . In other words, we can write a function and use functools.partial() to customize the behavior of our new functions to suit our needs in any given situation.

Even this version can be pared down. All we’re doing in this example is comparing two numbers, and as it turns out, the operator module has built-in functions for exactly that:

import operator
from functools import partial
from first import first

first([-1, 0, 1, 2], key=partial(operator.le, 0))

This is a good example of functools.partial() working with positional arguments. In this case, the function operator.le(a, b), which takes two numbers and returns a Boolean that tells us whether the first number is less than or equal to the second, is passed to functools.partial(). The 0 we pass to functools.partial() gets assigned to a, and the argument passed to the function returned by functools.partial() gets assigned to b. So this works identically to Listing 8-8 but without using lambda or defining any additional functions.

NOTE

The functools.partial() method is typically useful in place of lambda and should be considered a superior alternative. The lambda function is something of an anomaly in the Python language, and dropping it altogether was considered for Python 3 due to the function’s limited body size of a single line.

Useful itertools Functions

Finally, we’ll look at some useful functions in the itertools module in the Python Standard Library that you should be aware of. Too many programmers end up writing their own versions of these functions simply because they aren’t aware that Python provides them out of the box. They are all designed to help you manipulate iterator (that’s why the module is called iter-tools) and therefore are all purely functional. Here I’ll list a few of them and give a brief overview of what they do, and I encourage you to look into them further if they seem of use.

  • accumulate(iterable[, func]) returns a series of accumulated sums of items from iterables.

  • chain(*iterables) iterates over multiple iterables, one after another, without building an intermediate list of all items.

  • combinations(iterable, r) generates all combinations of length r from the given iterable.

  • compress(data, selectors) applies a Boolean mask from selectors to data and returns only the values from data where the corresponding element of selectors is True.

  • count(start, step) generates an endless sequence of values, starting with start and incrementing step at a time with each call.

  • cycle(iterable) loops repeatedly over the values in iterable.

  • repeat(elem[, n]) repeats an element n times.

  • dropwhile(predicate, iterable) filters elements of an iterable starting from the beginning until predicate is False.

  • groupby(iterable, keyfunc) creates an iterator that groups items by the result returned by the keyfunc() function.

  • permutations(iterable[, r]) returns successive r-length permutations of the items in iterable.

  • product(*iterables) returns an iterable of the Cartesian product of iterables without using a nested for loop.

  • takewhile(predicate, iterable) returns elements of an iterable starting from the beginning until predicate is False.

These functions are particularly useful in conjunction with the operator module. When used together, itertools and operator can handle most situations that programmers typically rely on lambda for. Here’s an example of using operator.itemgetter() instead of writing lambda x: x['foo']:

>>> import itertools
>>> a = [{'foo': 'bar'}, {'foo': 'bar', 'x': 42}, {'foo': 'baz', 'y': 43}]
>>> import operator
>>> list(itertools.groupby(a, operator.itemgetter('foo')))
[('bar', <itertools._grouper object at 0xb000d0>), ('baz', <itertools._grouper object at
0xb00110>)]
>>> [(key, list(group)) for key, group in itertools.groupby(a, operator.itemgetter('foo'))]
[('bar', [{'foo': 'bar'}, {'x': 42, 'foo': 'bar'}]), ('baz', [{'y': 43, 'foo': 'baz'}])]

In this case, we could have also written lambda x: x['foo'], but using operator lets us avoid having to use lambda at all.

Summary

While Python is often advertised as being object oriented, it can be used in a very functional manner. A lot of its built-in concepts, such as generators and list comprehension, are functionally oriented and don’t conflict with an object-oriented approach. They also limit the reliance on a program’s global state, for your own good.

Using functional programming as a paradigm in Python can help you make your program more reusable and easier to test and debug, supporting the Don’t Repeat Yourself (DRY) mantra. In this spirit, the standard Python modules itertools and operator are good tools to improve the readability of your functional code.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.128.226.121