Chapter 18. Algorithms

Introduction

Credit: Tim Peters, PythonLabs

Algorithm research is what drew me to Python—and I fell in love. It wasn’t love at first sight, but it was an attraction that grew into infatuation, which grew steadily into love. And that love shows no signs of fading. Why? I’ve worked in fields pushing the state of the art, and, in a paradoxical nutshell, Python code is easy to throw away!

When you’re trying to solve a problem that may not have been solved before, you may have some intuitions about how to proceed, but you rarely know in advance exactly what needs to be done. The only way to proceed is to try things, many things, everything you can think of, just to see what happens. Python makes such exploration easier by minimizing the time and pain from conception to code: if your colleagues are using, for example, C or Java, it’s not unusual for you to try (and discard) six different approaches in Python while they’re still getting the bugs out of their first attempt.

In addition, you will have naturally grown classes and modules that capture key parts of the problem domain, simply because you find the need to keep reinventing them when starting over from scratch. I’ve used many languages in my computer career, and I know of none more productive than Python for prototyping. Best of all, while being an expert is often helpful, moderate skill in Python is much easier to obtain than for many other languages, yet much more productive for research and prototyping than merely moderate skill in any other language I’ve used. You don’t have to be an expert to start!

So if you’re in the research business—and every programmer who doesn’t know everything occasionally is—you’ve got a nearly perfect language in Python. How then do you develop the intuitions that can generate a myriad of plausible approaches to try? Experience is the final answer, as we all get better at what we do often, but studying the myriad approaches other people have tried develops a firm base from which to explore. Toward that end, here are the most inspiring algorithm books I’ve read. They’ll teach you possibilities you may never have discovered on your own:

John Bentley, Programming Pearls and More Programming Pearls (Addison-Wesley)

Every programmer should read these books from cover to cover for sheer joy. The chapters are extended versions of a popular column Bentley wrote for the Communications of the Association for Computing Machinery (CACM). Each chapter is generally self-contained, covering one or two lovely (and often surprising, in the “Aha! why didn’t I think of that?!” sense) techniques of real practical value.

Robert Sedgewick, Algorithms in C++ or Algorithms in C (Addison-Wesley)

These books cover the most important general algorithms, organized by problem domain, and provide brief but cogent explanations, along with working code. The books cover the same material; the difference is in which computer language is used for the code. I recommend the C++ book for Python programmers, because idiomatic Python is closer to C++ than to C. Sedgewick’s use of C++ is generally simple and easily translated to equivalent Python. This is the first book to reach for when you need to tackle a new area quickly.

Donald Knuth, The Art of Computer Programming, series (Addison-Wesley)

For experts (and those who aspire to expertise), this massive series in progress is the finest in-depth exposition of the state of the art. Nothing compares to its unique combination of breadth and depth, rigor, and historical perspective. Note that these books aren’t meant to be read, they have to be actively studied, and many valuable insights are scattered in answers to the extensive exercises. While the books include detailed analysis, there’s virtually no working code, except for programs written in assembly language for a hypothetical machine of archaic design (yes, it can be maddeningly obscure). It can be hard going at times, but few books so richly reward time invested.

To hone your skills, you can practice on an endless source of problems from the wonderful On-Line Encyclopedia of Integer Sequences, at http://www.research.att.com/~njas/sequences/Seis.html. When stress-testing upcoming Python releases, I sometimes pick a sequence at random from its list of sequences needing more terms and write a program to attempt an extension the sequence. Sometimes I’m able to extend a sequence that hasn’t been augmented in years, in large part because Python has so many powerful features for rapid construction of new algorithms. Then the new terms are contributed to the database, where they benefit others. Give it a try! You may love it, but even if you hate it, you’ll certainly find it challenging.

Timing and timeit.py

The first edition of this book contained a lengthy discussion of the difficulties in timing alternative approaches. Such difficulties include the fact that the resolution of time.time varies across platforms, and time.clock measures different things on different platforms (e.g., process CPU time on Linux systems, wall-clock time on Windows).

It may still be important for some to learn all those details, but Python 2.3 introduced a new timeit module, which captures best practice and is perfectly suited to timing small programs with a minimum of fuss and pitfalls. Everyone should learn how to use timeit, and basic usage is very easy to learn.

The simplest use of timeit is to pass one or more Python statements on the command line. Of course, shell syntax varies across platforms, so you may need to adjust these statements to the shell you use:

% python timeit.py "100 + 200"10000000 loops, best of 3: 0.0932 usec per loop
% python timeit.py "100 - 200"
10000000 loops, best of 3: 0.0931 usec per loop

As expected, integer addition and subtraction are just about equally expensive. (Don’t fall into the trap of attributing any significance to differences as tiny as this one!) timeit picks the best way of measuring time on your platform and runs your code in a loop. The module tries a few times first to determine how many iterations to use in the loop, aiming at a total loop time between 0.2 and 2 seconds. When it determines a suitable number of iterations for the loop, it then runs the loop three times, reports the shortest time, and computes the time per loop iteration. The iterations per loop, and number of loops to run, can be forced to specific values with command-line options. See the Python Library Reference for details. (It’s part of Python’s online documentation and probably also comes in some handy form with your version of Python.)

As always, you should keep your machine as quiet as possible when running timing tests. The primary reason timeit runs three repetitions of the loop and reports the minimum time is to guard against skewed results due to other machine activity. This is especially important when running snippets that do very little work, such as the preceding examples. In such cases, even just one unfortunate interruption can grossly increase the reported time. Even so, on my quiet machine, snippets that run this fast can still yield confusing results:

% python timeit.py "100 + 200; 100 - 200"10000000 loops, best of 3: 0.151 usec per loop
% python timeit.py "100 + 200" "100 - 200"
10000000 loops, best of 3: 0.151 usec per loop

One correct conclusion is that modern Python no longer has a time penalty for writing two statements on two lines, instead of squashing them together on one line separated by a semicolon. Older Pythons generated a SET_LINENO opcode at the start of each logical line of code, and those opcodes consumed time to execute!

A more puzzling result is that adding and subtracting in one shot took 0.151 usec, but adding alone and subtracting alone took 0.0932 usec each. Why didn’t we get 2*0.093 = 0.186 usec in the second experiment? The explanation is quite simple: timeit uses a fast iteration technique and doesn’t try to subtract the iteration overhead from the reported results. When timing very fast snippets, this can be mildly disconcerting. Let’s try to measure the overhead by timing a do-nothing statement:

% python timeit.py "pass"10000000 loops, best of 3: 0.0203 usec per loop

While 0.02 usec is tiny, it’s significant compared to the 0.093 usec reported for an integer add! Of course this effect diminishes to insignificance when timing more expensive operations:

% python timeit.py "100**100"100000 loops, best of 3: 4.04 usec per loop
% python timeit.py "200**200"
100000 loops, best of 3: 9.03 usec per loop
% python timeit.py "100**100" "200**200"
100000 loops, best of 3: 13.1 usec per loop

Large integer exponentiation is much more expensive than adding small integers, and here the sum of the times for doing 100**100 and 200**200 in isolation is very close to the time for doing both at once.

The timeit module supports several other command-line options, and a programmatic interface too, but I’ll defer to the Python Library Reference for that information. To start making productive use of timeit, the only other option you need to know about is the ability to pass “setup” statements on the command line. These statements execute once, outside the loop containing the code you’re timing. For example, import statements are often used, as well as code that populates data structures. For example (assuming a backslash is your shell’s way to indicate that a long logical line continues in the next physical line):

% python timeit.py -s "import random" 
  -s "x=range(100000); random.shuffle(x)" "sorted(x)"10 loops, best of 3: 152 msec per loop

For each of the three loops, timeit constructed the randomly ordered array just once, then ran sorted(x) repeatedly inside the loop. This was so expensive that timeit ran only 10 iterations per loop and changed its reporting units from microseconds to milliseconds. (In Python 2.3, timeit always reported in microseconds, but in version 2.4, it tries to be more helpful by picking the appropriate reporting units.) This is very different from:

% python timeit.py "import random" 
  "x=range(100000); random.shuffle(x)" "sorted(x)"10 loops, best of 3: 309 msec per loop

This snippet timed all the operations: importing random, building the list, randomly permuting the list, and sorting the list. This preparation code takes longer than sorting does! You may be surprised that we see from the reported times that it took at least as long to build and shuffle the list as it took to sort it. The first two operations take O ( n ) time, but sorting random data takes O ( n log n ) time; given this, how can this strange measurement be explained? Why didn’t sorting take longer?

I won’t explain that mystery here but will point out a more significant lesson: timing code always uncovers mysteries, and a timing tool as easy to use as timeit can be addictive. So be careful what you measure! Measuring itself will consume more of your time than you expect. As noted innumerable times by innumerable authors, the speed of most of your code doesn’t matter at all. Find the 10% that consumes most of the time before worrying about any of it. When you find the true bottlenecks, timeit can help you measure the speed of alternatives objectively—and you may be surprised by what you find.

18.1. Removing Duplicates from a Sequence

Credit: Tim Peters

Problem

You have a sequence that may include duplicates, and you want to remove the duplicates in the fastest possible way, without knowing much about the properties of the items in the sequence. You do not care about the “or"der of items in the resulting sequence.

Solution

The key is to try several approaches, fastest first, and use try/except to handle the failing cases of the faster approaches by falling back to slower approaches. Here’s a function that implements exactly this strategy:

# support 2.3 as well as 2.4
try: set
except NameError: from sets import Set as set
def unique(s):
    """ Return a list of the elements in s in arbitrary order, but without
        duplicates. """
    # Try using a set first, because it's the fastest and will usually worktry:
        return list(set(s))
               except TypeError:
        pass  # Move on to the next method
    # Since you can't hash all elements, try sorting, to bring equal items
    # together and then weed them out in a single pass
    t = list(s)
    try:
        t.sort( )
    except TypeError:
        del t  # Move on to the next method
    else:
        # the sort worked, so we're fine -- do the weeding
        return [x for i, x in enumerate(t) if not i or x != t[i-1]]
    # Brute force is all that's left
    u = [  ]
    for x in s:
        if x not in u:
            u.append(x)
    return u

Discussion

The purpose of this recipe’s unique function is to take a sequence s as an argument and return a list of the items in s in arbitrary order, but without duplicates. For example, calling unique([1, 2, 3, 1, 2, 3]) returns an arbitrary permutation of [1, 2, 3], calling unique('abcabc') returns an arbitrary permutation of ['a', 'b', 'c'], and calling unique(([1, 2], [2, 3], [1, 2])) returns an arbitrary permutation of [[2, 3], [1, 2]].

The fastest way to remove duplicates from a sequence depends on fairly subtle properties of the sequence elements, such as whether they’re hashable and whether they support full comparisons. The unique function shown in this recipe tries three methods, from fastest to slowest, letting runtime exceptions pick the best method for the sequence at hand.

For fastest speed, all sequence elements must be hashable. When they are, the unique function will usually work in linear time (i.e., O ( n ), or directly proportional to the number of elements in the input, which is good and highly scalable performance behavior).

If it turns out that hashing the elements (e.g., using them as dictionary keys, or, as in this case, set elements) is not possible, the next best situation is when the elements enjoy a total ordering, meaning that each element can be compared to each other element with the < operator. If list( s ).sort() doesn’t raise a TypeError, we can assume that s' elements can be sorted and therefore enjoy a total ordering. Then unique will usually work in O( n log( n )) time. Python lists’ sort method is particularly efficient in the presence of partially ordered data (including, e.g., data with many duplicates), so the sorting approach may be more effective in Python than elsewhere.

If sorting also turns out to be impossible, the sequence items must at least support equality testing, or else the very concept of duplicates can’t really be meaningful for them. In this case, unique works in quadratic time—that is, O( n 2), meaning time proportional to the square of the number of elements in the input: not very scalable, but the least of all evils, given the sequence items’ obviously peculiar nature (assuming we get all the way to this subcase).

This recipe is a pure example of how algorithm efficiency depends on the strength of the assumptions you can make about the data. You could split this recipe’s function into three distinct functions and directly call the one that best meets your needs. In practice, however, the brute-force method is so slow for large sequences that nothing measurable is lost by simply letting the function as written try the faster methods first.

If you need to preserve the same order of items in the output sequence as in the input sequence, see Recipe 18.2.

See Also

Recipe 18.2.

18.2. Removing Duplicates from a Sequence While Maintaining Sequence Order

Credit: Alex Martelli

Problem

You have a sequence that may include duplicates, and you want to remove the duplicates in the fastest possible way. Moreover, the output sequence must respect the item ordering of the input sequence.

Solution

The need to respect the item ordering of the input sequence means that picking unique items becomes a problem quite different from that explored previously in Recipe 18.1. This requirement often arises in conjunction with a function f that defines an equivalence relation among items: x is equivalent to y if and only if f ( x )== f ( y ). In this case, the task of removing duplicates may often be better described as picking the first representative of each resulting equivalence class. Here is a function to perform this task:

# support 2.3 as well as 2.4
try: set
except NameError: from sets import Set as set
# f defines an equivalence relation among items of sequence seq, and
# f(x) must be hashable for each item x of seq
def uniquer(seq, f=None):
    """ Keeps earliest occurring item of each f-defined equivalence class """
    if f is None:    # f's default is the identity function f(x) -> x
        def f(x): return x
    already_seen = set( )
    result = [  ]
    for item in seq:
        marker = f(item)if marker not in already_seen:
             already_seen.add(marker)
               result.append(item)
    return result

Discussion

The previous Recipe 18.1 is applicable only if you are not concerned about item ordering or, in other words, if the sequences involved are meaningful only as the sets of their items, which is often the case. When sequential order is significant, a different approach is needed.

If the items are hashable, it’s not hard to maintain sequence order, keeping only the first occurrence of each value. More generally, we may want uniqueness within equivalence classes, as shown in this recipe’s Solution: the uniquer function accepts as an argument a function f that must return hashable objects, such that f ( x )== f ( y ) if and only if items x and y are equivalent. Identity (in the mathematical sense, not in the Python sense) is used as the default when no argument f is supplied. The added generality of allowing an f different from the identity function costs no added complication whatsoever.

If you need to keep the last occurrence, rather than the earliest occurrence, of an item in each equivalence class, the simplest approach is to reverse the input sequence (or, rather, a copy thereof into a local list, since the input might be immutable or at any rate not support reversing), then, after processing with uniquer, reverse the resulting list:

def uniquer_last(seq, f=None):
    seq = list(seq)
    seq.reverse( )
    result = uniquer(seq, f)
    result.reverse( )
    return result

In Python 2.4, instead of the first three statements of this version of uniquer_last, you could use the single statement:

    result = uniquer(reversed(seq), f)

exploiting the new built-in reversed. However, this Python 2.4 version, while marginally faster, is less general, because it does require seq to be really a sequence, while the previously shown version (and the uniquer function in the “Solution”) work with any iterable seq. For example:

    somelines = uniquer_last(open('my.txt'), str.lower)

binds name somelines to the list of unique lines from text file my.txt, considering two lines equivalent if they’re equal aside from uppercase and lowercase distinctions, picking the last occurring one from each set of equivalent lines, and preserving the order of the lines in the file (phew). If you used Python 2.4’s built-in reversed, this latest snippet would not work, due to reversed’s prerequisites.

If you must deal with nonhashable items, the simplest fallback approach is to use a set-like container that supports the add method and membership testing without requiring items to be hashable. Unfortunately, performance will be much worse than with a real set. Here’s the simplest fallback implementation, which demands of items nothing but support for equality testing:

def uniquer_with_simplest_fallback(seq, f=None):
    if f is None:
        def f(x): return x
    already_seen = set( )
    result = [  ]
    for item in seq:
        marker = f(item)
        try:
            new_marker = marker not in already_seen
        except TypeError:
            class TotallyFakeSet(list):
                add = list.append
            already_seen = TotallyFakeSet(already_seen)
            new_marker = marker not in already_seen
         if new_marker:
             already_seen.add(marker)
             result.append(item)
    return result

A more refined approach would be to use two levels of fallback, the intermediate one based on sorting, as shown previously in Recipe 18.1 testing in a sorted list can be performed efficiently by using the Python Standard Library module bisect.

However, remember that you can often use an f that gives you hashable markers for nonhashable items. The built-in function repr can often be useful for this purpose. For example:

lol = [ [1, 2], [  ], [1, 2], [3], [  ], [3, 4], [1, 2], [  ], [2, 1] ]
print uniquer(lol, repr)
# emits:[[1, 2], [  ], [3], [3, 4], [2, 1]]

While the items of lol are lists, and thus are not hashable, the built-in function repr produces representations of each of the items as a string, which is hashable. This enables use of the fast function uniquer. Unfortunately, repr is not useful for nonhashable items of other types, including dict and set. Because of the workings of hash-collision resolution, it’s quite possible to have d1 == d2 and yet repr( d1 )!=repr( d2 ) for two dictionaries d1 and d2, depending on the exact sequences of adds that built each dict. Still, you may be able build your own repr-like function to work around these issues, depending on the exact nature of your data. Whether repr can help for instances of a certain user-defined type depends on how accurately and usefully that specific type defines special method _ _repr_ _, which repr calls.

The task of picking one representative item, out of all of those belonging to each equivalence class, can be generalized. Instead of the simple ideas of implicitly picking the first such item, or the last such item, we can choose among multiple items in the same equivalence class via an arbitrary picking function p that considers both the actual items and their indexes of occurrence. As long as function p can operate pairwise, the key idea is just to keep a dictionary that maps the marker of each equivalence class to the index and item of the representative so far picked for that class. At the end, we reconstruct sequence order by sorting on the indices:

def fancy_unique(seq, f, p):
    """ Keeps "best" item of each f-defined equivalence class, with
        picking function p doing pairwise choice of (index, item) """
    representative = {  }
    for index, item in enumerate(seq):
        marker = f(item)
        if marker in representative:
            # It's NOT a problem to rebind index and item within the
            # for loop: the next leg of the loop does not use their binding
            index, item = p((index, item), representative[marker])
        representative[marker] = index, item
    # reconstruct sequence order by sorting on indices
    auxlist = representative.values( )
    auxlist.sort( )
    return [item for index, item in auxlist]

It’s possible that the picking function cannot operate pairwise, but rather must be presented with the whole bunch of ( index, item ) pairs for each equivalence class in order to pick the best representative of that class (e.g., it may have to get the median of the items in each class as being the best representative of that class). Then we need one pass over the sequence to collect the bunches, followed by a pass over the bunches, to pick the representatives:

def fancier_uniquer(seq, f, p):
    """ Keeps "best" item of each f-defined equivalence class, with
        picking function p choosing appropriate (index, item) for each
        equivalence class from the list of all (index, item) pairs in
        that class """
    bunches = {  }
    for index, item in enumerate(seq):
        marker = f(item)
        bunches.setdefault(marker, [  ]).append((index, item))
    auxlist = [p(candidates) for candidates in bunches.values( )]
    auxlist.sort( )
    return [item for index, item in auxlist]

These fancy approaches that rely on a picking function are useful only for substantial equivalence functions, not for identity, so I removed f’s default value from these versions.

An example of use for fancy_unique may help. Say we’re given a list of words, and we need to get a sublist from it, respecting order, such that no two words on the sublist begin with the same letter. Out of all the words in the “or"iginal list that begin with each given letter, we need to keep the longest word and, in case of equal lengths, the word appearing later on the list. This sounds complicated, but with fancy_unique to help us, it’s really not that bad:

def complicated_choice(words):
    def first_letter(aword):
        return aword[0].lower( )
    def prefer((indx1, word1), (indx2, word2)):
        if len(word2) > len(word1):
            return indx2, word2
        return indx1, word1
    return fancy_unique(words, first_letter, prefer)

The prefer nested function within complicated_choice is simplified because it knows fancy_unique always calls it with indx2<indx1. So, the older indx2, word2 pair must be returned only when word2 is longer than word1; otherwise, indx1, word1 is always the proper result. The automatic tuple unpacking in prefer’s signature is debatable, stylewise, but I like it (it reminds me of SML or Haskell).

Out of all the general programming techniques presented in the various functions of this recipe, the idea of writing higher-order functions, which organize a computation and appropriately call back to functions that they receive as arguments, is easily the most precious and widely applicable concept. This idea is well worth keeping in mind in several circumstances—not just for old Haskell-heads, because it works just as well in Python.

See Also

Recipe 18.1.

18.3. Generating Random Samples with Replacement

Credit: Sean Ross

Problem

You need to generate random samples with replacement out of a “population” of items that are held in a sequence.

Solution

A generator for the purpose is quintessentially simple:

import random
def sample_wr(population, _choose=random.choice):
    while True: yield _choose(population)

Discussion

random.sample lets you do random sampling without replacement, and Recipe 18.4, which follows, shows a generator to perform sampling without replacement with excellent memory-consumption characteristics. This recipe provides a generator for sampling with replacement, which is an even simpler task. Basically all the work gets delegated to random.choice. The sample_wr generator shown in this recipe is unbounded: used on its own, it will keep looping forever. To bound the output of an intrinsically unbounded generator, you can use it in a for statement that at some point executes a break, or use other techniques shown in Recipe 19.2.

For example, to make a random string of 50 lowercase ASCII letters:

import itertools
from string import ascii_lowercase
x = ''.join(itertools.slice(sample_wr(ascii_lowercase), 50))

string.ascii_lowercase is exactly the string 'abcdefghijklmnopqrstuvwxyz‘. If you didn’t have the sample_wr generator, the equivalent code might be something like:

from string import ascii_lowercase
from random import choice
x = ''.join([ random.choice(ascii_lowercase) for i in xrange(50) ])

So, the practical added-value of sample_wr is modest, when compared to other available building-blocks. It is, however, preferable to have such a fundamental concept of statistics as sampling with replacement embodied as its own function, rather than having to implement it with an explicit loop over random.choice each time it is needed.

See Also

Library Reference and Python in a Nutshell docs for module random.

18.4. Generating Random Samples Without Replacement

Credit: Raymond Hettinger

Problem

You need to generate random samples without replacement out of a “population” (the integers between included and some n excluded), and you want better memory consumption characteristics than random.sample provides.

Solution

A generator for this purpose requires only constant memory and makes a small number of calls to random.random:

import random
def sample(n, r):
    " Generate r randomly chosen, sorted integers from [0,n) "
    rand = random.random
    pop = n
    for samp in xrange(r, 0, -1):
        cumprob = 1.0
        x = rand( )
        while x < cumprob:
            cumprob -= cumprob * samp / pop
            pop -= 1
        yield n-pop-1

Discussion

random.sample(xrange(10), 3) produces output statistically equal to list(sample(10, 3)) using this recipe’s sample. Differently from random.sample(xrange( n ), r ), this recipe’s sample( n, r ) requires a bounded amount of memory (which does not grow with either r or n) and is guaranteed to make only r calls to random.random. Moreover, this recipe’s sample yields the r numbers of the sample in sorted order, while random.sample returns them in random order—which may be insignificant or a crucial advantage one way or the other, depending on your application’s needs. A definite advantage of random.sample is that its running time is O ( r ), while this recipe’s sample function’s running time is O ( n ).

This recipe was inspired by Knuth’s Algorithm S in Donald E. Knuth, The Art of Computer Programming, Volume 3, Seminumerical Algorithms, in section 3.4.2. However, this recipe has one improvement over Knuth’s algorithm: by tracking a cumulative probability for each selection, this recipe eliminates the need to make n calls to random.random.

A potential major improvement would be to find a direct formula giving the same result as the inner loop: given x, samp, and pop, compute the index of the first sample. Finding this formula would reduce the running time to O ( r ).

See Also

Library Reference and Python in a Nutshell docs about random.

18.5. Memoizing (Caching) the Return Values of Functions

Credit: Paul Moore, Mitch Chapman, Hannu Kankaanp

Problem

You have a pure function that is often called with the same arguments (particularly a recursive function) and is slow to compute its results. You want to find a simple way to gain substantial improvement in performance.

Solution

The key idea behind memoizing is to store a function’s results in a dictionary, keyed by the arguments that produce each result. This approach makes sense only for a pure function (i.e., one that yields the same result when called more than once with the same arguments). It’s easy to memoize a function by hand. For example, using the recursive Fibonacci function, here is a manually memoized function:

fib_memo = {  }
def fib(n):
    if n < 2: return 1
    if n not in fib_memo:
        fib_memo[n] = fib(n-1) + fib(n-2)
    return fib_memo[n]

Having to code the memoization inside each function to be memoized is repetitive and degrades the function’s readability. An alternative is to encapsulate the memoization mechanics into a closure:

def memoize(fn):
    memo = {  }
    def memoizer(*param_tuple, **kwds_dict):
        # can't memoize if there are any named arguments
        if kwds_dict:
            return fn(*param_tuple, **kwds_dict)
        try:
            # try using the memo dict, or else update it
            try:
                return memo[param_tuple]
            except KeyError:
                memo[param_tuple] = result = fn(*param_tuple)
                return result
        except TypeError:
            # some mutable arguments, bypass memoizing
            return fn(*param_tuple)
    # 2.4 only: memoizer._ _name_ _ = fn._ _name_ _
    return memoizer

Using this memoize closure to memoize fib, the function definition becomes obvious, without caching boilerplate to obscure the algorithm. You must assign the memoize result to the same name, fib, as the recursive function; otherwise, the recursive calls bypass the memoizing:

def fib(n):
    if n < 2: return 1
    return fib(n-1) + fib(n-2)
fib = memoize(fib)

This latest snippet shows that memoize is meant to be used exactly as a Python 2.4 decorator, so in Python 2.4, you could use decorator syntax (instead of the explicit call to memoize):

@memoize
def fib(n):
    if n < 2: return 1
    return fib(n-1) + fib(n-2)

giving exactly the same semantics as the previous snippet.

Discussion

The memoize function is called with just one argument, a function f. It returns a closure memoizer that acts just like f but memoizes its arguments and result if the actual arguments to a call are hashable and positional. Calls with mutable or keyword arguments bypass the memoizing. If you’re worried that such bypassing happens too often, making memoizing counterproductive, you should do a few dry runs that are representative of your intended production usage, with a closure that’s modified to collect statistics. Then you can decide whether memoization is worth using for your specific application. Here’s the modified memoize for this purpose:

def memoize(fn):
    memo = {  }
    def memoizer(*param_tuple, **kwds_dict):
        if kwds_dict:
            memoizer.namedargs += 1
            return fn(*param_tuple, **kwds_dict)
        try:
            memoizer.cacheable += 1
            try:
                return memo[param_tuple]
            except KeyError:
                memoizer.misses += 1
                memo[param_tuple] = result = fn(*param_tuple)
                return result
        except TypeError:
            memoizer.cacheable -= 1
            memoizer.noncacheable += 1
            return fn(*param_tuple)
    memoizer.namedargs = memoizer.cacheable = memoizer.noncacheable = 0
    memoizer.misses = 0
    return memoizer

Functions to be memoized must be pure (i.e., they must have no side effects and must always return the same value whenever they are called with the same arguments). More significantly, memoize returns a closure that does not memoize calls that receive mutable arguments, such as len on a list, nor functions that receive named parameters.

memoize cannot really check the semantics of the functions you wrap. The notions of same value and same arguments are vaguely defined in many cases, so take care. memoize does try to field occasional calls with keyword and mutable arguments (with an interesting mix of checking and try/except), but performance will suffer unless such cases are rare. This is why it’s worth having around a version of memoize that keeps counts of the various possibilities, so that you can check their rarity.

See Also

Recipe 20.4 applies caching to class instances’ attributes.

18.6. Implementing a FIFO Container

Credit: Sébastien Keim, Alex Martelli, Raymond Hettinger, Jeremy Fincher, Danny Yoo, Josiah Carlson

Problem

You need a container that allows element insertion and removal, in which the first element inserted is also the first to be removed (i.e., a first-in first-out, FIFO, queue).

Solution

We can subclass list to implement a Pythonic version of an idea found in Knuth’s Art of Computer Programming: the frontlist/backlist approach to building a FIFO out of two one-way linked lists. Here’s how:

class Fifo(list):
    def _ _init_ _(self):
        self.back = [  ]
        self.append = self.back.append
    def pop(self):
        if not self:
            self.back.reverse( )
            self[:] = self.back
            del self.back[:]
        return super(Fifo, self).pop( )

Discussion

Here is a usage example, protected by the usual guard so it runs only when the module executes as a main script rather than being imported:

if _ _name_ _ == '_ _main_ _':
    a = Fifo( )
    a.append(10)
    a.append(20)
    print a.pop( ),
    a.append(5)
    print a.pop( ),
    print a.pop( ),
    print
# emits:10 20 5

The key idea in class Fifo is to have an auxiliary backlist, self.back, to which incoming items get appended. Outgoing items get popped from the frontlist, self. Each time the frontlist is exhausted, it gets replenished with the reversed contents of the backlist, and the backlist is emptied. The reversing and copying are O ( n ), where n is the number of items appended since the “front list” was last empty, but these operations are performed only once every n times, so the amortized cost of each call to pop is a constant—that is, O(1).

A simpler way to build a FIFO in Python is no doubt to just use a standard list’s append and pop(0) methods—something like:

class FifoList(list):
    def pop(self):
        return super(FifoList, self).pop(0)

However, when using a list in this way, we need to keep in mind that pop( 0 ) is O ( n ), where n is the current length of the list. O ( 1) performance can be ensured by building the FIFO in a slightly less intuitive way, on top of a dictionary:

class FifoDict(dict):
    def _ _init_ _(self):
        self.nextin = 0
        self.nextout = 0
    def append(self, data):
        self.nextin += 1
        self[self.nextin] = data
    def pop(self):
        self.nextout += 1
        return dict.pop(self, self.nextout)

In Python 2.4, we also have collections.deque, a double-ended queue type that also ensures O ( 1 ) performance when used as a FIFO (using its append and popleft methods):

import collections
class FifoDeque(collections.deque):
    pop = collections.deque.popleft

To choose among different implementations of the same interface, such as the various Fifo... classes shown in this recipe, the best approach often is to measure their performance on artificial benchmark examples that provide a reasonable simulation of the expected load in your application. I ran some such measurements on a somewhat slow laptop, with Python 2.4, using the timeit module from the Python Standard Library. For a total of 6,000 appends and pops, with a maximum length of 3,000, class Fifo takes about 62 milliseconds, class FifoList about 78, FifoDict about 137, and FifoDeque about 30. Making the problem exactly ten times bigger, we see the advantages of O ( 1 ) behavior (exhibited by all of these classes except FifoList). Fifo takes 0.62 seconds, FifoList 3.8, FifoDict 1.4, and FifoDeque 0.29. Clearly, in Python 2.4, FifoDeque is fastest as well as simplest; if your code has to support Python 2.3, the Fifo class shown in this recipe’s Solution is best.

See Also

Library Reference and Python in a Nutshell docs for built-ins list and dict; Library Reference docs on modules collections (Python 2.4 only) and timeit. Python in a Nutshell’s chapter on optimization; Donald Knuth, The Art Of Computer Programming (exercise 14, section 2.2.1).

18.7. Caching Objects with a FIFO Pruning Strategy

Credit: David M. Wilson, Raymond Hettinger

Problem

You need to build a mapping to be used as a cache, holding up to a fixed number of previous entries and automatically discarding older entries.

Solution

A mapping can implement a relatively small number of core methods and rely on UserDict.DictMixin to provide all the other methods that make up the full official mapping interface. Here is a mapping class for FIFO caching, implemented with this “let DictMixin do it” strategy:

import UserDict
class FifoCache(object, UserDict.DictMixin):
    ''' A mapping that remembers the last `num_entries' items that were set '''
    def _ _init_ _(self, num_entries, dct=( )):
        self.num_entries = num_entries
        self.dct = dict(dct)
        self.lst = [  ]
    def _ _repr_ _(self):
        return '%r(%r,%r)' % (
            self._ _class_ _._ _name_ _, self.num_entries, self.dct)
    def copy(self):
        return self._ _class_ _(self.num_entries, self.dct)
    def keys(self):
        return list(self.lst)
    def _ _getitem_ _(self, key):
        return self.dct[key]
    def _ _setitem_ _(self, key, value):
        dct = self.dct
        lst = self.lst
        if key in dct:
            lst.remove(key)
        dct[key] = value
        lst.append(key)
        if len(lst) > self.num_entries:
            del dct[lst.pop(0)]
    def _ _delitem_ _(self, key):
        self.dct.pop(key)
        self.lst.remove(key)
    # a method explicitly defined only as an optimization
    def _ _contains_ _(self, item):
        return item in self.dct
    has_key = _ _contains_ _

Discussion

Here is a typical example of usage for this FifoCache class:

if _ _name_ _ == '_ _main_ _':
    f = FifoCache(num_entries=3)
    f["fly"] = "foo"
    f["moo"] = "two"
    f["bar"] = "baz"
    f["dave"] = "wilson"
    f["age"] = 20
    print f.keys( )
    # emits ['bar', 'dave', 'age']

For any case where you might use a dictionary object to cache expensive lookups, the FifoCache class shown in this recipe might be a safer alternative for use in long-running applications, whose caches might otherwise consume all system memory if left unchecked.

Thanks to UserDict.DictMixin, class FifoCache exhibits a full dictionary (i.e., mapping) interface: you can substitute an instance of FifoCache wherever you’re using a dictionary (as long as you do want entries that were set “a long time ago” to drop out automatically to make space for newer ones).

In Python 2.4, you can get a faster version of FifoCache by setting self.lst to be an instance of collections.deque rather than a list, and using self.lst.popleft() where this recipe’s solution uses self.lst.pop(0). Since the deque type does not have a remove method, you have to implement that with a little auxiliary function:

def remove_from_deque(d, x):
    for i, v in enumerate(d):
        if v == x:
            del d[i]
        return
    raise ValueError, '%r not in %r' % (x, d)

and use remove_from_deque(self.lst, key) where this recipe’s Solution uses self.list.remove(key). While, as always, you should measure how useful this optimization is in the context of your specific application, it’s likely to be helpful when num_entries is high, since self.lst.pop( 0 ) on a list self.lst is O ( n ), while self.list.popleft( ) on a deque self.lst is O ( 1 ). (remove_from_deque, like list.remove, is unfortunately and unavoidably O ( n )).

FIFO is not the ideal policy for a cache’s decision about what should “fall off”; a better one would be LRU (Least Recently Used). You can tweak this class’ policy into LRU by subclassing and overriding:

class LRUCache(FifoCache):
    def _ _getitem_ _(self, key):
        if key in self.dct:
            self.lst.remove(key)
        else:
            raise KeyError
        self.lst.append(key)
        return self.dct[key]

This variant does ensure the use of the LRU policy without much extra code. Unfortunately, it makes every read access quite costly O ( n ), where n is the number of entries in the cache at that time), due to the need for the self.lst.remove call. Therefore, this recipe’s official “Solution” uses the simpler implementation, even though FIFO is notoriously suboptimal as a cache replacement strategy.

See Also

Library Reference and Python in a Nutshell docs for module UserDict; Recipe 5.14 also uses UserDict.DictMixin to round up a mapping interface while coding a minimum of boilerplate.

18.8. Implementing a Bag (Multiset) Collection Type

Credit: Raymond Hettinger, Alex Martelli, Matt R

Problem

You need a set-like collection type that lets each element be in the set a number of times. In other words, you need a collection type of the kind that is known as multiset in C++ and SQL, and bag in Smalltalk, Objective C, and Haskell’s Edison module.

Solution

We can implement bag as a class. We could restrict the implementation to language constructs that are present in Python 2.3 or are easy to emulate; however, such restrictions would give substantial inefficiencies or complications with comparison to a pure Python 2.4 implementation. So, here is a Python 2.4 implementation, with no attempt to support Python 2.3:

from operator import itemgetter
from heapq import nlargest
class bag(object):
    def _ _init_ _(self, iterable=( )):
        # start empty, then add the `iterable' if any
        self._data = {  }
        self.update(iterable)
    def update(self, iterable):
        # update from an element->count mapping, or from any iterable
        if isinstance(iterable, dict):
            for elem, n in iterable.iteritems( ):
                self[elem] += n
        else:
            for elem in iterable:
                self[elem] += 1 
    def _ _contains_ _(self, elem):
        # delegate membership test
        return elem in self._data
    def _ _getitem_ _(self, elem):
        # default all missing items to a count of 0
        return self._data.get(elem, 0)
    def _ _setitem_ _(self, elem, n):
        # setting an item to a count of 0 removes the item
        self._data[elem] = n
        if n == 0:
            del self._data[elem]
    def _ _delitem_ _(self, elem):
        # delegate to _ _setitem_ _ to allow deleting missing items
        self[elem] = 0
    def _ _len_ _(self):
        # length is computed on-the-fly
        return sum(self._data.itervalues( ))
    def _ _nonzero_ _(self):
        # avoid truth tests using _ _len_ _, as it's relatively slow
        return bool(self._data)
    def _ _eq_ _(self, other):
        # a bag can only equal another bag
        if not isinstance(other, bag):
            return False
        return self._data == other._data
    def _ _ne_ _(self, other):
        # a bag always differs from any non-bag
        return not (self == other)
    def _ _hash_ _(self):
        # a bag can't be a dict key nor an element in a set
        raise TypeError
    def _ _repr_ _(self):
        # typical string-representation
        return '%s(%r)' % (self._ _class_ _._ _name_ _, self._data)
    def copy(self):
        # make and return a shallow copy
        return self._ _class_ _(self._data)
    _ _copy_ _ = copy # For the copy module
    def clear(self):
        # remove all items
        self._data.clear( )
    def _ _iter_ _(self):
        # yield each element the # of times given by its count
        for elem, cnt in self._data.iteritems( ):
            for i in xrange(cnt):
                yield elem
    def iterunique(self):
        # yield each element only once
        return self._data.iterkeys( )
    def itercounts(self):
        # yield element-count pairs
        return self._data.iteritems( )     
    def mostcommon(self, n=None):
        # return the n (default: all) most common elements, each as an
        # element-count pair, as a list sorted by descending counts
        if n is None:
            return sorted(self.itercounts( ), key=itemgetter(1), reverse=True)
        it = enumerate(self.itercounts( ))
        nl = nlargest(n, ((cnt, i, elem) for (i, (elem, cnt)) in it))
        return [(elem, cnt) for cnt, i, elem in nl]

Discussion

Python offers several useful container classes, including built-in tuples, lists and dicts, sets (in Python 2.4, sets are built-in; in Python 2.3, they’re in module sets)—which, unlike bags, can be seen as “holding only one instance” of each of their elements—and double-ended queues, deques (in Python 2.4 only, in module collections). This abundance of container classes doesn’t mean there is no use for yet more. The bag, (i.e., multiset), presented in this recipe, is widely useful, since counting the numbers of occurrences of different objects is a frequent task useful in many applications. Rather than coding a bag each time you need such a container (generally based on a dictionary mapping items to counts), it’s better to design and code it once, put it among one’s utilities, and lobby for it to become part of the standard library for a future Python, such as 2.5 (which can be expected sometime in 2006 and will focus on enhancements to the standard library rather than to the core language).

The API offered by the bag class presented in this recipe is largely based on indexing, due to the strong analogy between a bag and a mapping of items to counts. For example:

>>> b = bag('banana')
>>> b['a']3
>>> b['a'] += 1
>>> b['a']
4
>>> del b['a']          # removes all 'a's from the bag
>>> b['a']
0

Items that are not in the bag can also be used as indices, giving a value (i.e., count) of 0; a lot of bag’s convenience comes from this default. A bag also offers several ways to iterate on it (by unique elements; by elements, each repeated as many times as its count; by ( element, count ) pairs); and also supplies a handy method mostcommon to return ( element, count ) pairs sorted by descending count (all such pairs, or just the top n). An example use of mostcommon:

>>> bag(word for line in open('somefile.txt')
...     for word in line.split( )).mostcommon(5)[('to', 83), ('for', 71), ('the', 61), ('of', 53), ('and', 52)]

All design choices are tradeoffs. For some applications, it might be more convenient to have bag’s API closer to set’s rather than to dict’s, with an add method, and binary operators, for example, to join two bags returning a new one (as list does with operator + and set does with the “or”, vertical-bar operator |). In most cases, this would be overkill. After all, “a designer knows he has achieved perfection, not when there is nothing left to add, but when there is nothing left to take away” (Antoine de Saint-Exupéry). So, for example, to join two bags, getting a new one, without altering either input bag, code a little function using roughly the same update-based approach you would use with dicts, as follows:

def bagjoin(b1, b2):
    b = bag(b1)
    b.update(b2)
    return b

Just as would be the case for an analogous function joining dicts, this works, not only when b1 and b2 are bags, but also when they are other kinds of objects that can be passed to bag and bag.update—objects such as arbitrary iterables or mappings (generally dictionaries) from items to counts. Such polymorphism comes at negligible cost, and it’s well worth preserving.

Although the crucial design choices in this recipe are those about bag’s API, some implementation choices must be made as well. In this recipe’s code, implementation choices are oriented towards simplicity. In particular, there is no attempt to allow this code to run on Python 2.3. This recipe is optimized for Python 2.4 because it is Python’s current release and is likely to be used soon in lieu of Python 2.3, particularly among Pythonistas who are sensitive to performance issues, given the amount of highly successful effort that was devoted to optimizing version 2.4’s performance. If Python 2.3 support was deemed necessary, it would be best implemented separately, rather than hobbling the primary 2.4 implementation with inefficiencies or complications.

18.9. Simulating the Ternary Operator in Python

Credit: Jürgen Hermann, Alex Martelli, Oliver Steele, Chris Perkins, Brent Burley, Lloyd Goldwasser, Doug Hudgeon

Problem

You want to express in Python the equivalent of C’s so-called ternary operator ?:—as in condition ? iftrue:iffalse).

Solution

There are many ways to skin a ternary operator. An explicit if/else is most Pythonic, although slightly verbose:

for i in range(1, 3):
    if i == 1:
        plural = ''
    else:
        plural = 's'
    print "The loop ran %d time%s" % (i, plural)

Indexing is more compact, and therefore useful, if using the iftrue and iffalse expressions has no side effects:

for i in range(1, 3):
    print "The loop ran %d time%s" % (i, ('', 's')[i != 1])

For the specific case of plurals, there’s also a neat variant using slicing:

for i in range(1, 3):
    print "The loop ran %d time%s" % (i, "s"[i==1:])

Short-circuited logical expressions can deal correctly with side effects:

for i in range(1, 3):
    print "The loop ran %d time%s" % (i, i != 1 and 's' or '')

The output of each of these loops is:

The loop ran 1 time
The loop ran 2 times

However, beware: the short-circuit version (which is necessary when either or both of iftrue and iffalse have side effects) fails if “turned around”:

for i in range(1, 3):
    print "The loop ran %d time%s" % (i, i == 1 and '' or 's')

Since '' evaluates as false, the would-be-ternary expression always evaluates to 's', so that this latest snippet outputs:

The loop ran 1 times
The loop ran 2 times

Therefore, in general, when iftrue and iffalse are unknown at coding time (and therefore either could have side effects or be false), we need more cumbersome constructs, such as:

for i in range(1, 3):
    print "The loop ran %d time%s" % (i, (i == 1 and [''] or ['s'])[0])

or:

for i in range(1, 3):
    print "The loop ran %d time%s" % (i, (lambda:'', lambda:'s')[i!=1]( ))

or even weirder variations:

for i in range(1, 3):
    print "The loop ran %d time%s" % (i, [i==1 and '', i!=1 and 's'][i!=1])
for i in range(1, 3):
    print "The loop ran %d time%s" % (i,
           (i==1 and (lambda:'') or (lambda:'s'))( ))

As you can see, good old if/else is starting to look pretty good when compared to these terse and complicated approaches.

And now for something completely different (for plurals only, again):

for i in range(1, 3):
    print "The loop ran %d time%s" % (i, 's'*(i!=1))

Discussion

Programmers coming to Python from C, C++, or Perl sometimes miss the so-called ternary operator (?:). The ternary operator is most often used for avoiding a few lines of code and a temporary variable for simple decisions, such as printing the plural form of words after a counter, as in this recipe’s examples. In most cases, Python’s preference for making things clear and explicit at the cost of some conciseness is an acceptable tradeoff, but one can sympathize with the withdrawal symptoms of ternary-operator addicts.

Nevertheless, 99.44 times out of 100, you’re best off using a plain if/else statement. If you want your if/else to fit in an expression (so you can use that expression inside a lambda form, list comprehension, or generator expression), put it inside a named local function and use that function in the expression. For the remaining 56 cases out of 10,000, the idioms in this recipe might be useful. A typical case would be when you’re transliterating from another language into Python and need to keep program structure as close as possible to the “or"iginal, as mentioned in Recipe 4.19.

There are several ways to get the ternary operator effect in Python, and this recipe presents a fair selection of the wide range of possibilities. Indexing and slicing are nice but don’t apply to cases in which either or both of the iftrue and iffalse expressions may have side effects. If side effects are an issue, the short-circuiting nature of and/or can help, but this approach may run into trouble when iftrue and iffalse might be Python values that evaluate as false. To resolve both the side-effect and the might-be-false issues, two variants in this recipe mix indexing and function calling or a lambda form, and two more use even weirder mixtures of lambda and indexing and short circuiting.

If you’re not worried about side effects, you could overload slicing syntax to express a ternary operator:

class cond(object):
    def _ _getitem_ _(self, sl):
        if sl.start: return sl.stop
        else: return sl.step
cond = cond( )

After executing this snippet, you could code the example presented in this recipe’s Solution as:

for i in range(1, 3):
    print "The loop ran %d time%s" % (i, cond[i==1:'':'s'])

When you slice this cond object, iftrue and iffalse (masquerading as the stop and step attributes of the slice object) are both evaluated in any case, which is the reason this syntax is no use if you must worry about side effects. If you must have syntax sugar, using nullary lambdas may be the least of evils:

def cond(test, when_true, when_false):
    if test:
        return when_true( )
    else:
        return when_false( )

to be used as, for example, print cond(x%2==0, lambda:x//2, lambda:3*x+1).

Note that the lack of a ternary operator in Python is not due to an oversight: it’s a deliberate design decision, made after much debate pro and con. Therefore, you can be sure that Python will never “grow” a ternary operator. For many details about the various proposed syntax forms for a ternary operation, you can see the rejected PEP 308 at http://www.python.org/peps/pep-0308.html.

See Also

Recipe 4.19.

18.10. Computing Prime Numbers

Credit: David Eppstein, Tim Peters, Alex Martelli, Wim Stolker, Kazuo Moriwaka, Hallvard Furuseth, Pierre Denis, Tobias Klausmann, David Lees, Raymond Hettinger

Problem

You need to compute an unbounded sequence of all primes, or the list of all primes that are less than a certain threshold.

Solution

To compute an unbounded sequence, a generator is the natural Pythonic approach, and the Sieve of Eratosthenes, using a dictionary as the supporting data structure, is the natural algorithm to use:

import itertools
def eratosthenes( ):
    '''Yields the sequence of prime numbers via the Sieve of Eratosthenes.'''
    D = {  }  # map each composite integer to its first-found prime factor
    for q in itertools.count(2):     # q gets 2, 3, 4, 5, ... ad infinitum
        p = D.pop(q, None)
        if p is None:
            # q not a key in D, so q is prime, therefore, yield it
            yield q
            # mark q squared as not-prime (with q as first-found prime factor)
            D[q*q] = q
        else:
            # let x <- smallest (N*p)+q which wasn't yet known to be composite
            # we just learned x is composite, with p first-found prime factor,
            # since p is the first-found prime factor of q -- find and mark it
            x = p + q
            while x in D:
                x += p
            D[x] = p

Discussion

To compute all primes up to a predefined threshold, rather than an unbounded sequence, it’s reasonable to wonder if it’s possible to use a faster way than good old Eratosthenes, even in the smart variant shown as the “Solution”. Here is a function that uses a few speed-favoring tricks, such as a hard-coded tuple of the first few primes:

def primes_less_than(N):
    # make `primes' a list of known primes < N
    primes = [x for x in (2, 3, 5, 7, 11, 13) if x < N]
    if N <= 17: return primes
    # candidate primes are all odd numbers less than N and over 15,
    # not divisible by the first few known primes, in descending order
    candidates = [x for x in xrange((N-2)|1, 15, -2)
                  if x % 3 and x % 5 and x % 7 and x % 11 and x % 13]
    # make `top' the biggest number that we must check for compositeness
    top = int(N ** 0.5)
    while (top+1)*(top+1) <= N:
        top += 1
    # main loop, weeding out non-primes among the remaining candidates
    while True:
        # get the smallest candidate: it must be a prime
        p = candidates.pop( )
        primes.append(p)
        if p > top:
            break
        # remove all candidates which are divisible by the newfound prime
        candidates = filter(p._ _rmod_ _, candidates)
    # all remaining candidates are prime, add them (in ascending order)
    candidates.reverse( )
    primes.extend(candidates)
    return primes

On a typical small task such as looping over all primes up to 8,192, eratosthenes (on an oldish 1.2 GHz Athlon PC, with Python 2.4) takes 22 milliseconds, while primes_less_than takes 9.7; so, the slight trickery and limitations of primes_less_than can pay for themselves quite handsomely if generating such primes is a bottleneck in your program. Be aware, however, that eratosthenes scales better. If you need all primes up to 199,999, eratosthenes will deliver them in 0.88 seconds, while primes_less_than takes 0.65.

Since primes_less_than’s little speed-up tricks can help, it’s natural to wonder whether a perhaps simpler trick can be retrofitted into eratosthenes as well. For example, we might simply avoid wasting work on a lot of even numbers, concentrating on odd numbers only, beyond the initial 2. In other words:

def erat2( ):
    D = {  }
    yield 2
    for q in itertools.islice(itertools.count(3), 0, None, 2):
        p = D.pop(q, None)
        if p is None:
            D[q*q] = q
            yield q
        else:
            x = p + q
            while x in D or not (x&1):
                x += p
            D[x] = p

And indeed, erat2 takes 16 milliseconds, versus eratosthenes' 22, to get primes up to 8,192; 0.49 seconds, versus eratosthenes' 0.88, to get primes up to 199,999. In other words, erat2 scales just as well as eratosthenes and is always approximately 25% faster. Incidentally, if you’re wondering whether it might be even faster to program at a slightly lower level, with q = 3 to start, a while True as the loop header, and a q += 2 at the end of the loop, don’t worry—the slightly higher-level approach using itertools' count and islice functions is repeatedly approximately 4% faster. Other languages may impose a performance penalty for programming with higher abstraction, Python rewards you for doing that.

You might keep pushing the same idea yet further, avoiding multiples of 3 as well as even numbers, and so on. However, it would be an exercise in diminishing returns: greater and greater complication for smaller and smaller gain. It’s better to quit while we’re ahead!

If you’re into one liners, you might want to study the following:

def primes_oneliner(N):
    aux = {  }
    return [aux.setdefault(p, p) for p in range(2, N)
            if 0 not in [p%d for d in aux if p>=d+d]]

Be aware that one liners, even clever ones, are generally anything but speed demons! primes_oneliner takes 2.9 seconds to complete the same small task (computing primes up to 8,192) which, eratosthenes does in 22 milliseconds, and primes_less_than in 9.7—so, you’re slowing things down by 130 to 300 times just for the fun of using a clever, opaque one liner, which is clearly not a sensible tradeoff. Clever one liners can be instructive but should almost never be used in production code, not just because they’re terse and make maintenance harder than straightforward coding (which is by far the main reason), but also because of the speed penalties they may entail.

While prime numbers, and number theory more generally, used to be considered purely theoretical problems, nowadays they have plenty of practical applications, starting with cryptography.

See Also

To explore both number theory and its applications, the best book is probably Kenneth Rosen, Elementary Number Theory and Its Applications (Addison-Wesley); http://www.utm.edu/research/primes/ for more information about prime numbers.

18.11. Formatting Integers as Binary Strings

Credit: Antti Kaihola, Scott David Daniels, W.J. van der Laan

Problem

You need to display non-negative integers in binary form—that is, you need to turn them into strings made up of the characters '0' and '1‘.

Solution

The best approach, assuming you must perform this task on several numbers in the course of one run of your application, is to first prepare an auxiliary table, for example, with an auxiliary function:

def _bytes_to_bits( ):
    # prepare and return a list of the first 256 int as binary strings
    # start with table of the right length, filled with place-holders
    the_table = 256*[None]
    # we'll repeatedly need to loop on [7, 6, ..., 1, 0], make it once
    bits_per_byte = range(7, -1, -1)
    for n in xrange(256):
        # prepare the nth string: start with a list of 8 place-holders
        bits = 8*[None]
        for i in bits_per_byte:
            # get the i-th bit as a string, shift n right for next bit
            bits[i] = '01'[n&1]
            n >>= 1
        # join up the 8-character string of 0's and 1's into the table
        the_table[n] = ''.join(bits)
    return the_table
# rebind function's name to the table, function not needed any more
_bytes_to_bits = _bytes_to_bits( )

and then use the auxiliary table to make a fast conversion function that works 8 bits at a time:

def binary(n):
    # check precondition: only non-negative numbers supported
    assert n>=0
    # prepare the list of substrings 8 bits at a time
    bits = [  ]
    while n:
        bits.append(_bytes_to_bit[n&255])
        n >>= 8
    # we need it left-to-right, thus the reverse
    bits.reverse( )
    # strip leading '0's, but ensure at least one is left!
    return ''.join(bits).lstrip('0') or '0'

If you need to perform this task only a very small number of times in the course of one run of your application, you might instead choose to perform the conversion directly, bit by bit—it’s easy, although somewhat slow. Just use the same approach as binary, but 1 bit at a time instead of 8 bits at a time:

def binary_slow(n):
    assert n>=0
    bits = [  ]
    while n:
        bits.append('01'[n&1])
        n >>= 1
    bits.reverse( )
    return ''.join(bits) or '0'

Discussion

If you also need to display negative numbers, you can take two different roads. Either do as the built-in functions hex and oct and prepend a minus sign to negative numbers:

def bin_with_sign(n):
    if n<0: return '-'+binary(-n)
    else: return binary(n)

or use two’s complement notation, but in that case you need to know how many bits fit in a “word”, because that’s how two’s complement is defined—in terms of fixed-length words:

def bin_twos_complement(n, bits_per_word=16):
    if n<0: n = (2<<bits_per_word) + n
    return binary(n)

Function binary produces just as many binary digits as each argument needs, avoiding leading '0’s (except the single zero digit needed to avoid displaying an empty string when n is 0). If instead you need to produce a fixed number of binary digits, you could ensure that at string level, which is particularly easy with Python 2.4:

def bin_fixed(n, bits_per_word=16):
    return bin_twos_complement(n, bits_per_word).rjust(bits_per_word, '0')

but is also quite feasible with Python 2.3 as well:

def bin_fixed_23(n, bits_per_word=16):
    result = bin_twos_complement(n, bits_per_word)
    return (('0'*bits_per_word)+result)[-bits_per_word:]

Alternatively, you could generalize some version of the auxiliary _bytes_to_bits function used in the “Solution”, which is indeed oriented to producing fixed-length results. However, using the variable-length version, and a little string manipulation on top of it for the occasional need for fixed-length results, should be enough.

See Also

Library Reference and Python in a Nutshell docs for built-ins oct and hex; Recipe 18.12 for displaying integers in an arbitrary base.

18.12. Formatting Integers as Strings in Arbitrary Bases

Credit: Moon aka Sun, Raymond Hettinger

Problem

You need to display non-negative integers in arbitrary bases—that is, you need to turn them into strings made up of “digit” characters (which may include letters for bases that are > 10).

Solution

A function is clearly the right way to package the “Solution” to this task:

import string
def format(number, radix, digits=string.digits+string.ascii_lowercase):
   """ format the given integer `number' in the given `radix' using the given
       `digits' (default: digits and lowercase ascii letters) """
   if not 2 <= radix <= len(digits):
      raise ValueError, "radix must be in 2..%r, not %r" % (len(digits), radix)
   # build result as a list of "digit"s in natural order (least-significant digit
   # leftmost), at the end flip it around and join it up into a single string
   result = [  ]
   addon = result.append                    # extract bound-method once
   # compute 'sign' (empty for number>=0) and ensure number >= 0 thereafter
   sign = ''
   if number < 0:
      number = -number
      sign = '-'
   elif number == 0:
      sign = '0'
   _divmod = divmod                         # access to locals is faster
   while number:
      # like: rdigit = number % radix; number //= radix
      number, rdigit = _divmod(number, radix)
      # append appropriate string for the digit we just found
      addon(digits[rdigit])
   # append sign (if any), flip things around, and join up into a string
   addon(sign)
   result.reverse( )
   return ''.join(result)

Discussion

Here is a simple usage example, with the usual guard to let us append the example to the same module where we define function format. The usage example runs when the module is run as a main script but not when the module is imported:

if _ _name_ _ == '_ _main_ _':
   as_str = 'qwertyuioplkjhgfdsazxcvbnm0987654321'
   as_num = 79495849566202193863718934176854772085778985434624775545L
   num = int( as_str, 36 )
   assert num == as_num
   res = format( num, 36 )
   assert res == as_str

This usage example is designed to be totally quiet when everything works fine, emitting messages only in case of problems.

The code in this recipe is designed with careful attention to both generality and performance. The string of digits used by default is made up of all decimal digits followed by lowercase ASCII letters, allowing a radix of up to 36; however, you can pass any sequence of strings (rather than just a string, to be used as a sequence of characters), for example to support even larger bases. Performance is vastly enhanced, with respect to a naive approach to coding, by a few precautions taken in the code—in decreasing order of importance:

  1. Building the result as a list and then using ''.join to create a string containing all the list items. (The alternative of adding each item to a string, one at a time, would be much slower than the ''.join approach.)

  2. Building the result in natural order (least-significant digit leftmost) and flipping it around at the end. Inserting each digit at the front as it gets computed would be slow.

  3. Extracting the bound method result.append into a local variable.

  4. Giving a local name _divmod to the divmod buit-in.

Items 2 and 3 speed lookups that otherwise would extract a small extra price each time through the loop because lookup of local variables is measurably faster than lookup of built-ins and quite a bit faster than compound-name lookups such as result.append.

Here is an example of how you could use format with “digits” that are not single characters, but rather longer strings:

digs = [ d+'-' for d in
         'zero one two three four five six seven eight nine'.split( ) ]
print format(315, 10, digs).rstrip('-')
# emits:three-one-five

See Also

Library Reference and Python in a Nutshell docs for built-ins oct and hex; Recipe 18.11 for displaying integers specifically in binary.

18.13. Converting Numbers to Rationals via Farey Fractions

Credit: Scott David Daniels

Problem

You have a number v (of almost any type) and need to find a rational number (in reduced form) that is as close to v as possible but with a denominator no larger than a prescribed value.

Solution

Farey fractions, whose crucial properties were studied by Augustin Louis Cauchy, are an excellent way to find rational approximations of floating-point values:

def farey(v, lim):
    """ No error checking on args.  lim = maximum denominator.
        Results are (numerator, denominator); (1, 0) is "infinity".
    """
    if v < 0:
        n, d = farey(-v, lim)
        return -n, d
    z = lim - lim    # Get a "0 of the right type" for denominator
    lower, upper = (z, z+1), (z+1, z)
    while True:mediant = (lower[0] + upper[0]), (lower[1] + upper[1])
        if v * mediant[1] > mediant[0]:
            if lim < mediant[1]:
                return upper
            lower = mediant
        elif v * mediant[1] == mediant[0]:
            if lim >= mediant[1]:
                return mediant
            if lower[1] < upper[1]:
                return lower
            return upper
        else:
            if lim < mediant[1]:
                return lower
            upper = mediant

For example:

import math
print farey(math.pi, 100)
# emits:(22, 7)

Discussion

The rationals resulting from the algorithm shown in this recipe are in reduced form (meaning that numerator and denominator are mutually prime), but the proof, which was given by Cauchy, is rather subtle (see http://www.cut-the-knot.com/blue/Farey.html).

You can use farey to compute odds from a probability, such as:

probability = 0.333
n, d = farey(probability, 100)
print "Odds are %d : %d" % (n, d-n)
# emits:Odds are 1 : 2

This recipe’s algorithm is ideally suited for reimplementation in a lower-level language (e.g., C, or assembly, or, maybe best, Pyrex) if you use it heavily. Since the code uses only multiplication and addition, it can play optimally to hardware strengths.

If you are using this recipe in an environment where you call it with a lot of values near 0.0, 1.0, or 0.5 (or other simple fractions), you may find that the algorithm’s convergence is too slow. You can improve convergence in a continued fraction style, by appending to the first if in the farey function:

if v < 0:...
elif v < 0.5:
    n, d = farey((v-v+1)/v, lim)     # lim is wrong; decide what you want
    return d, n
elif v > 1:
    intpart = floor(v)
    n, d = farey(v-intpart)
    return n+intpart*d, d
...

James Farey was an English geologist and surveyor who wrote a letter to the Journal of Science in 1816. In that letter he observed that, while reading a privately published list of the decimal equivalents of fractions, he had noticed an interesting fact. Consider the set of all the fractions with values between 0 and 1, reduced to the lowest terms, with denominators not exceeding some integer N. Arrange the set in order of magnitude to get a sequence. For example, for N equal to 5, the Farey sequence is:

0/1, 1/5, 1/4, 1/3, 2/5, 1/2, 3/5, 2/3, 3/4, 4/5, 1/1

For any three consecutive fractions in this sequence (e.g., A/B, C/D, E/F), the middle fraction (C/D), called the mediant, is equal to the ratio (A + E)/(B + F). I enjoy envisioning Mr. Farey sitting up late on a rainy English night, reading tables of decimal expansions of fractions by an oil lamp. Calculation has come a long way since his day, and I’m pleased to be able to benefit from his work.

See Also

Library Reference and Python in a Nutshell docs for built-in types int and long; http://www.cut-the-knot.org/blue/Farey.shtml for more information about the Farey Series.

18.14. Doing Arithmetic with Error Propagation

Credit: Mario Hilgemeier

Problem

You have numbers coming from measurements affected by known percentual uncertainties, and you want to perform arithmetic on these numbers while tracking the uncertainty in the results.

Solution

The simplest approach is to code a class that implements arithmetic operators applying the classical physicists’ error-propagation formulas:

import math
class Measurement(object):
    ''' models a measurement with % uncertainty, provides arithmetic '''
    def _ _init_ _(self, val, perc):
        self.val = val                            # central value
        self.perc = perc                          # % uncertainty
        self.abs = self.val * self.perc / 100.0   # absolute error
    def _ _repr_ _(self):
        return "Measurement(%r, %r)" % (self.val, self.perc)
    def _ _str_ _(self):
        return "%g+-%g%%" % (self.val, self.perc)
    def _addition_result(self, result, other_abs):
        new_perc = 100.0 * (math.hypot(self.abs, other_abs) / result)
        return Measurement(result, new_perc)
    def _ _add_ _(self, other):
        result = self.val + other.val
        return self._addition_result(result, other.abs)
    def _ _sub_ _(self, other):
        result = self.val - other.val
        return self._addition_result(result, other.abs)
    def _multiplication_result(self, result, other_perc):
        new_perc = math.hypot(self.perc, other_perc)
        return Measurement(result, new_perc)
    def _ _mul_ _(self, other):
        result = self.val * other.val
        return self._multiplication_result(result, other.perc)
    def _ _div_ _(self, other):
        result = self.val / other.val
        return self._multiplication_result(result, other.perc)

Discussion

Here is a typical example of use for this Measurement class:

m1 = Measurement(100.0, 5.5)   # measured value of 100.0 with 5.5% error
m2 = Measurement(50, 2)        # measured value of 50.0 with 2% error
print "m1 = ", m1
print "m2 = ", m2
print "m1 + m2 = ", m1 + m2
print "m1 - m2 = ", m1 - m2
print "m1 * m2 = ", m1 * m2
print "m1 / m2 = ", m1 / m2
print "(m1+m2) * (m1-m2) = ", (m1+m2) * (m1-m2)
print "(m1-m2) / (m1+m2) = ", (m1-m2) / (m1+m2)
# emits:
#m1 =  100+-5.5%
# m2 =  50+-2%
# m1 + m2 =  150+-3.72678%
# m1 - m2 =  50+-11.1803%
# m1 * m2 =  5000+-5.85235%
# m1 / m2 =  2+-5.85235%
# (m1+m2) * (m1-m2) =  7500+-11.7851%
# (m1-m2) / (m1+m2) =  0.333333+-11.7851%

What is commonly known as a percentage error is of course best described as a percentage of uncertainty. For example, when we state that some measured quantity is 100 with an error of 5.5% (or, equivalently, 5.5%), we mean that we know, with a reasonable level of confidence, that the quantity lies somewhere between 94.5 and 105.5. The error-propagation formulas are somewhat heuristic, rather than rigorous, but they’re quite traditional and have proven over the centuries that they perform acceptably in most large computations in physics or engineering.

Class Measurement, as shown in this recipe, does not support arithmetic with floats—only arithmetic between instances of Measurement. For those rare occasions when I need, in such computations, numbers that are known “exactly”, it is easiest to input them as “measurements with an error of 0%”. For example, if I have measured some sphere’s radius as 1 meter +- 3%, I can compute the sphere’s volume (with the well-known formula, 4/3 pi times the cube of the radius) as follows:

r = Measurement(1, 3)
v = Measurement(4/3.0*math.pi, 0) * r * r * r
print v
# emits:4.18879+-5.19615%

Avoiding accidental operations with floats that are presumed to be exact, but in fact are not, is quite helpful: this way, when I need to state that a certain number has 0 error, I’m reminded to consider whether things are truly that way. If your applications are different, so that you do need operations between measurements and exact floats all over the place, you can insert, as the first line of every one of the arithmetic special methods, the following statement:

        if isinstance(other, float):
            other = Measurement(other, 0)

Alternatively, you could perform this coercion in a special method named _ _coerce_ _, but that approach is considered obsolete and is discouraged in modern Python. If you do perform the coercion in the various arithmetic special methods (_ _add_ _, _ _sub_ _, etc.), don’t forget to also add the _ _radd_ _, etc, equivalents—after all, if you want to be able to code:

    some_measurement * 2.0

you will no doubt also want to be able to code:

    2.0 * some_measurement

and get exactly the same effects. For this purpose, in Python, your class needs to define the various _ _r... versions of the operator special methods. However, I’m not pursuing this line of reasoning further, because in most cases, you will be best served by not having the implicit ability to do arithmetic in an automatic way between measurements and floats—much like, with Python 2.4’s decimal module, you can’t implicitly do arithmetic in an automatic way between decimal numbers and floats.

See Also

Library Reference and Python in a Nutshell docs for module math.

18.15. Summing Numbers with Maximal Accuracy

Credit: Yaroslav Bulatov, Connelly Barnes

Problem

Due to the properties of floating point arithmetic, the simplest loop to sum a list of numbers (also embodied in the built-in sum function, as well as similar functions such as add.reduce in add-on modules such as Numeric and numarray) is not maximally accurate. You wish to minimize such numeric inaccuracy.

Solution

A careful and accurate algorithm, using and progressively accumulating partial sums (i.e., partials), can reduce inaccuracy:

import math
def sum_with_partials(arr):
    arr = list(arr)
    size = len(arr)
    iters = int(math.ceil(math.log(size) / math.log(2)))
    step = 1
    for itr in xrange(iters):
        for i in xrange(0, size-step, step+step):
            next_i = i+step
            arr[i] += arr[next_i]
        step += step
    return arr[0]

Discussion

Here is an example of the numeric behavior of this sum_with_partials function compared to that of the built-in sum:

if _ _name_ _ == '_ _main_ _':
    arr = [0.123456789012345]*10000000
    true_answer = arr[0] * len(arr)
    print '"True" result: %r' % true_answer
    sum_result = sum(arr)
    print '"sum"  result: %r' % sum_result
    sum_p_resu = sum_with_partials(arr)
    print 'sum_p. result: %r' % sum_p_resu
# emits:
#"True" result: 1234567.89012345
# "sum"  result: 1234567.8902233159
# sum_p. result: 1234567.89012345

As you can see, in this case, the built-in sum accumulated a relative error of almost 10-10 after summing 10 million floats all equal to each other (giving less than 11 digits of accuracy in the result), while sum_with_partials happens to be “perfect” for this case to within machine precision (15 digits of accuracy). Summing just a million copies rather than 10 million lowers sum’s relative error only to a bit more than 10-11.

If you know that the input argument arr is a list, and you don’t mind destroying that list as part of the summation, you can omit from the body of sum_with_partials the statement:

    arr = list(arr)

and recover a little bit of performance. Without this small enhancement, on one slow laptop, summing a million floats with the built-in sum takes 360 milliseconds, while the more accurate function sum_with_partials shown in this recipe takes 1.8 seconds to perform the same task (a slowdown of five times). In theory, sum_with_partials should be asymptotically faster than built-in sum if you’re doing unbounded-precision arithmetic (e.g., with Python’s built-in longs or other unbounded-precision data types from such add-ons as gmpy, which you can download from http://gmpy.sourceforge.net). To sum a list of n elements with d digits of precision, in unbounded-precision exact arithmetic, sum takes O ( n ( d +logd)) time, while sum_with_partials takes O ( nd ). However, I have not been able to observe that effect in empirical measurements.

Most of the time, you probably don’t want to pay the price of slowing down a summation by five times in order to increase your digits of accuracy from 10 or 11 to 15. However, for those occasions in which this tradeoff is right for your applications, and you need to sum millions and millions of floating-point numbers, this recipe might well prove rather useful to you. Another simple way to increase accuracy, particularly when your input numbers are not necessarily all of similar magnitude, is to ensure the small-magnitude ones are summed first. This is particularly easy to code in Python 2.4, although it’s inevitably O ( n log n ): just sum(sorted(data, key=abs)). Finally, if precision is much more important than speed, consider using decimal.Decimal (which lets you ask for as much precision as you’re willing to wait for and is part of Python 2.4’s standard library). Or you could use gmpy.mpf (which also allows any precision you require, may even be faster, but must be downloaded as part of gmpy from http://gmpy.sourceforge.net.)

See Also

Recipe 18.16 shows how to use a bounded-precision simulation of floating point to estimate the accuracy of algorithms; ftp://ftp.icsi.berkeley.edu/pub/theory/priest-thesis.ps.Z for Douglas M. Priest’s Ph.D. thesis On Properties of Floating Point Arithmetics: Numerical Stability and the Cost of Accurate Computations, covering this entire field with depth and rigor; gmpy is at http://gmpy.sourceforge.net.

18.16. Simulating Floating Point

Credit: Raymond Hettinger

Problem

You need to simulate floating-point arithmetic in software, either to show to students the details of the various classic problems with floating point (e.g., representation error, loss of precision, failure of distributive, commutative, and associative laws), or to explore the numerical robustness of alternative algorithms.

Solution

We can reproduce every floating-point operation, with explicitly bounded precision, by coding a Python class that overloads all the special methods used in arithmetic operators:

prec = 8             # number of decimal digits (must be under 15)
class F(object):
    def _ _init_ _(self, value, full=None):
        self.value = float('%.*e' % (prec-1, value))
        if full is None:
            full = self.value
        self.full = full
    def _ _str_ _(self):
        return str(self.value)
    def _ _repr_ _(self):
        return "F(%s, %r)" % (self, self.full)
    def error(self):
        ulp = float('1'+('%.4e' % self.value)[-5:]) * 10 ** (1-prec)
        return int(abs(self.value - self.full) / ulp)
    def _ _coerce_ _(self, other):
        if not isinstance(other, F):
            return (self, F(other))
        return (self, other)
    def _ _add_ _(self, other):
        return F(self.value + other.value, self.full + other.full)
    def _ _sub_ _(self, other):
        return F(self.value - other.value, self.full - other.full)
    def _ _mul_ _(self, other):
        return F(self.value * other.value, self.full * other.full)
    def _ _div_ _(self, other):
        return F(self.value / other.value, self.full / other.full)
    def _ _neg_ _(self):
        return F(-self.value, -self.full)
    def _ _abs_ _(self):
        return F(abs(self.value), abs(self.full))
    def _ _pow_ _(self, other):
        return F(pow(self.value, other.value), pow(self.full, other.full))
    def _ _cmp_ _(self, other):
        return cmp(self.value, other.value)

Discussion

The initializer of class F rounds the input value to the given precision (the global constant prec). This rounding produces what is known as representation error because the result is the nearest possible representable value for the specified number of digits. For instance, at three digits of precision, 3.527104 is stored as 3.53, so the representation error is 0.002896.

Since the underlying representation used in this recipe is Python’s ordinary float, the simulation works only up to 15 digits (the typical limit for double-precision floating point). If you need more than 15 digits, you can use Python 2.4’s decimal.Decimal type as the underlying representation. This way, you can get any precision you ask for, although the computation occurs in decimal rather than in binary. Alternatively, to get binary floating point with arbitrarily high precision, use the gmpy Python wrapper for the GMP (Gnu Multiple Precision) multiple-precision arithmetic library, specifically the gmpy.mpf type. One way or another, you need change only the two calls to float in this recipe’s Solution into calls to Python 2.4’s decimal.Decimal, or to gmpy.mpf (requesting the appropriate number of “digits” or bits), to use class F with higher precision than 15 digits. gmpy is at http://gmpy.sourceforge.net.

One key use of this recipe is to show to students the classic failure of associative, commutative, and distributive laws (Knuth, The Art of Computer Programming, vol. 2, pp. 214-15)—for example:

# Show failure of the associative law
u, v, w = F(11111113), F(-11111111), F(7.51111111)
assert (u+v)+w == 9.5111111
assert u+(v+w) == 10
# Show failure of the commutative law for addition
assert u+v+w != v+w+u
# Show failure of the distributive law
u, v, w = F(20000), F(-6), F(6.0000003)
assert u*v == -120000
assert u*w == 120000.01
assert v+w == .0000003
assert (u*v) + (u*w) == .01
assert u * (v+w) == .006

The other main way to use the code in this recipe is to compare the numerical accuracy of different algorithms that compute the same results. For example, we can compare the following three approaches to computing averages:

def avgsum(data):       # Sum all of the elements, then divide
    return sum(data, F(0)) / len(data)
def avgrun(data):       # Make small adjustments to a running mean
    m = data[0]
    k = 1
    for x in data[1:]:
        k += 1
        m += (x-m)/k    # Recurrence formula for mean
    return m
def avgrun_kahan(data): # Adjustment method with Kahan error correction term
    m = data[0]
    k = 1
    dm = 0
    for x in data[1:]:
        k += 1
        adjm = (x-m)/k - dm
        newm = m + adjm
        dm = (newm - m) - adjm
        m = newm
    return m

Here is a way to exercise these approaches and display their errors:

import random
prec = 5
data = [F(random.random( )*10-5) for i in xrange(1000)]
print '%s	%s	%s' %('Computed', 'ULP Error', 'Method')
print '%s	%s	%s' %('--------', '---------', '------')
for f in avgsum, avgrun, avgrun_kahan:
    result = f(data)
    print '%s	%6d		%s' % (result, result.error( ), f._ _name_ _)
print '
%r	baseline average using full precision' % result.full

Here is typical output from this snippet (the exact numbers in play will be different each time you run it, since what we are summing are random numbers):

Computed        ULP Error        Method
--------        ---------        ------
-0.020086            15                avgsum
-0.020061             9                avgrun
-0.020072             1                avgrun_kahan
-0.020070327734999997        baseline average using full precision

The last example demonstrates how to extract a full-precision floating-point result from an instance of F, by using the full attribute of the instance. This example is helpful for running an algorithm to full precision, as a baseline for seeing the effects of using less precision.

The full-precision result excludes the representation error in the “or"iginal inputs. For example, with prec = 3 and d = F(3.8761) / F(2.7181), d.full is 1.4264705882352939, the same result as regular division would yield, starting from the nearest representable values, 3.88 / 2.72. This helpful choice isolates accumulated floating-point operation errors from the artifacts of the “or"iginal data entry. This separation is reasonable because real floating-point systems have to start with representable constants; however, if the “or"iginal representation error has to be tracked, you can do so by entering the number twice in the call to F—for example, use F(2.7181, 2.7181) rather than F(2.7181).

See Also

Recipe 18.15 for algorithms for accurate sums; gmpy is at http://gmpy.sourceforge.net.

18.17. Computing the Convex Hulls and Diameters of 2D Point Sets

Credit: David Eppstein, Dinu Gherman

Problem

You have a list of 2D points, represented as pairs (x, y), and need to compute the convex hull (i.e., upper and lower chains of vertices) and the diameter (the two points farthest from each other).

Solution

We can easily compute the hulls by the classic Graham’s scan algorithm, with sorting based on coordinates rather than radially. Here is a function to perform this task:

def orientation(p, q, r):
    ''' >0 if p-q-r are clockwise, <0 if counterclockwise, 0 if colinear. '''
    return (q[1]-p[1])*(r[0]-p[0]) - (q[0]-p[0])*(r[1]-p[1])
def hulls(Points):
    ' Graham scan to find upper and lower convex hulls of a set of 2D points '
    U = [  ]
    L = [  ]
    # the natural sort in Python is lexicographical, by coordinate
    Points.sort( )
    for p in Points:
        while len(U) > 1 and orientation(U[-2], U[-1], p) <= 0:
            U.pop( )
        while len(L) > 1 and orientation(L[-2], L[-1], p) >= 0:
            L.pop( )
        U.append(p)
        L.append(p)
    return U, L

Given the hulls, the rotating calipers algorithm provides all pairs of points that are candidates to be set’s diameter. Here is a function to embody this algorithm:

def rotatingCalipers(Points):
    ''' Given a list of 2d points, finds all ways of sandwiching the points
        between two parallel lines that touch one point each, and yields the
        sequence of pairs of points touched by each pair of lines.  '''
    U, L = hulls(Points)
    i = 0
    j = len(L) - 1
    while i < len(U) - 1 or j > 0:
        yield U[i], L[j]
        # if all the way through one side of hull, advance the other side
        if i == len(U) - 1:
            j -= 1
        elif j == 0:
            i += 1
        # still points left on both lists, compare slopes of next hull edges
        # being careful to avoid divide-by-zero in slope calculation
        elif (U[i+1][1]-U[i][1]) * (L[j][0]-L[j-1][0]) > 
             (L[j][1]-L[j-1][1]) * (U[i+1][0]-U[i][0]):
            i += 1
        else: j -= 1

Given all the candidates, we need only to scan for the max on pairwise point-point distances of candidate pairs of points to get the diameter. Here is a function that performs exactly this task:

def diameter(Points):
    ''' Given a list of 2d points, returns the pair that's farthest apart. '''
    diam, pair = max( [((p[0]-q[0])**2 + (p[1]-q[1])**2, (p,q))
                      for p,q in rotatingCalipers(Points)] )
    return pair

Discussion

As function hulls shows, we can apply Graham’s scan algorithm without needing an expensive radial sort as a preliminary step: Python’s own built-in sort (which is lexicographical, meaning, in this case, by x coordinate first, and by y coordinate when the x coordinates of two points coincide) is sufficient.

From hulls, we get the upper and lower convex hulls as distinct lists, which, in turn, helps in the rotatingCalipers function: that function can maintain separate indices i and j on the lower and upper hulls and still be sure to yield all pairs of sandwich boundary points that are candidates to be the set’s diameter. Given the sequence of candidate pairs, function diameter’s task is quite simple—it boils down to one call to built-in max on a list comprehension (a generator expression would suffice in Python 2.4) that associates the pairwise point distance to each pair of candidate points. We use the squared distance, in fact. There’s no reason to compute a costly square root to get the actual non-squared distance: we’re comparing only distances, and for any non-negative x and y, x < y and sqrt( x ) < sqrt( y ) always have identical truth values. (In practice, however, using math.hypot( p [ 0 ]- q [ 0 ], p [1]- q [1]) in the list comprehension gives us just about the same performance.)

The computations in this recipe take care to handle tricky cases, such as pairs of points with the same x coordinate, multiple copies of the same point, colinear triples of points, and slope computations that, if not carefully handled, would produce a division by zero (i.e., again, pairs of points with the same x coordinate). The set of unit tests that carefully probe each of these corner cases is far longer than the code in the recipe itself, which is why it’s not posted on this cookbook.

Some of the formulas become a little simpler and more readable when we represent points by complex numbers, rather than as pairs of reals:

def orientation(p, q, r):
    return ((q - p) * (r - p).conjugate( )).imag...
        # still points left on both lists, compare slopes of next hull edges
        # being careful to avoid divide-by-zero in slope calculation
        elif ((U[i+1] - U[i]) * (L[j] - L[j-1]).conjugate( )).imag > 0:
            i += 1
        else: j -= 1
...
def diameter(Points):
    diam, pair = max([(abs(p-q), (p,q)) for p,q in rotatingCalipers(Points)])
    return pair

If we represent points by complex numbers, of course, we cannot just use Points.sort( ) any more because complex numbers cannot be compared. We need to “pay back” some of the simplification by coding our own sort, such as:

    aux = [ (p.real, p.imag) for p in Points ]
    aux.sort( )
    Points[:] = [ complex(*p) for p in aux ]
    del aux

or equivalently, in Python 2.4:

    Points.sort(key=lambda p: p.real, p.imag)

Moreover, under the hood, a complex numbers-based version is doing more arithmetic: finding the real as well as imaginary components in the first and second formula, and doing an unnecessary square root in the third one. Nevertheless, performance as measured on my machine, despite this extra work, turns out to be slightly better with this latest snippet than with the “Solution"’s code. The reason I’ve not made the complex-numbers approach the “official” one, aside from the complication with sorting, is that you should not require familiarity with complex arithmetic just to understand geometrical computations.

If you’re comfortable with complex numbers, don’t mind the sorting issues, and have to perform many 2D geometrical computations, you should consider representing points as complex numbers and check whether this provides a performance boost, as well as overall simplification to your source code. Among other advantages, representing points as complex numbers lets you use the Numeric package to hold your data, saving much memory and possibly gaining even more performance, when compared to the natural alternative of representing points as (x, y) tuples holding two floats.

See Also

M. de Berg, M. van Kreveld, M. Overmars, and O. Schwarzkopf, Computational Geometry: Algorithms and Applications, 2nd ed. (Springer-Verlag).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.119.106.135