6. Object-Oriented Programming

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 6. Object-Oriented Programming

Introduction

Credit: Alex Martelli, author of Python in a Nutshell (O’Reilly)

Object-oriented programming (OOP) is among Python’s greatest strengths. Python’s OOP features continue to improve steadily and gradually, just like Python in general. You could already write better object-oriented programs in Python 1.5.2 (the ancient, long-stable version that was new when I first began to work with Python) than in any other popular language (excluding, of course, Lisp and its variants: I doubt there’s anything you can’t do well in Lisp-like languages, as long as you can stomach parentheses-heavy concrete syntax). For a few years now, since the release of Python 2.2, Python OOP has become substantially better than it was with 1.5.2. I am constantly amazed at the systematic progress Python achieves without sacrificing solidity, stability, and backwards-compatibility.

To get the most out of Python’s great OOP features, you should use them the Python way, rather than trying to mimic C++, Java, Smalltalk, or other languages you may be familiar with. You can do a lot of mimicry, thanks to Python’s power. However, you’ll get better mileage if you invest time and energy in understanding the Python way. Most of the investment is in increasing your understanding of OOP itself: what is OOP, what does it buy you, and which underlying mechanisms can your object-oriented programs use? The rest of the investment is in understanding the specific mechanisms that Python itself offers.

One caveat is in order. For such a high-level language, Python is quite explicit about the OOP mechanisms it uses behind the curtains: they’re exposed and available for your exploration and tinkering. Exploration and understanding are good, but beware the temptation to tinker. In other words, don’t use unnecessary black magic just because you can. Specifically, don’t use black magic in production code. If you can meet your goals with simplicity (and most often, in Python, you can), then keep your code simple. Simplicity pays off in readability, maintainability, and, more often than not, performance, too. To describe something as clever is not considered a compliment in the Python culture.

So what is OOP all about? First of all, it’s about keeping some state (data) and some behavior (code) together in handy packets. “Handy packets” is the key here. Every program has state and behavior—programming paradigms differ only in how you view, organize, and package them. If the packaging is in terms of objects that typically comprise state and behavior, you’re using OOP. Some object-oriented languages force you to use OOP for everything, so you end up with many objects that lack either state or behavior. Python, however, supports multiple paradigms. While everything in Python is an object, you package things as OOP objects only when you want to. Other languages try to force your programming style into a predefined mold for your own good, while Python empowers you to make and express your own design choices.

With OOP, once you have specified how an object is composed, you can instantiate as many objects of that kind as you need. When you don’t want to create multiple objects, consider using other Python constructs, such as modules. In this chapter, you’ll find recipes for Singleton, an object-oriented design pattern that eliminates the multiplicity of instantiation, and Borg, an idiom that makes multiple instances share state. But if you want only one instance, in Python it’s often best to use a module, not an OOP object.

To describe how an object is made, use the class statement:

class SomeName(object):
    """ You usually define data and code here (in the class body). """

SomeName is a class object. It’s a first-class object, like every Python object, meaning that you can put it in lists and dictionaries, pass it as an argument to functions, and so on. You don’t have to include the (object) part in the class header clause—class SomeName: by itself is also valid Python syntax—but normally you should include that part, as we’ll see later.

When you want a new instance of a class, call the class object as if it were a function. Each call returns a new instance object:

anInstance = SomeName( )
another = SomeName( )

anInstance and another are two distinct instance objects, instances of the SomeName class. (See Recipe 4.18 for a class that does little more than this and yet is already quite useful.) You can freely bind (i.e., assign or set) and access (i.e., get) attributes (i.e., state) of an instance object:

anInstance.someNumber = 23 * 45
print anInstance.someNumber                # emits:1035

Instances of an “empty” class like SomeName have no behavior, but they may have state. Most often, however, you want instances to have behavior. Specify the behavior you want by defining methods (with def statements, just like you define functions) inside the class body:

class Behave(object):
    def _ _init_ _(self, name):
        self.name = name
    def once(self):
        print "Hello,", self.name
    def rename(self, newName)
        self.name = newName
    def repeat(self, N):
        for i in range(N): self.once( )

You define methods with the same def statement Python uses to define functions, exactly because methods are essentially functions. However, a method is an attribute of a class object, and its first formal argument is (by universal convention) named self. self always refers to the instance on which you call the method.

The method with the special name _ _init_ _ is also known as the constructor (or more properly the initializer) for instances of the class. Python calls this special method to initialize each newly created instance with the arguments that you passed when calling the class (except for self, which you do not pass explicitly since Python supplies it automatically). The body of _ _init_ _ typically binds attributes on the newly created self instance to appropriately initialize the instance’s state.

Other methods implement the behavior of instances of the class. Typically, they do so by accessing instance attributes. Also, methods often rebind instance attributes, and they may call other methods. Within a class definition, these actions are always done with the self.something syntax. Once you instantiate the class, however, you call methods on the instance, access the instance’s attributes, and even rebind them, using the theobject.something syntax:

beehive = Behave("Queen Bee")
beehive.repeat(3)
beehive.rename("Stinger")
beehive.once( )
print beehive.name
beehive.name = 'See, you can rebind it "from the outside" too, if you want'
beehive.repeat(2)

If you’re new to OOP in Python, you should try, in an interactive Python environment, the example snippets I have shown so far and those I’m going to show in the rest of this Introduction. One of the best interactive Python environments for such exploration is the GUI shell supplied as part of the free IDLE development environment that comes with Python.

In addition to the constructor (_ _init_ _), your class may have other special methods, meaning methods with names that start and end with two underscores. Python calls the special methods of a class when instances of the class are used in various operations and built-in functions. For example, len(x) returns x._ _len_ _( ); a+b normally returns a._ _add_ _(b); a[b] returns a._ _getitem_ _(b). Therefore, by defining special methods in a class, you can make instances of that class interchangeable with objects of built-in types, such as numbers, lists, and dictionaries.

Tip

Each operation and built-in function can try several special methods in some specific order. For example, a+b first tries a._ _add_ _(b), but, if that doesn’t pan out, the operation also gives object b a say in the matter, by next trying b._ _radd_ _(a). This kind of intrinsic structuring among special methods, that operations and built-in functions can provide, is an important added value of such functions and operations with respect to pure OO notation such as someobject.somemethod(arguments).

The ability to handle different objects in similar ways, known as polymorphism, is a major advantage of OOP. Thanks to polymorphism, you can call the same method on various objects, and each object can implement the method appropriately. For example, in addition to the Behave class, you might have another class that implements a repeat method with rather different behavior:

class Repeater(object):
    def repeat(self, N): print N*"*-*"

You can mix instances of Behave and Repeater at will, as long as the only method you call on each such instance is repeat:

aMix = beehive, Behave('John'), Repeater( ), Behave('world')
for whatever in aMix: whatever.repeat(3)

Other languages require inheritance, or the formal definition and implementation of interfaces, in order to enable such polymorphism. In Python, all you need is to have methods with the same signature (i.e., methods of the same name, callable with the same arguments). This signature-based polymorphism allows a style of programming that’s quite similar to generic programming (e.g., as supported by C++’s template classes and functions), without syntax cruft and without conceptual complications.

Python also uses inheritance, which is mostly a handy, elegant, structured way to reuse code. You can define a class by inheriting from another (i.e., subclassing the other class) and then adding or redefining (known as overriding) some methods:

class Subclass(Behave):
    def once(self): print '(%s)' % self.name
subInstance = Subclass("Queen Bee")
subInstance.repeat(3)

The Subclass class overrides only the once method, but you can also call the repeat method on subInstance, since Subclass inherits that method from the Behave superclass. The body of the repeat method calls once n times on the specific instance, using whatever version of the once method the instance has. In this case, each call uses the method from the Subclass class, which prints the name in parentheses, not the original version from the Behave class, which prints the name after a greeting. The idea of a method calling other methods on the same instance and getting the appropriately overridden version of each is important in every object-oriented language, including Python. It is also known as the Template Method Design Pattern.

The method of a subclass often overrides a method from the superclass, but also needs to call the method of the superclass as part of its own operation. You can do this in Python by explicitly getting the method as a class attribute and passing the instance as the first argument:

class OneMore(Behave):
    def repeat(self, N): Behave.repeat(self, N+1)
zealant = OneMore("Worker Bee")
zealant.repeat(3)

The OneMore class implements its own repeat method in terms of the method with the same name in its superclass, Behave, with a slight change. This approach, known as delegation, is pervasive in all programming. Delegation involves implementing some functionality by letting another existing piece of code do most of the work, often with some slight variation. An overriding method often is best implemented by delegating some of the work to the same method in the superclass. In Python, the syntax Classname.method(self, . . .) delegates to Classname’s version of the method. A vastly preferable way to perform superclass delegation, however, is to use Python’s built-in super:

class OneMore(Behave):
    def repeat(self, N): super(OneMore, self).repeat(N+1)

This super construct is equivalent to the explicit use of Behave.repeat in this simple case, but it also allows class OneMore to be used smoothly with multiple inheritance. Even if you’re not interested in multiple inheritance at first, you should still get into the habit of using super instead of explicit delegation to your base class by name—super costs nothing and it may prove very useful to you in the future.

Python does fully support multiple inheritance: one class can inherit from several other classes. In terms of coding, this feature is sometimes just a minor one that lets you use the mix-in class idiom, a convenient way to supply functionality across a broad range of classes. (See Recipe 6.20 and Recipe 6.12, for unusual but powerful examples of using the mix-in idiom.) However, multiple inheritance is particularly important because of its implications for object-oriented analysis—the way you conceptualize your problem and your solution in the first place. Single inheritance pushes you to frame your problem space via taxonomy (i.e., mutually exclusive classification). The real world doesn’t work like that. Rather, it resembles Jorge Luis Borges’ explanation in The Analytical Language of John Wilkins, from a purported Chinese encyclopedia, The Celestial Emporium of Benevolent Knowledge. Borges explains that all animals are divided into:

Those that belong to the Emperor
Embalmed ones
Those that are trained
Suckling pigs
Mermaids
Fabulous ones
Stray dogs
Those included in the present classification
Those that tremble as if they were mad
Innumerable ones
Those drawn with a very fine camelhair brush
Others
Those that have just broken a flower vase
Those that from a long way off look like flies

You get the point: taxonomy forces you to pigeonhole, fitting everything into categories that aren’t truly mutually exclusive. Modeling aspects of the real world in your programs is hard enough without buying into artificial constraints such as taxonomy. Multiple inheritance frees you from these constraints.

Ah, yes, that (object) thing—I had promised to come back to it later. Now that you’ve seen Python’s notation for inheritance, you realize that writing class X(object) means that class X inherits from class object. If you just write class Y:, you’re saying that Y doesn’t inherit from anything—Y, so to speak, “stands on its own”. For backwards compatibility, Python allows you to request such a rootless class, and, if you do, then Python makes class Y an “old-style” class, also known as a classic class, meaning a class that works just like all classes used to work in the Python versions of old. Python is very keen on backwards-compatibility.

For many elementary uses, you won’t notice the difference between classic classes and the new-style classes that are recommended for all new Python code you write. However, it’s important to underscore that classic classes are a legacy feature, not recommended for new code. Even within the limited compass of elementary OOP features that I cover in this Introduction, you will already feel some of the limitations of classic classes: for example, you cannot use super within classic classes, and in practice, you should not do any serious use of multiple inheritance with them. Many important features of today’s Python OOP, such as the property built-in, can’t work completely, if they even work at all, with old-style classes.

In practice, even if you’re maintaining a large body of legacy Python code, the next time you need to do any substantial maintenance on that code, you should take the little effort required to ensure all classes are new style: it’s a small job, and it will ease your future maintenance burden quite a bit. Instead of explicitly having all your classes inherit from object, an equivalent alternative is to add the following assignment statement close to the start of every module that defines any classes:

_ _metaclass_ _ = type

The built-in type is the metaclass of object and of every other new-style class and built-in type. That’s why inheriting from object or any built-in type makes a class new style: the class you’re coding gets the same metaclass as its base. A class without bases can get its metaclass from the module-global _ _metaclass_ _ variable, which is why the “state"ment I suggest suffices to ensure that any classes without explicit bases are made new-style. Even if you never make any other use of explicit metaclasses (a rather advanced subject that is, nevertheless, mentioned in several of this chapter’s recipes), this one simple use of them will stand you in good stead.

Metaclasses do not mean “deep, dark black magic”. When you execute any class statement, Python performs the following steps:

Remember the class name as a string, say n, and the class bases as a tuple, say b.

Execute the body of the class, recording all names that the body binds as keys in a new dictionary d, each with its associated value (e.g., each statement such as def f(self) just sets d['f'] to the function object the def statement builds).

Determine the appropriate metaclass, say M, by inheritance or by looking for name _ _metaclass_ _ in d and in the globals:

if '_ _metaclass_ _' in d: M = d['_ _metaclass_ _']
elif b: M = type(b[0])
elif '_ _metaclass_ _' in globals( ): M = globals( )['_ _metaclass_ _']
else: M = types.ClassType

types.ClassType is the metaclass of old-style classes, so this code implies that a class without bases is old style if the name '_ _metaclass_ _' is not set in the class body nor among the global variables of the current module.

Call M(n, b, d) and record the result as a variable with name n in whatever scope the class statement executed.

So, some metaclass M is always involved in the execution of any class statement. The metaclass is normally type for new-style classes, types.ClassType for old-style classes. You can set it up to use your own custom metaclass (normally a subclass of type), and that is where you may reasonably feel that things are getting a bit too advanced. However, understanding that a class statement, such as:

class Someclass(Somebase):
    _ _metaclass_ _ = type
    x = 23

is exactly equivalent to the assignment statement:

Someclass = type('Someclass', (Somebase,), {'x': 23})

does help a lot in understanding the exact semantics of the class statement.

6.1. Converting Among Temperature Scales

Credit: Artur de Sousa Rocha, Adde Nilsson

Problem

You want to convert easily among Kelvin, Celsius, Fahrenheit, and Rankine scales of temperature.

Solution

Rather than having a dozen functions to do all possible conversions, we can more elegantly package this functionality into a class:

class Temperature(object):
    coefficients = {'c': (1.0, 0.0, -273.15), 'f': (1.8, -273.15, 32.0),
                    'r': (1.8, 0.0, 0.0)}
    def _ _init_ _(self, **kwargs):
        # default to absolute (Kelvin) 0, but allow one named argument,
        # with name being k, c, f or r, to use any of the scales
        try:
            name, value = kwargs.popitem( )
        except KeyError:
            # no arguments, so default to k=0
            name, value = 'k', 0
        # error if there are more arguments, or the arg's name is unknown
        if kwargs or name not in 'kcfr':
            kwargs[name] = value             # put it back for diagnosis
            raise TypeError, 'invalid arguments %r' % kwargs
        setattr(self, name, float(value))
    def _ _getattr_ _(self, name):
        # maps getting of c, f, r, to computation from k
        try:
            eq = self.coefficients[name]
        except KeyError:
            # unknown name, give error message
            raise AttributeError, name
        return (self.k + eq[1]) * eq[0] + eq[2]
    def _ _setattr_ _(self, name, value):
        # maps settings of k, c, f, r, to setting of k; forbids others
        if name in self.coefficients:
            # name is c, f or r -- compute and set k
            eq = self.coefficients[name]
            self.k = (value - eq[2]) / eq[0] - eq[1]
        elif name == 'k':
            # name is k, just set it
            object._ _setattr_ _(self, name, value)
        else:
            # unknown name, give error message
            raise AttributeError, name
    def _ _str_ _(self):
        # readable, concise representation as string
        return "%s K" % self.k
    def _ _repr_ _(self):
        # detailed, precise representation as string
        return "Temperature(k=%r)" % self.k

Discussion

Converting between several different scales or units of measure is a task that’s subject to a “combinatorial explosion”: if we tackle it in the apparently obvious way, by providing a function for each conversion, then, to deal with n different units, we will have to write n * (n-1) functions.

A Python class can intercept attribute setting and getting, and perform computation on the fly in response. This power enables a much handier and more elegant architecture, as shown in this recipe for the specific case of temperatures.

Inside the class, we always hold the measurement in one reference unit or scale, Kelvin (absolute) degrees in the case of this recipe. We allow the setting of the value to happen through any of four attribute names ('k', 'r', 'c', 'f', abbreviations of the scales’ names), and compute and set the Kelvin-scale value appropriately. Vice versa, we also allow the “getting” of the value in any scale, through the same attribute names, computing the result on the fly. (Assuming you have saved the code in this recipe as te.py somewhere on your Python sys.path, you can import it as a module.) For example:

>>> from te import Temperature
>>> t = Temperature(f=70)        # 70 F is...
>>> print t.c                    # ...a bit over 21 C21.1111111111
>>> t.c = 23                     # 23 C is...
>>> print t.f                    # ...a bit over 73 F
73.4

_ _getattr_ _ and _ _setattr_ _ work better than named properties would in this case, since the form of the computation is the same for every attribute (except the reference 'k' one), and we only need to use different coefficients that we can most handily keep in a per-class dictionary, the one we name self.coefficients. It’s important to remember that _ _setattr_ _ is called on every setting of any attribute, so it must delegate to object the setting of attributes, which need to be recorded in the instance (the _ _setattr_ _ implementation in this recipe does just such a delegation for attribute k) and must raise an AttributeError exception for attributes that can’t be set. _ _getattr_ _, on the other hand, is called only upon the “getting” of an attribute that can’t be found by other, “normal” means (e.g., in the case of this recipe’s class, _ _getattr_ _ is not called for accesses to attribute k, which is recorded in the instance and thus gets found by normal means). _ _getattr_ _ must also raise an AttributeError exception for attributes that can’t be accessed.

6.2. Defining Constants

Credit: Alex Martelli

Problem

You need to define module-level variables (i.e., named constants) that client code cannot accidentally rebind.

Solution

You can install any object as if it were a module. Save the following code as module const.py on some directory on your Python sys.path:

class _const(object):
    class ConstError(TypeError): pass
    def _ _setattr_ _(self, name, value):
        if name in self._ _dict_ _:
            raise self.ConstError, "Can't rebind const(%s)" % name
        self._ _dict_ _[name] = value
    def _ _delattr_ _(self, name):
        if name in self._ _dict_ _:
            raise self.ConstError, "Can't unbind const(%s)" % name
        raise NameError, name
import syssys.modules[_ _name_ _] = _const( )

Now, any client code can import const, then bind an attribute on the const module just once, as follows:

const.magic = 23

Once the attribute is bound, the program cannot accidentally rebind or unbind it:

const.magic = 88      # raises const.ConstError
del const.magic       # raises const.ConstError

Discussion

In Python, variables can be rebound at will, and modules, differently from classes, don’t let you define special methods such as _ _setattr_ _ to stop rebinding. An easy solution is to install an instance as if it were a module.

Python performs no type-checks to force entries in sys.modules to actually be module objects. Therefore, you can install any object there and take advantage of attribute-access special methods (e.g., to prevent rebinding, to synthesize attributes on the fly in _ _getattr_ _, etc.), while still allowing client code to access the object with import somename. You may even see it as a more Pythonic Singleton-style idiom (but see Recipe 6.16).

This recipe ensures that a module-level name remains constantly bound to the same object once it has first been bound to it. This recipe does not deal with a certain object’s immutability, which is quite a different issue. Altering an object and rebinding a name are different concepts, as explained in Recipe 4.1. Numbers, strings, and tuples are immutable: if you bind a name in const to such an object, not only will the name always be bound to that object, but the object’s contents also will always be the same since the object is immutable. However, other objects, such as lists and dictionaries, are mutable: if you bind a name in const to, say, a list object, the name will always remain bound to that list object, but the contents of the list may change (e.g., items in it may be rebound or unbound, more items can be added with the object’s append method, etc.).

To make “read-only” wrappers around mutable objects, see Recipe 6.5. You might choose to have class _const’s _ _setattr_ _ method perform such wrapping implicitly. Say you have saved the code from Recipe 6.5 as module ro.py somewhere along your Python sys.path. Then, you need to add, at the start of module const.py:

import ro

and change the assignment self._ _dict_ _[name] = value, used in class _const’s _ _setattr_ _ method to:

    self._ _dict_ _[name] = ro.Readonly(value)

Now, when you set an attribute in const to some value, what gets bound there is a read-only wrapper to that value. The underlying value might still get changed by calling mutators on some other reference to that same value (object), but it cannot be accidentally changed through the attribute of “pseudo-module” const. If you want to avoid such “accidental changes through other references”, you need to take a copy, as explained in Recipe 4.1, so that there exist no other references to the value held by the read-only wrapper. Ensure that at the start of module const.py you have:

import ro, copy

and change the assignment in class _const’s _ _setattr_ _ method to:

    self._ _dict_ _[name] = ro.Readonly(copy.copy(value))

If you’re sufficiently paranoid, you might even use copy.deepcopy rather than plain copy.copy in this latest snippet. However, you may end up paying substantial amounts of memory, as well as losing some performance, by these kinds of excessive precautions. You should evaluate carefully whether so much prudence is really necessary for your specific application. Whatever you end up deciding about this issue, Python offers all the tools you need to implement exactly the amount of constantness you require.

The _const class presented in this recipe can be seen, in a sense, as the “complement” of the NoNewAttrs class, which is presented next in Recipe 6.3. This one ensures that already bound attributes can never be rebound but lets you freely bind new attributes; the other one, conversely, lets you freely rebind attributes that are already bound but blocks the binding of any new attribute.

6.3. Restricting Attribute Setting

Credit: Michele Simionato

Problem

Python normally lets you freely add attributes to classes and their instances. However, you want to restrict that freedom for some class.

Solution

Special method _ _setattr_ _ intercepts every setting of an attribute, so it lets you inhibit the addition of new attributes that were not already present. One elegant way to implement this idea is to code a class, a simple custom metaclass, and a wrapper function, all cooperating for the purpose, as follows:

def no_new_attributes(wrapped_setattr):
    """ raise an error on attempts to add a new attribute, while
        allowing existing attributes to be set to new values.
    """
    def _ _setattr_ _(self, name, value):
        if hasattr(self, name):    # not a new attribute, allow setting
            wrapped_setattr(self, name, value)
        else:                      # a new attribute, forbid adding it
            raise AttributeError("can't add attribute %r to %s" % (name, self))
    return _ _setattr_ _
class NoNewAttrs(object):
    """ subclasses of NoNewAttrs inhibit addition of new attributes, while
        allowing existing attributed to be set to new values.
    """
    # block the addition new attributes to instances of this class
    _ _setattr_ _ = no_new_attributes(object._ _setattr_ _)
    class _ _metaclass_ _(type):
        " simple custom metaclass to block adding new attributes to this class "
        _ _setattr_ _ = no_new_attributes(type._ _setattr_ _)

Discussion

For various reasons, you sometimes want to restrict Python’s dynamism. In particular, you may want to get an exception when a new attribute is accidentally set on a certain class or one of its instances. This recipe shows how to go about implementing such a restriction. The key point of the recipe is, don’t use _ _slots_ _ for this purpose: _ _slots_ _ is intended for a completely different task (i.e., saving memory by avoiding each instance having a dictionary, as it normally would, when you need to have vast numbers of instances of a class with just a few fixed attributes). _ _slots_ _ performs its intended task well but has various limitations when you try to stretch it to perform, instead, the task this recipe covers. (See Recipe 6.18 for an example of the appropriate use of _ _slots_ _ to save memory.)

Notice that this recipe inhibits the addition of runtime attributes, not only to class instances, but also to the class itself, thanks to the simple custom metaclass it defines. When you want to inhibit accidental addition of attributes, you usually want to inhibit it on the class as well as on each individual instance. On the other hand, existing attributes on both the class and its instances may be freely set to new values.

Here is an example of how you could use this recipe:

class Person(NoNewAttrs):
    firstname = ''
    lastname = ''
    def _ _init_ _(self, firstname, lastname):
        self.firstname = firstname
        self.lastname = lastname
    def _ _repr_ _(self):
        return 'Person(%r, %r)' % (self.firstname, self.lastname)
me = Person("Michere", "Simionato")
print me
# emits:Person('Michere', 'Simionato')
# oops, wrong value for firstname, can we fix it?  Sure, no problem!
me.firstname = "Michele"
print me
# emits: Person('Michele', 'Simionato')

The point of inheriting from NoNewAttrs is forcing yourself to “declare” all allowed attributes by setting them at class level in the body of the class itself. Any further attempt to set a new, “undeclared” attribute raises an AttributeError:

try: Person.address = ''
except AttributeError, err: print 'raised %r as expected' % err
try: me.address = ''
except AttributeError, err: print 'raised %r as expected' % err

In some ways, therefore, subclasses of NoNewAttr and their instances behave more like Java or C++ classes and instances, rather than normal Python ones. Thus, one use case for this recipe is when you’re coding in Python a prototype that you already know will eventually have to be recoded in a less dynamic language.

6.4. Chaining Dictionary Lookups

Credit: Raymond Hettinger

Problem

You have several mappings (usually dicts) and want to look things up in them in a chained way (try the first one; if the key is not there, then try the second one; and so on). Specifically, you want to make a single mapping object that “virtually merges” several others, by looking things up in them in a specified priority order, so that you can conveniently pass that one object around.

Solution

A mapping is a generalized, abstract version of a dictionary: a mapping provides an interface that’s similar to a dictionary’s, but it may use very different implementations. All dictionaries are mappings, but not vice versa. Here, you need to implement a mapping which sequentially tries delegating lookups to other mappings. A class is the right way to encapsulate this functionality:

class Chainmap(object):
    def _ _init_ _(self, *mappings):
        # record the sequence of mappings into which we must look
        self._mappings = mappings
    def _ _getitem_ _(self, key):
        # try looking up into each mapping in sequence
        for mapping in self._mappings:
            try:
                return mapping[key]
            except KeyError:
                pass
        # `key' not found in any mapping, so raise KeyError exception
        raise KeyError, key
    def get(self, key, default=None):
        # return self[key] if present, otherwise `default'
        try:
            return self[key]
        except KeyError:
            return default
    def _ _contains_ _(self, key):
        # return True if `key' is present in self, otherwise False
        try:
            self[key]
            return True
        except KeyError:
            return False

For example, you can now implement the same sequence of lookups that Python normally uses for any name: look among locals, then (if not found there) among globals, lastly (if not found yet) among built-ins:

import _ _builtin_ _
pylookup = Chainmap(locals( ), globals( ), vars(_ _builtin_ _))

Discussion

Chainmap relies on minimal functionality from the mappings it wraps: each of those underlying mappings must allow indexing (i.e., supply a special method _ _getitem_ _), and it must raise the standard exception KeyError when indexed with a key that the mapping does not know about. A Chainmap instance provides the same behavior, plus the handy get method covered in Recipe 4.9 and special method _ _contains_ _ (which conveniently lets you check whether some key k is present in a Chainmap instance c by just coding if k in c).

Besides the obvious and sensible limitation of being “read-only”, this Chainmap class has others—essentially, it is not a “full mapping” even within the read-only design choice. You can make any partial mapping into a “full mapping” by inheriting from class DictMixin (in standard library module UserDict) and supplying a few key methods (DictMixin implements the others). Here is how you could make a full (read-only) mapping from ChainMap and UserDict.DictMixin:

import UserDict
from sets import Set
class FullChainmap(Chainmap, UserDict.DictMixin):
    def copy(self):
        return self._ _class_ _(self._mappings)
    def _ _iter_ _(self):
        seen = Set( )
        for mapping in self._mappings:
            for key in mapping:
                if key not in seen:
                    yield key
                    seen.add(key)
    iterkeys = _ _iter_ _
    def keys(self):
        return list(self)

This class FullChainmap adds one requirement to the mappings it holds, besides the requirements posed by Chainmap: the mappings must be iterable. Also note that the implementation in Chainmap of methods get and _ _contains_ _ is redundant (although innocuous) once we subclass DictMixin, since DictMixin also implements those two methods (as well as many others) in terms of lower-level methods, just like Chainmap does. See Recipe 5.14 for more details about DictMixin.

6.5. Delegating Automatically as an Alternative to Inheritance

Credit: Alex Martelli, Raymond Hettinger

Problem

You’d like to inherit from a class or type, but you need some tweak that inheritance does not provide. For example, you want to selectively hide some of the base class’ methods, which inheritance doesn’t allow.

Solution

Inheritance is quite handy, but it’s not all-powerful. For example, it doesn’t let you hide methods or other attributes supplied by a base class. Containment with automatic delegation is often a good alternative. Say, for example, you need to wrap some objects to make them read-only; thus preventing accidental alterations. Therefore, besides stopping attribute-setting, you also need to hide mutating methods. Here’s a way:

# support 2.3 as well as 2.4
try: set
except NameError: from sets import Set as set
class ROError(AttributeError): pass
class Readonly: # there IS a reason to NOT subclass object, see Discussion
    mutators = {
        list: set('''_ _delitem_ _ _ _delslice_ _ _ _iadd_ _ _ _imul_ _
                 _ _setitem_ _ _ _setslice_ _ append extend insert
                 pop remove sort'''.split( )),
        dict: set('''_ _delitem_ _ _ _setitem_ _ clear pop popitem
                 setdefault update'''.split( )),
        }
    def _ _init_ _(self, o):
        object._ _setattr_ _(self, '_o', o)
        object._ _setattr_ _(self, '_no', self.mutators.get(type(o), ( )))
    def _ _setattr_ _(self, n, v):
        raise ROError, "Can't set attr %r on RO object" % n
    def _ _delattr_ _(self, n):
        raise ROError, "Can't del attr %r from RO object" % n
    def _ _getattr_ _(self, n):
        if n in self._no:
            raise ROError, "Can't get attr %r from RO object" % n
        return getattr(self._o, n)

Code using this class Readonly can easily add other wrappable types with Readonly.mutators[sometype] = the_mutators.

Discussion

Automatic delegation, which the special methods _ _getattr_ _, _ _setattr_ _, and _ _delattr_ _ enable us to perform so smoothly, is a powerful, general technique. In this recipe, we show how to use it to get an effect that is almost indistinguishable from subclassing while hiding some names. In particular, we apply this quasi-subclassing to the task of wrapping objects to make them read-only. Performance isn’t quite as good as it might be with real inheritance, but we get better flexibility and finer-grained control as compensation.

The fundamental idea is that each instance of our class holds an instance of the type we are wrapping (i.e., extending and/or tweaking). Whenever client code tries to get an attribute from an instance of our class, unless the attribute is specifically defined there (e.g., the mutators dictionary in class Readonly), _ _getattr_ _ transparently shunts the request to the wrapped instance after appropriate checks. In Python, methods are also attributes, accessed in just the same way, so we don’t need to do anything different to access methods. The _ _getattr_ _ approach used to access data attributes works for methods just as well.

This is where the comment in the recipe about there being a specific reason to avoid subclassing object comes in. Our _ _getattr_ _ based approach does work on special methods too, but only for instances of old-style classes. In today’s object model, Python operations access special methods on the class, not on the instance. Solutions to this issue are presented next in Recipe 6.6 and in Recipe 20.8. The approach adopted in this recipe—making class Readonly old style, so that the issue can be locally avoided and delegated to other recipes—is definitely not recommended for production code. I use it here only to keep this recipe shorter and to avoid duplicating coverage that is already amply given elsewhere in this cookbook.

_ _setattr_ _ plays a role similar to _ _getattr_ _, but it gets called when client code sets an instance attribute; in this case, since we want to make a read-only wrapper, we simply forbid the operation. Remember, to avoid triggering _ _setattr_ _ from inside the methods you code, you must never code normal self.n = v statements within the methods of classes that have _ _setattr_ _. The simplest workaround is to delegate the setting to class object, just like our class Readonly does twice in its _ _init_ _ method. Method _ _delattr_ _ completes the picture, dealing with any attempts to delete attributes from an instance.

Wrapping by automatic delegation does not work well with client or framework code that, one way or another, does type-testing. In such cases, the client or framework code is breaking polymorphism and should be rewritten. Remember not to use type-tests in your own client code, as you probably do not need them anyway. See Recipe 6.13 for better alternatives.

In old versions of Python, automatic delegation was even more prevalent, since you could not subclass built-in types. In modern Python, you can inherit from built-in types, so you’ll use automatic delegation less often. However, delegation still has its place—it is just a bit farther from the spotlight. Delegation is more flexible than inheritance, and sometimes such flexibility is invaluable. In addition to the ability to delegate selectively (thus effectively “hiding” some of the attributes), an object can delegate to different subobjects over time, or to multiple subobjects at one time, and inheritance doesn’t offer anything comparable.

Here is an example of delegating to multiple specific subobjects. Say that you have classes that are chock full of “forwarding methods”, such as:

class Pricing(object):
    def _ _init_ _(self, location, event):
        self.location = location
        self.event = event
    def setlocation(self, location):
        self.location = location
    def getprice(self):
        return self.location.getprice( )
    def getquantity(self):
        return self.location.getquantity( )
    def getdiscount(self):
        return self.event.getdiscount( )and many more such methods

Inheritance is clearly not applicable because an instance of Pricing must delegate to specific location and event instances, which get passed at initialization time and may even be changed. Automatic delegation to the rescue:

class AutoDelegator(object):
    delegates = ( )
    do_not_delegate = ( )
    def _ _getattr_ _(self, key):
        if key not in do_not_delegate:
            for d in self.delegates:
                try:
                    return getattr(d, key)
                except AttributeError:
                    pass
        raise AttributeError, key
class Pricing(AutoDelegator):
    def  _ _init_ _(self, location, event):
        self.delegates = [location, event]
    def setlocation(self, location):
        self.delegates[0] = location

In this case, we do not delegate the setting and deletion of attributes, only the getting of attributes (and nonspecial methods). Of course, this approach is fully applicable only when the methods (and other attributes) of the various objects to which we want to delegate do not interfere with each other; for example, location must not have a getdiscount method; otherwise, it would preempt the delegation of that method, which is intended to go to event.

If a class that does lots of delegation has a few such issues to solve, it can do so by explicitly defining the few corresponding methods, since _ _getattr_ _ enters the picture only for attributes and methods that cannot be found otherwise. The ability to hide some attributes and methods that are supplied by a delegate, but the delegator does not want to expose, is supported through attribute do_not_delegate, which any subclass may override. For example, if class Pricing wanted to hide a method setdiscount that is supplied by, say, event, only a tiny change would be required:

class Pricing(AutoDelegator):
    do_not_delegate = ('set_discount',)

while all the rest remains as in the previous snippet.

6.6. Delegating Special Methods in Proxies

Credit: Gonçalo Rodrigues

Problem

In the new-style object model, Python operations perform implicit lookups for special methods on the class (rather than on the instance, as they do in the classic object model). Nevertheless, you need to wrap new-style instances in proxies that can also delegate a selected set of special methods to the object they’re wrapping.

Solution

You need to generate each proxy’s class on the fly. For example:

class Proxy(object):
    """ base class for all proxies """
    def _ _init_ _(self, obj):
        super(Proxy, self)._ _init_ _(obj)
        self._obj = obj
    def _ _getattr_ _(self, attrib):
        return getattr(self._obj, attrib)
def make_binder(unbound_method):
    def f(self, *a, **k): return unbound_method(self._obj, *a, **k)
    # in 2.4, only: f._ _name_ _ = unbound_method._ _name_ _
    return f
known_proxy_classes = {  }
def proxy(obj, *specials):
    ''' factory-function for a proxy able to delegate special methods '''
    # do we already have a suitable customized class around?
    obj_cls = obj._ _class_ _
    key = obj_cls, specials
    cls = known_proxy_classes.get(key)
    if cls is None:
        # we don't have a suitable class around, so let's make it
        cls = type("%sProxy" % obj_cls._ _name_ _, (Proxy,), {  })
        for name in specials:
            name = '_ _%s_ _' % name
            unbound_method = getattr(obj_cls, name)
            setattr(cls, name, make_binder(unbound_method))
        # also cache it for the future
        known_proxy_classes[key] = cls
    # instantiate and return the needed proxy
    return cls(obj)

Discussion

Proxying and automatic delegation are a joy in Python, thanks to the _ _getattr_ _ hook. Python calls it automatically when a lookup for any attribute (including a method—Python draws no distinction there) has not otherwise succeeded.

In the old-style (classic) object model, _ _getattr_ _ also applied to special methods that were looked up as part of a Python operation. This required some care to avoid mistakenly supplying a special method one didn’t really want to supply but was otherwise handy. Nowadays, the new-style object model is recommended for all new code: it is faster, more regular, and richer in features. You get new-style classes when you subclass object or any other built-in type. One day, some years from now, Python 3.0 will eliminate the classic object model, as well as other features that are still around only for backwards-compatibility. (See http://www.python.org/peps/pep-3000.html for details about plans for Python 3.0—almost all changes will be language simplifications, rather than new features.)

In the new-style object model, Python operations don’t look up special methods at runtime: they rely on “slots” held in class objects. Such slots are updated when a class object is built or modified. Therefore, a proxy object that wants to delegate some special methods to an object it’s wrapping needs to belong to a specially made and tailored class. Fortunately, as this recipe shows, making and instantiating classes on the fly is quite an easy job in Python.

In this recipe, we don’t use any advanced Python concepts such as custom metaclasses and custom descriptors. Rather, each proxy is built by a factory function proxy, which takes as arguments the object to wrap and the names of special methods to delegate (shorn of leading and trailing double underscores). If you’ve saved the “Solution"’s code in a file named proxy.py somewhere along your Python sys.path, here is how you could use it from an interactive Python interpreter session:

>>> import proxy
>>> a = proxy.proxy([  ], 'len', 'iter')   # only delegate _ _len_ _ & _ _iter_ _
>>> a                                    # _ _repr_ _ is not delegated<proxy.listProxy object at 0x0113C370>
>>> a._ _class_ _
<class 'proxy.listProxy'>
>>> a._obj
[  ]
>>> a.append                             # all non-specials are delegated
<built-in method append of list object at 0x010F1A10>

Since _ _len_ _ is delegated, len(a) works as expected:

>>> len(a)0
>>> a.append(23)
>>> len(a)
1

Since _ _iter_ _ is delegated, for loops work as expected, as does intrinsic looping performed by built-ins such as list, sum, max, . . . :

>>> for x in a: print x
...23
>>> list(a)
[23]
>>> sum(a)
23
>>> max(a)
23

However, since _ _getitem_ _ is not delegated, a cannot be indexed nor sliced:

>>> a._ _getitem_ _<method-wrapper object at 0x010F1AF0>
>>> a[1]
Traceback (most recent call last):
  File "<interactive input>", line 1, in ?
TypeError: unindexable object

Function proxy uses a “cache” of classes it has previously generated, the global dictionary known_proxy_classes, keyed by the class of the object being wrapped and the tuple of special methods’ names being delegated. To make a new class, proxy calls the built-in type, passing as arguments the name of the new class (made by appending 'Proxy' to the name of the class being wrapped), class Proxy as the only base, and an “empty” class dictionary (since it’s adding no class attributes yet). Base class Proxy deals with initialization and delegation of ordinary attribute lookups. Then, factory function proxy loops over the names of specials to be delegated: for each of them, it gets the unbound method from the class of the object being wrapped, and sets it as an attribute of the new class within a make_binder closure. make_binder deals with calling the unbound method with the appropriate first argument (i.e., the object being wrapped, self._obj).

Once it’s done preparing a new class, proxy saves it in known_proxy_classes under the appropriate key. Finally, whether the class was just built or recovered from known_proxy_classes, proxy instantiates it, with the object being wrapped as the only argument, and returns the resulting proxy instance.

6.7. Implementing Tuples with Named Items

Credit: Gonçalo Rodrigues, Raymond Hettinger

Problem

Python tuples are handy ways to group pieces of information, but having to access each item by numeric index is a bother. You’d like to build tuples whose items are also accessible as named attributes.

Solution

A factory function is the simplest way to generate the required subclass of tuple:

# use operator.itemgetter if we're in 2.4, roll our own if we're in 2.3
try:
    from operator import itemgetter
except ImportError:
    def itemgetter(i):
        def getter(self): return self[i]
        return getter
def superTuple(typename, *attribute_names):
    " create and return a subclass of `tuple', with named attributes "
    # make the subclass with appropriate _ _new_ _ and _ _repr_ _ specials
    nargs = len(attribute_names)
    class supertup(tuple):
        _ _slots_ _ = ( )         # save memory, we don't need per-instance dict
        def _ _new_ _(cls, *args):
            if len(args) != nargs:
                raise TypeError, '%s takes exactly %d arguments (%d given)' % (
                                  typename, nargs, len(args))
            return tuple._ _new_ _(cls, args)
        def _ _repr_ _(self):
            return '%s(%s)' % (typename, ', '.join(map(repr, self)))
    # add a few key touches to our new subclass of `tuple'
    for index, attr_name in enumerate(attribute_names):
        setattr(supertup, attr_name, property(itemgetter(index)))
    supertup._ _name_ _ = typename
    return supertup

Discussion

You often want to pass data around by means of tuples, which play the role of C’s structs, or that of simple records in other languages. Having to remember which numeric index corresponds to which field, and accessing the fields by indexing, is often bothersome. Some Python Standard Library modules, such as time and os, which in old Python versions used to return tuples, have fixed the problem by returning, instead, instances of tuple-like types that let you access the fields by name, as attributes, as well as by index, as items. This recipe shows you how to get the same effect for your code, essentially by automatically building a custom subclass of tuple.

Orchestrating the building of a new, customized type can be achieved in several ways; custom metaclasses are often the best approach for such tasks. In this case, however, a simple factory function is quite sufficient, and you should never use more power than you need. Here is how you can use this recipe’s superTuple factory function in your code, assuming you have saved this recipe’s Solution as a module named supertuple.py somewhere along your Python sys.path:

>>> import supertuple
>>> Point = supertuple.superTuple('Point', 'x', 'y')
>>> Point<class 'supertuple.Point'>
>>> p = Point(1, 2, 3)              # wrong number of fields
Traceback (most recent call last):
                 File "", line 1, in ?
                 File "C:Python24Libsite-packagessuperTuple.py", line 16, in _ _new_ _
                   raise TypeError, '%s takes exactly %d arguments (%d given)' % (
               TypeError: Point takes exactly 2 arguments (3 given)
>>> p = Point(1, 2)                 # let's do it right this time
>>> p
Point(1, 2)
>>> print p.x, p.y
1 2

Function superTuple’s implementation is quite straightforward. To build the new subclass, superTuple uses a class statement, and in that statement’s body, it defines three specials: an “empty” _ _slots_ _ (just to save memory, since our supertuple instances don’t need any per-instance dictionary anyway); a _ _new_ _ method that checks the number of arguments before delegating to tuple._ _new_ _; and an appropriate _ _repr_ _ method. After the new class object is built, we set into it a property for each named attribute we want. Each such property has only a “getter”, since our supertuples, just like tuples themselves, are immutable—no setting of fields. Finally, we set the new class’ name and return the class object.

Each of the getters is easily built by a simple call to the built-in itemgetter from the standard library module operator. Since operator.itemgetter was introduced in Python 2.4, at the very start of our module we ensure we have a suitable itemgetter at hand anyway, even in Python 2.3, by rolling our own if necessary.

6.8. Avoiding Boilerplate Accessors for Properties

Credit: Yakov Markovitch

Problem

Your classes use some property instances where either the getter or the setter is just boilerplate code to fetch or set an instance attribute. You would prefer to just specify the attribute name, instead of writing boilerplate code.

Solution

You need a factory function that catches the cases in which either the getter or the setter argument is a string, and wraps the appropriate argument into a function, then delegates the rest of the work to Python’s built-in property:

def xproperty(fget, fset, fdel=None, doc=None):
    if isinstance(fget, str):
        attr_name = fget
        def fget(obj): return getattr(obj, attr_name)
    elif isinstance(fset, str):
        attr_name = fset
        def fset(obj, val): setattr(obj, attr_name, val)
    else:
        raise TypeError, 'either fget or fset must be a str'
    return property(fget, fset, fdel, doc)

Discussion

Python’s built-in property is very useful, but it presents one minor annoyance (it may be easier to see as an annoyance for programmers with experience in Delphi). It often happens that you want to have both a setter and a “getter”, but only one of them actually needs to execute any significant code; the other one simply needs to read or write an instance attribute. In that case, property still requires two functions as its arguments. One of the functions will then be just “boilerplate code” (i.e., repetitious plumbing code that is boring, and often voluminous, and thus a likely home for bugs).

For example, consider:

class Lower(object):
    def _ _init_ _(self, s=''):
        self.s = s
    def _getS(self):
        return self._s
    def _setS(self, s):
        self._s = s.lower( )
    s = property(_getS, _setS)

Method _getS is just boilerplate, yet you have to code it because you need to pass it to property. Using this recipe, you can make your code a little bit simpler, without changing the code’s meaning:

class Lower(object):
    def _ _init_ _(self, s=''):
        self.s = s
    def _setS(self, s):
        self._s = s.lower( )
    s = xproperty('_s', _setS)

The simplification doesn’t look like much in one small example, but, applied widely all over your code, it can in fact help quite a bit.

The implementation of factory function xproperty in this recipe’s Solution is rather rigidly coded: it requires you to pass both fget and fset, and exactly one of them must be a string. No use case requires that both be strings; when neither is a string, or when you want to have just one of the two accessors, you can (and should) use the built-in property directly. It is better, therefore, to have xproperty check that it is being used accurately, considering that such checks remove no useful functionality and impose no substantial performance penalty either.

6.9. Making a Fast Copy of an Object

Credit: Alex Martelli

Problem

You need to implement the special method _ _copy_ _ so that your class can cooperate with the copy.copy function. Because the _ _init_ _ method of your specific class happens to be slow, you need to bypass it and get an “empty”, uninitialized instance of the class.

Solution

Here’s a solution that works for both new-style and classic classes:

def empty_copy(obj):
    class Empty(obj._ _class_ _):
        def _ _init_ _(self): pass
    newcopy = Empty( )
    newcopy._ _class_ _ = obj._ _class_ _
    return newcopy

Your classes can use this function to implement _ _copy_ _ as follows:

class YourClass(object):
    def _ _init_ _(self):assume there's a lot of work here
    def _ _copy_ _(self):
        newcopy = empty_copy(self)
        copy some relevant subset of self's attributes to newcopy
        return newcopy

Here’s a usage example:

if _ _name_ _ == '_ _main_ _':
    import copy
    y = YourClass( )    # This, of course, does run _ _init_ _
    print y
    z = copy.copy(y)   # ...but this doesn't
    print z

Discussion

As covered in Recipe 4.1, Python doesn’t implicitly copy your objects when you assign them, which is a great thing because it gives fast, flexible, and uniform semantics. When you need a copy, you explicitly ask for it, often with the copy.copy function, which knows how to copy built-in types, has reasonable defaults for your own objects, and lets you customize the copying process by defining a special method _ _copy_ _ in your own classes. If you want instances of a class to be noncopyable, you can define _ _copy_ _ and raise a TypeError there. In most cases, you can just let copy.copy’s default mechanisms work, and you get free clonability for most of your classes. This is quite a bit nicer than languages that force you to implement a specific clone method for every class whose instances you want to be clonable.

A _ _copy_ _ method often needs to start with an “empty” instance of the class in question (e.g., self), bypassing _ _init_ _ when that is a costly operation. The simplest general way to do this is to use the ability that Python gives you to change an instance’s class on the fly: create a new object in a local empty class, then set the new object’s _ _class_ _ attribute, as the recipe’s code shows. Inheriting class Empty from obj._ _class_ _ is redundant (but quite innocuous) for old-style (classic) classes, but that inheritance makes the recipe compatible with all kinds of objects of classic or new-style classes (including built-in and extension types). Once you choose to inherit from obj’s class, you must override _ _init_ _ in class Empty, or else the whole purpose of the recipe is defeated. The override means that the _ _init_ _ method of obj’s class won’t execute, since Python, fortunately, does not automatically execute ancestor classes’ initializers.

Once you have an “empty” object of the required class, you typically need to copy a subset of self’s attributes. When you need all of the attributes, you’re better off not defining _ _copy_ _ explicitly, since copying all instance attributes is exactly copy.copy’s default behavior. Unless, of course, you need to do a little bit more than just copying instance attributes; in this case, these two alternative techniques to copy all attributes are both quite acceptable:

newcopy._ _dict_ _.update(self._ _dict_ _)
newcopy._ _dict_ _ = dict(self._ _dict_ _)

An instance of a new-style class doesn’t necessarily keep all of its state in _ _dict_ _, so you may need to do some class-specific state copying in such cases.

Alternatives based on the new standard module can’t be made transparent across classic and new-style classes, and neither can the _ _new_ _ static method that generates an empty instance—the latter is only defined in new-style classes, not classic ones. Fortunately, this recipe obviates any such issues.

A good alternative to implementing _ _copy_ _ is often to implement the methods _ _getstate_ _ and _ _setstate_ _ instead: these special methods define your object’s state very explicitly and intrinsically bypass _ _init_ _. Moreover, they also support serialization (i.e., pickling) of your class instances: see Recipe 7.4 for more information about these methods.

So far we have been discussing shallow copies, which is what you want most of the time. With a shallow copy, your object is copied, but objects it refers to (attributes or items) are not, so the newly copied object and the original object refer to the same items or attributes objects—a fast and lightweight operation. A deep copy is a heavyweight operation, potentially duplicating a large graph of objects that refer to each other. You get a deep copy by calling copy.deepcopy on an object. If you need to customize the way in which instances of your class are deep-copied, you can define the special method _ _deepcopy_ _:

class YourClass(object):...
    def _ _deepcopy_ _(self, memo):
        newcopy = empty_copy(self)
        # use copy.deepcopy(self.x, memo) to get deep copies of elements
        # in the relevant subset of self's attributes, to set in newcopy
        return newcopy

If you choose to implement _ _deepcopy_ _, remember to respect the memoization protocol that is specified in the Python documentation for standard module copy—get deep copies of all the attributes or items that are needed by calling copy.deepcopy with a second argument, the same memo dictionary that is passed to the _ _deepcopy_ _ method. Again, implementing _ _getstate_ _ and _ _setstate_ _ is often a good alternative, since these methods can also support deep copying: Python takes care of deeply copying the “state” object that _ _getstate_ _ returns, before passing it to the _ _setstate_ _ method of a new, empty instance. See Recipe 7.4 for more information about these special methods.

6.10. Keeping References to Bound Methods Without Inhibiting Garbage Collection

Credit: Joseph A. Knapka, Frédéric Jolliton, Nicodemus

Problem

You want to hold references to bound methods, while still allowing the associated object to be garbage-collected.

Solution

Weak references (i.e., references that indicate an object as long as that object is alive but don’t keep that object alive if there are no other, normal references to it) are an important tool in some advanced programming situations. The weakref module in the Python Standard Library lets you use weak references.

However, weakref’s functionality cannot directly be used for bound methods unless you take some precautions. To allow an object to be garbage-collected despite outstanding references to its bound methods, you need some wrappers. Put the following code in a file named weakmethod.py in some directory on your Python sys.path:

import weakref, new
class ref(object):
    """ Wraps any callable, most importantly a bound method, in
        a way that allows a bound method's object to be GC'ed, while
        providing the same interface as a normal weak reference. """
    def _ _init_ _(self, fn):
        try:
            # try getting object, function, and class
            o, f, c = fn.im_self, fn.im_func, fn.im_class
        except AttributeError:                # It's not a bound method
            self._obj = None
            self._func = fn
            self._clas = None
        else:                                 # It is a bound method
            if o is None: self._obj = None    # ...actually UN-bound
            else: self._obj = weakref.ref(o)  # ...really bound
            self._func = f
            self._clas = c
    def _ _call_ _(self):
        if self.obj is None: return self._func
        elif self._obj( ) is None: return None
        return new.instancemethod(self._func, self.obj( ), self._clas)

Discussion

A normal bound method holds a strong reference to the bound method’s object. That means that the object can’t be garbage-collected until the bound method is disposed of:

>>> class C(object):
...     def f(self):
...         print "Hello"
...     def _ _del_ _(self):
...         print "C dying"
...
>>> c = C( ) 
>>> cf = c.f
>>> del c      # c continues to wander about with glazed eyes...
>>> del cf     # ...until we stake its bound method, only then it goes away:C dying

This behavior is most often handy, but sometimes it’s not what you want. For example, if you’re implementing an event-dispatch system, it might not be desirable for the mere presence of an event handler (i.e., a bound method) to prevent the associated object from being reclaimed. The instinctive idea should then be to use weak references. However, a normal weakref.ref to a bound method doesn’t quite work the way one might expect, because bound methods are first-class objects. Weak references to bound methods are dead-on-arrival—that is, they always return None when dereferenced, unless another strong reference to the same bound-method object exists.

For example, the following code, based on the weakref module from the Python Standard Library, doesn’t print “Hello” but raises an exception instead:

>>> import weakref
>>> c = C( )
>>> cf = weakref.ref(c.f)
>>> cf         # Oops, better try the lightning again, Igor...<weakref at 80ce394; dead>
>>> cf( )( )
Traceback (most recent call last):
               File "", line 1, in ?
               TypeError: object of type 'None' is not callable

On the other hand, the class ref in the weakmethod module shown in this recipe allows you to have weak references to bound methods in a useful way:

>>> import weakmethod
>>> cf = weakmethod.ref(c.f)
>>> cf( )( )     # It LIVES! Bwahahahaha!Hello
>>> del c      # ...and it dies
C dying
>>> print cf( )
None

Calling the weakmethod.ref instance, which refers to a bound method, has the same semantics as calling a weakref.ref instance that refers to, say, a function object: if the referent has died, it returns None; otherwise, it returns the referent. Actually, in this case, it returns a freshly minted new.instancemethod (holding a strong reference to the object—so, be sure not to hold on to that, unless you do want to keep the object alive for a while!).

Note that the recipe is carefully coded so you can wrap into a ref instance any callable you want, be it a method (bound or unbound), a function, whatever; the weak references semantics, however, are provided only when you’re wrapping a bound method; otherwise, ref acts as a normal (strong) reference, holding the callable alive. This basically lets you use ref for wrapping arbitrary callables without needing to check for special cases.

If you want semantics closer to that of a weakref.proxy, they’re easy to implement, for example by subclassing the ref class given in this recipe. When you call a proxy, the proxy calls the referent with the same arguments. If the referent’s object no longer lives, then weakref.ReferenceError gets raised instead. Here’s an implementation of such a proxy class:

class proxy(ref):
    def _ _call_ _(self, *args, **kwargs):
        func = ref._ _call_ _(self)
        if func is None:
            raise weakref.ReferenceError('referent object is dead')
        else:
            return func(*args, **kwargs)
    def _ _eq_ _(self, other):
        if type(other) != type(self):
            return False
        return ref._ _call_ _(self) == ref._ _call_ _(other)

6.11. Implementing a Ring Buffer

Credit: Sébastien Keim, Paul Moore, Steve Alexander, Raymond Hettinger

Problem

You want to define a buffer with a fixed size, so that, when it fills up, adding another element overwrites the first (oldest) one. This kind of data structure is particularly useful for storing log and history information.

Solution

This recipe changes the buffer object’s class on the fly, from a nonfull buffer class to a full buffer class, when the buffer fills up:

class RingBuffer(object):
    """ class that implements a not-yet-full buffer """
    def _ _init_ _(self, size_max):
        self.max = size_max
        self.data = [  ]
    class _ _Full(object):
        """ class that implements a full buffer """
        def append(self, x):
            """ Append an element overwriting the oldest one. """
            self.data[self.cur] = x
            self.cur = (self.cur+1) % self.max
        def tolist(self):
            """ return list of elements in correct order. """
            return self.data[self.cur:] + self.data[:self.cur]
    def append(self, x):
        """ append an element at the end of the buffer. """
        self.data.append(x)
        if len(self.data) == self.max:
            self.cur = 0
            # Permanently change self's class from non-full to fullself._ _class_ _ = _ _Full
    def tolist(self):
        """ Return a list of elements from the oldest to the newest. """
        return self.data
# sample usage
if _ _name_ _ == '_ _main_ _':
    x = RingBuffer(5)
    x.append(1); x.append(2); x.append(3); x.append(4)
    print x._ _class_ _, x.tolist( )
    x.append(5)
    print x._ _class_ _, x.tolist( )
    x.append(6)
    print x.data, x.tolist( )
    x.append(7); x.append(8); x.append(9); x.append(10)
    print x.data, x.tolist( )

Discussion

A ring buffer is a buffer with a fixed size. When it fills up, adding another element overwrites the oldest one that was still being kept. It’s particularly useful for the storage of log and history information. Python has no direct support for this kind of structure, but it’s easy to construct one. The implementation in this recipe is optimized for element insertion.

The notable design choice in the implementation is that, since these objects undergo a nonreversible state transition at some point in their lifetimes—from nonfull buffer to full buffer (and behavior changes at that point)—I modeled that by changing self._ _class_ _. This works just as well for classic classes as for new-style ones, as long as the old and new classes of the object have the same slots (e.g., it works fine for two new-style classes that have no slots at all, such as RingBuffer and _ _Full in this recipe). Note that, differently from other languages, the fact that class _ _Full is implemented inside class RingBuffer does not imply any special relationship between these classes; that’s a good thing, too, because no such relationship is necessary.

Changing the class of an instance may be strange in many languages, but it is an excellent Pythonic alternative to other ways of representing occasional, massive, irreversible, and discrete changes of state that vastly affect behavior, as in this recipe. Fortunately, Python supports it for all kinds of classes.

Ring buffers (i.e., bounded queues, and other names) are quite a useful idea, but the inefficiency of testing whether the ring is full, and if so, doing something different, is a nuisance. The nuisance is particularly undesirable in a language like Python, where there’s no difficulty—other than the massive memory cost involved—in allowing the list to grow without bounds. So, ring buffers end up being underused in spite of their potential. The idea of assigning to _ _class_ _ to switch behaviors when the ring gets full is the key to this recipe’s efficiency: such class switching is a one-off operation, so it doesn’t make the steady-state cases any less efficient.

Alternatively, we might switch just two methods, rather than the whole class, of a ring buffer instance that becomes full:

class RingBuffer(object):
    def _ _init_ _(self,size_max):
        self.max = size_max
        self.data = [  ]
    def _full_append(self, x):
        self.data[self.cur] = x
        self.cur = (self.cur+1) % self.max
    def _full_get(self):
        return self.data[self.cur:]+self.data[:self.cur]
    def append(self, x):
        self.data.append(x)
        if len(self.data) == self.max:
            self.cur = 0
            # Permanently change self's methods from non-full to fullself.append = self._full_append
               self.tolist = self._full_get
    def tolist(self):
        return self.data

This method-switching approach is essentially equivalent to the class-switching one in the recipe’s solution, albeit through rather different mechanisms. The best approach is probably to use class switching when all methods must be switched in bulk and method switching only when you need finer granularity of behavior change. Class switching is the only approach that works if you need to switch any special methods in a new-style class, since intrinsic lookup of special methods during various operations happens on the class, not on the instance (classic classes differ from new-style ones in this aspect).

You can use many other ways to implement a ring buffer. In Python 2.4, in particular, you should consider subclassing the new type collections.deque, which supplies a “double-ended queue”, allowing equally effective additions and deletions from either end:

from collections import deque
class RingBuffer(deque):
    def _ _init_ _(self, size_max):
        deque._ _init_ _(self)
        self.size_max = size_max
    def append(self, datum):
        deque.append(self, datum)
        if len(self) > self.size_max:
            self.popleft( )
    def tolist(self):
        return list(self)

or, to avoid the if statement when at steady state, you can mix this idea with the idea of switching a method:

from collections import deque
class RingBuffer(deque):
    def _ _init_ _(self, size_max):
        deque._ _init_ _(self)
        self.size_max = size_max
    def _full_append(self, datum):
        deque.append(self, datum)
        self.popleft( )
    def append(self, datum):
        deque.append(self, datum)
        if len(self) == self.size_max:
            self.append = self._full_append
    def tolist(self):
        return list(self)

With this latest implementation, we need to switch only the append method (the tolist method remains the same), so method switching appears to be more appropriate than class switching.

6.12. Checking an Instance for Any State Changes

Credit: David Hughes

Problem

You need to check whether any changes to an instance’s state have occurred to selectively save instances that have been modified since the last “save” operation.

Solution

An effective solution is a mixin class—a class you can multiply inherit from and that is able to take snapshots of an instance’s state and compare the instance’s current state with the last snapshot to determine whether or not the instance has been modified:

import copy
class ChangeCheckerMixin(object):
    containerItems = {dict: dict.iteritems, list: enumerate}
    immutable = False
    def snapshot(self):
        ''' create a "snapshot" of self's state -- like a shallow copy, but
            recursing over container types (not over general instances:
            instances must keep track of their own changes if needed).  '''
        if self.immutable:
            return
        self._snapshot = self._copy_container(self._ _dict_ _)
    def makeImmutable(self):
        ''' the instance state can't change any more, set .immutable '''
        self.immutable = True
        try:
            del self._snapshot
        except AttributeError:
            pass
    def _copy_container(self, container):
        ''' semi-shallow copy, recursing on container types only '''
        new_container = copy.copy(container)
        for k, v in self.containerItems[type(new_container)](new_container):
            if type(v) in self.containerItems:
                new_container[k] = self._copy_container(v)
            elif hasattr(v, 'snapshot'):
                v.snapshot( )
        return new_container
    def isChanged(self):
        ''' True if self's state is changed since the last snapshot '''
        if self.immutable:
            return False
        # remove snapshot from self._ _dict_ _, put it back at the end
        snap = self._ _dict_ _.pop('_snapshot', None)
        if snap is None:
            return True
        try:
            return self._checkContainer(self._ _dict_ _, snap)
        finally:
            self._snapshot = snap
    def _checkContainer(self, container, snapshot):
        ''' return True if the container and its snapshot differ '''
        if len(container) != len(snapshot):
            return True
        for k, v in self.containerItems[type(container)](container):
            try:
                ov = snapshot[k]
            except LookupError:
                return True
            if self._checkItem(v, ov):
                return True
        return False
    def _checkItem(self, newitem, olditem):
        ''' compare newitem and olditem.  If they are containers, call
            self._checkContainer recursively.  If they're an instance with
            an 'isChanged' method, delegate to that method.  Otherwise,
            return True if the items differ. '''
        if type(newitem) != type(olditem):
            return True
        if type(newitem) in self.containerItems:
            return self._checkContainer(newitem, olditem)
        if newitem is olditem:
            method_isChanged = getattr(newitem, 'isChanged', None)
            if method_isChanged is None:
                return False
            return method_isChanged( )
        return newitem != olditem

Discussion

I often need change-checking functionality in my applications. For example, when a user closes the last GUI window over a certain document, I need to check whether the document was changed since the last “save” operation; if it was, then I need to pop up a small window to give the user a choice between saving the document, losing the latest changes, or canceling the window-closing operation.

The class ChangeCheckerMixin, which this recipe describes, satisfies this need. The idea is to multiply derive all of your data classes, meaning all classes that hold data the user views and may change, from ChangeCheckerMixin (as well as from any other bases they need). When the data has just been loaded from or saved to persistent storage, call method snapshot on the top-level, document data class instance. This call takes a “snapshot” of the current state, basically a shallow copy of the object but with recursion over containers, and calls the snapshot methods on any contained instance that has such a method. Any time afterward, you can call method isChanged on any data class instance to check whether the instance state was changed since the time of its last snapshot.

As container types, ChangeCheckerMixin, as presented, considers only list and dict. If you also use other types as containers, you just need to add them appropriately to the containerItems dictionary. That dictionary must map each container type to a function callable on an instance of that type to get an iterator on indices and values (with indices usable to index the container). Container type instances must also support being shallowly copied with standard library Python function copy.copy. For example, to add Python 2.4’s collections.deque as a container to a subclass of ChangeCheckerMixin, you can code:

import collections
class CCM_with_deque(ChangeCheckerMixin):
    containerItems = dict(ChangeCheckerMixin.containerItems)
    containerItems[collections.deque] = enumerate

since collections.deque can be “walked over” with enumerate, just like list can.

Here is a toy example of use for ChangeChecherMixin:

if _ _name_ _ == '_ _main_ _':
    class eg(ChangeCheckerMixin):
        def _ _init_ _(self, *a, **k):
            self.L = list(*a, **k)
        def _ _str_ _(self):
            return 'eg(%s)' % str(self.L)
        def _ _getattr_ _(self, a):
            return getattr(self.L, a)
    x = eg('ciao')
    print 'x =', x, 'is changed =', x.isChanged( )
    # emits: x = eg(['c', 'i', 'a', 'o']) is changed = True
    # now, assume x gets saved, then...:
    x.snapshot( )
    print 'x =', x, 'is changed =', x.isChanged( )
    # emits: x = eg(['c', 'i', 'a', 'o']) is changed = False
    # now we change x...:
    x.append('x')
    print 'x =', x, 'is changed =', x.isChanged( )
    # emits: x = eg(['c', 'i', 'a', 'o', 'x']) is changed = True

In class eg we only subclass ChanceCheckerMixin because we need no other bases. In particular, we cannot usefully subclass list because the change-checking functionality works only on state that is kept in an instance’s dictionary; so, we must hold a list object in our instance’s dictionary, and delegate to it as needed (in this toy example, we delegate all nonspecial methods, automatically, via _ _getattr_ _). With this precaution, we see that the isChanged method correctly reflects the crucial tidbit—whether the instance’s state has been changed since the last call to snapshot on the instance.

An implicit assumption of this recipe is that your application’s data class instances are organized in a hierarchical fashion. The tired old (but still valid) example is an invoice containing header data and detail lines. Each instance of the details data class could contain other instances, such as product details, which may not be modifiable in the current activity but are probably modifiable elsewhere. This is the reason for the immutable attribute and the makeImmutable method: when the attribute is set by calling the method, any outstanding snapshot for the instance is dropped to save memory, and further calls to either snapshot or isChanged can return very rapidly.

If your data does not lend itself to such hierarchical structuring, you may have to take full deep copies, or even “snapshot” a document instance by taking a full pickle of it, and check for changes by comparing the new pickle with the last one previously taken. That may be all right on very fast machines, or when the amount of data you’re handling is rather modest. In my tests, however, it shows up as being unacceptably slow for substantial amounts of data on more ordinary machines. This recipe, when your data organization is suitable for its application, can offer better performance. If some of your data classes also contain data that is automatically computed or, for other reasons, does not need to be saved, store such data in instances of subordinate classes (which do not inherit from ChangeCheckerMixin), rather than either holding the data as attributes or storing it in ordinary containers such as lists and dictionaries.

6.13. Checking Whether an Object Has Necessary Attributes

Credit: Alex Martelli

Problem

You need to check whether an object has certain necessary attributes before performing state-altering operations. However, you want to avoid type-testing because you know it interferes with polymorphism.

Solution

In Python, you normally just try performing whatever operations you need to perform. For example, here’s the simplest, no-checks code for doing a certain sequence of manipulations on a list argument:

def munge1(alist):
    alist.append(23)
    alist.extend(range(5))
    alist.append(42)
    alist[4] = alist[3]
    alist.extend(range(2))

If alist is missing any of the methods you’re calling (explicitly, such as append and extend; or implicitly, such as the calls to _ _getitem_ _ and _ _setitem_ _ implied by the assignment statement alist[4] = alist[3]), the attempt to access and call a missing method raises an exception. Function munge1 makes no attempt to catch the exception, so the execution of munge1 terminates, and the exception propagates to the caller of munge1. The caller may choose to catch the exception and deal with it, or terminate execution and let the exception propagate further back along the chain of calls, as appropriate.

This approach is usually just fine, but problems may occasionally occur. Suppose, for example, that the alist object has an append method but not an extend method. In this peculiar case, the munge1 function partially alters alist before an exception is raised. Such partial alterations are generally not cleanly undoable; depending on your application, they can sometimes be a bother.

To forestall the “partial alterations” problem, the first approach that comes to mind is to check the type of alist. Such a naive “Look Before You Leap” (LBYL) approach may look safer than doing no checks at all, but LBYL has a serious defect: it loses polymorphism! The worst approach of all is checking for equality of types:

def munge2(alist):
    if type(alist) is list:       # avery bad idea
        munge1(alist)
    else: raise TypeError, "expected list, got %s" % type(alist)

This even fails, without any good reason, when alist is an instance of a subclass of list. You can at least remove that huge defect by using isinstance instead:

def munge3(alist):
    if isinstance(alist, list):
        munge1(alist)
    else: raise TypeError, "expected list, got %s" % type(alist)

However, munge3 still fails, needlessly, when alist is an instance of a type or class that mimics list but doesn’t inherit from it. In other words, such type-checking sacrifices one of Python’s great strengths: signature-based polymorphism. For example, you cannot pass to munge3 an instance of Python 2.4’s collections.deque, which is a real pity because such a deque does supply all needed functionality and indeed can be passed to the original munge1 and work just fine. Probably a zillion sequence types are out there that, like deque, are quite acceptable to munge1 but not to munge3. Type-checking, even with isinstance, exacts an enormous price.

A far better solution is accurate LBYL, which is both safe and fully polymorphic:

def munge4(alist):
    # Extract all bound methods you need (get immediate exception,
    # without partial alteration, if any needed method is missing):
    append = alist.append
    extend = alist.extend
    # Check operations, such as indexing, to get an exception ASAP
    # if signature compatibility is missing:
    try: alist[0] = alist[0]
    except IndexError: pass    # An empty alist is okay
    # Operate: no exceptions are expected from this point onwards
    append(23)
    extend(range(5))
    append(42)
    alist[4] = alist[3]
    extend(range(2))

Discussion

Python functions are naturally polymorphic on their arguments because they essentially depend on the methods and behaviors of the arguments, not on the arguments’ types. If you check the types of arguments, you sacrifice this precious polymorphism, so, don’t! However, you may perform a few early checks to obtain some extra safety (particularly against partial alterations) without substantial costs.

Polymorphism (from Greek roots meaning “many shapes”) is the ability of code to deal with objects of different types in ways that are appropriate to each applicable type. Unfortunately, this useful term has been overloaded with all sorts of implications, to the point that many people think it’s somehow connected with such concepts as overloading (specifying different functions depending on call-time signatures) or subtyping (i.e., subclassing), which it most definitely isn’t.

Subclassing is often a useful implementation technique, but it’s not a necessary condition for polymorphism. Overloading is right out: Python just doesn’t let multiple objects with the same name live at the same time in the same scope, so you can’t have several functions or methods with the same name and scope, distinguished only by their signatures—a minor annoyance, at worst: just rename those functions or methods so that their name suffices to distinguish them.

Python’s functions are polymorphic (unless you take specific steps to break this very useful feature) because they just call methods on their arguments (explicitly or implicitly by performing operations such as arithmetic and indexing): as long as the arguments supply the needed methods, callable with the needed signatures, and those calls perform the appropriate behavior, everything just works.

The normal Pythonic way of life can be described as the Easier to Ask Forgiveness than Permission (EAFP) approach: just try to perform whatever operations you need, and either handle or propagate any exceptions that may result. It usually works great. The only real problem that occasionally arises is “partial alteration”: when you need to perform several operations on an object, just trying to do them all in natural order could result in some of them succeeding, and partially altering the object, before an exception is raised.

For example, suppose that munge1, as shown at the start of this recipe’s Solution, is called with an actual argument value for alist that has an append method but lacks extend. In this case, alist is altered by the first call to append; but then, the attempt to obtain and call extend raises an exception, leaving alist’s state partially altered, a situation that may be hard to recover from. Sometimes, a sequence of operations should ideally be atomic: either all of the alterations happen, and everything is fine, or none of them do, and an exception gets raised.

You can get closer to ideal atomicity by switching to the LBYL approach, but in an accurate, careful way. Extract all bound methods you’ll need, then noninvasively test the necessary operations (such as indexing on both sides of the assignment operator). Move on to actually changing the object state only if all of this succeeds. From that point onward, it’s far less likely (although not impossible) that exceptions will occur in midstream, leaving state partially altered. You could not reach 100% safety even with the strictest type-checking, after all: for example, you might run out of memory just smack in the middle of your operations. So, with or without type-checking, you don’t really ever guarantee atomicity—you just approach asymptotically to that desirable property.

Accurate LBYL generally offers a good trade-off in comparison to EAFP, assuming we need safeguards against partial alterations. The extra complication is modest, and the slowdown due to the checks is typically compensated by the extra speed gained by using bound methods through local names rather than explicit attribute access (at least if the operations include loops, which is often the case). It’s important to avoid overdoing the checks, and the assert statement can help with that. For example, you can add such checks as assert callable(append) to munge4. In this case, the compiler removes the assert entirely when you run the program with optimization (i.e., with flags -O or -OO passed to the python command), while performing the checks when the program is run for testing and debugging (i.e., without the optimization flags).

6.14. Implementing the State Design Pattern

Credit: Elmar Bschorer

Problem

An object in your program can switch among several “states”, and the object’s behavior must change along with the object’s state.

Solution

The key idea of the State Design Pattern is to objectify the “state” (with its several behaviors) into a class instance (with its several methods). In Python, you don’t have to build an abstract class to represent the interface that is common to the various states: just write the classes for the “state"s themselves. For example:

class TraceNormal(object):
    ' state for normal level of verbosity '
    def startMessage(self):
        self.nstr = self.characters = 0
    def emitString(self, s):
        self.nstr += 1
        self.characters += len(s)
    def endMessage(self):
        print '%d characters in %d strings' % (self.characters, self.nstr)
class TraceChatty(object):
    ' state for high level of verbosity '
    def startMessage(self):
        self.msg = [  ]
    def emitString(self, s):
        self.msg.append(repr(s))
    def endMessage(self):
        print 'Message: ', ', '.join(self.msg)
class TraceQuiet(object):
    ' state for zero level of verbosity '
    def startMessage(self): pass
    def emitString(self, s): pass
    def endMessage(self): pass
class Tracer(object):
    def _ _init_ _(self, state): self.state = state
    def setState(self, state): self.state = state
    def emitStrings(self, strings):
        self.state.startMessage( )
        for s in strings: self.state.emitString(s)
        self.state.endMessage( )
if _ _name_ _ == '_ _main_ _':
    t = Tracer(TraceNormal( ))
    t.emitStrings('some example strings here'.split( ))
# emits:21 characters in 4 strings
    t.setState(TraceQuiet( ))
    t.emitStrings('some example strings here'.split( ))
# emits nothing
    t.setState(TraceChatty( ))
    t.emitStrings('some example strings here'.split( ))
# emits: Message: 'some', 'example', 'strings', 'here'

Discussion

With the State Design Pattern, you can “factor out” a number of related behaviors of an object (and possibly some data connected with these behaviors) into an auxiliary state object, to which the main object delegates these behaviors as needed, through calls to methods of the “state” object. In Python terms, this design pattern is related to the idioms of rebinding an object’s whole _ _class_ _, as shown in Recipe 6.11, and rebinding just certain methods (shown in Recipe 2.14). This design pattern, in a sense, lies in between those Python idioms: you group a set of related behaviors, rather than switching either all behavior, by changing the object’s whole _ _class_ _, or each method on its own, without grouping. With relation to the classic design pattern terminology, this recipe presents a pattern that falls somewhere between the classic State Design Pattern and the classic Strategy Design Pattern.

This State Design Pattern has some extra oomph, compared to the related Pythonic idioms, because an appropriate amount of data can live together with the behaviors you’re delegating—exactly as much, or as little, as needed to support each specific behavior. In the examples given in this recipe’s Solution, for example, the different state objects differ greatly in the kind and amount of data they need: none at all for class TraceQuiet, just a couple of numbers for TraceNormal, a whole list of strings for TraceChatty. These responsibilities are usefully delegated from the main object to each specific “state object”.

In some cases, although not in the specific examples shown in this recipe, state objects may need to cooperate more closely with the main object, by calling main object methods or accessing main object attributes in certain circumstances. To allow this, the main object can pass as an argument either self or some bound method of self to methods of the “state” objects. For example, suppose that the functionality in this recipe’s Solution needs to be extended, in that the main object must keep track of how many lines have been emitted by messages it has sent. Tracer._ _init_ _ will have to add one per-instance initialization self.lines = 0, and the signature of the “state” object’s endMessage methods will have to be extended to def endMessage(self, tracer):. The implementation of endMessage in class TraceQuiet will just ignore the tracer argument, since it doesn’t actually emit any lines; the implementations in the other two classes will each add a statement tracer.lines += 1, since each of them emits one line per message.

As you see, the kind of closer coupling implied by this kind of extra functionality need not be particularly problematic. In particular, the key feature of the classic State Design Pattern, that state objects are the ones that handle state switching (while, in the Strategy Design Pattern, the switching comes from the outside), is just not enough of a big deal in Python to warrant considering the two design patterns as separate.

6.15. Implementing the “Singleton” Design Pattern

Credit: Jürgen Hermann

Problem

You want to make sure that only one instance of a class is ever created.

Solution

The _ _new_ _ staticmethod makes the task very simple:

class Singleton(object):
    """ A Pythonic Singleton """
    def _ _new_ _(cls, *args, **kwargs):
        if '_inst' not in vars(cls):
            cls._inst = type._ _new_ _(cls, *args, **kwargs)
        return cls._inst

Just have your class inherit from Singleton, and don’t override _ _new_ _. Then, all calls to that class (normally creations of new instances) return the same instance. (The instance is created once, on the first such call to each given subclass of Singleton during each run of your program.)

Discussion

This recipe shows the one obvious way to implement the “Singleton” Design Pattern in Python (see E. Gamma, et al., Design Patterns: Elements of Reusable Object-Oriented Software, Addison-Wesley). A Singleton is a class that makes sure only one instance of it is ever created. Typically, such a class is used to manage resources that by their nature can exist only once. See Recipe 6.16 for other considerations about, and alternatives to, the “Singleton” design pattern in Python.

We can complete the module with the usual self-test idiom and show this behavior:

if _ _name_ _ == '_ _main_ _':
    class SingleSpam(Singleton):
        def _ _init_ _(self, s): self.s = s
        def _ _str_ _(self): return self.s
    s1 = SingleSpam('spam')
    print id(s1), s1.spam( )
    s2 = SingleSpam('eggs')
    print id(s2), s2.spam( )

When we run this module as a script, we get something like the following output (the exact value of id does vary, of course):

8172684 spam
8172684 spam

The 'eggs' parameter passed when trying to instantiate s2 has been ignored, of course—that’s part of the price you pay for having a Singleton!

One issue with Singleton in general is subclassability. The way class Singleton is coded in this recipe, each descendant subclass, direct or indirect, will get a separate instance. Literally speaking, this violates the constraint of only one instance per class, depending on what one exactly means by it:

class Foo(Singleton): pass
class Bar(Foo): pass
f = Foo( ); b = Bar( )
print f is b, isinstance(f, Foo), isinstance(b, Foo)
# emitsFalse True True

f and b are separate instances, yet, according to the built-in function isinstance, they are both instances of Foo because isinstance applies the IS-A rule of OOP: an instance of a subclass IS-An instance of the base class too. On the other hand, if we took pains to return f again when b is being instantiated by calling Bar, we’d be violating the normal assumption that calling class Bar gives us an instance of class Bar, not an instance of a random superclass of Bar that just happens to have been instantiated earlier in the course of a run of the program.

In practice, subclassability of “Singleton"s is rather a headache, without any obvious solution. If this issue is important to you, the alternative Borg idiom, explained next in Recipe 6.16 may provide a better approach.

6.16. Avoiding the “Singleton” Design Pattern with the Borg Idiom

Credit: Alex Martelli, Alex A. Naanou

Problem

You want to make sure that only one instance of a class is ever created: you don’t care about the id of the resulting instances, just about their state and behavior, and you need to ensure subclassability.

Solution

Application needs (forces) related to the “Singleton” Design Pattern can be met by allowing multiple instances to be created while ensuring that all instances share state and behavior. This is more flexible than fiddling with instance creation. Have your class inherit from the following Borg class:

class Borg(object):
    _shared_state = {  }
    def _ _new_ _(cls, *a, **k):
        obj = object._ _new_ _(cls, *a, **k)
        obj._ _dict_ _ = cls._shared_state
        return obj

If you override _ _new_ _ in your class (very few classes need to do that), just remember to use Borg._ _new_ _, rather than object._ _new_ _, within your override. If you want instances of your class to share state among themselves, but not with instances of other subclasses of Borg, make sure that your class has, at class scope, the “state"ment:

    _shared_state = {  }

With this “data override”, your class doesn’t inherit the _shared_state attribute from Borg but rather gets its own. It is to enable this “data override” that Borg’s _ _new_ _ uses cls._shared_state instead of Borg._shared_state.

Discussion

Borg in action

Here’s a typical example of Borg use:

if _ _name_ _ == '_ _main_ _':
    class Example(Borg):
        name = None
        def _ _init_ _(self, name=None):
            if name is not None: self.name = name
        def _ _str_ _(self): return 'name->%s' % self.name
    a = Example('Lara')
    b = Example( )                  # instantiating b shares self.name with a
    print a, b
    c = Example('John Malkovich')  # making c changes self.name of a & b too
    print a, b, c
    b.name = 'Seven'               # setting b.name changes name of a & c too 
    print a, b, c

When running this module as a main script, the output is:

name->Lara name->Lara
name->John Malkovich name->John Malkovich name->John Malkovich
name->Seven name->Seven name->Seven

All instances of Example share state, so any setting of the name attribute of any instance, either in _ _init_ _ or directly, affects all instances equally. However, note that the instance’s ids differ; therefore, since we have not defined special methods _ _eq_ _ and _ _hash_ _, each instance can work as a distinct key in a dictionary. Thus, if we continue our sample code as follows:

    adict = {  }
    j = 0
    for i in a, b, c:
        adict[i] = j
        j = j + 1
    for i in a, b, c:
        print i, adict[i]

the output is:

name->Seven 0
name->Seven 1
name->Seven 2

If this behavior is not what you want, add _ _eq_ _ and _ _hash_ _ methods to the Example class or the Borg superclass. Having these methods might better simulate the existence of a single instance, depending on your exact needs. For example, here’s a version of Borg with these special methods added:

class Borg(object):
    _shared_state = {  }
    def _ _new_ _(cls, *a, **k):
        obj = object._ _new_ _(cls, *a, **k)
        obj._ _dict_ _ = cls._shared_state
        return obj
    def _ _hash_ _(self): return 9      # any arbitrary constant integer
    def _ _eq_ _(self, other):
        try: return self._ _dict_ _ is other._ _dict_ _
        except AttributeError: return False

With this enriched version of Borg, the example’s output changes to:

name->Seven 2
name->Seven 2
name->Seven 2

Borg, Singleton, or neither?

The Singleton Design Pattern has a catchy name, but unfortunately it also has the wrong focus for most purposes: it focuses on object identity, rather than on object state and behavior. The Borg design nonpattern makes all instances share state instead, and Python makes implementing this idea a snap.

In most cases in which you might think of using Singleton or Borg, you don’t really need either of them. Just write a Python module, with functions and module-global variables, instead of defining a class, with methods and per-instance attributes. You need to use a class only if you must be able to inherit from it, or if you need to take advantage of the class’ ability to define special methods. (See Recipe 6.2 for a way to combine some of the advantages of classes and modules.) Even when you do need a class, it’s usually unnecessary to include in the class itself any code to enforce the idea that one can’t make multiple instances of it; other, simpler idioms are generally preferable. For example:

class froober(object):
    def _ _init_ _(self):etc, etc
froober = froober( )

Now froober is by nature the only instance of its own class, since name 'froober' has been rebound to mean the instance, not the class. Of course, one might call froober._ _class_ _( ), but it’s not sensible to spend much energy taking precautions against deliberate abuse of your design intentions. Any obstacles you put in the way of such abuse, somebody else can bypass. Taking precautions against accidental misuse is way plenty. If the very simple idiom shown in this latest snippet is sufficient for your needs, use it, and forget about Singleton and Borg. Remember: do the simplest thing that could possibly work. On rare occasions, though, an idiom as simple as this one cannot work, and then you do need more.

The Singleton Design Pattern (described previously in Recipe 6.15) is all about ensuring that just one instance of a certain class is ever created. In my experience, Singleton is generally not the best solution to the problems it tries to solve, producing different kinds of issues in various object models. We typically want to let as many instances be created as necessary, but all with shared state. Who cares about identity? It’s state (and behavior) we care about. The alternate pattern based on sharing state, in order to solve roughly the same problems as Singleton does, has also been called Monostate. Incidentally, I like to call Singleton “Highlander” because there can be only one.

In Python, you can implement the Monostate Design Pattern in many ways, but the Borg design nonpattern is often best. Simplicity is Borg’s greatest strength. Since the _ _dict_ _ of any instance can be rebound, Borg in its _ _new_ _ rebinds the _ _dict_ _ of each of its instances to a class-attribute dictionary. Now, any reference or binding of an instance attribute will affect all instances equally. I thank David Ascher for suggesting the appropriate name Borg for this nonpattern. Borg is a nonpattern because it had no known uses at the time of its first publication (although several uses are now known): two or more known uses are part of the prerequisites for being a design pattern. See the detailed discussion at http://www.aleax.it/5ep.html.

An excellent article by Robert Martin about Singleton and Monostate can be found at http://www.objectmentor.com/resources/articles/SingletonAndMonostate.pdf. Note that most of the disadvantages that Martin attributes to Monostate are really due to the limitations of the languages that Martin is considering, such as C++ and Java, and just disappear when using Borg in Python. For example, Martin indicates, as Monostate’s first and main disadvantage, that “A non-Monostate class cannot be converted into a Monostate class through derivation”—but that is obviously not the case for Borg, which, through multiple inheritance, makes such conversions trivial.

Borg odds and ends

The _ _getattr_ _ and _ _setattr_ _ special methods are not involved in Borg’s operations. Therefore, you can define them independently in your subclass, for whatever other purposes you may require, or you may leave these special methods undefined. Either way is not a problem because Python does not call _ _setattr_ _ in the specific case of the rebinding of the instance’s _ _dict_ _ attribute.

Borg does not work well for classes that choose to keep some or all of their per-instance state somewhere other than in the instance’s _ _dict_ _. So, in subclasses of Borg, avoid defining _ _slots_ _—that’s a memory-footprint optimization that would make no sense, anyway, since it’s meant for classes that have a large number of instances, and Borg subclasses will effectively have just one instance! Moreover, instead of inheriting from built-in types such as list or dict, your Borg subclasses should use wrapping and automatic delegation, as shown previously Recipe 6.5. (I named this latter twist “DeleBorg,” in my paper available at http://www.aleax.it/5ep.html.)

Saying that Borg “is a Singleton” would be as silly as saying that a portico is an umbrella. Both serve similar purposes (letting you walk in the rain without getting wet)—solve similar forces, in design pattern parlance—but since they do so in utterly different ways, they’re not instances of the same pattern. If anything, as already mentioned, Borg has similarities to the Monostate alternative design pattern to Singleton. However, Monostate is a design pattern, while Borg is not; also, a Python Monostate could perfectly well exist without being a Borg. We can say that Borg is an idiom that makes it easy and effective to implement Monostate in Python.

For reasons mysterious to me, people often conflate issues germane to Borg and Highlander with other, independent issues, such as access control and, particularly, access from multiple threads. If you need to control access to an object, that need is exactly the same whether there is one instance of that object’s class or twenty of them, and whether or not those instances share state. A fruitful approach to problem-solving is known as divide and conquer—making problems easier to solve by splitting apart their different aspects. Making problems more difficult to solve by joining together several aspects must be an example of an approach known as unite and suffer!

6.17. Implementing the Null Object Design Pattern

Credit: Dinu C. Gherman, Holger Krekel

Problem

You want to reduce the need for conditional statements in your code, particularly the need to keep checking for special cases.

Solution

The usual placeholder object for “there’s nothing here” is None, but we may be able to do better than that by defining a class meant exactly to act as such a placeholder:

class Null(object):
    """ Null objects always and reliably "do nothing." """
    # optional optimization: ensure only one instance per subclass
    # (essentially just to save memory, no functional difference)
    def _ _new_ _(cls, *args, **kwargs):
        if '_inst' not in vars(cls):
            cls._inst = type._ _new_ _(cls, *args, **kwargs)
        return cls._inst
    def _ _init_ _(self, *args, **kwargs): pass
    def _ _call_ _(self, *args, **kwargs): return self
    def _ _repr_ _(self): return "Null( )"
    def _ _nonzero_ _(self): return False
    def _ _getattr_ _(self, name): return self
    def _ _setattr_ _(self, name, value): return self
    def _ _delattr_ _(self, name): return self

Discussion

You can use an instance of the Null class instead of the primitive value None. By using such an instance as a placeholder, instead of None, you can avoid many conditional statements in your code and can often express algorithms with little or no checking for special values. This recipe is a sample implementation of the Null Object Design Pattern. (See B. Woolf, “The Null Object Pattern” in Pattern Languages of Programming [PLoP 96, September 1996].)

This recipe’s Null class ignores all parameters passed when constructing or calling instances, as well as any attempt to set or delete attributes. Any call or attempt to access an attribute (or a method, since Python does not distinguish between the two, calling _ _getattr_ _ either way) returns the same Null instance (i.e., self—no reason to create a new instance). For example, if you have a computation such as:

def compute(x, y):
    try:lots of computation here to return some appropriate object
    except SomeError:
        return None

and you use it like this:

for x in xs:
    for y in ys:
        obj = compute(x, y)
        if obj is not None:
            obj.somemethod(y, x)

you can usefully change the computation to:

def compute(x, y):
    try:lots of computation here to return some appropriate object
    except SomeError:
        return Null( )

and thus simplify its use down to:

for x in xs:
    for y in ys:
        compute(x, y).somemethod(y, x)

The point is that you don’t need to check whether compute has returned a real result or an instance of Null: even in the latter case, you can safely and innocuously call on it whatever method you want. Here is another, more specific use case:

log = err = Null( )
if verbose:
   log = open('/tmp/log', 'w')
   err = open('/tmp/err', 'w')
log.write('blabla')
err.write('blabla error')

This obviously avoids the usual kind of “pollution” of your code from guards such as if verbose: strewn all over the place. You can now call log.write('bla'), instead of having to express each such call as if log is not None: log.write('bla').

In the new object model, Python does not call _ _getattr_ _ on an instance for any special methods needed to perform an operation on the instance (rather, it looks up such methods in the instance class’ slots). You may have to take care and customize Null to your application’s needs regarding operations on null objects, and therefore special methods of the null objects’ class, either directly in the class’ sources or by subclassing it appropriately. For example, with this recipe’s Null, you cannot index Null instances, nor take their length, nor iterate on them. If this is a problem for your purposes, you can add all the special methods you need (in Null itself or in an appropriate subclass) and implement them appropriately—for example:

class SeqNull(Null):
    def _ _len_ _(self): return 0
    def _ _iter_ _(self): return iter(( ))
    def _ _getitem_ _(self, i): return self
    def _ _delitem_ _(self, i): return self
    def _ _setitem_ _(self, i, v): return self

Similar considerations apply to several other operations.

The key goal of Null objects is to provide an intelligent replacement for the often-used primitive value None in Python. (Other languages represent the lack of a value using either null or a null pointer.) These nobody-lives-here markers/placeholders are used for many purposes, including the important case in which one member of a group of otherwise similar elements is special. This usage usually results in conditional statements all over the place to distinguish between ordinary elements and the primitive null (e.g., None) value, but Null objects help you avoid that.

Among the advantages of using Null objects are the following:

Superfluous conditional statements can be avoided by providing a first-class object alternative for the primitive value None, thereby improving code readability.
Null objects can act as placeholders for objects whose behavior is not yet implemented.
Null objects can be used polymorphically with instances of just about any other class (perhaps needing suitable subclassing for special methods, as previously mentioned).
Null objects are very predictable.

The one serious disadvantage of Null is that it can hide bugs. If a function returns None, and the caller did not expect that return value, the caller most likely will soon thereafter try to call a method or perform an operation that None doesn’t support, leading to a reasonably prompt exception and traceback. If the return value that the caller didn’t expect is a Null, the problem might stay hidden for a longer time, and the exception and traceback, when they eventually happen, may therefore be harder to reconnect to the location of the defect in the code. Is this problem serious enough to make using Null inadvisable? The answer is a matter of opinion. If your code has halfway decent unit tests, this problem will not arise; while, if your code lacks decent unit tests, then using Null is the least of your problems. But, as I said, it boils down to a matter of opinions. I use Null very widely, and I’m extremely happy with the effect it has had on my productivity.

The Null class as presented in this recipe uses a simple variant of the “Singleton” pattern (shown earlier in Recipe 6.15), strictly for optimization purposes—namely, to avoid the creation of numerous passive objects that do nothing but take up memory. Given all the previous remarks about customization by subclassing, it is, of course, crucial that the specific implementation of “Singleton” ensures a separate instance exists for each subclass of Null that gets instantiated. The number of subclasses will no doubt never be so high as to eat up substantial amounts of memory, and anyway this per-subclass distinction can be semantically crucial.

6.18. Automatically Initializing Instance Variables from _ _init_ _ Arguments

Credit: Peter Otten, Gary Robinson, Henry Crutcher, Paul Moore, Peter Schwalm, Holger Krekel

Problem

You want to avoid writing and maintaining _ _init_ _ methods that consist of almost nothing but a series of self.something = something assignments.

Solution

You can “factor out” the attribute-assignment task to an auxiliary function:

def attributesFromDict(d):
    self = d.pop('self')
    for n, v in d.iteritems( ):
        setattr(self, n, v)

Now, the typical boilerplate code for an _ _init_ _ method such as:

    def _ _init_ _(self, foo, bar, baz, boom=1, bang=2):
        self.foo = foo
        self.bar = bar
        self.baz = baz
        self.boom = boom
        self.bang = bang

can become a short, crystal-clear one-liner:

    def _ _init_ _(self, foo, bar, baz, boom=1, bang=2):
        attributesFromDict(locals( ))

Discussion

As long as no additional logic is in the body of _ _init_ _, the dict returned by calling the built-in function locals contains only the arguments that were passed to _ _init_ _ (plus those arguments that were not passed but have default values). Function attributesFromDict extracts the object, relying on the convention that the object is always an argument named 'self', and then interprets all other items in the dictionary as names and values of attributes to set. A similar but simpler technique, not requiring an auxiliary function, is:

    def _ _init_ _(self, foo, bar, baz, boom=1, bang=2): 
        self._ _dict_ _.update(locals( )) 
        del self.self

However, this latter technique has a serious defect when compared to the one presented in this recipe’s Solution: by setting attributes directly into self._ _dict_ _ (through the latter’s update method), it does not play well with properties and other advanced descriptors, while the approach in this recipe’s Solution, using built-in setattr, is impeccable in this respect.

attributesFromDict is not meant for use in an _ _init_ _ method that contains more code, and specifically one that uses some local variables, because attributesFromDict cannot easily distinguish, in the dictionary that is passed as its only argument d, between arguments of _ _init_ _ and other local variables of _ _init_ _. If you’re willing to insert a little introspection in the auxiliary function, this limitation may be overcome:

def attributesFromArguments(d):
    self = d.pop('self')
    codeObject = self._ _init_ _.im_func.func_code
    argumentNames = codeObject.co_varnames[1:codeObject.co_argcount]
    for n in argumentNames:
        setattr(self, n, d[n])

By extracting the code object of the _ _init_ _ method, function attributesFromArguments is able to limit itself to the names of _ _init_ _’s arguments. Your _ _init_ _ method can then call attributesFromArguments(locals( )), instead of attributesFromDict(locals( )), if and when it needs to continue, after the call, with more code that may define other local variables.

The key limitation of attributesFromArguments is that it does not support _ _init_ _ having a last special argument of the **kw kind. Such support can be added, with yet more introspection, but it would require more black magic and complication than the functionality is probably worth. If you nevertheless want to explore this possibility, you can use the inspect module of the standard library, rather than the roll-your-own approach used in function attributeFromArguments, for introspection purposes. inspect.getargspec(self._ _init_ _) gives you both the argument names and the indication of whether self._ _init_ _ accepts a **kw form. See Recipe 6.19 for more information about function inspect.getargspec. Remember the golden rule of Python programming: “Let the standard library do it!”

6.19. Calling a Superclass _ _init_ _ Method If It Exists

Credit: Alex Martelli

Problem

You want to ensure that _ _init_ _ is called for all superclasses that define it, and Python does not do this automatically.

Solution

As long as your class is new-style, the built-in super makes this task easy (if all superclasses’ _ _init_ _ methods also use super similarly):

class NewStyleOnly(A, B, C):
    def _ _init_ _(self):super(NewStyleOnly, self)._ _init_ _( )
               initialization specific to subclass NewStyleOnly

Discussion

Classic classes are not recommended for new code development: they exist only to guarantee backwards compatibility with old versions of Python. Use new-style classes (deriving directly or indirectly from object) for all new code. The only thing you cannot do with a new-style class is to raise its instances as exception objects; exception classes must therefore be old style, but then, you do not need the functionality of this recipe for such classes. Since the rest of this recipe’s Discussion is therefore both advanced and of limited applicability, you may want to skip it.

Still, it may happen that you need to retrofit this functionality into a classic class, or, more likely, into a new-style class with some superclasses that do not follow the proper style of cooperative superclass method-calling with the built-in super. In such cases, you should first try to fix the problematic premises—make all classes new style and make them use super properly. If you absolutely cannot fix things, the best you can do is to have your class loop over its base classes—for each base, check whether it has an _ _init_ _, and if so, then call it:

class LookBeforeYouLeap(X, Y, Z):
    def _ _init_ _(self):
        for base in self_ _class_ _._ _bases_ _:
            if hasattr(base, '_ _init_ _'):
                base._ _init_ _(self)initialization specific to subclass LookBeforeYouLeap

More generally, and not just for method _ _init_ _, we often want to call a method on an instance, or class, if and only if that method exists; if the method does not exist on that class or instance, we do nothing, or we default to another action. The technique shown in the “Solution”, based on built-in super, is not applicable in general: it only works on superclasses of the current object, only if those superclasses also use super appropriately, and only if the method in question does exist in some superclass. Note that all new-style classes do have an _ _init_ _ method: they all subclass object, and object defines _ _init_ _ (as a do-nothing function that accepts and ignores any arguments). Therefore, all new-style classes have an _ _init_ _ method, either by inheritance or by override.

The LBYL technique shown in class LookBeforeYouLeap may be of help in more general cases, including ones that involve methods other than _ _init_ _. Indeed, LBYL may even be used together with super, for example, as in the following toy example:

class Base1(object):
    def met(self):
        print 'met in Base1'
class Der1(Base1):
    def met(self):
        s = super(Der1, self)
        if hasattr(s, 'met'):
            s.met( )
        print 'met in Der1'
class Base2(object):
    pass
class Der2(Base2):
    def met(self):
        s = super(Der2, self)
        if hasattr(s, 'met'):
            s.met( )
        print 'met in Der2'
Der1( ).met( )
Der2( ).met( )

This snippet emits:

met in Base1
met in Der1
met in Der2

The implementation of met has the same structure in both derived classes, Der1 (whose superclass Base1 does have a method named met) and Der2 (whose superclass Base1 doesn’t have such a method). By binding a local name s to the result of super, and checking with hasattr that the superclass does have such a method before calling it, this LBYL structure lets you code in the same way in both cases. Of course, when coding a subclass, you do normally know which methods the superclasses have, and whether and how you need to call them. Still, this technique can provide a little extra flexibility for those occasions in which you need to slightly decouple the subclass from the superclass.

The LBYL technique is far from perfect, though: a superclass might define an attribute named met, which is not callable or needs a different number of arguments. If your need for flexibility is so extreme that you must ward against such occurrences, you can extract the superclass’ method object (if any) and check it with the getargspec function of standard library module inspect.

While pushing this idea towards full generality can lead into rather deep complications, here is one example of how you might code a class with a method that calls the superclass’ version of the same method only if the latter is callable without arguments:

import inspect
class Der(A, B, C, D):
    def met(self):
        s = super(Der, self)
        # get the superclass's bound-method object, or else None
        m = getattr(s, 'met', None)
        try:
            args, varargs, varkw, defaults = inspect.getargspec(m)
        except TypeError:
            # m is not a method, just ignore it
            pass
        else:
            # m is a method, do all its arguments have default values?
            if len(defaults) == len(args):
                # yes! so, call it:
                m( )
        print 'met in Der'

inspect.getargspec raises a TypeError if its argument is not a method or function, so we catch that case with a try/except statement, and if the exception occurs, we just ignore it with a do-nothing pass statement in the except clause. To simplify our code a bit, we do not first check separately with hasattr. Rather, we get the 'met' attribute of the superclass by calling getattr with a third argument of None. Thus, if the superclass does not have any attribute named 'met', m is set to None, later causing exactly the same TypeError that we have to catch (and ignore) anyway—two birds with one stone. If the call to inspect.getargspec in the try clause does not raise a TypeError, execution continues with the else clause.

If inspect.getargspec doesn’t raise a TypeError, it returns a tuple of four items, and we bind each item to a local name. In this case, the ones we care about are args, a list of m’s argument names, and defaults, a tuple of default values that m provides for its arguments. Clearly, we can call m without arguments if and only if m provides default values for all of its arguments. So, we check that there are just as many default values as arguments, by comparing the lengths of list args and tuple defaults, and call m only if the lengths are equal.

No doubt you don’t need such advanced introspection and such careful checking in most of the code you write, but, just in case you do, Python does supply all the tools you need to achieve it.

6.20. Using Cooperative Supercalls Concisely and Safely

Credit: Paul McNett, Alex Martelli

Problem

You appreciate the cooperative style of multiple-inheritance coding supported by the super built-in, but you wish you could use that style in a more terse and concise way.

Solution

A good solution is a mixin class—a class you can multiply inherit from, that uses introspection to allow more terse coding:

import inspect
class SuperMixin(object):
    def super(cls, *args, **kwargs):
        frame = inspect.currentframe(1)
        self = frame.f_locals['self']
        methodName = frame.f_code.co_name
        method = getattr(super(cls, self), methodName, None)
        if inspect.ismethod(method):
            return method(*args, **kwargs)
    super = classmethod(super)

Any class cls that inherits from class SuperMixin acquires a magic method named super: calling cls.super(args) from within a method named somename of class cls is a concise way to call super(cls, self).somename(args). Moreover, the call is safe even if no class that follows cls in Method Resolution Order (MRO) defines any method named somename.

Discussion

Here is a usage example:

if _ _name_ _ == '_ _main_ _':
    class TestBase(list, SuperMixin):
        # note: no myMethod defined here
        pass
    class MyTest1(TestBase):
        def myMethod(self):
            print "in MyTest1"
            MyTest1.super( )
    class MyTest2(TestBase):
        def myMethod(self):
            print "in MyTest2"
            MyTest2.super( )
    class MyTest(MyTest1, MyTest2):
        def myMethod(self):
            print "in MyTest"
            MyTest.super( )
    MyTest( ).myMethod( )
# emits:
#in MyTest
# in MyTest1
# in MyTest2

Python has been offering “new-style” classes for years, as a preferable alternative to the classic classes that you get by default. Classic classes exist only for backwards-compatibility with old versions of Python and are not recommended for new code. Among the advantages of new-style classes is the ease of calling superclass implementations of a method in a “cooperative” way that fully supports multiple inheritance, thanks to the super built-in.

Suppose you have a method in a new-style class cls, which needs to perform a task and then delegate the rest of the work to the superclass implementation of the same method. The code idiom is:

def somename(self, *args):...some preliminary task...
    return super(cls, self).somename(*args)

This idiom suffers from two minor issues: it’s slightly verbose, and it also depends on a superclass offering a method somename. If you want to make cls less coupled to other classes, and therefore more robust, by removing the dependency, the code gets even more verbose:

def somename(self, *args):...some preliminary task...
    try:
        super_method = super(cls, self).somename
    except AttributeError:
        return None
    else:
        return super_method(*args)

The mixin class SuperMixin shown in this recipe removes both issues. Just ensure cls inherits, directly or indirectly, from SuperMixin (alongside any other base classes you desire), and then you can code, concisely and robustly:

def somename(self, *args):...some preliminary task...
    return cls.super(*args)

The classmethod SuperMixin.super relies on simple introspection to get the self object and the name of the method, then internally uses built-ins super and getattr to get the superclass method, and safely call it only if it exists. The introspection is performed through the handy inspect module of the standard Python library, making the whole task even simpler.

Table of Contents for 6. Object-Oriented Programming

Create new playlist

Sign In

Sign Up

Chapter 6. Object-Oriented Programming

Introduction

Tip

6.1. Converting Among Temperature Scales

Problem

Solution

Discussion

See Also

6.2. Defining Constants

Problem

Solution

Discussion

See Also

6.3. Restricting Attribute Setting

Problem

Solution

Discussion

See Also

6.4. Chaining Dictionary Lookups

Problem

Solution

Discussion

See Also

6.5. Delegating Automatically as an Alternative to Inheritance

Problem

Solution

Discussion

See Also

6.6. Delegating Special Methods in Proxies

Problem

Solution

Discussion

See Also

6.7. Implementing Tuples with Named Items

Problem

Solution

Discussion

See Also

6.8. Avoiding Boilerplate Accessors for Properties

Problem

Solution

Discussion

See Also

6.9. Making a Fast Copy of an Object

Problem

Solution

Discussion

See Also

6.10. Keeping References to Bound Methods Without Inhibiting Garbage Collection

Problem

Solution

Discussion

See Also

6.11. Implementing a Ring Buffer

Problem

Solution

Discussion

See Also

6.12. Checking an Instance for Any State Changes

Problem

Solution

Discussion

See Also

6.13. Checking Whether an Object Has Necessary Attributes

Problem

Solution

Discussion

See Also

6.14. Implementing the State Design Pattern

Problem

Solution

Discussion

See Also

6.15. Implementing the “Singleton” Design Pattern

Problem

Solution

Table of Contents for
6. Object-Oriented Programming