Chapter 20. Descriptors, Decorators,and Metaclasses

Introduction

Credit: Raymond Hettinger

I had my power drill slung low on my toolbelt and I said, “Go ahead, honey.Break something.”

Tim Allen

on the challenges of figuring out whatto do with a new set of general-purpose tools

This chapter is last because it deals with issues that look or sound difficult, although they really aren’t. It is about Python’s power tools.

Though easy to use, the power tools can be considered advanced for several reasons. First, the need for them rarely arises in simple programs. Second, most involve introspection, wrapping, and forwarding techniques available only in a dynamic language like Python. Third, the tools seem advanced because when you learn them, you also develop a deep understanding of how Python works internally.

Last, as with the power tools in your garage, it is easy to get carried away and create a gory mess. Accordingly, to ward off small children, the tools were given scary names such as descriptors, decorators, and metaclasses (such names as pangalaticgarglebaster were considered a bit too long).

Because these tools are so general purpose, it can be a challenge to figure out what to do with them. Rather that resorting to Tim Allen’s tactics, study the recipes in this chapter: they will give you all the practice you need. And, as Tim Peters once pointed out, it can be difficult to devise new uses from scratch, but when a real problem demands a power tool, you’ll know it when you need it.

Descriptors

The concept of descriptors is easy enough. Whenever an attribute is looked up, an action takes place. By default, the action is a get, set, or delete. However, someday you’ll be working on an application with some subtle need and wish that more complex actions could be programmed. Perhaps you would like to create a log entry every time a certain attribute is accessed. Perhaps you would like to redirect a method lookup to another method. The solution is to write a function with the needed action and then specify that it be run whenever the attribute is accessed. An object with such functions is called a descriptor (just to make it sound harder than it really is).

While the concept of a descriptor is straightforward, there seems to be no limit to what can be done with them. Descriptors underlie Python’s implementation of methods, bound methods, super, property, classmethod, and staticmethod. Learning about the various applications of descriptors is key to mastering the language.

The recipes in this chapter show how to put descriptors straight to work. However, if you want the full details behind the descriptor protocol or want to know exactly how descriptors are used to implement super, property, and the like, see my paper on the subject at http://users.rcn.com/python/download/Descriptor.htm.

Decorators

Decorators are even simpler than descriptors. Writing myfunc=wrapper(myfunc) was the common way to modify or log something about another function, which took place somewhere after myfunc was defined. Starting with Python 2.4, we now write @wrapper just before the def statement that performs the definition of myfunc. Common examples include @staticmethod and @classmethod. Unlike Java declarations, these wrappers are higher-order functions that can modify the original function or take some other action. Their uses are limitless. Some ideas that have been advanced include @make_constants for bytecode optimization, @atexit to register a function to be run before Python exits, @synchronized to automatically add mutual exclusion locking to a function or method, and @log to create a log entry every time a function is called. Such wrapper functions are called decorators (not an especially intimidating name but cryptic enough to ward off evil spirits).

Metaclasses

The concept of a metaclass sounds strange only because it is so familiar. Whenever you write a class definition, a mechanism uses the name, bases, and class dictionary to create a class object. For old-style classes that mechanism is types.ClassType. For new-style classes, the mechanism is just type. The former implements the familiar actions of a classic class, including attribute lookup and showing the name of the class when repr is called. The latter adds a few bells and whistles including support for _ _slots_ _ and _ _getattribute_ _. If only that mechanism were programmable, what you could do in Python would be limitless. Well, the mechanism is programmable, and, of course, it has an intimidating name, metaclasses.

The recipes in this chapter show that writing metaclasses can be straightforward. Most metaclasses subclass type and simply extend or override the desired behavior. Some are as simple as altering the class dictionary and then forwarding the arguments to type to finish the job.

For instance, say that you would like to automatically generate getter methods for all the private variables listed in slots. Just define a metaclass M that looks up _ _slots_ _ in the mapping, scans for variable names starting with an underscore, creates an accessor method for each, and adds the new methods to the class dictionary:

class M(type):
    def _ _new_ _(cls, name, bases, classdict):
        for attr in classdict.get('_ _slots_ _', ( )):
            if attr.startswith('_'):
                def getter(self, attr=attr): 
                    return getattr(self, attr)
                # 2.4 only: getter._ _name_ _ = 'get' + attr[1:]
                classdict['get' + attr[1:]] = getter
        return type._ _new_ _(cls, name, bases, classdict)

Apply the new metaclass to every class where you want automatically created accessor functions:

class Point(object):
    _ _metaclass_ _ = M
    _ _slots_ _ = ['_x', '_y']

If you now print dir(Point), you will see the two accessor methods as if you had written them out the long way:

class Point(object):
    _ _slots_ _ = ['_x', '_y']
    def getx(self):
        return self._x
    def gety(self):
        return self._y

In both cases, among the output of the print statement, you will see the names 'getx' and 'gety‘.

20.1. Getting Fresh Default Values at Each Function Call

Credit: Sean Ross

Problem

Python computes the default values for a function’s optional arguments just once, when the function’s def statement executes. However, for some of your functions, you’d like to ensure that the default values are fresh ones (i.e., new and independent copies) each time a function gets called.

Solution

A Python 2.4 decorator offers an elegant solution, and, with a slightly less terse syntax, it’s a solution you can apply in version 2.3 too:

import copy
def freshdefaults(f):
    "a decorator to wrap f and keep its default values fresh between calls"
    fdefaults = f.func_defaults
    def refresher(*args, **kwds):
        f.func_defaults = deepcopy(fdefaults)
        return f(*args, **kwds)
    # in 2.4, only: refresher._ _name_ _ = f._ _name_ _
    return refresher
# usage as a decorator, in python 2.4:
@freshdefaults
def packitem(item, pkg=[  ]):
    pkg.append(item)
    return pkg
# usage in python 2.3: after the function definition, explicitly assign:
# f = freshdefaults(f)

Discussion

A function’s default values are evaluated once, and only once, at the time the function is defined (i.e., when the def statement executes). Beginning Python programmers are sometimes surprised by this fact; they try to use mutable default values and yet expect that the values will somehow be regenerated afresh each time they’re needed.

Recommended Python practice is to not use mutable default values. Instead, you should use idioms such as:

def packitem(item, pkg=None):
    if pkg is None:
        pkg = [  ]
    pkg.append(item)
    return pkg

The freshdefaults decorator presented in this recipe provides another way to accomplish the same task. It eliminates the need to set as your default value anything but the value you intend that optional argument to have by default. In particular, you don’t have to use None as the default value, rather than (say) square brackets [ ], as you do in the recommended idiom.

freshdefaults also removes the need to test each argument against the stand-in value (e.g., None) before assigning the intended value: this could be an important simplification in your code, where your functions need to have several optional arguments with mutable default values, as long as all of those default values can be deep-copied.

On the other hand, the implementation of freshdefaults needs several reasonably advanced concepts: decorators, closures, function attributes, and deep copying. All in all, this implementation is no doubt more difficult to explain to beginning Python programmers than the recommended idiom. Therefore, this recipe cannot really be recommended to beginners. However, advanced Pythonistas may find it useful.

See Also

Python Language Reference documentation about decorators; Python Language Reference and Python in a Nutshell documentation about closures and function attributes; Python Library Reference and Python in a Nutshell documentation about standard library module copy, specifically function deepcopy.

20.2. Coding Properties as Nested Functions

Credit: Sean Ross, David Niergarth, Holger Krekel

Problem

You want to code properties without cluttering up your class namespace with accessor methods that are not called directly.

Solution

Functions nested within another function are quite handy for this task:

import math
class Rectangle(object):
    def _ _init_ _(self, x, y):
        self.y = x
        self.y = y
    def area( ):
        doc = "Area of the rectangle"
        def fget(self):
            return self.x * self.y
        def fset(self, value):
            ratio = math.sqrt((1.0*value)/self.area)
            self.x *= ratio
            self.y *= ratio
        return locals( )
    area = property(**area( ))

Discussion

The standard idiom used to create a property starts with defining in the class body several accessor methods (e.g., getter, setter, deleter), often with boilerplate-like method names such as setThis, getThat, or delTheother. More often than not, such accessors are not required except inside the property itself; sometimes (rarely) programmers even remember to del them to clean up the class namespace after building the property instance.

The idiom suggested in this recipe avoids cluttering up the class namespace at all. Just write in the class body a function with the same name you intend to give to the property. Inside that function, define appropriate nested functions, which must be named exactly fget, fset, fdel, and assign an appropriate docstring named doc. Have the outer function return a dictionary whose entries have exactly those names, and no others: returning the locals( ) dictionary will work, as long as your outer function has no other local variables at that point. If you do have other names in addition to the fixed ones, you might want to code your return statement, for example, as:

return sub_dict(locals( ), 'doc fget fset fdel'.split( ))

using the sub_dict function shown in Recipe 4.13. Any other way to subset a dictionary will work just as well.

Finally, the call to property uses the ** notation to expand a mapping into named arguments, and the assignment rebinds the name to the resulting property instance, so that the class namespace is left pristine.

As you can see from the example in this recipe’s Solution, you don’t have to define all of the four key names: you may, and should, omit some of them if a particular property forbids the corresponding operation. In particular, the area function in the solution does not define fdel because the resulting area attribute must be not deletable.

In Python 2.4, you can define a simple custom decorator to make this recipe’s suggested idiom even spiffier:

def nested_property(c):
    return property(**c( ))

With this little helper at hand, you can replace the explicit assignment of the property to the attribute name with the decorator syntax:

    @nested_property
    def area( ):
        doc = "Area of the rectangle"
        def fget(self):the area function remains the same

In Python 2.4, having a decorator line @ deco right before a def name statement is equivalent to having, right after the def statement’s body, an assignment name = deco(name). A mere difference of syntax sugar, but it’s useful: anybody reading the source code of the class knows up front that the function or method you’re def‘ing is meant to get decorated in a certain way, not to get used exactly as coded. With the Python 2.3 syntax, somebody reading in haste might possibly miss the assignment statement that comes after the def.

Returning locals works only if your outer function has no other local variables besides fget, fset, fdel, and doc. An alternative idiom to avoid this restriction is to move the call to property inside the outer function:

def area( ):
    what_is_area = "Area of the rectangle"
    def compute_area(self):
        return self.x * self.y
    def scale_both_sides(self, value):
        ratio = math.sqrt((1.0*value)/self.area)
        self.x *= ratio
        self.y *= ratio
    return property(compute_area, scale_both_sides, None, what_is_area)
area = area( )

As you see, this alternative idiom enables us to give different names to the getter and setter accessors, which is not a big deal because, as mentioned previously, accessors are often named in uninformative ways such as getThis and setThat anyway. But, if your opinion differs, you may prefer this idiom, or its slight variant based on having the outer function return a tuple of values for property’s argument rather than a dict. In other words, the variant obtained by changing the last two statements of this latest snippet to:

    return compute_area, scale_both_sides, None, what_is_area
area = property(*area( ))

See Also

Library Reference and Python in a Nutshell docs on built-in functions property and locals.

20.3. Aliasing Attribute Values

Credit: Denis S. Otkidach

Problem

You want to use an attribute name as an alias for another one, either just as a default value (when the attribute was not explicitly set), or with full setting and deleting abilities too.

Solution

Custom descriptors are the right tools for this task:

class DefaultAlias(object):
    ''' unless explicitly assigned, this attribute aliases to another. '''
    def _ _init_ _(self, name):
        self.name = name
    def _ _get_ _(self, inst, cls):
        if inst is None:
            # attribute accessed on class, return `self' descriptor
            return self
        return getattr(inst, self.name)
class Alias(DefaultAlias):
    ''' this attribute unconditionally aliases to another. '''
    def _ _set_ _(self, inst, value):
        setattr(inst, self.name, value)
    def _ _delete_ _(self, inst):
        delattr(inst, self.name)

Discussion

Your class instances sometimes have attributes whose default value must be the same as the current value of other attributes but may be set and deleted independently. For such requirements, custom descriptor DefaultAlias, as presented in this recipe’s Solution, is just the ticket. Here is a toy example:

class Book(object):
    def _ _init_ _(self, title, shortTitle=None):
        self.title = title
        if shortTitle is not None:
            self.shortTitle = shortTitle
    shortTitle = DefaultAlias('title')
b = Book('The Life and Opinions of Tristram Shandy, Gent.')
print b.shortTitle
# emits:The Life and Opinions of Tristram Shandy, Gent.
b.shortTitle = "Tristram Shandy"
print b.shortTitle
# emits: Tristram Shandy
del b.shortTitle
print b.shortTitle
# emits: The Life and Opinions of Tristram Shandy, Gent.

DefaultAlias is not what is technically known as a data descriptor class because it has no _ _set_ _ method. In practice, this means that, when we assign a value to an instance attribute whose name is defined in the class as a DefaultAlias, the instance records the attribute normally, and the instance attribute shadows the class attribute. This is exactly what’s happening in this snippet after we explicitly assign to b.shortTitle—when we del b.shortTitle, we remove the per-instance attribute, uncovering the per-class one again.

Custom descriptor class Alias is a simple variant of class DefaultAlias, easily obtained by inheritance. Alias aliases one attribute to another, not just upon accesses to the attribute’s value (as DefaultAlias would do), but also upon all operations of value setting and deletion. It easily achieves this by being a “data descriptor” class, which means that it does have a _ _set_ _ method. Therefore, any assignment to an instance attribute whose name is defined in the class as an Alias gets intercepted by Alias' _ _set_ _ method. (Alias also defines a _ _delete_ _ method, to obtain exactly the same effect upon attribute deletion.)

Alias can be quite useful when you want to evolve a class, which you made publicly available in a previous version, to use more appropriate names for methods and other attributes, while still keeping the old names available for backwards compatibility. For this specific use, you may even want a version that emits a warning when the old name is used:

import warnings
class OldAlias(Alias):
    def _warn(self):
        warnings.warn('use %r, not %r' % (self.name, self.oldname),
                      DeprecationWarning, stacklevel=3)
    def _ _init_ _(self, name, oldname):
        super(OldAlias, self)._ _init_ _(name)
        self.oldname = oldname
    def _ _get_ _(self, inst, cls):
        self._warn( )
        return super(OldAlias, self)._ _get_ _(inst, cls)
    def _ _set_ _(self, inst, value):
        self._warn( )
        return super(OldAlias, self)._ _set_ _(inst, value)
    def _ _delete_ _(self, inst):
        self._warn( )
        return super(OldAlias, self)._ _delete_ _(inst)

Here is a toy example of using OldAlias:

class NiceClass(object):
    def _ _init_ _(self, name):
        self.nice_new_name = name
    bad_old_name = OldAlias('nice_new_name', 'bad_old_name')

Old code using this class may still refer to the instance attribute as bad_old_name, preserving backwards compatibility; when that happens, though, a warning message is presented about the deprecation, encouraging the old code’s author to upgrade the code to use nice_new_name instead. The normal mechanisms of the warnings module of the Python Standard Library ensure that, by default, such warnings are output only once per occurrence and per run of a program, not repeatedly. For example, the snippet:

x = NiceClass(23)
for y in range(4):
    print x.bad_old_name
    x.bad_old_name += 100

emits:

xxx.py:64: DeprecationWarning: use 'nice_new_name', not 'bad_old_name'
  print x.bad_old_name
23
xxx.py:65: DeprecationWarning: use 'nice_new_name', not 'bad_old_name'
  x.bad_old_name += 100
123
223
323

The warning is printed once per line using the bad old name, not repeated again and again as the for loop iterates.

See Also

Custom descriptors are best documented on Raymond Hettinger’s web page: http://users.rcn.com/python/download/Descriptor.htm; Library Reference and Python in a Nutshell docs about the warnings module.

20.4. Caching Attribute Values

Credit: Denis S. Otkidach

Problem

You want to be able to compute attribute values, either per instance or per class, on demand, with automatic caching.

Solution

Custom descriptors are the right tools for this task:

class CachedAttribute(object):
    ''' Computes attribute value and caches it in the instance. '''
    def _ _init_ _(self, method, name=None):
        # record the unbound-method and the name
        self.method = method
        self.name = name or method._ _name_ _
    def _ _get_ _(self, inst, cls):
        if inst is None:
            # instance attribute accessed on class, return self
            return self
        # compute, cache and return the instance's attribute value
        result = self.method(inst)
        setattr(inst, self.name, result)
        return result
class CachedClassAttribute(CachedAttribute):
    ''' Computes attribute value and caches it in the class. '''
    def _ _get_ _(self, inst, cls):
        # just delegate to CachedAttribute, with 'cls' as ``instance''
        return super(CachedClassAttribute, self)._ _get_ _(cls, cls)

Discussion

If your class instances have attributes that must be computed on demand but don’t generally change after they’re first computed, custom descriptor CachedAttribute as presented in this recipe is just the ticket. Here is a toy example of use (with Python 2.4 syntax):

class MyObject(object):
    def _ _init_ _(self, n):
        self.n = n
    @CachedAttribute
    def square(self):
        return self.n * self.n
m = MyObject(23)
print vars(m)                               # 'square' not there yet
# emits:{'n': 23}
print m.square                              # ...so it gets computed
# emits: 529
print vars(m)                               # 'square' IS there now
# emits: {'square': 529, 'n': 23}
del m.square                                # flushing the cache
print vars(m)                               # 'square' removed  
# emits: {'n': 23}
m.n = 42
print vars(m)
# emits: {'n': 42}                   # still no 'square'
print m.square                              # ...so gets recomputed
# emits: 1764
print vars(m)                               # 'square' IS there again
# emits: {'square': 1764, 'n': 23}

As you see, after the first access to m.square, the square attribute is cached in instance m, so it will not get recomputed for that instance. If you need to flush the cache, for example, to change m.n, so that m.square will get recomputed if it is ever accessed again, just del m.square. Remember, attributes can be removed in Python! To use this code in Python 2.3, remove the decorator syntax @CachedAttribute and insert instead an assignment square = CachedAttribute(square) after the end of the def statement for method square.

Custom descriptor CachedClassAttribute is just a simple variant of CachedAttribute, easily obtained by inheritance: it computes the value by calling a method on the class rather than the instance, and it caches the result on the class, too. This may help when all instances of the class need to see the same cached value. CachedClassAttribute is mostly meant for cases in which you do not need to flush the cache because its _ _get_ _ method usually wipes away the instance descriptor itself:

class MyClass(object):
    class_attr = 23
    @CachedClassAttribute
    def square(cls):
        return cls.class_attr * cls.class_attr
x = MyClass( )
y = MyClass( )
print x.square
# emits:529
print y.square
# emits: 529
del MyClass.square
print x.square         # raises an AttributeError exception

However, when you do need a cached class attribute with the ability to occasionally flush it, you can still get it with a little trick. To implement this snippet so it works as intended, just add the statement:

class MyClass(MyClass): pass

right after the end of the class MyClass statement and before generating any instance of MyClass. Now, two class objects are named MyClass, a hidden “base” one that always holds the custom descriptor instance, and an outer “subclass” one that is used for everything else, including making instances and holding the cached value if any. Whether this trick is a reasonable one or whether it’s too cute and clever for its own good, is a judgment call you can make for yourself! Perhaps it would be clearer to name the base class MyClassBase and use class MyClass(MyClassBase), rather than use the same name for both classes; the mechanism would work in exactly the same fashion, since it is not dependent on the names of classes.

See Also

Custom descriptors are best documented at Raymond Hettinger’s web page: http://users.rcn.com/python/download/Descriptor.htm.

20.5. Using One Method as Accessorfor Multiple Attributes

Credit: Raymond Hettinger

Problem

Python’s built-in property descriptor is quite handy but only as long as you want to use a separate method as the accessor of each attribute you make into a property. In certain cases, you prefer to use the same method to access several different attributes, and property does not support that mode of operation.

Solution

We need to code our own custom descriptor, which gets the attribute name in _ _init_ _, saves it, and passes it on to the accessors. For convenience, we also provide useful defaults for the various accessors. You can still pass in None explicitly if you want to forbid certain kinds of access but the default is to allow it freely.

class CommonProperty(object):
    def _ _init_ _(self, realname, fget=getattr, fset=setattr, fdel=delattr,
                 doc=None):
        self.realname = realname
        self.fget = fget
        self.fset = fset
        self.fdel = fdel
        self._ _doc_ _ = doc or ""
    def _ _get_ _(self, obj, objtype=None):
        if obj is None:
            return self
        if self.fget is None:
            raise AttributeError, "can't get attribute"
        return self.fget(obj, self.realname)
    def _ _set_ _(self, obj, value):
        if self.fset is None:
            raise AttributeError, "can't set attribute"
        self.fset(obj, self.realname, value)
    def _ _delete_ _(self, obj):
        if self.fdel is None:
            raise AttributeError, "can't delete attribute"
        self.fdel(obj, self.realname, value)

Discussion

Here is a simple example of using this CommonProperty custom descriptor:

class Rectangle(object):
    def _ _init_ _(self, x, y):
        self._x = x                    # don't trigger _setSide prematurely
        self.y = y                     # now trigger it, so area gets computed
    def _setSide(self, attrname, value):
        setattr(self, attrname, value)
        self.area = self._x * self._y
    x = CommonProperty('_x', fset=_setSide, fdel=None)
    y = CommonProperty('_y', fset=_setSide, fdel=None)

The idea of this Rectangle class is that attributes x and y may be freely accessed but never deleted; when either of these attributes is set, the area attribute must be recomputed at once. You could alternatively recompute the area on the fly each time it’s accessed, using a simple property for the purpose; however, if area is accessed often and sides are changed rarely, the architecture of this simple example obviously can be preferable.

In this simple example of CommonProperty use, we just need to be careful on the very first attribute setting in _ _init_ _: if we carelessly used self.x = x, that would trigger the call to _setSide, which, in turn, would try to use self._y before the _y attribute is set.

Another issue worthy of mention is that if any one or more of the fget, fset, or fdel arguments to CommonProperty is defaulted, the realname argument must be different from the attribute name to which the CommonProperty instance is assigned; otherwise, unbounded recursion would occur on trying the corresponding operation (in practice, you’d get a RecursionLimitExceeded exception).

See Also

The Library Reference and Python in a Nutshell documentation for built-ins getattr, setattr, delattr, and property.

20.6. Adding Functionality to a Class by Wrapping a Method

Credit: Ken Seehof, Holger Krekel

Problem

You need to add functionality to an existing class, without changing the source code for that class, and inheritance is not applicable (since it would make a new class, rather than changing the existing one). Specifically, you need to enrich a method of the class, adding some extra functionality “around” that of the existing method.

Solution

Adding completely new methods (and other attributes) to an existing class object is quite simple, since the built-in function setattr does essentially all the work. We need to “decorate” an existing method to add to its functionality. To achieve this, we can build the new replacement method as a closure. The best architecture is to define general-purpose wrapper and unwrapper functions, such as:

import inspect
def wrapfunc(obj, name, processor, avoid_doublewrap=True):
    """ patch obj.<name> so that calling it actually calls, instead,
            processor(original_callable, *args, **kwargs)
    """
    # get the callable at obj.<name>
    call = getattr(obj, name)
    # optionally avoid multiple identical wrappings
    if avoid_doublewrap and getattr(call, 'processor', None) is processor:
        return
    # get underlying function (if any), and anyway def the wrapper closure
    original_callable = getattr(call, 'im_func', call)
    def wrappedfunc(*args, **kwargs):
        return processor(original_callable, *args, **kwargs)
    # set attributes, for future unwrapping and to avoid double-wrapping
    wrappedfunc.original = call
    wrappedfunc.processor = processor
    # 2.4 only: wrappedfunc._ _name_ _ = getattr(call, '_ _name_ _', name)
    # rewrap staticmethod and classmethod specifically (iff obj is a class)
    if inspect.isclass(obj):
        if hasattr(call, 'im_self'):
            if call.im_self:
                wrappedfunc = classmethod(wrappedfunc)
        else:
            wrappedfunc = staticmethod(wrappedfunc)
    # finally, install the wrapper closure as requested
    setattr(obj, name, wrappedfunc)
def unwrapfunc(obj, name):
    ''' undo the effects of wrapfunc(obj, name, processor) '''
    setattr(obj, name, getattr(obj, name).original)

This approach to wrapping is carefully coded to work just as well on ordinary functions (when obj is a module) as on methods of all kinds (e.g., bound methods, when obj is an instance; unbound, class, and static methods, when obj is a class). This method doesn’t work when obj is a built-in type, though, because built-ins are immutable.

For example, suppose we want to have “tracing” prints of all that happens whenever a particular method is called. Using the general-purpose wrapfunc function just shown, we could code:

def tracing_processor(original_callable, *args, **kwargs):
    r_name = getattr(original_callable, '_ _name_ _', '<unknown>')
    r_args = map(repr, args)
    r_args.extend(['%s=%r' % x for x in kwargs.iteritems( )])
    print "begin call to %s(%s)" % (r_name, ", ".join(r_args))
    try:
        result = call(*args, **kwargs)
    except:
        print "EXCEPTION in call to %s" %(r_name,)
        raise
    else:
        print "call to %s result: %r" %(r_name, result)
        return result
def add_tracing_prints_to_method(class_object, method_name):
    wrapfunc(class_object, method_name, tracing_processor)

Discussion

This recipe’s task occurs fairly often when you’re trying to modify the behavior of a standard or third-party Python module, since editing the source of the module itself is undesirable. In particular, this recipe can be handy for debugging, since the example function add_tracing_prints_to_method presented in the “Solution” lets you see on standard output all details of calls to a method you want to watch, without modifying the library module, and without requiring interactive access to the Python session in which the calls occur.

You can also use this recipe’s approach on a larger scale. For example, say that a library that you imported has a long series of methods that return numeric error codes. You could wrap each of them inside an enhanced wrapper method, which raises an exception when the error code from the original method indicates an error condition. Again, a key issue is not having to modify the library’s own code. However, methodical application of wrappers when building a subclass is also a way to avoid repetitious code (i.e., boilerplate). For example, Recipe 5.12 and Recipe 1.24 might be recoded to take advantage of the general wrapfunc presented in this recipe.

Particularly when “wrapping on a large scale”, it is important to be able to “unwrap” methods back to their normal state, which is why this recipe’s Solution also includes an unwrapfunc function. It may also be handy to avoid accidentally wrapping the same method in the same way twice, which is why wrapfunc supports the optional parameter avoid_doublewrap, defaulting to True, to avoid such double wrapping. (Unfortunately, classmethod and staticmethod do not support per-instance attributes, so the avoidance of double wrapping, as well as the ability to “unwrap”, cannot be guaranteed in all cases.)

You can wrap the same method multiple times with different processors. However, unwrapping must proceed last-in, first-out; as coded, this recipe does not support the ability to remove a wrapper from “somewhere in the middle” of a chain of several wrappers. A related limitation of this recipe as coded is that double wrapping is not detected when another unrelated wrapping occurred in the meantime. (We don’t even try to detect what we might call “deep double wrapping.”)

If you need “generalized unwrapping”, you can extend unwrap_func to return the processor it has removed; then you can obtain generalized unwrapping by unwrapping all the way, recording a list of the processors that you removed, and then pruning that list of processors and rewrapping. Similarly, generalized detection of “deep” double wrapping could be implemented based on this same idea.

Another generalization, to fully support staticmethod and classmethod, is to use a global dict, rather than per-instance attributes, for the original and processor values; functions, bound and unbound methods, as well as class methods and static methods, can all be used as keys into such a dictionary. Doing so obviates the issue with the inability to set per-instance attributes on class methods and static methods. However, each of these generalizations can be somewhat complicated, so we are not pursuing them further here.

Once you have coded some processors with the signature and semantics required by this recipe’s wrapfunc, you can also use such processors more directly (in cases where modifying the source is OK) with a Python 2.4 decorator, as follows:

def processedby(processor):
    """ decorator to wrap the processor around a function. """
    def processedfunc(func):
        def wrappedfunc(*args, **kwargs):
            return processor(func, *args, **kwargs)
        return wrappedfunc
    return processedfunc

For example, to wrap this recipe’s tracing_processor around a certain method at the time the class statement executes, in Python 2.4, you can code:

class SomeClass(object):
    @processedby(tracing_processor)
    def amethod(self, s):
        return 'Hello, ' + s

See Also

Recipe 5.12 and Recipe 1.24 provide examples of the methodical application of wrappers to build a subclass to avoid boilerplate; Library Reference and Python in a Nutshell docs on built-in functions getattr and setattr and module inspect.

20.7. Adding Functionality to a Class by Enriching All Methods

Credit: Stephan Diehl, Robert E. Brewer

Problem

You need to add functionality to an existing class without changing the source code for that class. Specifically, you need to enrich all methods of the class, adding some extra functionality “around” that of the existing methods.

Solution

Recipe 20.6 previously showed a way to solve this task for one method by writing a closure that builds and applies a wrapper, exemplified by function add_tracing_prints_to_method in that recipe’s Solution. This recipe generalizes that one, wrapping methods throughout a class or hierarchy, directly or via a custom metaclass.

Module inspect lets you easily find all methods of an existing class, so you can systematically wrap them all:

import inspect
def add_tracing_prints_to_all_methods(class_object):
    for method_name, v in inspect.getmembers(class_object, inspect.ismethod):
        add_tracing_prints_to_method(class_object, method_name)

If you need to ensure that such wrapping applies to all methods of all classes in a whole hierarchy, the simplest way may be to insert a custom metaclass at the root of the hierarchy, so that all classes in the hierarchy will get that same metaclass. This insertion does normally need a minimum of “invasiveness”—placing a single statement

    _ _metaclass_ _ = MetaTracer

in the body of that root class. Custom metaclass MetaTracer is, however, quite easy to write:

class MetaTracer(type):
    def _ _init_ _(cls, n, b, d):
        super(MetaTracer, cls)._ _init_ _(n, b, d)
        add_tracing_prints_to_all_methods(cls)

Even such minimal invasiveness sometimes is unacceptable, or you need a more dynamic way to wrap all methods in a hierarchy. Then, as long as the root class of the hierarchy is new-style, you can arrange to get function add_tracing_prints_to_all_methods dynamically called on all classes in the hierarchy:

def add_tracing_prints_to_all_descendants(class_object):
    add_tracing_prints_to_all_methods(class_object)
    for s in class_object._ _subclasses_ _( ):
        add_tracing_prints_to_all_descendants(s)

The inverse function unwrapfunc, in Recipe 20.6, may also be similarly applied to all methods of a class and all classes of a hierarchy.

Discussion

We could code just about all functionality of such a powerful function as add_tracing_prints_to_all_descendants in the function’s own body. However, it would not be a great idea to bunch up such diverse functionality inside a single function. Instead, we carefully split the functionality among the various separate functions presented in this recipe and previously in Recipe 20.6. By this careful factorization, we obtain maximum reusability without code duplication: we have separate functions to dynamically add and remove wrapping from a single method, an entire class, and a whole hierarchy of classes; each of these functions appropriately uses the simpler ones. And for cases in which we can afford a tiny amount of “invasiveness” and want the convenience of automatically applying the wrapping to all methods of classes descended from a certain root, we can use a tiny custom metaclass.

add_tracing_prints_to_all_descendants cannot apply to old-style classes. This limitation is inherent in the old-style object model and is one of the several reasons you should always use new-style classes in new code you write: classic classes exist only to ensure compatibility in legacy programs. Besides the problem with classic classes, however, there’s another issue with the structure of add_tracing_prints_to_all_descendants: in cases of multiple inheritance, the function will repeatedly visit some classes.

Since the method-wrapping function is carefully designed to avoid double wrapping, such multiple visits are not a serious problem, costing just a little avoidable overhead, which is why the function was acceptable for inclusion in the “Solution”. In other cases in which we want to operate on all descendants of a certain root class, however, multiple visits might be unacceptable. Moreover, it is clearly not optimal to entwine the functionality of getting all descendants with that of applying one particular operation to each of them. The best idea is clearly to factor out the recursive structure into a generator, which can avoid duplicating visits with the memo idiom:

def all_descendants(class_object, _memo=None):
    if _memo is None:
        _memo = {  }
    elif class_object in _memo:
        return
    yield class_object
    for subclass in class_object._ _subclasses_ _( ):
        for descendant in all_descendants(subclass, _memo):
            yield descendant

Adding tracing prints to all descendants now simplifies to:

def add_tracing_prints_to_all_descendants(class_object):
    for c in all_descendants(class_object):
        add_tracing_prints_to_all_methods(c)

In Python, whenever you find yourself with an iteration structure of any complexity, or recursion, it’s always worthwhile to check whether it’s feasible to factor out the iterative or recursive control structure into a separate, reusable generator, so that all iterations of that form can become simple for statements. Such separation of concerns can offer important simplifications and make code more maintainable.

See Also

Recipe 20.6 for details on how each method gets wrapped; Library Reference and Python in a Nutshell docs on module inspect and the _ _subclasses_ _ special method of new-style classes.

20.8. Adding a Method to a Class Instance at Runtime

Credit: Moshe Zadka

Problem

During debugging, you want to identify certain specific instance objects so that print statements display more information when applied to those specific objects.

Solution

The print statement implicitly calls the special method _ _str_ _ of the class of each object you’re printing. Therefore, to ensure that printing certain objects displays more information, we need to give those objects new classes whose _ _str_ _ special methods are suitably modified. For example:

def add_method_to_objects_class(object, method, name=None):
    if name is None:
        name = method.func_name
    class newclass(object._ _class_ _):
        pass
    setattr(newclass, name, method)
    object._ _class_ _ = newclass
import inspect
def _rich_str(self):
    pieces = [  ]
    for name, value in inspect.getmembers(self):
        # don't display specials
        if name.startswith('_ _') and name.endswith('_ _'):
            continue
        # don't display the object's own methods
        if inspect.ismethod(value) and value.im_self is self:
            continue
        pieces.extend((name.ljust(15), '	', str(value), '
'))        
    return ''.join(pieces)
def set_rich_str(obj, on=True):
    def isrich( ):
        return getattr(obj._ _class_ _._ _str_ _, 'im_func', None) is _rich_str
    if on:
        if not isrich( ):
            add_method_to_objects_class(obj, _rich_str, '_ _str_ _')
        assert isrich( )
    else:
        if not isrich( ):
            return
        bases = obj._ _class_ _._ _bases_ _
        assert len(bases) == 1
        obj._ _class_ _ = bases[0]
        assert not isrich( )

Discussion

Here is a sample use of this recipe’s set_rich_str function, guarded in the usual way:

if _ _name_ _ == '_ _main_ _':               # usual guard for example usage
    class Foo(object):
        def _ _init_ _(self, x=23, y=42):
            self.x, self.y = x, y
    f = Foo( )
    print f
    # emits:<_ _main_ _.Foo object at 0x38f770>
    set_rich_str(f)
    print f
    # emits:
    # x               23
    # y               42
    set_rich_str(f, on=False)
    print f
    # emits: <_ _main_ _.Foo object at 0x38f770>

In old versions of Python (and in Python 2.3 and 2.4, for backwards compatibility on instances of classic classes), intrinsic lookup of special methods (such as the intrinsic lookup for _ _str_ _ in a print statement) started on the instance. In today’s Python, in the new object model that is recommended for all new code, the intrinsic lookup starts on the instance’s class, bypassing names set in the instance’s own _ _dict_ _. This innovation has many advantages, but, at a first superficial look, it may also seem to have one substantial disadvantage: namely, to make it impossible to solve this recipe’s Problem in the general case (i.e., for instances that might belong to either classic or new-style classes).

Fortunately, that superficial impression is not correct, thanks to Python’s power of introspection and dynamism. This recipe’s function add_method_to_objects_class shows how to change special methods on a given object obj’s class, without affecting other “sibling” objects (i.e., other instances of the same class as obj’s): very simply, start by changing the obj’s class—that is, by setting obj._ _class_ _ to a newly made class object (which inherits from the original class of obj, so that anything we don’t explicitly modify remains unchanged). Once you’ve done that, you can then alter the newly made class object to your heart’s contents.

Function _rich_str shows how you can use introspection to display a lot of information about a specific instance. Specifically, we display every attribute of the instance that doesn’t have a special name (starting and ending with two underscores), except the instances’ own bound methods. Function set_rich_str shows how to set the _ _str_ _ special method of an instance’s class to either “rich” (the _rich_str function we just mentioned) or “normal” (the _ _str_ _ method the object’s original class is coded to supply). To make the object’s _ _str_ _ rich, set_rich_str uses add_method_to_objects_class to set _ _str_ _ to _rich_str. When the object goes back to “normal”, set_rich_str sets the object’s _ _class_ _ back to its original value (which is preserved as the only base class when the object is set to use _rich_str).

See Also

Recipe 20.6 and Recipe 20.7 for other cases in which a class’ methods are modified; documentation on the inspect standard library module in the Library Reference.

20.9. Checking Whether Interfaces Are Implemented

Credit: Raymond Hettinger

Problem

You want to ensure that the classes you define implement the interfaces that they claim to implement.

Solution

Python does not have a formal concept of “interface”, but we can easily represent interfaces by means of “skeleton” classes such as:

class IMinimalMapping(object):
    def _ _getitem_ _(self, key): pass
    def _ _setitem_ _(self, key, value): pass
    def _ _delitem_ _(self, key): pass
    def _ _contains_ _(self, key): pass
import UserDict
class IFullMapping(IMinimalMapping, UserDict.DictMixin):
    def keys(self): pass
class IMinimalSequence(object):
    def _ _len_ _(self): pass
    def _ _getitem_ _(self, index): pass
class ICallable(object):
    def _ _call_ _(self, *args): pass

We follow the natural convention that any class can represent an interface: the interface is the set of methods and other attributes of the class. We can say that a class C implements an interface i if C has all the methods and other attributes of i (and, possibly, additional ones).

We can now define a simple custom metaclass that checks whether classes implement all the interfaces they claim to implement:

# ensure we use the best available 'set' type with name 'set'
try:
    set
except NameError:
    from sets import Set as set
# a custom exception class that we raise to signal violations
class InterfaceOmission(TypeError):
    pass
class MetaInterfaceChecker(type):
    ''' the interface-checking custom metaclass '''
    def _ _init_ _(cls, classname, bases, classdict):
        super(MetaInterfaceChecker, cls)._ _init_ _(classname, bases, classdict)
        cls_defines = set(dir(cls))
        for interface in cls._ _implements_ _:
            itf_requires = set(dir(interface))
            if not itf_requires.issubset(cls_defines):
                raise InterfaceOmission, list(itf_requires - cls_defines)

Any class that uses MetaInterfaceChecker as its metaclass must expose a class attribute _ _implements_ _, an iterable whose items are the interfaces the class claims to implement. The metaclass checks the claim, raising an InterfaceOmission exception if the claim is false.

Discussion

Here’s an example class using the MetaInterfaceChecker custom metaclass:

class Skidoo(object):
    ''' a mapping which claims to contain all keys, each with a value
        of 23; item setting and deletion are no-ops; you can also call
        an instance with arbitrary positional args, result is 23. '''
    _ _metaclass_ _ = MetaInterfaceChecker
    _ _implements_ _ = IMinimalMapping, ICallable
    def _ _getitem_ _(self, key): return 23
    def _ _setitem_ _(self, key, value): pass
    def _ _delitem_ _(self, key): pass
    def _ _contains_ _(self, key): return True
    def _ _call_ _(self, *args): return 23
sk = Skidoo( )

Any code dealing with an instance of such a class can choose to check whether it can rely on certain interfaces:

def use(sk):
    if IMinimalMapping in sk._ _implements_ _:...code using 'sk[...]' and/or 'x in sk'...

You can, if you want, provide much fancier and more thorough checks, for example by using functions from standard library module inspect to check that the attributes being exposed and required are methods with compatible signatures. However, this simple recipe does show how to automate the simplest kind of checks for interface compliance.

See Also

Library Reference and Python in a Nutshell docs about module sets, (in Python 2.4 only) the set built-in, custom metaclasses, the inspect module.

20.10. Using _ _new_ _ and _ _init_ _ Appropriately in Custom Metaclasses

Credit: Michele Simionato, Stephan Diehl, Alex Martelli

Problem

You are writing a custom metaclass, and you are not sure which tasks your metaclass should perform in its _ _new_ _ method, and which ones it should perform in its _ _init_ _ method instead.

Solution

Any preliminary processing that your custom metaclass performs on the name, bases, or dict of the class being built, can affect the way in which the class object gets built only if it occurs in the metaclass’ _ _new_ _ method, before your code calls the metaclass’ superclass’ _ _new_ _. For example, that’s the only time when you can usefully affect the new class’ _ _slots_ _, if any:

class MetaEnsure_foo(type):
    def _ _new_ _(mcl, cname, cbases, cdict):
        # ensure instances of the new class can have a '_foo' attribute
        if '_ _slots_ _' in cdict and '_foo' not in cdict['_ _slots_ _']:
            cdict['_ _slots_ _'] = tuple(cdict['_ _slots_ _']) + ('_foo',)
        return super(MetaEnsure_foo, mcl)._ _new_ _(mcl, cname, cbases, cdict)

Metaclass method _ _init_ _ is generally the most appropriate one for any changes that your custom metaclass makes to the class object after the class object is built—for example, continuing the example code for metaclass MetaEnsure_foo:

    def _ _init_ _(cls, cname, cbases, cdict):
        super(MetaEnsure_foo, cls)._ _init_ _(cls, cname, cbases, cdict)
        cls._foo = 23

Discussion

The custom metaclass MetaEnsure_foo performs a definitely “toy” task presented strictly as an example: if the class object being built defines a _ _slots_ _ attribute (to save memory), MetaEnsure_foo ensures that the class object includes a slot _foo, so that instances of that class can have an attribute thus named. Further, the custom metaclass sets an attribute with name _foo and value 23 on each new class object. The point of the recipe isn’t really this toy task, but rather, a clarification on how _ _new_ _ and _ _init_ _ methods of a custom metaclass are best coded, and which tasks are most appropriate for each.

Whenever you instantiate any class x (whether x is a custom metaclass or an ordinary class) with or without arguments (we can employ the usual Python notation *a, **k to mean arbitrary positional and named arguments), Python internally performs the equivalent of the following snippet of code:

    new_thing = X._ _new_ _(X, *a, **k)
    if isinstance(new_thing, X):
        X._ _init_ _(new_thing, *a, **k)

The new_thing thus built and initialized is the result of instantiating x. If x is a custom metaclass, in particular, this snippet occurs at the end of the execution of a class statement, and the arguments (all positional) are the name, bases, and dictionary of the new class that is being built.

So, your custom metaclass’ _ _new_ _ method is the code that has dibs—it executes first. That’s the moment in which you can adjust the name, bases, and dictionary that you receive as arguments, to affect the way the new class object is built. Most characteristics of the class object, but not all, can also be changed later. An example of an attribute that you have to set before building the class object is _ _slots_ _. Once the class object is built, the slots, if any, are defined, and any further change to _ _slots_ _ has no effect.

The custom metaclass in this recipe carefully uses super to delegate work to its superclass, rather than carelessly calling type._ _new_ _ or type._ _init_ _ directly: the latter usage would be a subtle mistake, impeding the proper working of multiple inheritance among metaclasses. Further, this recipe is careful in naming the first parameters to both methods: cls to mean an ordinary class (the object that is the first argument to a custom metaclass’ _ _init_ _), mcl to mean a metaclass (the object that is the first argument to a custom metaclass’ _ _new_ _). The common usage of self should be reserved to mean normal instances, not classes nor metaclasses, and therefore it doesn’t normally occur in the body of a custom metaclass. All of these names are a matter of mere convention, but using appropriate conventions promotes clarity, and this use of cls and mcl was blessed by Guido van Rossum himself, albeit only verbally.

The usage distinction between _ _new_ _ and _ _init_ _ that this recipe advocates for custom metaclasses is basically the same criterion that any class should always follow: use _ _new_ _ when you must, only for jobs that cannot be done later; use _ _init_ _ for all jobs that can be left until _ _init_ _ time. Following these conventions makes life easiest for anybody who must tweak your custom metaclass or make it work well in a multiple inheritance situation, and thus enhances the reusability of your code. _ _new_ _ should contain only the essence of your metaclass: stuff that anybody using your metaclass in any way at all must surely want (or else he wouldn’t be using your metaclass!) because it’s stuff that’s not easy to tweak, modify, or override. _ _init_ _ is “softer”, so most of what your metaclass is doing to the class objects you generate, should be there, exactly because it will be easier for reusers to tweak or avoid.

See Also

Library Reference and Python in a Nutshell docs on built-ins super and _ _slots_ _, and special methods _ _init_ _ and _ _new_ _.

20.11. Allowing Chaining of Mutating List Methods

Credit: Stephan Diehl, Alex Martelli

Problem

The methods of the list type that mutate a list object in place—methods such as append and sort—return None. To call a series of such methods, you therefore need to use a series of statements. You would like those methods to return self to enable you to chain a series of calls within a single expression.

Solution

A custom metaclass can offer an elegant approach to this task:

def makeChainable(func):
    ''' wrapp a method returning None into one returning self '''
    def chainableWrapper(self, *args, **kwds):
        func(self, *args, **kwds)
        return self
    # 2.4 only: chainableWrapper._ _name_ _ = func._ _name_ _
    return chainableWrapper
class MetaChainable(type):
    def _ _new_ _(mcl, cName, cBases, cDict):
        # get the "real" base class, then wrap its mutators into the cDict
        for base in cBases:
            if not isinstance(base, MetaChainable):
                for mutator in cDict['_ _mutators_ _']:
                    if mutator not in cDict:
                        cDict[mutator] = makeChainable(getattr(base, mutator))
                break
        # delegate the rest to built-in 'type'
        return super(MetaChainable, mcl)._ _new_ _(mcl, cName, cBases, cDict)
class Chainable: _ _metaclass_ _ = MetaChainable
if _ _name_ _ == '_ _main_ _':
    # example usage
    class chainablelist(Chainable, list):
        _ _mutators_ _ = 'sort reverse append extend insert'.split( )
    print ''.join(chainablelist('hello').extend('ciao').sort( ).reverse( ))
# emits:oolliheca

Discussion

Mutator methods of mutable objects such as lists and dictionaries work in place, mutating the object they’re called on, and return None. One reason for this behavior is to avoid confusing programmers who might otherwise think such methods build and return new objects. Returning None also prevents you from chaining a sequence of mutator calls, which some Python gurus consider bad style because it can lead to very dense code that may be hard to read.

Some programmers, however, occasionally prefer the chained-calls, dense-code style. This style is particularly useful in such contexts as lambda forms and list comprehensions. In these contexts, the ability to perform actions within an expression, rather than in statements, can be crucial. This recipe shows one way you can tweak mutators’ return values to allow chaining. Using a custom metaclass means the runtime overhead of introspection is paid only rarely, at class-creation time, rather than repeatedly. If runtime overhead is not a problem for your application, it may be simpler for you to use a delegating wrapper idiom that was posted to comp.lang.python by Jacek Generowicz:

class chainable(object):
    def _ _init_ _(self, obj):
        self.obj = obj
    def _ _iter_ _(self):
        return iter(self.obj)
    def _ _getattr_ _(self, name):
        def proxy(*args, **kwds):
            result = getattr(self.obj, name)(*args, **kwds)
            if result is None: return self
            else: return result
        # 2.4 only: proxy._ _name_ _ = name
        return proxy

The use of this wrapper is quite similar to that of classes obtained by the custom metaclass presented in this recipe’s Solution—for example:

print ''.join(chainable(list('hello')).extend('ciao').sort( ).reverse( ))
# emits:oolliheca

See Also

Library Reference and Python in a Nutshell docs on built-in type list and special methods _ _new_ _ and _ _getattr_ _.

20.12. Using Cooperative Super calls with Terser Syntax

Credit: Michele Simionato, Gonçalo Rodrigues

Problem

You like the cooperative style of multiple-inheritance coding supported by the super built-in, but you wish you could use that style in a more terse and direct way.

Solution

A custom metaclass lets us selectively wrap the methods exposed by a class. Specifically, if the second argument of a method is named super, then that argument gets bound to the appropriate instance of the built-in super:

import inspect
def second_arg(func):
    args = inspect.getargspec(func)[0]
    try: return args[1]
    except IndexError: return None
def super_wrapper(cls, func):
    def wrapper(self, *args, **kw):
        return func(self, super(cls, self), *args, **kw)
    # 2.4 only: wrapper._ _name_ _ = func._ _name_ _
    return wrapper
class MetaCooperative(type):
    def _ _init_ _(cls, name, bases, dic):
        super(MetaCooperative, cls)._ _init_ _(cls, name, bases, dic)
        for attr_name, func in dic.iteritems( ):
            if inspect.isfunction(func) and second_arg(func) == "super":
                setattr(cls, attr_name, super_wrapper(cls, func)) 
class Cooperative:
    _ _metaclass_ _ = MetaCooperative

Discussion

Here is a usage example of the custom metaclass presented in this recipe’s Solution, in a typical toy case of “diamond-shaped” inheritance:

if _ _name_ _ == "_ _main_ _":
    class B(Cooperative):
        def say(self):
            print "B",
    class C(B):
        def say(self, super):
            super.say( )
            print "C",
    class D(B):
        def say(self, super):
            super.say( )
            print "D",
    class CD(C, D):
        def say(self, super):
            super.say( )
            print '!'
    CD( ).say( )
# emits: B D C !

Methods that want to access the super-instance just need to use super as the name of their second argument; the metaclass then arranges to wrap those methods so that the super-instance gets synthesized and passed in as the second argument, as needed.

In other words, when a class cls, whose metaclass is MetaCooperative, has methods whose second argument is named super, then, in those methods, any call of the form super.something(*args, **kw) is a shortcut for super(cls, self).something(*args, **kw). This approach avoids the need to pass the class object as an argument to the built-in super.

Class cls may also perfectly well have other methods that do not follow this convention, and in those methods, it may use the built-in super in the usual way: all it takes for any method to be “normal” is to not use super as the name of its second argument, surely not a major restriction. This recipe offers nicer syntax sugar for the common case of cooperative supercalls, where the first argument to super is the current class—nothing more.

See Also

Library Reference and Python in a Nutshell docs on module inspect and the super built-in.

20.13. Initializing Instance Attributes Without Using _ _init_ _

Credit: Dan Perl, Shalabh Chaturvedi

Problem

Your classes need to initialize some instance attributes when they generate new instances. If you do the initialization, as normal, in the _ _init_ _ method of your classes, then, when anybody subclasses your classes, they must remember to invoke your classes’ _ _init_ _ methods. Your classes often get subclassed by beginners who forget this elementary requirement, and you’re getting tired of the resulting support requests. You’d like an approach that beginners subclassing your classes are less likely to mess up.

Solution

Beginners are unlikely to have heard of the _ _new_ _ method, so you can place your initialization there, instead of in _ _init_ _:

# a couple of classes that you write:
class super1(object):
    def _ _new_ _(cls, *args, **kwargs):
        obj = super(super1, cls)._ _new_ _(cls, *args, **kwargs)
        obj.attr1 = [  ]
        return obj
    def _ _str_ _(self):
        show_attr = [  ]
        for attr, value in sorted(self._ _dict_ _.iteritems( )):
            show_attr.append('%s:%r' % (attr, value))
        return '%s with %s' % (self._ _class_ _._ _name_ _,
                               ', '.join(show_attr))
class super2(object):
    def _ _new_ _(cls, *args, **kwargs):
        obj = super(super2, cls)._ _new_ _(cls, *args, **kwargs)
        obj.attr2 = {  }
        return obj
# typical beginners' code, inheriting your classes but forgetting to
# call its superclasses' _ _init_ _ methods
class derived(super1, super2):
    def _ _init_ _(self):
        self.attr1.append(111)
        self.attr3 = ( )
# despite the typical beginner's error, you won't get support calls:
d = derived( )
print d
# emits:derived with attr1:[111], attr2:{  }, attr3:( )

Discussion

One of Python’s strengths is that it does very little magic behind the curtains—close to nothing, actually. If you know Python in sufficient depth, you know that essentially all internal mechanisms are clearly documented and exposed. This strength, however, means that you yourself must do some things that other languages do magically, such as prefixing self. to methods and attributes of the current object and explicitly calling the _ _init_ _ methods of your superclasses in the _ _init_ _ method of your own class.

Unfortunately, Python beginners, particularly if they first learned from other languages where they’re used to such implicit and magical behavior, can take some time adapting to this brave new world where, if you want something done, you do it. Eventually, they learn. Until they have learned, at times it seems that their favorite pastime is filling my mailbox with help requests, in tones ranging from the humble to the arrogant and angry, complaining that “my classes don’t work.” Almost invariably, this complaint means they’re inheriting from my classes, which are meant to ease such tasks as displaying GUIs and communicating on the Internet, and they have forgotten to call my classes’ _ _init_ _ methods from the _ _init_ _ methods of subclasses they have coded.

To deal with this annoyance, I devised the simple solution shown in this recipe. Beginners generally don’t know about the _ _new_ _ method, and what they don’t know, they cannot mess up. If they do know enough to override _ _new_ _, you can hope they also know enough to do a properly cooperative supercall using the super built-in, rather than crudely bypassing your code by directly calling object._ _new_ _. Well, hope springs eternal, or so they say. Truth be told, my hopes lie in beginners’ total, blissful ignorance about _ _new_ _—and this theory seems to work because I don’t get those kind of help requests any more. The help requests I now receive seem concerned more with how to actually use my classes, rather than displaying fundamental ignorance of Python.

If you work with more advanced but equally perverse beginners, ones quite able to mess up _ _new_ _, you should consider giving your classes a custom metaclass that, in its _ _call_ _ (which executes at class instantiation time), calls a special hidden method on your classes to enable you to do your initializations anyway. That approach should hold you in good stead—at least until the beginners start learning about metaclasses. Of course, “it is impossible to make anything foolproof, because fools are so ingenious” (Roger Berg). Nevertheless, see Recipe 20.14 for other approaches that avoid _ _init_ _ for attribute initialization needs.

See Also

Library Reference and Python in a Nutshell documentation on special methods _ _init_ _ and _ _new_ _, and built-in super; Recipe 20.14.

20.14. Automatic Initialization of Instance Attributes

Credit: Sébastien Keim, Troy Melhase, Peter Cogolo

Problem

You want to set some attributes to constant values, during object initialization, without forcing your subclasses to call your _ _init_ _ method.

Solution

For constant values of immutable types, you can just set them in the class. For example, instead of the natural looking:

class counter(object):
    def _ _init_ _(self):
        self.count = 0
    def increase(self, addend=1):
        self.count += addend

you can code:

class counter(object):
    count = 0
    def increase(self, addend=1):
        self.count += addend

This style works because self.count += addend, when self.count belongs to an immutable type, is exactly equivalent to self.count = self.count + addend. The first time this code executes for a particular instance self, self.count is not yet initialized as a per-instance attribute, so the per-class attribute is used, on the right of the equal sign (=); but the per-instance attribute is nevertheless the one assigned to (on the left of the sign). Any further use, once the per-instance attribute has been initialized in this way, gets or sets the per-instance attribute.

This style does not work for values of mutable types, such as lists or dictionaries. Coding this way would then result in all instances of the class sharing the same mutable-type object as their attribute. However, a custom descriptor works fine:

class auto_attr(object):
    def _ _init_ _(self, name, factory, *a, **k):
        self.data = name, factory, a, k
    def _ _get_ _(self, obj, clas=None):
        name, factory, a, k = self.data
        setattr(obj, name, factory(*a, **k))
        return getattr(obj, name)

With class auto_attr at hand, you can now code, for example:

class recorder(object):
    count = 0
    events = auto_attr('events', list)
    def record(self, event):
        self.count += 1
        self.events.append((self.count, event))

Discussion

The simple and standard approach of defining constant initial values of attributes by setting them as class attributes is just fine, as long as we’re talking about constants of immutable types, such as numbers or strings. In such cases, it does no harm for all instances of the class to share the same initial-value object for such attributes, and, when you do such operations as self.count += 1, you intrinsically rebind the specific, per-instance value of the attribute, without affecting the attributes of other instances.

However, when you want an attribute to have an initial value of a mutable type, such as a list or a dictionary, you need a little bit more—such as the auto_attr custom descriptor type in this recipe. Each instance of auto_attr needs to know to what attribute name it’s being bound, so we pass that name as the first argument when we instantiate auto_attr. Then, we have the factory, a callable that will produce the desired initial value when called (often factory will be a type object, such as list or dict); and finally optional positional and keyword arguments to be passed when factory gets called.

The first time you access an attribute named name on a given instance obj, Python finds in obj’s class the descriptor (an instance of auto_attr) and calls the descriptor’s method _ _get_ _, with obj as an argument. auto_attr’s _ _get_ _ calls the factory and sets the result under the right name as an instance attribute, so that any further access to the attribute of that name in the instance gets the actual value.

In other words, the descriptor is designed to hide itself when it’s first accessed on each instance, to get out of the way from further accesses to the attribute of the same name on that same instance. For this purpose, it’s absolutely crucial that auto_attr is technically a nondata descriptor class, meaning it doesn’t define a _ _set_ _ method. As a consequence, an attribute of the same name may be set in the instance: the per-instance attribute overrides (i.e., takes precedence over) the per-class attribute (i.e., the instance of a nondata descriptor class).

You can regard this recipe’s approach as “just-in-time generation” of instance attributes, the first time a certain attribute gets accessed on a certain instance. Beyond allowing attribute initialization to occur without an _ _init_ _ method, this approach may therefore be useful as an optimization: consider it when each instance has a potentially large set of attributes, maybe costly to initialize, and most of the attributes may end up never being accessed on each given instance.

It is somewhat unfortunate that this recipe requires you to pass to auto_attr the name of the attribute it’s getting bound to; unfortunately, auto_attr has no way to find out for itself. However, if you’re willing to add a custom metaclass to the mix, you can fix this little inconvenience, too, as follows:

class smart_attr(object):
    name = None
    def _ _init_ _(self, factory, *a, **k):
        self.creation_data = factory, a, k
    def _ _get_ _(self, obj, clas=None):
        if self.name is None:
            raise RuntimeError, ("class %r uses a smart_attr, so its "
                "metaclass should be MetaSmart, but is %r instead" %
                (clas, type(clas)))
        factory, a, k = self.creation_data
        setattr(obj, name, factory(*a, **k))
        return getattr(obj, name)
class MetaSmart(type):
    def _ _new_ _(mcl, clasname, bases, clasdict):
        # set all names for smart_attr attributes
        for k, v in clasdict.iteritems( ):
            if isinstance(v, smart_attr):
                v.name = k
        # delegate the rest to the supermetaclass
        return super(MetaSmart, mcl)._ _new_ _(mcl, clasname, bases, clasdict)
# let's let any class use our custom metaclass by inheriting from smart_object
class smart_object:
    _ _metaclass_ _ = MetaSmart

Using this variant, you could code:

class recorder(smart_object):
    count = 0
    events = smart_attr(list)
    def record(self, event):
        self.count += 1
        self.events.append((self.count, event))

Once you start considering custom metaclasses, you have more options for this recipe’s task, automatic initialization of instance attributes. While a custom descriptor remains the best approach when you do want “just-in-time” generation of initial values, if you prefer to generate all the initial values at the time the instance is being initialized, then you can use a simple placeholder instead of smart_attr, and do more work in the metaclass:

class attr(object):
    def _ _init_ _(self, factory, *a, **k):
        self.creation_data = factory, a, k
import inspect
def is_attr(member):
    return isinstance(member, attr)
class MetaAuto(type):
    def _ _call_ _(cls, *a, **k):
        obj = super(MetaAuto, cls)._ _call_ _(cls, *a, **k)
        # set all values for 'attr' attributes
        for n, v in inspect.getmembers(cls, is_attr):
            factory, a, k = v.creation_data
            setattr(obj, n, factory(*a, **k))
        return obj
# lets' let any class use our custom metaclass by inheriting from auto_object
class auto_object:
    _ _metaclass_ _ = MetaAuto

Code using this more concise variant looks just about the same as with the previous one:

class recorder(auto_object):
    count = 0
    events = attr(list)
    def record(self, event):
        self.count += 1
        self.events.append((self.count, event))

See Also

Recipe 20.13 for another approach that avoids _ _init_ _ for attribute initialization needs; Library Reference and Python in a Nutshell docs on special method _ _init_ _, and built-ins super and setattr.

20.15. Upgrading Class Instances Automatically on reload

Credit: Michael Hudson, Peter Cogolo

Problem

You are developing a Python module that defines a class, and you’re trying things out in the interactive interpreter. Each time you reload the module, you have to ensure that existing instances are updated to instances of the new, rather than the old class.

Solution

First, we define a custom metaclass, which ensures its classes keep track of all their existing instances:

import weakref
class MetaInstanceTracker(type):
    ''' a metaclass which ensures its classes keep track of their instances '''
    def _ _init_ _(cls, name, bases, ns):
        super(MetaInstanceTracker, cls)._ _init_ _(name, bases, ns)
        # new class cls starts with no instances
        cls._ _instance_refs_ _ = [  ]
    def _ _instances_ _(cls):
        ''' return all instances of cls which are still alive '''
        # get ref and obj for refs that are still alive
        instances = [(r, r( )) for r in cls._ _instance_refs_ _ if r( ) is not None]
        # record the still-alive references back into the class
        cls._ _instance_refs_ _ = [r for (r, o) in instances]
        # return the instances which are still alive
        return [o for (r, o) in instances]
    def _ _call_ _(cls, *args, **kw):
        ''' generate an instance, and record it (with a weak reference) '''
        instance = super(MetaInstanceTracker, cls)._ _call_ _(*args, **kw)
        # record a ref to the instance before returning the instance
        cls._ _instance_refs_ _.append(weakref.ref(instance))
        return instance
class InstanceTracker:
    ''' any class may subclass this one, to keep track of its instances '''
    _ _metaclass_ _ = MetaInstanceTracker

Now, we can subclass MetaInstanceTracker to obtain another custom metaclass, which, on top of the instance-tracking functionality, implements the auto-upgrading functionality required by this recipe’s Problem:

import inspect
class MetaAutoReloader(MetaInstanceTracker):
    ''' a metaclass which, when one of its classes is re-built, updates all
        instances and subclasses of the previous version to the new one '''
    def _ _init_ _(cls, name, bases, ns):
        # the new class may optionally define an _ _update_ _ method
        updater = ns.pop('_ _update_ _', None)
        super(MetaInstanceTracker, cls)._ _init_ _(name, bases, ns)
        # inspect locals & globals in the stackframe of our caller
        f = inspect.currentframe( ).f_back
        for d in (f.f_locals, f.f_globals):
            if name in d:
                # found the name as a variable is it the old class
                old_class = d[name]
                if not isinstance(old_class, mcl):
                    # no, keep trying
                    continue
                # found the old class: update its existing instances
                for instance in old_class._ _instances_ _( ):
                    instance._ _class_ _ = cls
                    if updater: updater(instance)
                    cls._ _instance_refs_ _.append(weakref.ref(instance))
                # also update the old class's subclasses
                for subclass in old_class._ _subclasses_ _( ):
                    bases = list(subclass._ _bases_ _)
                    bases[bases.index(old_class)] = cls
                    subclass._ _bases_ _ = tuple(bases)
                break
        return cls
class AutoReloader:
    ''' any class may subclass this one, to get automatic updates '''
    _ _metaclass_ _ = MetaAutoReloader

Here is a usage example:

# an 'old class'
class Bar(AutoReloader):
    def _ _init_ _(self, what=23):
       self.old_attribute = what
# a subclass of the old class
class Baz(Bar):
    pass
# instances of the old class & of its subclass
b = Bar( )
b2 = Baz( )
# we rebuild the class (normally via 'reload', but, here, in-line!):
class Bar(AutoReloader):
    def _ _init_ _(self, what=42):
       self.new_attribute = what+100
    def _ _update_ _(self):
       # compute new attribute from old ones, then delete old ones
       self.new_attribute = self.old_attribute+100
       del self.old_attribute
    def meth(self, arg):
       # add a new method which wasn't in the old class
       print arg, self.new_attribute
if _ _name_ _ == '_ _main_ _':
    # now b is "upgraded" to the new Bar class, so we can call 'meth':
    b.meth(1)
    # emits: 1 123
    # subclass Baz is also upgraded, both for existing instances...:
    b2.meth(2)
    # emits: 2 123
    # ...and for new ones:
    Baz( ).meth(3)
    # emits: 3 142

Discussion

You’re probably familiar with the problem this recipe is meant to address. The scenario is that you’re editing a Python module with your favorite text editor. Let’s say at some point, your module mod.py looks like this:

class Foo(object):
    def meth1(self, arg):
        print arg

In another window, you have an interactive interpreter running to test your code:

>>> import mod
>>> f = mod.Foo( )
>>> f.meth1(1)1

and it seems to be working. Now you edit mod.py to add another method:

class Foo(object):
    def meth1(self, arg):
        print arg
    def meth2(self, arg):
        print -arg

Head back to the test session:

>>> reload(mod)module 'mod' from 'mod.pyc'
>>> f.meth2(2)
Traceback (most recent call last):
                 File "&lt;stdin&gt;", line 1, in ?
               AttributeError: 'Foo' object has no attribute 'meth2'

Argh! You forgot that f was an instance of the old mod.Foo!

You can do two things about this situation. After reloading, either regenerate the instance:

>>> f = mod.Foo( )
>>> f.meth2(2)-2

or manually assign to f._ _class_ _:

>>> f._ _class_ _ = mod.Foo
>>> f.meth2(2)-2

Regenerating works well in simple situations but can become very tedious. Assigning to the class can be automated, which is what this recipe is all about.

Class MetaInstanceTracker is a metaclass that tracks instances of its instances. As metaclasses go, it isn’t too complicated. New classes of this metatype get an extra _ _instance_refs_ _ class variable (which is used to store weak references to instances) and an _ _instances_ _ class method (which strips out dead references from the _ _instance_refs_ _ list and returns real references to the still live instances). Each time a class whose metatype is MetaInstanceTracker gets instantiated, a weak reference to the instance is appended to the class’ _ _instance_refs_ _ list.

When the definition of a class of metatype MetaAutoReloader executes, the namespace of the definition is examined to determine whether a class of the same name already exists. If it does, then it is assumed that this is a class redefinition, instead of a class definition, and all instances of the old class are updated to the new class. (MetaAutoReloader inherits from MetaInstanceTracker, so such instances can easily be found). All direct subclasses, found through the old class’ intrinsic _ _subclasses_ _ class method, then get their _ _bases_ _ tuples rebuilt with the same change.

The new class definition can optionally include a method _ _update_ _, whose job is to update the state (meaning the set of attributes) of each instance, as the instance’s class transitions from the old version of the class to the new one. The usage example in this recipe’s Solution presents a case in which one attribute has changed name and is computed by different rules, as you can tell by observing the way the _ _init_ _ methods of the old and new versions are coded; in this case, the job of _ _update_ _ is to compute the new attribute based on the value of the old one, then del the old attribute for tidiness.

This recipe’s code should probably do more thorough error checking; Net of error-checking issues, this recipe can also supply some fundamental tools to start solving a problem that is substantially harder than the one explained in this recipe’s Problem statement: automatically upgrade classes in a long-running application, without needing to stop and restart that application.

Doing automatic upgrading in production code is more difficult than doing it during development because many more issues must be monitored. For example, you may need a form of locking to ensure the application is in a quiescent state while a number of classes get upgraded, since you probably don’t want to have the application answering requests in the middle of the upgrading procedure, with some classes or instances already upgraded and others still in their old versions. You also often encounter issues of persistent storage because the application probably needs to update whatever persistent storage it keeps from old to new versions when it upgrades classes. And those are just two examples. Nevertheless, the key component of such on-the-fly upgrading, which has to do with updating instances and subclasses of old classes to new ones, can be tackled with the tools shown in this recipe.

See Also

Docs for the built-in function reload in the Library Reference and Python in a Nutshell.

20.16. Binding Constants at Compile Time

Credit: Raymond Hettinger, Skip Montanaro

Problem

Runtime lookup of global and built-in names is slower than lookup of local names. So, you would like to bind constant global and built-in names into local constant names at compile time.

Solution

To perform this task, we must examine and rewrite bytecodes in the function’s code object. First, we get three names from the standard library module opcode, so we can operate symbolically on bytecodes, and define two auxiliary functions for bytecode operations:

from opcode import opmap, HAVE_ARGUMENT, EXTENDED_ARG
globals( ).update(opmap)
def _insert_constant(value, i, code, constants):
    ''' insert LOAD_CONST for value at code[i:i+3].  Reuse an existing
        constant if values coincide, otherwise append new value to the
        list of constants; return index of the value in constants. '''
    for pos, v in enumerate(constants):
        if v is value: break
    else:
        pos = len(constants)
        constants.append(value)
    code[i] = LOAD_CONST
    code[i+1] = pos & 0xFF
    code[i+2] = pos >> 8
    return pos
def _arg_at(i, code):
    ''' return argument number of the opcode at code[i] '''
    return code[i+1] | (code[i+2] << 8)

Next comes the workhorse, the internal function that does all the binding and folding work:

def _make_constants(f, builtin_only=False, stoplist=( ), verbose=False):
    # bail out at once, innocuously, if we're in Jython, IronPython, etc
    try: co = f.func_code
    except AttributeError: return f
    # we'll modify the bytecodes and consts, so make lists of them
    newcode = map(ord, co.co_code)
    codelen = len(newcode)
    newconsts = list(co.co_consts)
    names = co.co_names
    # Depending on whether we're binding only builtins, or ordinary globals
    # too, we build dictionary 'env' to look up name->value mappings, and we
    # build set 'stoplist' to selectively override and cancel such lookups
    import _ _builtin_ _
    env = vars(_ _builtin_ _).copy( )
    if builtin_only:
        stoplist = set(stoplist)
        stoplist.update(f.func_globals)
    else:
        env.update(f.func_globals)
    # First pass converts global lookups into lookups of constants
    i = 0
    while i < codelen:
        opcode = newcode[i]
        # bail out in difficult cases: optimize common cases only
        if opcode in (EXTENDED_ARG, STORE_GLOBAL):
            return f
        if opcode == LOAD_GLOBAL:
            oparg = _arg_at(i, newcode)
            name = names[oparg]
            if name in env and name not in stoplist:
                # get the constant index to use instead
                pos = _insert_constant(env[name], i, newcode, newconsts)
                if verbose: print '%r -> %r[%d]' % (name, newconsts[pos], pos)
        # move accurately to the next bytecode, skipping arg if any
        i += 1
        if opcode >= HAVE_ARGUMENT:
            i += 2
    # Second pass folds tuples of constants and constant attribute lookups
    i = 0
    while i < codelen:
        newtuple = [  ]
        while newcode[i] == LOAD_CONST:
            oparg = _arg_at(i, newcode)
            newtuple.append(newconsts[oparg])
            i += 3
        opcode = newcode[i]
        if not newtuple:
            i += 1
            if opcode >= HAVE_ARGUMENT:
                i += 2
            continue
        if opcode == LOAD_ATTR:
            obj = newtuple[-1]
            oparg = _arg_at(i, newcode)
            name = names[oparg]
            try:
                value = getattr(obj, name)
            except AttributeError:
                continue
            deletions = 1
        elif opcode == BUILD_TUPLE:
            oparg = _arg_at(i, newcode)
            if oparg != len(newtuple):
                continue
            deletions = len(newtuple)
            value = tuple(newtuple)
        else:
            continue
        reljump = deletions * 3
        newcode[i-reljump] = JUMP_FORWARD
        newcode[i-reljump+1] = (reljump-3) & 0xFF
        newcode[i-reljump+2] = (reljump-3) >> 8
        pos = _insert_constant(value, i, newcode, newconsts)
        if verbose: print "new folded constant: %r[%d]" % (value, pos)
        i += 3
    codestr = ''.join(map(chr, newcode))
    codeobj = type(co)(co.co_argcount, co.co_nlocals, co.co_stacksize,
                    co.co_flags, codestr, tuple(newconsts), co.co_names,
                    co.co_varnames, co.co_filename, co.co_name,
                    co.co_firstlineno, co.co_lnotab, co.co_freevars,
                    co.co_cellvars)
    return type(f)(codeobj, f.func_globals, f.func_name, f.func_defaults,
                    f.func_closure)

Finally, we use _make_constants to optimize itself and its auxiliary function, and define the functions that are meant to be called from outside this module to perform the optimizations that this module supplies:

# optimize thyself!
_insert_constant = _make_constants(_insert_constant)
_make_constants = _make_constants(_make_constants)
import types
@_make_constants
def bind_all(mc, builtin_only=False, stoplist=( ), verbose=False):
    """ Recursively apply constant binding to functions in a module or class.
    """
    try:
        d = vars(mc)
    except TypeError:
        return
    for k, v in d.items( ):
        if type(v) is types.FunctionType:
            newv = _make_constants(v, builtin_only, stoplist,  verbose)
            setattr(mc, k, newv)
        elif type(v) in (type, types.ClassType):
            bind_all(v, builtin_only, stoplist, verbose)
@_make_constants
def make_constants(builtin_only=False, stoplist=[  ], verbose=False):
    """ Call this metadecorator to obtain a decorator which optimizes
        global references by constant binding on a specific function.
    """
    if type(builtin_only) == type(types.FunctionType):
        raise ValueError, 'must CALL, not just MENTION, make_constants'
    return lambda f: _make_constants(f, builtin_only, stoplist, verbose)

Discussion

Assuming you have saved the code in this recipe’s Solution as module optimize.py somewhere on your Python sys.path, the following example demonstrates how to use the make_constants decorator with arguments (i.e., metadecorator) to optimize a function—in this case, a reimplementation of random.sample:

import random
import optimize
@optimize.make_constants(verbose=True)
def sample(population, k):
    " Choose `k' unique random elements from a `population' sequence. "
    if not isinstance(population, (list, tuple, str)):
        raise TypeError('Cannot handle type', type(population))
    n = len(population)
    if not 0 <= k <= n:
        raise ValueError, "sample larger than population"
    result = [None] * k
    pool = list(population)
    for i in xrange(k):         # invariant:  non-selected at [0,n-i)
        j = int(random.random( ) * (n-i))
        result[i] = pool[j]
        pool[j] = pool[n-i-1]   # move non-selected item into vacancy
    return result

Importing this module emits the following output. (Some details, such as the addresses and paths, will, of course, vary.)

'isinstance' -> <built-in function isinstance>[6]
'list' -> <type 'list'>[7]
'tuple' -> <type 'tuple'>[8]
'str' -> <type 'str'>[9]
'TypeError' -> <class exceptions.TypeError at 0x402952cc>[10]
'type' -> <type 'type'>[11]
'len' -> <built-in function len>[12]
'ValueError' -> <class exceptions.ValueError at 0x40295adc>[13]
'list' -> <type 'list'>[7]
'xrange' -> <type 'xrange'>[14]
'int' -> <type 'int'>[15]
'random' -> <module 'random' from '/usr/local/lib/python2.4/random.pyc'>[16]
new folded constant: (<type 'list'>, <type 'tuple'>, <type 'str'>)[17]
new folded constant: <built-in method random of Random object at 0x819853c>[18]

On my machine, with the decorator optimize.make_constants as shown in this snippet, sample(range(1000), 100) takes 287 microseconds; without the decorator (and thus with the usual bytecode that the Python 2.4 compiler produces), the same operation takes 333 microseconds. Thus, using the decorator improves performance by approximately 14% in this example—and it does so while allowing your own functions’ source code to remain pristine, without any optimization-induced obfuscation. On functions making use of more constant names within loops, the performance benefit of using this recipe’s decorator can be correspondingly greater.

A common and important technique for manual optimization of a Python function, once that function is shown by profiling to be a bottleneck of program performance, is to ensure that all global and built-in name lookups are turned into lookups of local names. In the source of functions that have been thus optimized, you see strange arguments with default values, such as _len=len, and the body of the function uses this local name _len to refer to the built-in function len. This kind of optimization is worthwhile because lookup of local names is much faster than lookup of global and built-in names. However, functions thus optimized can become cluttered and less readable. Moreover, optimizing by hand can be tedious and error prone.

This recipe automates this important optimization technique: by just mentioning a decorator before the def statement, you get all the constant bindings and foldings, while leaving the function source uncluttered, readable, and maintainable. After binding globals to constants, the decorator makes a second pass and folds constant attribute lookups and tuples of constants. Constant attribute lookups often occur when you use a function or other attribute from an imported module, such as the use of random.random in the sample function in the example snippet. Tuples of constants commonly occur in for loops and conditionals using the in operator, such as for x in ('a', 'b', 'c'). The best way to appreciate the bytecode transformations performed by the decorator in this recipe is to run "dis.dis(sample)" and view the disassembly into bytecodes, both with and without the decorator.

If you want to optimize every function and method in a module, you can call optimize.bind_all(sys.modules[_ _name_ _]) as the last instruction in the module’s body, before the tests. To optimize every method in a class, you can call optimize.bind_all(theclass) just after the end of the body of the class theclass statement. Such wholesale optimization is handy (it does not require you to deal with any details) but generally not the best approach. It’s best to bind, selectively, only functions whose speed is important. Functions that particularly benefit from constant-binding optimizations are those that refer to many global and built-in names, particularly with references in loops.

To ensure that the constant-binding optimizations do not alter the behavior of your code, apply them only where dynamic updates of globals are not desired (i.e., the globals do not change). In more dynamic environments, a more conservative approach is to pass argument builtin_only as True, so that only the built-ins get optimized (built-ins include functions such as len, exceptions such as IndexError, and such constants as True or False). Alternatively, you can pass a sequence of names as the stoplist argument, to tell the binding optimization functions to leave unchanged any reference to those names.

While this recipe is meant for use with Python 2.4, you can also use this approach in Python 2.3, with a few obvious caveats. In particular, in version 2.3, you cannot use the new 2.4 @decorator syntax. Therefore, to use in Python 2.3, you’ll have to tweak the recipe’s code a little, to expose _make_constants directly, without a leading underscore, and use f=make_constants(f) in your code, right after the end of the body of the def f statement. However, if you are interested in optimization, you should consider moving to Python 2.4 anyway: Python 2.4 is very compatible with Python 2.3, with just a few useful additions, and version 2.4 is generally measurably faster than Python 2.3.

See Also

Library Reference and Python in a Nutshell docs on the opcode module.

20.17. Solving Metaclass Conflicts

Credit: Michele Simionato, David Mertz, Phillip J. Eby, Alex Martelli, Anna Martelli Ravenscroft

Problem

You need to multiply inherit from several classes that may come from several metaclasses, so you need to generate automatically a custom metaclass to solve any possible metaclass conflicts.

Solution

First of all, given a sequence of metaclasses, we want to filter out “redundant” ones—those that are already implied by others, being duplicates or superclasses. This job nicely factors into a general-purpose generator yielding the unique, nonredundant items of an iterable, and a function using inspect.getmro to make the set of all superclasses of the given classes (since superclasses are redundant):

# support 2.3, too
try: set
except NameError: from sets import Set as set
# support classic classes, to some extent
import types
def uniques(sequence, skipset):
    for item in sequence:
        if item not in skipset:
            yield item
            skipset.add(item)
import inspect
def remove_redundant(classes):
    redundant = set([types.ClassType])   # turn old-style classes to new
    for c in classes:
        redundant.update(inspect.getmro(c)[1:])
    return tuple(uniques(classes, redundant))

Using the remove_redundant function, we can generate a metaclass that can resolve metatype conflicts (given a sequence of base classes, and other metaclasses to inject both before and after those implied by the base classes). It’s important to avoid generating more than one metaclass to solve the same potential conflicts, so we also keep a “memoization” mapping:

memoized_metaclasses_map = {  }
def _get_noconflict_metaclass(bases, left_metas, right_metas):
     # make tuple of needed metaclasses in specified order
     metas = left_metas + tuple(map(type, bases)) + right_metas
     needed_metas = remove_redundant(metas)
     # return existing confict-solving meta, if any
     try: return memoized_metaclasses_map[needed_metas]
     except KeyError: pass
     # compute, memoize and return needed conflict-solving meta
     if not needed_metas:         # whee, a trivial case, happy us
         meta = type
     elif len(needed_metas) == 1: # another trivial case
         meta = needed_metas[0]
     else:                        # nontrivial, darn, gotta work...
         # ward against non-type root metatypes
         for m in needed_metas:
             if not issubclass(m, type):
                 raise TypeError( 'Non-type root metatype %r' % m)
         metaname = '_' + ''.join([m._ _name_ _ for m in needed_metas])
         meta = classmaker( )(metaname, needed_metas, {  })
     memoized_metaclasses_map[needed_metas] = meta
     return meta
def classmaker(left_metas=( ), right_metas=( )):
     def make_class(name, bases, adict):
         metaclass = _get_noconflict_metaclass(bases, left_metas, right_metas)
         return metaclass(name, bases, adict)
     return make_class

The internal _get_noconflict_metaclass function, which returns (and, if needed, builds) the conflict-resolution metaclass, and the public classmaker closure must be mutually recursive for a rather subtle reason. If _get_noconflict_metaclass just built the metaclass with the reasonably common idiom:

         meta = type(metaname, needed_metas, {  })

it would work in all ordinary cases, but it might get into trouble when the metaclasses involved have custom metaclasses themselves! Just like “little fleas have lesser fleas,” so, potentially, metaclasses can have meta-metaclasses, and so on—fortunately not “ad infinitum,” pace Augustus De Morgan, so the mutual recursion does eventually terminate.

The recipe offers minimal support for old-style (i.e., classic) classes, with the simple expedient of initializing the set redundant to contain the metaclass of old-style classes, types.ClassType. In practice, this recipe imposes automatic conversion to new-style classes. Trying to offer more support than this for classic classes, which are after all a mere legacy feature, would be overkill, given the confused and confusing situation of metaclass use for old-style classes.

In all of our code outside of this noconflict.py module, we will only use noconflict.classmaker, optionally passing it metaclasses we want to inject, left and right, to obtain a callable that we can then use just like a metaclass to build new class objects given names, bases, and dictionary, but with the assurance that metatype conflicts cannot occur. Phew. Now that was worth it, wasn’t it?!

Discussion

Here is the simplest case in which a metatype conflict can occur: multiply inheriting from two classes with independent metaclasses. In a pedagogically simplified toy-level example, that could be, say:

>>> class Meta_A(type): pass
... 
>>> class Meta_B(type): pass
... 
>>> class A: _ _metaclass_ _ = Meta_A
... 
>>> class B: _ _metaclass_ _ = Meta_B
... 
>>> class C(A, B): passTraceback (most recent call last):
                 File "<stdin>", line 1, in ?
               TypeError: Error when calling the metaclass bases
                   metaclass conflict: the metaclass of a derived class must be a
               (non-strict) subclass of the metaclasses of all its bases
>>>

A class normally inherits its metaclass from its bases, but when the bases have distinct metaclasses, the metatype constraint that Python expresses so tersely in this error message applies. So, we need to build a new metaclass, say Meta_C, which inherits from both Meta_A and Meta_B. For a demonstration of this need, see the book that’s so aptly considered the bible of metaclasses: Ira R. Forman and Scott H. Danforth, Putting Metaclasses to Work: A New Dimension in Object-Oriented Programming (Addison-Wesley).

Python does not do magic: it does not automatically create the required Meta_C. Rather, Python raises a TypeError to ensure that the programmer is aware of the problem. In simple cases, the programmer can solve the metatype conflict by hand, as follows:

>>> class Meta_C(Meta_A, Meta_B): pass
>>> class C(A, B): _ _metaclass_ _ = Meta_C

In this case, everything works smoothly.

The key point of this recipe is to show an automatic way to resolve metatype conflicts, rather than having to do it by hand every time. Having saved all the code from this recipe’s Solution into noconflict.py somewhere along your Python sys.path, you can make class C with automatic conflict resolution, as follows:

>>> import noconflict
>>> class C(A, B): _ _metaclass_ _ = noconflict.classmaker( )

The call to the noconflict.classmaker closure returns a function that, when Python calls it, obtains the proper metaclass and uses it to build the class object. It cannot yet return the metaclass itself, but that’s OK—you can assign anything you want to the _ _metaclass_ _ attribute of your class, as long as it’s callable with the (name, bases, dict) arguments and nicely builds the class object. Once again, Python’s signature-based polymorphism serves us well and unobtrusively.

Automating the resolution of the metatype conflict has many pluses, even in simple cases. Thanks to the “memoizing” technique used in noconflict.py, the same conflict-resolving metaclass is used for any occurrence of a given sequence of conflicting metaclasses. Moreover, with this approach you may also explicitly inject other metaclasses, beyond those you get from your base classes, and again you can avoid conflicts. Consider:

>>> class D(A): _ _metaclass_ _ = Meta_BTraceback (most recent call last):
                 File "<stdin>", line 1, in ?
               TypeError: Error when calling the metaclass bases
                   metaclass conflict: the metaclass of a derived class must be a
               (non-strict) subclass of the metaclasses of all its bases

This metatype conflict is resolved just as easily as the former one:

>>> class D(A): _ _metaclass_ _ = noconflict.classmaker((Meta_B,))

The code presented in this recipe’s Solution takes pains to avoid any subclassing that is not strictly necessary, and it also uses mutual recursion to avoid any meta-level of meta-meta-type conflicts. You might never meet higher-order-meta conflicts anyway, but if you adopt the code presented in this recipe, you need not even worry about them.

Thanks to David Mertz for help in polishing the original version of the code. This version has benefited immensely from discussions with Phillip J. Eby. Alex Martelli and Anna Martelli Ravenscroft did their best to make the recipe’s code and discussion as explicit and understandable as they could. The functionality in this recipe is not absolutely complete: for example, it supports old-style classes only in a rather backhanded way, and it does not really cover such exotica as nontype metatype roots (such as Zope 2’s old ExtensionClass). These limitations are there primarily to keep this recipe as understandable as possible. You may find a more complete implementation of metatype conflict resolution at Phillip J. Eby’s PEAK site, http://peak.telecommunity.com/, in the peak.util.Meta module of the PEAK framework.

See Also

Ira R. Forman and Scott H. Danforth, Putting Metaclasses to Work: A New Dimension in Object-Oriented Programming (Addison-Wesley); Michele Simionato’s essay, “Method Resolution Order,” http://www.python.org/2.3/mro.html.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.76.204