Chapter 37. Managed Attributes

This chapter expands on the attribute interception techniques introduced earlier, introduces another, and employs them in a handful of larger examples. Like everything in this part of the book, this chapter is classified as an advanced topic and optional reading, because most applications programmers don’t need to care about the material discussed here—they can fetch and set attributes on objects without concern for attribute implementations. Especially for tools builders, though, managing attribute access can be an important part of flexible APIs.

Why Manage Attributes?

Object attributes are central to most Python programs—they are where we often store information about the entities our scripts process. Normally, attributes are simply names for objects; a person’s name attribute, for example, might be a simple string, fetched and set with basic attribute syntax:

person.name                 # Fetch attribute value
person.name = value         # Change attribute value

In most cases, the attribute lives in the object itself, or is inherited from a class from which it derives. That basic model suffices for most programs you will write in your Python career.

Sometimes, though, more flexibility is required. Suppose you’ve written a program to use a name attribute directly, but then your requirements change—for example, you decide that names should be validated with logic when set or mutated in some way when fetched. It’s straightforward to code methods to manage access to the attribute’s value (valid and transform are abstract here):

class Person:
    def getName(self):
        if not valid():
            raise TypeError('cannot fetch name')
        else:
            return self.name.transform()
    def setName(self, value):
         if not valid(value):
            raise TypeError('cannot change name')
        else:
            self.name = transform(value)

person = Person()
person.getName()
person.setName('value')

However, this also requires changing all the places where names are used in the entire program—a possibly nontrivial task. Moreover, this approach requires the program to be aware of how values are exported: as simple names or called methods. If you begin with a method-based interface to data, clients are immune to changes; if you do not, they can become problematic.

This issue can crop up more often than you might expect. The value of a cell in a spreadsheet-like program, for instance, might begin its life as a simple discrete value, but later mutate into an arbitrary calculation. Since an object’s interface should be flexible enough to support such future changes without breaking existing code, switching to methods later is less than ideal.

Inserting Code to Run on Attribute Access

A better solution would allow you to run code automatically on attribute access, if needed. At various points in this book, we’ve met Python tools that allow our scripts to dynamically compute attribute values when fetching them and validate or change attribute values when storing them. In this chapter, were going to expand on the tools already introduced, explore other available tools, and study some larger use-case examples in this domain. Specifically, this chapter presents:

  • The __getattr__ and __setattr__ methods, for routing undefined attribute fetches and all attribute assignments to generic handler methods.

  • The __getattribute__ method, for routing all attribute fetches to a generic handler method in new-style classes in 2.6 and all classes in 3.0.

  • The property built-in, for routing specific attribute access to get and set handler functions, known as properties.

  • The descriptor protocol, for routing specific attribute accesses to instances of classes with arbitrary get and set handler methods.

The first and third of these were briefly introduced in Part VI; the others are new topics introduced and covered here.

As we’ll see, all four techniques share goals to some degree, and it’s usually possible to code a given problem using any one of them. They do differ in some important ways, though. For example, the last two techniques listed here apply to specific attributes, whereas the first two are generic enough to be used by delegation-based classes that must route arbitrary attributes to wrapped objects. As we’ll see, all four schemes also differ in both complexity and aesthetics, in ways you must see in action to judge for yourself.

Besides studying the specifics behind the four attribute interception techniques listed in this section, this chapter also presents an opportunity to explore larger programs than we’ve seen elsewhere in this book. The CardHolder case study at the end, for example, should serve as a self-study example of larger classes in action. We’ll also be using some of the techniques outlined here in the next chapter to code decorators, so be sure you have at least a general understanding of these topics before you move on.

Properties

The property protocol allows us to route a specific attribute’s get and set operations to functions or methods we provide, enabling us to insert code to be run automatically on attribute access, intercept attribute deletions, and provide documentation for the attributes if desired.

Properties are created with the property built-in and are assigned to class attributes, just like method functions. As such, they are inherited by subclasses and instances, like any other class attributes. Their access-interception functions are provided with the self instance argument, which grants access to state information and class attributes available on the subject instance.

A property manages a single, specific attribute; although it can’t catch all attribute accesses generically, it allows us to control both fetch and assignment accesses and enables us to change an attribute from simple data to a computation freely, without breaking existing code. As we’ll see, properties are strongly related to descriptors; they are essentially a restricted form of them.

The Basics

A property is created by assigning the result of a built-in function to a class attribute:

attribute = property(fget, fset, fdel, doc)

None of this built-in’s arguments are required, and all default to None if not passed; such operations are not supported, and attempting them will raise an exception. When using them, we pass fget a function for intercepting attribute fetches, fset a function for assignments, and fdel a function for attribute deletions; the doc argument receives a documentation string for the attribute, if desired (otherwise the property copies the docstring of fget, if provided, which defaults to None). fget returns the computed attribute value, and fset and fdel return nothing (really, None).

This built-in call returns a property object, which we assign to the name of the attribute to be managed in the class scope, where it will be inherited by every instance.

A First Example

To demonstrate how this translates to working code, the following class uses a property to trace access to an attribute named name; the actual stored data is named _name so it does not clash with the property:

class Person:                       # Use (object) in 2.6
    def __init__(self, name):
        self._name = name
    def getName(self):
        print('fetch...')
        return self._name
    def setName(self, value):
        print('change...')
        self._name = value
    def delName(self):
        print('remove...')
        del self._name
    name = property(getName, setName, delName, "name property docs")

bob = Person('Bob Smith')           # bob has a managed attribute
print(bob.name)                     # Runs getName
bob.name = 'Robert Smith'           # Runs setName
print(bob.name)
del bob.name                        # Runs delName

print('-'*20)
sue = Person('Sue Jones')           # sue inherits property too
print(sue.name)
print(Person.name.__doc__)          # Or help(Person.name)

Properties are available in both 2.6 and 3.0, but they require new-style object derivation in 2.6 to work correctly for assignments—add object as a superclass here to run this in 2.6 (you can list the superclass in 3.0 too, but it’s implied and not required).

This particular property doesn’t do much—it simply intercepts and traces an attribute—but it serves to demonstrate the protocol. When this code is run, two instances inherit the property, just as they would any other attribute attached to their class. However, their attribute accesses are caught:

fetch...
Bob Smith
change...
fetch...
Robert Smith
remove...
--------------------
fetch...
Sue Jones
name property docs

Like all class attributes, properties are inherited by both instances and lower subclasses. If we change our example as follows, for example:

class Super:
    ...the original Person class code...
    name = property(getName, setName, delName, 'name property docs')

class Person(Super):
    pass                            # Properties are inherited

bob = Person('Bob Smith')
...rest unchanged...

the output is the same—the Person subclass inherits the name property from Super, and the bob instance gets it from Person. In terms of inheritance, properties work the same as normal methods; because they have access to the self instance argument, they can access instance state information like methods, as the next section demonstrates.

Computed Attributes

The example in the prior section simply traces attribute accesses. Usually, though, properties do much more—computing the value of an attribute dynamically when fetched, for example. The following example illustrates:

class PropSquare:
    def __init__(self, start):
        self.value = start
    def getX(self):                         # On attr fetch
        return self.value ** 2
    def setX(self, value):                  # On attr assign
        self.value = value
    X = property(getX, setX)                # No delete or docs

P = PropSquare(3)       # 2 instances of class with property
Q = PropSquare(32)      # Each has different state information

print(P.X)              # 3 ** 2
P.X = 4
print(P.X)              # 4 ** 2
print(Q.X)              # 32 ** 2

This class defines an attribute X that is accessed as though it were static data, but really runs code to compute its value when fetched. The effect is much like an implicit method call. When the code is run, the value is stored in the instance as state information, but each time we fetch it via the managed attribute, its value is automatically squared:

9
16
1024

Notice that we’ve made two different instances—because property methods automatically receive a self argument, they have access to the state information stored in instances. In our case, this mean the fetch computes the square of the subject instance’s data.

Coding Properties with Decorators

Although we’re saving additional details until the next chapter, we introduced function decorator basics earlier, in Chapter 31. Recall that the function decorator syntax:

@decorator
def func(args): ...

is automatically translated to this equivalent by Python, to rebind the function name to the result of the decorator callable:

def func(args): ...
func = decorator(func)

Because of this mapping, it turns out that the property built-in can serve as a decorator, to define a function that will run automatically when an attribute is fetched:

class Person:
    @property
    def name(self): ...             # Rebinds: name = property(name)

When run, the decorated method is automatically passed to the first argument of the property built-in. This is really just alternative syntax for creating a property and rebinding the attribute name manually:

class Person:
    def name(self): ...
    name = property(name)

As of Python 2.6, property objects also have getter, setter, and deleter methods that assign the corresponding property accessor methods and return a copy of the property itself. We can use these to specify components of properties by decorating normal methods too, though the getter component is usually filled in automatically by the act of creating the property itself:

class Person:
    def __init__(self, name):
        self._name = name

    @property
    def name(self):                 # name = property(name)
        "name property docs"
        print('fetch...')
        return self._name

    @name.setter
    def name(self, value):          # name = name.setter(name)
        print('change...')
        self._name = value

    @name.deleter
    def name(self):                 # name = name.deleter(name)
        print('remove...')
        del self._name

bob = Person('Bob Smith')           # bob has a managed attribute
print(bob.name)                     # Runs name getter (name 1)
bob.name = 'Robert Smith'           # Runs name setter (name 2)
print(bob.name)
del bob.name                        # Runs name deleter (name 3)

print('-'*20)
sue = Person('Sue Jones')           # sue inherits property too
print(sue.name)
print(Person.name.__doc__)          # Or help(Person.name)

In fact, this code is equivalent to the first example in this section—decoration is just an alternative way to code properties in this case. When it’s run, the results are the same:

fetch...
Bob Smith
change...
fetch...
Robert Smith
remove...
--------------------
fetch...
Sue Jones
name property docs

Compared to manual assignment of property results, in this case using decorators to code properties requires just three extra lines of code (a negligible difference). As is so often the case with alternative tools, the choice between the two techniques is largely subjective.

Descriptors

Descriptors provide an alternative way to intercept attribute access; they are strongly related to the properties discussed in the prior section. In fact, a property is a kind of descriptor—technically speaking, the property built-in is just a simplified way to create a specific type of descriptor that runs method functions on attribute accesses.

Functionally speaking, the descriptor protocol allows us to route a specific attribute’s get and set operations to methods of a separate class object that we provide: they provide a way to insert code to be run automatically on attribute access, and they allow us to intercept attribute deletions and provide documentation for the attributes if desired.

Descriptors are created as independent classes, and they are assigned to class attributes just like method functions. Like any other class attribute, they are inherited by subclasses and instances. Their access-interception methods are provided with both a self for the descriptor itself, and the instance of the client class. Because of this, they can retain and use state information of their own, as well as state information of the subject instance. For example, a descriptor may call methods available in the client class, as well as descriptor-specific methods it defines.

Like a property, a descriptor manages a single, specific attribute; although it can’t catch all attribute accesses generically, it provides control over both fetch and assignment accesses and allows us to change an attribute freely from simple data to a computation without breaking existing code. Properties really are just a convenient way to create a specific kind of descriptor, and as we shall see, they can be coded as descriptors directly.

Whereas properties are fairly narrow in scope, descriptors provide a more general solution. For instance, because they are coded as normal classes, descriptors have their own state, may participate in descriptor inheritance hierarchies, can use composition to aggregate objects, and provide a natural structure for coding internal methods and attribute documentation strings.

The Basics

As mentioned previously, descriptors are coded as separate classes and provide specially named accessor methods for the attribute access operations they wish to intercept—get, set, and deletion methods in the descriptor class are automatically run when the attribute assigned to the descriptor class instance is accessed in the corresponding way:

class Descriptor:
    "docstring goes here"
    def __get__(self, instance, owner): ...        # Return attr value
    def __set__(self, instance, value): ...        # Return nothing (None)
    def __delete__(self, instance): ...            # Return nothing (None)

Classes with any of these methods are considered descriptors, and their methods are special when one of their instances is assigned to another class’s attribute—when the attribute is accessed, they are automatically invoked. If any of these methods are absent, it generally means that the corresponding type of access is not supported. Unlike with properties, however, omitting a __set__ allows the name to be redefined in an instance, thereby hiding the descriptor—to make an attribute read-only, you must define __set__ to catch assignments and raise an exception.

Descriptor method arguments

Before we code anything realistic, let’s take a brief look at some fundamentals. All three descriptor methods outlined in the prior section are passed both the descriptor class instance (self) and the instance of the client class to which the descriptor instance is attached (instance).

The __get__ access method additionally receives an owner argument, specifying the class to which the descriptor instance is attached. Its instance argument is either the instance through which the attribute was accessed (for instance.attr), or None when the attribute is accessed through the owner class directly (for class.attr). The former of these generally computes a value for instance access, and the latter usually returns self if descriptor object access is supported.

For example, in the following, when X.attr is fetched, Python automatically runs the __get__ method of the Descriptor class to which the Subject.attr class attribute is assigned (as with properties, in Python 2.6 we must derive from object to use descriptors here; in 3.0 this is implied, but doesn’t hurt):

>>> class Descriptor(object):
...     def __get__(self, instance, owner):
...         print(self, instance, owner, sep='
')
...
>>> class Subject:
...     attr = Descriptor()             # Descriptor instance is class attr
...
>>> X = Subject()

>>> X.attr
<__main__.Descriptor object at 0x0281E690>
<__main__.Subject object at 0x028289B0>
<class '__main__.Subject'>

>>> Subject.attr
<__main__.Descriptor object at 0x0281E690>
None
<class '__main__.Subject'>

Notice the arguments automatically passed in to the __get__ method in the first attribute fetch—when X.attr is fetched, it’s as though the following translation occurs (though the Subject.attr here doesn’t invoke __get__ again):

X.attr  ->  Descriptor.__get__(Subject.attr, X, Subject)

The descriptor knows it is being accessed directly when its instance argument is None.

Read-only descriptors

As mentioned earlier, unlike with properties, with descriptors simply omitting the __set__ method isn’t enough to make an attribute read-only, because the descriptor name can be assigned to an instance. In the following, the attribute assignment to X.a stores a in the instance object X, thereby hiding the descriptor stored in class C:

>>> class D:
...     def __get__(*args): print('get')
...
>>> class C:
...     a = D()
...
>>> X = C()
>>> X.a                                 # Runs inherited descriptor __get__
get
>>> C.a
get
>>> X.a = 99                            # Stored on X, hiding C.a
>>> X.a
99
>>> list(X.__dict__.keys())
['a']
>>> Y = C()
>>> Y.a                                 # Y still inherits descriptor
get
>>> C.a
get

This is the way all instance attribute assignments work in Python, and it allows classes to selectively override class-level defaults in their instances. To make a descriptor-based attribute read-only, catch the assignment in the descriptor class and raise an exception to prevent attribute assignment—when assigning an attribute that is a descriptor, Python effectively bypasses the normal instance-level assignment behavior and routes the operation to the descriptor object:

>>> class D:
...     def __get__(*args): print('get')
...     def __set__(*args): raise AttributeError('cannot set')
...
>>> class C:
...     a = D()
...
>>> X = C()
>>> X.a                                 # Routed to C.a.__get__
get
>>> X.a = 99                            # Routed to C.a.__set__
AttributeError: cannot set

Note

Also be careful not to confuse the descriptor __delete__ method with the general __del__ method. The former is called on attempts to delete the managed attribute name on an instance of the owner class; the latter is the general instance destructor method, run when an instance of any kind of class is about to be garbage collected. __delete__ is more closely related to the __delattr__ generic attribute deletion method we’ll meet later in this chapter. See Chapter 29 for more on operator overloading methods.

A First Example

To see how this all comes together in more realistic code, let’s get started with the same first example we wrote for properties. The following defines a descriptor that intercepts access to an attribute named name in its clients. Its methods use their instance argument to access state information in the subject instance, where the name string is actually stored. Like properties, descriptors work properly only for new-style classes, so be sure to derive both classes in the following from object if you’re using 2.6:

class Name:                             # Use (object) in 2.6
    "name descriptor docs"
    def __get__(self, instance, owner):
        print('fetch...')
        return instance._name
    def __set__(self, instance, value):
        print('change...')
        instance._name = value
    def __delete__(self, instance):
        print('remove...')
        del instance._name

class Person:                           # Use (object) in 2.6
    def __init__(self, name):
        self._name = name
    name = Name()                       # Assign descriptor to attr

bob = Person('Bob Smith')               # bob has a managed attribute
print(bob.name)                         # Runs Name.__get__
bob.name = 'Robert Smith'               # Runs Name.__set__
print(bob.name)
del bob.name                            # Runs Name.__delete__

print('-'*20)
sue = Person('Sue Jones')               # sue inherits descriptor too
print(sue.name)
print(Name.__doc__)                     # Or help(Name)

Notice in this code how we assign an instance of our descriptor class to a class attribute in the client class; because of this, it is inherited by all instances of the class, just like a class’s methods. Really, we must assign the descriptor to a class attribute like this—it won’t work if assigned to a self instance attribute instead. When the descriptor’s __get__ method is run, it is passed three objects to define its context:

  • self is the Name class instance.

  • instance is the Person class instance.

  • owner is the Person class.

When this code is run the descriptor’s methods intercept accesses to the attribute, much like the property version. In fact, the output is the same again:

fetch...
Bob Smith
change...
fetch...
Robert Smith
remove...
--------------------
fetch...
Sue Jones
name descriptor docs

Also like in the property example, our descriptor class instance is a class attribute and thus is inherited by all instances of the client class and any subclasses. If we change the Person class in our example to the following, for instance, the output of our script is the same:

...
class Super:
    def __init__(self, name):
        self._name = name
    name = Name()

class Person(Super):                            # Descriptors are inherited
   pass
...

Also note that when a descriptor class is not useful outside the client class, it’s perfectly reasonable to embed the descriptor’s definition inside its client syntactically. Here’s what our example looks like if we use a nested class:

class Person:
    def __init__(self, name):
        self._name = name

    class Name:                                 # Using a nested class
        "name descriptor docs"
        def __get__(self, instance, owner):
            print('fetch...')
            return instance._name
        def __set__(self, instance, value):
            print('change...')
            instance._name = value
        def __delete__(self, instance):
            print('remove...')
            del instance._name
    name = Name()

When coded this way, Name becomes a local variable in the scope of the Person class statement, such that it won’t clash with any names outside the class. This version works the same as the original—we’ve simply moved the descriptor class definition into the client class’s scope—but the last line of the testing code must change to fetch the docstring from its new location:

...
print(Person.Name.__doc__)     # Differs: not Name.__doc__ outside class

Computed Attributes

As was the case when using properties, our first descriptor example of the prior section didn’t do much—it simply printed trace messages for attribute accesses. In practice, descriptors can also be used to compute attribute values each time they are fetched. The following illustrates—it’s a rehash of the same example we coded for properties, which uses a descriptor to automatically square an attribute’s value each time it is fetched:

class DescSquare:
    def __init__(self, start):                  # Each desc has own state
        self.value = start
    def __get__(self, instance, owner):         # On attr fetch
        return self.value ** 2
    def __set__(self, instance, value):         # On attr assign
        self.value = value                      # No delete or docs

class Client1:
    X = DescSquare(3)          # Assign descriptor instance to class attr

class Client2:
    X = DescSquare(32)         # Another instance in another client class
                               # Could also code 2 instances in same class
c1 = Client1()
c2 = Client2()

print(c1.X)                    # 3 ** 2
c1.X = 4
print(c1.X)                    # 4 ** 2
print(c2.X)                    # 32 ** 2

When run, the output of this example is the same as that of the original property-based version, but here a descriptor class object is intercepting the attribute accesses:

9
16
1024

Using State Information in Descriptors

If you study the two descriptor examples we’ve written so far, you might notice that they get their information from different places—the first (the name attribute example) uses data stored on the client instance, and the second (the attribute squaring example) uses data attached to the descriptor object itself. In fact, descriptors can use both instance state and descriptor state, or any combination thereof:

  • Descriptor state is used to manage data internal to the workings of the descriptor.

  • Instance state records information related to and possibly created by the client class.

Descriptor methods may use either, but descriptor state often makes it unnecessary to use special naming conventions to avoid name collisions for descriptor data stored on an instance. For example, the following descriptor attaches information to its own instance, so it doesn’t clash with that on the client class’s instance:

class DescState:                           # Use descriptor state
    def __init__(self, value):
        self.value = value
    def __get__(self, instance, owner):    # On attr fetch
        print('DescState get')
        return self.value * 10
    def __set__(self, instance, value):    # On attr assign
        print('DescState set')
        self.value = value

# Client class

class CalcAttrs:
    X = DescState(2)                       # Descriptor class attr
    Y = 3                                  # Class attr
    def __init__(self):
        self.Z = 4                         # Instance attr

obj = CalcAttrs()
print(obj.X, obj.Y, obj.Z)                 # X is computed, others are not
obj.X = 5                                  # X assignment is intercepted
obj.Y = 6
obj.Z = 7
print(obj.X, obj.Y, obj.Z)

This code’s value information lives only in the descriptor, so there won’t be a collision if the same name is used in the client’s instance. Notice that only the descriptor attribute is managed here—get and set accesses to X are intercepted, but accesses to Y and Z are not (Y is attached to the client class and Z to the instance). When this code is run, X is computed when fetched:

DescState get
20 3 4
DescState set
DescState get
50 6 7

It’s also feasible for a descriptor to store or use an attribute attached to the client class’s instance, instead of itself. Unlike data stored in the descriptor itself, this allows for data that can vary per client class instance. The descriptor in the following example assumes the instance has an attribute _Y attached by the client class, and uses it to compute the value of the attribute it represents:

class InstState:                           # Using instance state
    def __get__(self, instance, owner):
        print('InstState get')             # Assume set by client class
        return instance._Y * 100
    def __set__(self, instance, value):
        print('InstState set')
        instance._Y = value

# Client class

class CalcAttrs:
    X = DescState(2)                       # Descriptor class attr
    Y = InstState()                        # Descriptor class attr
    def __init__(self):
        self._Y = 3                        # Instance attr
        self.Z  = 4                        # Instance attr

obj = CalcAttrs()
print(obj.X, obj.Y, obj.Z)                 # X and Y are computed, Z is not
obj.X = 5                                  # X and Y assignments intercepted
obj.Y = 6
obj.Z = 7
print(obj.X, obj.Y, obj.Z)

This time, X and Y are both assigned to descriptors and computed when fetched (X is assigned the descriptor of the prior example). The new descriptor here has no information itself, but it uses an attribute assumed to exist in the instance—that attribute is named _Y, to avoid collisions with the name of the descriptor itself. When this version is run the results are similar, but a second attribute is managed, using state that lives in the instance instead of the descriptor:

DescState get
InstState get
20 300 4
DescState set
InstState set
DescState get
InstState get
50 600 7

Both descriptor and instance state have roles. In fact, this is a general advantage that descriptors have over properties—because they have state of their own, they can easily retain data internally, without adding it to the namespace of the client instance object.

How Properties and Descriptors Relate

As mentioned earlier, properties and descriptors are strongly related—the property built-in is just a convenient way to create a descriptor. Now that you know how both work, you should also be able to see that it’s possible to simulate the property built-in with a descriptor class like the following:

class Property:
    def __init__(self, fget=None, fset=None, fdel=None, doc=None):
        self.fget = fget
        self.fset = fset
        self.fdel = fdel                                  # Save unbound methods
        self.__doc__ = doc                                # or other callables

    def __get__(self, instance, instancetype=None):
        if instance is None:
            return self
        if self.fget is None:
            raise AttributeError("can't get attribute")
        return self.fget(instance)                        # Pass instance to self
                                                          # in property accessors
    def __set__(self, instance, value):
        if self.fset is None:
            raise AttributeError("can't set attribute")
        self.fset(instance, value)

    def __delete__(self, instance):
        if self.fdel is None:
            raise AttributeError("can't delete attribute")
        self.fdel(instance)

class Person:
    def getName(self): ...
    def setName(self, value): ...
    name = Property(getName, setName)                     # Use like property()

This Property class catches attribute accesses with the descriptor protocol and routes requests to functions or methods passed in and saved in descriptor state when the class is created. Attribute fetches, for example, are routed from the Person class, to the Property class’s __get__ method, and back to the Person class’s getName. With descriptors, this “just works.”

Note that this descriptor class equivalent only handles basic property usage, though; to use @ decorator syntax to also specify set and delete operations, our Property class would also have to be extended with setter and deleter methods, which would save the decorated accessor function and return the property object (self should suffice). Since the property built-in already does this, we’ll omit a formal coding of this extension here.

Also note that descriptors are used to implement Python’s __slots__; instance attribute dictionaries are avoided by intercepting slot names with descriptors stored at the class level. See Chapter 31 for more on slots.

Note

In Chapter 38, we’ll also make use of descriptors to implement function decorators that apply to both functions and methods. As you’ll see there, because descriptors receive both descriptor and subject class instances they work well in this role, though nested functions are usually a simpler solution.

__getattr__ and __getattribute__

So far, we’ve studied properties and descriptors—tools for managing specific attributes. The __getattr__ and __getattribute__ operator overloading methods provide still other ways to intercept attribute fetches for class instances. Like properties and descriptors, they allow us to insert code to be run automatically when attributes are accessed; as we’ll see, though, these two methods can be used in more general ways.

Attribute fetch interception comes in two flavors, coded with two different methods:

  • __getattr__ is run for undefined attributes—that is, attributes not stored on an instance or inherited from one of its classes.

  • __getattribute__ is run for every attribute, so when using it you must be cautious to avoid recursive loops by passing attribute accesses to a superclass.

We met the former of these in Chapter 29; it’s available for all Python versions. The latter of these is available for new-style classes in 2.6, and for all (implicitly new-style) classes in 3.0. These two methods are representatives of a set of attribute interception methods that also includes __setattr__ and __delattr__. Because these methods have similar roles, we will generally treat them as a single topic here.

Unlike properties and descriptors, these methods are part of Python’s operator overloading protocol—specially named methods of a class, inherited by subclasses, and run automatically when instances are used in the implied built-in operation. Like all methods of a class, they each receive a first self argument when called, giving access to any required instance state information or other methods of the class.

The __getattr__ and __getattribute__ methods are also more generic than properties and descriptors—they can be used to intercept access to any (or even all) instance attribute fetches, not just the specific name to which they are assigned. Because of this, these two methods are well suited to general delegation-based coding patterns—they can be used to implement wrapper objects that manage all attribute accesses for an embedded object. By contrast, we must define one property or descriptor for every attribute we wish to intercept.

Finally, these two methods are more narrowly focused than the alternatives we considered earlier: they intercept attribute fetches only, not assignments. To also catch attribute changes by assignment, we must code a __setattr__ method—an operator overloading method run for every attribute fetch, which must take care to avoid recursive loops by routing attribute assignments through the instance namespace dictionary.

Although much less common, we can also code a __delattr__ overloading method (which must avoid looping in the same way) to intercept attribute deletions. By contrast, properties and descriptors catch get, set, and delete operations by design.

Most of these operator overloading methods were introduced earlier in the book; here, we’ll expand on their usage and study their roles in larger contexts.

The Basics

__getattr__ and __setattr__ were introduced in Chapters 29 and 31, and __getattribute__ was mentioned briefly in Chapter 31. In short, if a class defines or inherits the following methods, they will be run automatically when an instance is used in the context described by the comments to the right:

def __getattr__(self, name):        # On undefined attribute fetch [obj.name]
def __getattribute__(self, name):   # On all attribute fetch [obj.name]
def __setattr__(self, name, value): # On all attribute assignment [obj.name=value]
def __delattr__(self, name):        # On all attribute deletion [del obj.name]

In all of these, self is the subject instance object as usual, name is the string name of the attribute being accessed, and value is the object being assigned to the attribute. The two get methods normally return an attribute’s value, and the other two return nothing (None). For example, to catch every attribute fetch, we can use either of the first two methods above, and to catch every attribute assignment we can use the third:

class Catcher:
    def __getattr__(self, name):
        print('Get:', name)
    def __setattr__(self, name, value):
        print('Set:', name, value)

X = Catcher()
X.job                               # Prints "Get: job"
X.pay                               # Prints "Get: pay"
X.pay = 99                          # Prints "Set: pay 99"

Such a coding structure can be used to implement the delegation design pattern we met earlier, in Chapter 30. Because all attribute are routed to our interception methods generically, we can validate and pass them along to embedded, managed objects. The following class (borrowed from Chapter 30), for example, traces every attribute fetch made to another object passed to the wrapper class:

class Wrapper:
    def __init__(self, object):
        self.wrapped = object                    # Save object
    def __getattr__(self, attrname):
        print('Trace:', attrname)                # Trace fetch
        return getattr(self.wrapped, attrname)   # Delegate fetch

There is no such analog for properties and descriptors, short of coding accessors for every possible attribute in every possibly wrapped object.

Avoiding loops in attribute interception methods

These methods are generally straightforward to use; their only complex part is the potential for looping (a.k.a. recursing). Because __getattr__ is called for undefined attributes only, it can freely fetch other attributes within its own code. However, because __getattribute__ and __setattr__ are run for all attributes, their code needs to be careful when accessing other attributes to avoid calling themselves again and triggering a recursive loop.

For example, another attribute fetch run inside a __getattribute__ method’s code will trigger __getattribute__ again, and the code will loop until memory is exhausted:

    def __getattribute__(self, name):
        x = self.other                                # LOOPS!

To work around this, route the fetch through a higher superclass instead to skip this level’s version—the object class is always a superclass, and it serves well in this role:

    def __getattribute__(self, name):
        x = object.__getattribute__(self, 'other')    # Force higher to avoid me

For __setattr__, the situation is similar; assigning any attribute inside this method triggers __setattr__ again and creates a similar loop:

    def __setattr__(self, name, value):
        self.other = value                            # LOOPS!

To work around this problem, assign the attribute as a key in the instance’s __dict__ namespace dictionary instead. This avoids direct attribute assignment:

    def __setattr__(self, name, value):
        self.__dict__['other'] = value                # Use atttr dict to avoid me

Although it’s a less common approach, __setattr__ can also pass its own attribute assignments to a higher superclass to avoid looping, just like __getattribute__:

    def __setattr__(self, name, value):
        object.__setattr__(self, 'other', value)      # Force higher to avoid me

By contrast, though, we cannot use the __dict__ trick to avoid loops in __getattribute__:

    def __getattribute__(self, name):
        x = self.__dict__['other']                    # LOOPS!

Fetching the __dict__ attribute itself triggers __getattribute__ again, causing a recursive loop. Strange but true!

The __delattr__ method is rarely used in practice, but when it is, it is called for every attribute deletion (just as __setattr__ is called for every attribute assignment). Therefore, you must take care to avoid loops when deleting attributes, by using the same techniques: namespace dictionaries or superclass method calls.

A First Example

All this is not nearly as complicated as the prior section may have implied. To see how to put these ideas to work, here is the same first example we used for properties and descriptors in action again, this time implemented with attribute operator overloading methods. Because these methods are so generic, we test attribute names here to know when a managed attribute is being accessed; others are allowed to pass normally:

class Person:
    def __init__(self, name):               # On [Person()]
        self._name = name                   # Triggers __setattr__!

    def __getattr__(self, attr):            # On [obj.undefined]
        if attr == 'name':                  # Intercept name: not stored
            print('fetch...')
            return self._name               # Does not loop: real attr
        else:                               # Others are errors
            raise AttributeError(attr)

    def __setattr__(self, attr, value):     # On [obj.any = value]
        if attr == 'name':
            print('change...')
            attr = '_name'                  # Set internal name
        self.__dict__[attr] = value         # Avoid looping here

    def __delattr__(self, attr):            # On [del obj.any]
        if attr == 'name':
            print('remove...')
            attr = '_name'                  # Avoid looping here too
        del self.__dict__[attr]             # but much less common

bob = Person('Bob Smith')           # bob has a managed attribute
print(bob.name)                     # Runs __getattr__
bob.name = 'Robert Smith'           # Runs __setattr__
print(bob.name)
del bob.name                        # Runs __delattr__

print('-'*20)
sue = Person('Sue Jones')           # sue inherits property too
print(sue.name)
#print(Person.name.__doc__)         # No equivalent here

Notice that the attribute assignment in the __init__ constructor triggers __setattr__ too—this method catches every attribute assignment, even those within the class itself. When this code is run, the same output is produced, but this time it’s the result of Python’s normal operator overloading mechanism and our attribute interception methods:

fetch...
Bob Smith
change...
fetch...
Robert Smith
remove...
--------------------
fetch...
Sue Jones

Also note that, unlike with properties and descriptors, there’s no direct notion of specifying documentation for our attribute here; managed attributes exist within the code of our interception methods, not as distinct objects.

To achieve exactly the same results with __getattribute__, replace __getattr__ in the example with the following; because it catches all attribute fetches, this version must be careful to avoid looping by passing new fetches to a superclass, and it can’t generally assume unknown names are errors:

# Replace __getattr__ with this

    def __getattribute__(self, attr):                 # On [obj.any]
        if attr == 'name':                            # Intercept all names
            print('fetch...')
            attr = '_name'                            # Map to internal name
        return object.__getattribute__(self, attr)    # Avoid looping here

This example is equivalent to that coded for properties and descriptors, but it’s a bit artificial, and it doesn’t really highlight these tools in practice. Because they are generic, __getattr__ and __getattribute__ are probably more commonly used in delegation-base code (as sketched earlier), where attribute access is validated and routed to an embedded object. Where just a single attribute must be managed, properties and descriptors might do as well or better.

Computed Attributes

As before, our prior example doesn’t really do anything but trace attribute fetches; it’s not much more work to compute an attribute’s value when fetched. As for properties and descriptors, the following creates a virtual attribute X that runs a calculation when fetched:

class AttrSquare:
    def __init__(self, start):
        self.value = start                            # Triggers __setattr__!

    def __getattr__(self, attr):                      # On undefined attr fetch
        if attr == 'X':
            return self.value ** 2                    # value is not undefined
        else:
            raise AttributeError(attr)

    def __setattr__(self, attr, value):               # On all attr assignments
        if attr == 'X':
            attr = 'value'
        self.__dict__[attr] = value

A = AttrSquare(3)       # 2 instances of class with overloading
B = AttrSquare(32)      # Each has different state information

print(A.X)              # 3 ** 2
A.X = 4
print(A.X)              # 4 ** 2
print(B.X)              # 32 ** 2

Running this code results in the same output that we got earlier when using properties and descriptors, but this script’s mechanics are based on generic attribute interception methods:

9
16
1024

As before, we can achieve the same effect with __getattribute__ instead of __getattr__; the following replaces the fetch method with a __getattribute__ and changes the __setattr__ assignment method to avoid looping by using direct superclass method calls instead of __dict__ keys:

class AttrSquare:
    def __init__(self, start):
        self.value = start                  # Triggers __setattr__!

    def __getattribute__(self, attr):       # On all attr fetches
        if attr == 'X':
            return self.value ** 2          # Triggers __getattribute__ again!
        else:
            return object.__getattribute__(self, attr)

    def __setattr__(self, attr, value):     # On all attr assignments
        if attr == 'X':
            attr = 'value'
        object.__setattr__(self, attr, value)

When this version is run, the results are the same again. Notice the implicit routing going on in inside this class’s methods:

  • self.value=start inside the constructor triggers __setattr__

  • self.value inside __getattribute__ triggers __getattribute__ again

In fact, __getattribute__ is run twice each time we fetch attribute X. This doesn’t happen in the __getattr__ version, because the value attribute is not undefined. If you care about speed and want to avoid this, change __getattribute__ to use the superclass to fetch value as well:

    def __getattribute__(self, attr):
        if attr == 'X':
            return object.__getattribute__(self, 'value') ** 2

Of course, this still incurs a call to the superclass method, but not an additional recursive call before we get there. Add print calls to these methods to trace how and when they run.

__getattr__ and __getattribute__ Compared

To summarize the coding differences between __getattr__ and __getattribute__, the following example uses both to implement three attributes—attr1 is a class attribute, attr2 is an instance attribute, and attr3 is a virtual managed attribute computed when fetched:

class GetAttr:
    attr1 = 1
    def __init__(self):
        self.attr2 = 2
    def __getattr__(self, attr):            # On undefined attrs only
        print('get: ' + attr)               # Not attr1: inherited from class
        return 3                            # Not attr2: stored on instance

X = GetAttr()
print(X.attr1)
print(X.attr2)
print(X.attr3)

print('-'*40)

class GetAttribute(object):                 # (object) needed in 2.6 only
    attr1 = 1
    def __init__(self):
        self.attr2 = 2
    def __getattribute__(self, attr):       # On all attr fetches
        print('get: ' + attr)               # Use superclass to avoid looping here
        if attr == 'attr3':
            return 3
        else:
            return object.__getattribute__(self, attr)

X = GetAttribute()
print(X.attr1)
print(X.attr2)
print(X.attr3)

When run, the __getattr__ version intercepts only attr3 accesses, because it is undefined. The __getattribute__ version, on the other hand, intercepts all attribute fetches and must route those it does not manage to the superclass fetcher to avoid loops:

1
2
get: attr3
3
----------------------------------------
get: attr1
1
get: attr2
2
get: attr3
3

Although __getattribute__ can catch more attribute fetches than __getattr__, in practice they are often just variations on a theme—if attributes are not physically stored, the two have the same effect.

Management Techniques Compared

To summarize the coding differences in all four attribute management schemes we’ve seen in this chapter, let’s quickly step through a more comprehensive computed-attribute example using each technique. The following version uses properties to intercept and calculate attributes named square and cube. Notice how their base values are stored in names that begin with an underscore, so they don’t clash with the names of the properties themselves:

# 2 dynamically computed attributes with properties

class Powers:
    def __init__(self, square, cube):
        self._square = square                      # _square is the base value
        self._cube   = cube                        # square is the property name

    def getSquare(self):
        return self._square ** 2
    def setSquare(self, value):
        self._square = value
    square = property(getSquare, setSquare)

    def getCube(self):
        return self._cube ** 3
    cube = property(getCube)

X = Powers(3, 4)
print(X.square)      # 3 ** 2 = 9
print(X.cube)        # 4 ** 3 = 64
X.square = 5
print(X.square)      # 5 ** 2 = 25

To do the same with descriptors, we define the attributes with complete classes. Note that these descriptors store base values as instance state, so they must use leading underscores again so as not to clash with the names of descriptors (as we’ll see in the final example of this chapter, we could avoid this renaming requirement by storing base values as descriptor state instead):

# Same, but with descriptors

class DescSquare:
    def __get__(self, instance, owner):
        return instance._square ** 2
    def __set__(self, instance, value):
        instance._square = value

class DescCube:
    def __get__(self, instance, owner):
        return instance._cube ** 3

class Powers:                                  # Use (object) in 2.6
    square = DescSquare()
    cube   = DescCube()
    def __init__(self, square, cube):
        self._square = square                  # "self.square = square" works too,
        self._cube   = cube                    # because it triggers desc __set__!

X = Powers(3, 4)
print(X.square)      # 3 ** 2 = 9
print(X.cube)        # 4 ** 3 = 64
X.square = 5
print(X.square)      # 5 ** 2 = 25

To achieve the same result with __getattr__ fetch interception, we again store base values with underscore-prefixed names so that accesses to managed names are undefined and thus invoke our method; we also need to code a __setattrr__ to intercept assignments, and take care to avoid its potential for looping:

# Same, but with generic __getattr__ undefined attribute interception

class Powers:
    def __init__(self, square, cube):
        self._square = square
        self._cube   = cube

    def __getattr__(self, name):
        if name == 'square':
            return self._square ** 2
        elif name == 'cube':
            return self._cube ** 3
        else:
            raise TypeError('unknown attr:' + name)

    def __setattr__(self, name, value):
        if name == 'square':
            self.__dict__['_square'] = value
        else:
            self.__dict__[name] = value

X = Powers(3, 4)
print(X.square)      # 3 ** 2 = 9
print(X.cube)        # 4 ** 3 = 64
X.square = 5
print(X.square)      # 5 ** 2 = 25

The final option, coding this with __getattribute__, is similar to the prior version. Because we catch every attribute now, though, we must route base value fetches to a superclass to avoid looping:

# Same, but with generic __getattribute__ all attribute interception

class Powers:
    def __init__(self, square, cube):
        self._square = square
        self._cube   = cube
    def __getattribute__(self, name):
        if name == 'square':
            return object.__getattribute__(self, '_square') ** 2
        elif name == 'cube':
            return object.__getattribute__(self, '_cube') ** 3
        else:
            return object.__getattribute__(self, name)
    def __setattr__(self, name, value):
        if name == 'square':
            self.__dict__['_square'] = value
        else:
            self.__dict__[name] = value

X = Powers(3, 4)
print(X.square)      # 3 ** 2 = 9
print(X.cube)        # 4 ** 3 = 64
X.square = 5
print(X.square)      # 5 ** 2 = 25

As you can see, each technique takes a different form in code, but all four produce the same result when run:

9
64
25

For more on how these alternatives compare, and other coding options, stay tuned for a more realistic application of them in the attribute validation example in the section Example: Attribute Validations. First, though, we need to study a pitfall associated with two of these tools.

Intercepting Built-in Operation Attributes

When I introduced __getattr__ and __getattribute__, I stated that they intercept undefined and all attribute fetches, respectively, which makes them ideal for delegation-based coding patterns. While this is true for normally named attributes, their behavior needs some additional clarification: for method-name attributes implicitly fetched by built-in operations, these methods may not be run at all. This means that operator overloading method calls cannot be delegated to wrapped objects unless wrapper classes somehow redefine these methods themselves.

For example, attribute fetches for the __str__, __add__, and __getitem__ methods run implicitly by printing, + expressions, and indexing, respectively, are not routed to the generic attribute interception methods in 3.0. Specifically:

  • In Python 3.0, neither __getattr__ nor __getattribute__ is run for such attributes.

  • In Python 2.6, __getattr__ is run for such attributes if they are undefined in the class.

  • In Python 2.6, __getattribute__ is available for new-style classes only and works as it does in 3.0.

In other words, in Python 3.0 classes (and 2.6 new-style classes), there is no direct way to generically intercept built-in operations like printing and addition. In Python 2.X, the methods such operations invoke are looked up at runtime in instances, like all other attributes; in Python 3.0 such methods are looked up in classes instead.

This change makes delegation-based coding patterns more complex in 3.0, since they cannot generically intercept operator overloading method calls and route them to an embedded object. This is not a showstopper—wrapper classes can work around this constraint by redefining all relevant operator overloading methods in the wrapper itself, in order to delegate calls. These extra methods can be added either manually, with tools, or by definition in and inheritance from common superclasses. This does, however, make wrappers more work than they used to be when operator overloading methods are a part of a wrapped object’s interface.

Keep in mind that this issue applies only to __getattr__ and __getattribute__. Because properties and descriptors are defined for specific attributes only, they don’t really apply to delegation-based classes at all—a single property or descriptor cannot be used to intercept arbitrary attributes. Moreover, a class that defines both operator overloading methods and attribute interception will work correctly, regardless of the type of attribute interception defined. Our concern here is only with classes that do not have operator overloading methods defined, but try to intercept them generically.

Consider the following example, the file getattr.py, which tests various attribute types and built-in operations on instances of classes containing __getattr__ and __getattribute__ methods:

class GetAttr:
    eggs = 88             # eggs stored on class, spam on instance
    def __init__(self):
       self.spam = 77
    def __len__(self):    # len here, else __getattr__ called with __len__
        print('__len__: 42')
        return 42
    def __getattr__(self, attr):     # Provide __str__ if asked, else dummy func
        print('getattr: ' + attr)
        if attr == '__str__':
            return lambda *args: '[Getattr str]'
        else:
            return lambda *args: None

class GetAttribute(object):          # object required in 2.6, implied in 3.0
    eggs = 88                        # In 2.6 all are isinstance(object) auto
    def __init__(self):              # But must derive to get new-style tools,
        self.spam = 77               # incl __getattribute__, some __X__ defaults
    def __len__(self):
        print('__len__: 42')
        return 42
    def __getattribute__(self, attr):
        print('getattribute: ' + attr)
        if attr == '__str__':
            return lambda *args: '[GetAttribute str]'
        else:
            return lambda *args: None

for Class in GetAttr, GetAttribute:
    print('
' + Class.__name__.ljust(50, '='))

    X = Class()
    X.eggs               # Class attr
    X.spam               # Instance attr
    X.other              # Missing attr
    len(X)               # __len__ defined explicitly

    try:                 # New-styles must support [], +, call directly: redefine
        X[0]             # __getitem__?
    except:
        print('fail []')

    try:
        X + 99           # __add__?
    except:
        print('fail +')

    try:
        X()              # __call__?  (implicit via built-in)
    except:
        print('fail ()')
    X.__call__()         # __call__?  (explicit, not inherited)

    print(X.__str__())   # __str__?   (explicit, inherited from type)
    print(X)             # __str__?   (implicit via built-in)

When run under Python 2.6, __getattr__ does receive a variety of implicit attribute fetches for built-in operations, because Python looks up such attributes in instances normally. Conversely, __getattribute__ is not run for any of the operator overloading names, because such names are looked up in classes only:

C:misc> c:python26python getattr.py

GetAttr===========================================
getattr: other
__len__: 42
getattr: __getitem__
getattr: __coerce__
getattr: __add__
getattr: __call__
getattr: __call__
getattr: __str__
[Getattr str]
getattr: __str__
[Getattr str]

GetAttribute======================================
getattribute: eggs
getattribute: spam
getattribute: other
__len__: 42
fail []
fail +
fail ()
getattribute: __call__
getattribute: __str__
[GetAttribute str]
<__main__.GetAttribute object at 0x025EA1D0>

Note how __getattr__ intercepts both implicit and explicit fetches of __call__ and __str__ in 2.6 here. By contrast, __getattribute__ fails to catch implicit fetches of either attribute name for built-in operations.

Really, the __getattribute__ case is the same in 2.6 as it is in 3.0, because in 2.6 classes must be made new-style by deriving from object to use this method. This code’s object derivation is optional in 3.0 because all classes are new-style.

When run under Python 3.0, though, results for __getattr__ differ—none of the implicitly run operator overloading methods trigger either attribute interception method when their attributes are fetched by built-in operations. Python 3.0 skips the normal instance lookup mechanism when resolving such names:

C:misc> c:python30python getattr.py

GetAttr===========================================
getattr: other
__len__: 42
fail []
fail +
fail ()
getattr: __call__
<__main__.GetAttr object at 0x025D17F0>
<__main__.GetAttr object at 0x025D17F0>

GetAttribute======================================
getattribute: eggs
getattribute: spam
getattribute: other
__len__: 42
fail []
fail +
fail ()
getattribute: __call__
getattribute: __str__
[GetAttribute str]
<__main__.GetAttribute object at 0x025D1870>

We can trace these outputs back to prints in the script to see how this works:

  • __str__ access fails to be caught twice by __getattr__ in 3.0: once for the built-in print, and once for explicit fetches because a default is inherited from the class (really, from the built-in object, which is a superclass to every class).

  • __str__ fails to be caught only once by the __getattribute__ catchall, during the built-in print operation; explicit fetches bypass the inherited version.

  • __call__ fails to be caught in both schemes in 3.0 for built-in call expressions, but it is intercepted by both when fetched explicitly; unlike with __str__, there is no inherited __call__ default to defeat __getattr__.

  • __len__ is caught by both classes, simply because it is an explicitly defined method in the classes themselves—its name it is not routed to either __getattr__ or __getattribute__ in 3.0 if we delete the class’s __len__ methods.

  • All other built-in operations fail to be intercepted by both schemes in 3.0.

Again, the net effect is that operator overloading methods implicitly run by built-in operations are never routed through either attribute interception method in 3.0: Python 3.0 searches for such attributes in classes and skips instance lookup entirely.

This makes delegation-based wrapper classes more difficult to code in 3.0—if wrapped classes may contain operator overloading methods, those methods must be redefined redundantly in the wrapper class in order to delegate to the wrapped object. In general delegation tools, this can add many extra methods.

Of course, the addition of such methods can be partly automated by tools that augment classes with new methods (the class decorators and metaclasses of the next two chapters might help here). Moreover, a superclass might be able to define all these extra methods once, for inheritance in delegation-based classes. Still, delegation coding patterns require extra work in 3.0.

For a more realistic illustration of this phenomenon as well as its workaround, see the Private decorator example in the following chapter. There, we’ll see that it’s also possible to insert a __getattribute__ in the client class to retain its original type, although this method still won’t be called for operator overloading methods; printing still runs a __str__ defined in such a class directly, for example, instead of routing the request through __getattribute__.

As another example, the next section resurrects our class tutorial example. Now that you understand how attribute interception works, I’ll be able to explain one of its stranger bits.

Note

For an example of this 3.0 change at work in Python itself, see the discussion of the 3.0 os.popen object in Chapter 14. Because it is implemented with a wrapper that uses __getattr__ to delegate attribute fetches to an embedded object, it does not intercept the next(X) built-in iterator function in Python 3.0, which is defined to run __next__. It does, however, intercept and delegate explicit X.__next__() calls, because these are not routed through the built-in and are not inherited from a superclass like __str__ is.

This is equivalent to __call__ in our example—implicit calls for built-ins do not trigger __getattr__, but explicit calls to names not inherited from the class type do. In other words, this change impacts not only our delegators, but also those in the Python standard library! Given the scope of this change, it’s possible that this behavior may evolve in the future, so be sure to verify this issue in later releases.

Delegation-Based Managers Revisited

The object-oriented tutorial of Chapter 27 presented a Manager class that used object embedding and method delegation to customize its superclass, rather than inheritance. Here is the code again for reference, with some irrelevant testing removed:

class Person:
    def __init__(self, name, job=None, pay=0):
        self.name = name
        self.job  = job
        self.pay  = pay
    def lastName(self):
        return self.name.split()[-1]
    def giveRaise(self, percent):
        self.pay = int(self.pay * (1 + percent))
    def __str__(self):
        return '[Person: %s, %s]' % (self.name, self.pay)

class Manager:
    def __init__(self, name, pay):
        self.person = Person(name, 'mgr', pay)      # Embed a Person object
    def giveRaise(self, percent, bonus=.10):
        self.person.giveRaise(percent + bonus)      # Intercept and delegate
    def __getattr__(self, attr):
        return getattr(self.person, attr)           # Delegate all other attrs
    def __str__(self):
        return str(self.person)                     # Must overload again (in 3.0)

if __name__ == '__main__':
    sue = Person('Sue Jones', job='dev', pay=100000)
    print(sue.lastName())
    sue.giveRaise(.10)
    print(sue)
    tom = Manager('Tom Jones', 50000)    # Manager.__init__
    print(tom.lastName())                # Manager.__getattr__ -> Person.lastName
    tom.giveRaise(.10)                   # Manager.giveRaise -> Person.giveRaise
    print(tom)                           # Manager.__str__ -> Person.__str__

Comments at the end of this file show which methods are invoked for a line’s operation. In particular, notice how lastName calls are undefined in Manager, and thus are routed into the generic __getattr__ and from there on to the embedded Person object. Here is the script’s output—Sue receives a 10% raise from Person, but Tom gets 20% because giveRaise is customized in Manager:

C:misc> c:python30python person.py
Jones
[Person: Sue Jones, 110000]
Jones
[Person: Tom Jones, 60000]

By contrast, though, notice what occurs when we print a Manager at the end of the script: the wrapper class’s __str__ is invoked, and it delegates to the embedded Person object’s __str__. With that in mind, watch what happens if we delete the Manager.__str__ method in this code:

# Delete the Manager __str__ method

class Manager:
    def __init__(self, name, pay):
        self.person = Person(name, 'mgr', pay)      # Embed a Person object
    def giveRaise(self, percent, bonus=.10):
        self.person.giveRaise(percent + bonus)      # Intercept and delegate
    def __getattr__(self, attr):
        return getattr(self.person, attr)           # Delegate all other attrs

Now printing does not route its attribute fetch through the generic __getattr__ interceptor under Python 3.0 for Manager objects. Instead, a default __str__ display method inherited from the class’s implicit object superclass is looked up and run (sue still prints correctly, because Person has an explicit __str__):

C:misc> c:python30python person.py
Jones
[Person: Sue Jones, 110000]
Jones
<__main__.Manager object at 0x02A5AE30>

Curiously, running without a __str__ like this does trigger __getattr__ in Python 2.6, because operator overloading attributes are routed through this method, and classes do not inherit a default for __str__:

C:misc> c:python26python person.py
Jones
[Person: Sue Jones, 110000]
Jones
[Person: Tom Jones, 60000]

Switching to __getattribute__ won’t help 3.0 here either—like __getattr__, it is not run for operator overloading attributes implied by built-in operations in either Python 2.6 or 3.0:

# Replace __getattr_ with __getattribute__

class Manager:                                           # Use (object) in 2.6
    def __init__(self, name, pay):
        self.person = Person(name, 'mgr', pay)           # Embed a Person object
    def giveRaise(self, percent, bonus=.10):
        self.person.giveRaise(percent + bonus)           # Intercept and delegate
    def __getattribute__(self, attr):
        print('**', attr)
        if attr in ['person', 'giveRaise']:
            return object.__getattribute__(self, attr)   # Fetch my attrs
        else:
            return getattr(self.person, attr)            # Delegate all others

Regardless of which attribute interception method is used in 3.0, we still must include a redefined __str__ in Manager (as shown above) in order to intercept printing operations and route them to the embedded Person object:

C:misc> c:python30python person.py
Jones
[Person: Sue Jones, 110000]
** lastName
** person
Jones
** giveRaise
** person
<__main__.Manager object at 0x028E0590>

Notice that __getattribute__ gets called twice here for methods—once for the method name, and again for the self.person embedded object fetch. We could avoid that with a different coding, but we would still have to redefine __str__ to catch printing, albeit differently here (self.person would cause this __getattribute__ to fail):

# Code __getattribute__ differently to minimize extra calls

class Manager:
    def __init__(self, name, pay):
        self.person = Person(name, 'mgr', pay)
    def __getattribute__(self, attr):
        print('**', attr)
        person = object.__getattribute__(self, 'person')
        if attr == 'giveRaise':
            return lambda percent: person.giveRaise(percent+.10)
        else:
            return getattr(person, attr)
    def __str__(self):
        person = object.__getattribute__(self, 'person')
        return str(person)

When this alternative runs, our object prints properly, but only because we’ve added an explicit __str__ in the wrapper—this attribute is still not routed to our generic attribute interception method:

Jones
[Person: Sue Jones, 110000]
** lastName
Jones
** giveRaise
[Person: Tom Jones, 60000]

That short story here is that delegation-based classes like Manager must redefine some operator overloading methods (like __str__) to route them to embedded objects in Python 3.0, but not in Python 2.6 unless new-style classes are used. Our only direct options seem to be using __getattr__ and Python 2.6, or redefining operator overloading methods in wrapper classes redundantly in 3.0.

Again, this isn’t an impossible task; many wrappers can predict the set of operator overloading methods required, and tools and superclasses can automate part of this task. Moreover, not all classes use operator overloading methods (indeed, most application classes usually should not). It is, however, something to keep in mind for delegation coding models used in Python 3.0; when operator overloading methods are part of an object’s interface, wrappers must accommodate them portably by redefining them locally.

Example: Attribute Validations

To close out this chapter, let’s turn to a more realistic example, coded in all four of our attribute management schemes. The example we will use defines a CardHolder object with four attributes, three of which are managed. The managed attributes validate or transform values when fetched or stored. All four versions produce the same results for the same test code, but they implement their attributes in very different ways. The examples are included largely for self-study; although I won’t go through their code in detail, they all use concepts we’ve already explored in this chapter.

Using Properties to Validate

Our first coding uses properties to manage three attributes. As usual, we could use simple methods instead of managed attributes, but properties help if we have been using attributes in existing code already. Properties run code automatically on attribute access, but are focused on a specific set of attributes; they cannot be used to intercept all attributes generically.

To understand this code, it’s crucial to notice that the attribute assignments inside the __init__ constructor method trigger property setter methods too. When this method assigns to self.name, for example, it automatically invokes the setName method, which transforms the value and assigns it to an instance attribute called __name so it won’t clash with the property’s name.

This renaming (sometimes called name mangling) is necessary because properties use common instance state and have none of their own. Data is stored in an attribute called __name, and the attribute called name is always a property, not data.

In the end, this class manages attributes called name, age, and acct; allows the attribute addr to be accessed directly; and provides a read-only attribute called remain that is entirely virtual and computed on demand. For comparison purposes, this property-based coding weighs in at 39 lines of code:

class CardHolder:
    acctlen = 8                                # Class data
    retireage = 59.5

    def __init__(self, acct, name, age, addr):
        self.acct = acct                       # Instance data
        self.name = name                       # These trigger prop setters too
        self.age  = age                        # __X mangled to have class name
        self.addr = addr                       # addr is not managed
                                               # remain has no data
    def getName(self):
        return self.__name
    def setName(self, value):
        value = value.lower().replace(' ', '_')
        self.__name = value
    name = property(getName, setName)

    def getAge(self):
        return self.__age
    def setAge(self, value):
        if value < 0 or value > 150:
            raise ValueError('invalid age')
        else:
            self.__age = value
    age = property(getAge, setAge)

    def getAcct(self):
        return self.__acct[:-3] + '***'
    def setAcct(self, value):
        value = value.replace('-', '')
        if len(value) != self.acctlen:
            raise TypeError('invald acct number')
        else:
            self.__acct = value
    acct = property(getAcct, setAcct)

    def remainGet(self):                       # Could be a method, not attr
        return self.retireage - self.age       # Unless already using as attr
    remain = property(remainGet)

Self-test code

The following code tests our class; add this to the bottom of your file, or place the class in a module and import it first. We’ll use this same testing code for all four versions of this example. When it runs, we make two instances of our managed-attribute class and fetch and change its various attributes. Operations expected to fail are wrapped in try statements:

bob = CardHolder('1234-5678', 'Bob Smith', 40, '123 main st')
print(bob.acct, bob.name, bob.age, bob.remain, bob.addr, sep=' / ')
bob.name = 'Bob Q. Smith'
bob.age  = 50
bob.acct = '23-45-67-89'
print(bob.acct, bob.name, bob.age, bob.remain, bob.addr, sep=' / ')

sue = CardHolder('5678-12-34', 'Sue Jones', 35, '124 main st')
print(sue.acct, sue.name, sue.age, sue.remain, sue.addr, sep=' / ')
try:
    sue.age = 200
except:
    print('Bad age for Sue')

try:
    sue.remain = 5
except:
    print("Can't set sue.remain")

try:
    sue.acct = '1234567'
except:
    print('Bad acct for Sue')

Here is the output of our self-test code; again, this is the same for all versions of this example. Trace through this code to see how the class’s methods are invoked; accounts are displayed with some digits hidden, names are converted to a standard format, and time remaining until retirement is computed when fetched using a class attribute cutoff:

12345*** / bob_smith / 40 / 19.5 / 123 main st
23456*** / bob_q._smith / 50 / 9.5 / 123 main st
56781*** / sue_jones / 35 / 24.5 / 124 main st
Bad age for Sue
Can't set sue.remain
Bad acct for Sue

Using Descriptors to Validate

Now, let’s recode our example using descriptors instead of properties. As we’ve seen, descriptors are very similar to properties in terms of functionality and roles; in fact, properties are basically a restricted form of descriptor. Like properties, descriptors are designed to handle specific attributes, not generic attribute access. Unlike properties, descriptors have their own state, and they’re a more general scheme.

To understand this code, it’s again important to notice that the attribute assignments inside the __init__ constructor method trigger descriptor __set__ methods. When the constructor method assigns to self.name, for example, it automatically invokes the Name.__set__() method, which transforms the value and assigns it to a descriptor attribute called name.

Unlike in the prior property-based variant, though, in this case the actual name value is attached to the descriptor object, not the client class instance. Although we could store this value in either instance or descriptor state, the latter avoids the need to mangle names with underscores to avoid collisions. In the CardHolder client class, the attribute called name is always a descriptor object, not data. The downside of this scheme is that state stored inside a descriptor itself is class-level data which is effectively shared by all client class instances, and so cannot vary between them.

In the end, this class implements the same attributes as the prior version: it manages attributes called name, age, and acct; allows the attribute addr to be accessed directly; and provides a read-only attribute called remain that is entirely virtual and computed on demand. Notice how we must catch assignments to the remain name in its descriptor and raise an exception; as we learned earlier, if we did not do this, assigning to this attribute of an instance would silently create an instance attribute that hides the class attribute descriptor. For comparison purposes, this descriptor-based coding takes 45 lines of code:

class CardHolder:
    acctlen = 8                                  # Class data
    retireage = 59.5

    def __init__(self, acct, name, age, addr):
        self.acct = acct                         # Instance data
        self.name = name                         # These trigger __set__ calls too
        self.age  = age                          # __X not needed: in descriptor
        self.addr = addr                         # addr is not managed
                                                 # remain has no data
    class Name:
        def __get__(self, instance, owner):      # Class names: CardHolder locals
            return self.name
        def __set__(self, instance, value):
            value = value.lower().replace(' ', '_')
            self.name = value
    name = Name()

    class Age:
        def __get__(self, instance, owner):
            return self.age                             # Use descriptor data
        def __set__(self, instance, value):
            if value < 0 or value > 150:
                raise ValueError('invalid age')
            else:
                self.age = value
    age = Age()

    class Acct:
        def __get__(self, instance, owner):
            return self.acct[:-3] + '***'
        def __set__(self, instance, value):
            value = value.replace('-', '')
            if len(value) != instance.acctlen:          # Use instance class data
                raise TypeError('invald acct number')
            else:
                self.acct = value
    acct = Acct()

    class Remain:
        def __get__(self, instance, owner):
            return instance.retireage - instance.age    # Triggers Age.__get__
        def __set__(self, instance, value):
            raise TypeError('cannot set remain')        # Else set allowed here
    remain = Remain()

Using __getattr__ to Validate

As we’ve seen, the __getattr__ method intercepts all undefined attributes, so it can be more generic than using properties or descriptors. For our example, we simply test the attribute name to know when a managed attribute is being fetched; others are stored physically on the instance and so never reach __getattr__. Although this approach is more general than using properties or descriptors, extra work may be required to imitate the specific-attribute focus of other tools. We need to check names at runtime, and we must code a __setattr__ in order to intercept and validate attribute assignments.

As for the property and descriptor versions of this example, it’s critical to notice that the attribute assignments inside the __init__ constructor method trigger the class’s __setattr__ method too. When this method assigns to self.name, for example, it automatically invokes the __setattr__ method, which transforms the value and assigns it to an instance attribute called name. By storing name on the instance, it ensures that future accesses will not trigger __getattr__. In contrast, acct is stored as _acct, so that later accesses to acct do invoke __getattr__.

In the end, this class, like the prior two, manages attributes called name, age, and acct; allows the attribute addr to be accessed directly; and provides a read-only attribute called remain that is entirely virtual and is computed on demand.

For comparison purposes, this alternative comes in at 32 lines of code—7 fewer than the property-based version, and 13 fewer than the version using descriptors. Clarity matters more than code size, of course, but extra code can sometimes imply extra development and maintenance work. Probably more important here are roles: generic tools like __getattr__ may be better suited to generic delegation, while properties and descriptors are more directly designed to manage specific attributes.

Also note that the code here incurs extra calls when setting unmanaged attributes (e.g., addr), although no extra calls are incurred for fetching unmanaged attributes, since they are defined. Though this will likely result in negligible overhead for most programs, properties and descriptors incur an extra call only when managed attributes are accessed.

Here’s the __getattr__ version of our code:

class CardHolder:
    acctlen = 8                                  # Class data
    retireage = 59.5

    def __init__(self, acct, name, age, addr):
        self.acct = acct                         # Instance data
        self.name = name                         # These trigger __setattr__ too
        self.age  = age                          # _acct not mangled: name tested
        self.addr = addr                         # addr is not managed
                                                 # remain has no data
    def __getattr__(self, name):
        if name == 'acct':                           # On undefined attr fetches
            return self._acct[:-3] + '***'           # name, age, addr are defined
        elif name == 'remain':
            return self.retireage - self.age         # Doesn't trigger __getattr__
        else:
            raise AttributeError(name)

    def __setattr__(self, name, value):
        if name == 'name':                           # On all attr assignments
            value = value.lower().replace(' ', '_')  # addr stored directly
        elif name == 'age':                          # acct mangled to _acct
            if value < 0 or value > 150:
                raise ValueError('invalid age')
        elif name == 'acct':
            name  = '_acct'
            value = value.replace('-', '')
            if len(value) != self.acctlen:
                raise TypeError('invald acct number')
        elif name == 'remain':
            raise TypeError('cannot set remain')
        self.__dict__[name] = value                  # Avoid looping

Using __getattribute__ to Validate

Our final variant uses the __getattribute__ catchall to intercept attribute fetches and manage them as needed. Every attribute fetch is caught here, so we test the attribute names to detect managed attributes and route all others to the superclass for normal fetch processing. This version uses the same __setattr__ to catch assignments as the prior version.

The code works very much like the __getattr__ version, so I won’t repeat the full description here. Note, though, that because every attribute fetch is routed to __getattribute__, we don’t need to mangle names to intercept them here (acct is stored as acct). On the other hand, this code must take care to route nonmanaged attribute fetches to a superclass to avoid looping.

Also notice that this version incurs extra calls for both setting and fetching unmanaged attributes (e.g., addr); if speed is paramount, this alternative may be the slowest of the bunch. For comparison purposes, this version amounts to 32 lines of code, just like the prior version:

class CardHolder:
    acctlen = 8                                  # Class data
    retireage = 59.5

    def __init__(self, acct, name, age, addr):
        self.acct = acct                         # Instance data
        self.name = name                         # These trigger __setattr__ too
        self.age  = age                          # acct not mangled: name tested
        self.addr = addr                         # addr is not managed
                                                 # remain has no data
    def __getattribute__(self, name):
        superget = object.__getattribute__             # Don't loop: one level up
        if name == 'acct':                             # On all attr fetches
            return superget(self, 'acct')[:-3] + '***'
        elif name == 'remain':
            return superget(self, 'retireage') - superget(self, 'age')
        else:
            return superget(self, name)                # name, age, addr: stored

    def __setattr__(self, name, value):
        if name == 'name':                             # On all attr assignments
            value = value.lower().replace(' ', '_')    # addr stored directly
        elif name == 'age':
            if value < 0 or value > 150:
                raise ValueError('invalid age')
        elif name == 'acct':
            value = value.replace('-', '')
            if len(value) != self.acctlen:
                raise TypeError('invald acct number')
        elif name == 'remain':
            raise TypeError('cannot set remain')
        self.__dict__[name] = value                     # Avoid loops, orig names

Be sure to study and run this section’s code on your own for more pointers on managed attribute coding techniques.

Chapter Summary

This chapter covered the various techniques for managing access to attributes in Python, including the __getattr__ and __getattribute__ operator overloading methods, class properties, and attribute descriptors. Along the way, it compared and contrasted these tools and presented a handful of use cases to demonstrate their behavior.

Chapter 38 continues our tool-building survey with a look at decorators—code run automatically at function and class creation time, rather than on attribute access. Before we continue, though, let’s work through a set of questions to review what we’ve covered here.

Test Your Knowledge: Quiz

  1. How do __getattr__ and __getattribute__ differ?

  2. How do properties and descriptors differ?

  3. How are properties and decorators related?

  4. What are the main functional differences between __getattr__ and __getattribute__ and properties and descriptors?

  5. Isn’t all this feature comparison just a kind of argument?

Test Your Knowledge: Answers

  1. The __getattr__ method is run for fetches of undefined attributes only—i.e., those not present on an instance and not inherited from any of its classes. By contrast, the __getattribute__ method is called for every attribute fetch, whether the attribute is defined or not. Because of this, code inside a __getattr__ can freely fetch other attributes if they are defined, whereas __getattribute__ must use special code for all such attribute fetches to avoid looping (it must route fetches to a superclass to skip itself).

  2. Properties serve a specific role, while descriptors are more general. Properties define get, set, and delete functions for a specific attribute; descriptors provide a class with methods for these actions, too, but they provide extra flexibility to support more arbitrary actions. In fact, properties are really a simple way to create a specific kind of descriptor—one that runs functions on attribute accesses. Coding differs too: a property is created with a built-in function, and a descriptor is coded with a class; as such, descriptors can leverage all the usual OOP features of classes, such as inheritance. Moreover, in addition to the instance’s state information, descriptors have local state of their own, so they can avoid name collisions in the instance.

  3. Properties can be coded with decorator syntax. Because the property built-in accepts a single function argument, it can be used directly as a function decorator to define a fetch access property. Due to the name rebinding behavior of decorators, the name of the decorated function is assigned to a property whose get accessor is set to the original function decorated (name = property(name)). Property setter and deleter attributes allow us to further add set and delete accessors with decoration syntax—they set the accessor to the decorated function and return the augmented property.

  4. The __getattr__ and __getattribute__ methods are more generic: they can be used to catch arbitrarily many attributes. In contrast, each property or descriptor provides access interception for only one specific attribute—we can’t catch every attribute fetch with a single property or descriptor. On the other hand, properties and descriptors handle both attribute fetch and assignment by design: __getattr__ and __getattribute__ handle fetches only; to intercept assignments as well, __setattr__ must also be coded. The implementation is also different: __getattr__ and __getattribute__ are operator overloading methods, whereas properties and descriptors are objects manually assigned to class attributes.

  5. No it isn’t. To quote from Python namesake Monty Python’s Flying Circus:

    An argument is a connected series of statements intended to establish a
    proposition.
    No it isn't.
    Yes it is! It's not just contradiction.
    Look, if I argue with you, I must take up a contrary position.
    Yes, but that's not just saying "No it isn't."
    Yes it is!
    No it isn't!
    Yes it is!
    No it isn't. Argument is an intellectual process. Contradiction is just
    the automatic gainsaying of any statement the other person makes.
    (short pause)
    No it isn't.
    It is.
    Not at all.
    Now look...
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.220.163.88