© Jacob Zimmerman 2018
Jacob ZimmermanPython Descriptorshttps://doi.org/10.1007/978-1-4842-3727-4_5

5. Attribute Access and Descriptors

Jacob Zimmerman1 
(1)
New York, USA
 

It was stated earlier that attribute access calls are transformed into descriptor calls, but it was not stated how. The quick answer is that __getattribute__(), __setattr__(), and __delattr__() do it. That probably isn’t much of an answer for you, so I’ll dig into it more. These three methods exist on all normal objects, inherited via the object class (and classes inherit it from the type metaclass). As you might imagine, these methods are called when an attribute on an object is retrieved, set, or deleted, respectively, and it is these methods that decide whether to use a descriptor, __dict__, or __slots__, and whether to return/set something on the class or on the instance.

An explanation of this decision process is given in a little bit, but now I have to explain something that may be nagging you: Why do the set and and delete methods end with attr, but the get method ends with attribute?

Part of the answer to that is the fact that there actually is a __getattr__() method, but it’s not used quite the same as the others. __getattribute__() handles all the normal attribute lookup logic while __getattr__() is called by __getattribute__() in a last ditch effort if all else fails. It is recommended by Python that you don’t make changes to __getattribute__() except under extreme circumstances, and only if you really know what you’re doing. With some experience, I can concur with that recommendation.

I don’t know why setting and deleting don’t have a similar setup, but I can theorize. It might have to do with the idea that a typical override of attribute lookup is as a failsafe if the usual ways don’t work, but if someone is overriding one or both of the others, there’s a decent chance that it may be a complete replacement or at least the first thing tried instead of the backup thing. Plus, there’s the fact that, under normal circumstances (doesn’t use __slots__, isn’t a named tuple, etc.), setting always works and deleting is pretty rare. But you may want to ask one of the core developers if you’re really that curious.

One last clarification: near the beginning of the book, I said that attribute access gets “transformed” into calls to the descriptor methods. This makes it sound like it’s a compile-time decision, but it’s not. Python is a dynamically typed language, and it isn’t supposed to know at compile time whether an attribute exists on an object and whether it needs to be accessed like a descriptor or just a normal attribute, especially since this can change at runtime. It can make certain guesses based on the code around it, but it can never be 100% sure.

No, using attributes effectively gets transformed into calls to the descriptor method within the methods mentioned previously, which describe how the language decides what to do. This is the really dynamic part. So let’s move on and see what this decision-making process look like.

Instance Access

Simply looking up attributes is the most complex of the three uses of attributes because there are multiple places to look for attributes: on the instance and on the class. Also, if it’s a descriptor on the class, you have two different behaviors for data and non-data descriptors.

__getattribute__() has an order of priority that describes where to look for attributes and how to deal with them. That priority is the main difference between data descriptors and non-data descriptors. Here is that list of priorities:
  • Data descriptors

  • Instance attributes

  • Non-data descriptors and class attributes

  • __getattr__ (might be called separately from __getattribute__)

The first thing __getattribute__() does is look in the class dictionary for the attribute. If it’s not found, it works its way through the method resolution order (MRO) of classes (the superclasses in a linear order) to continue looking for it. If it’s still not found, it’ll move to the next priority. If it is found, it is checked to see if it is a data descriptor. If it’s not, it moves on to the next priority. If it turns out to be a data descriptor, it’ll call __get__() and return the result, assuming it has a __get__() method . If it doesn’t have a __get__() method, then it moves on to the next priority.

That’s a lot of ifs, and that’s just within the first priority to determine whether a viable data descriptor is available to work with. Luckily, the next priority is simpler.

Next in the priority list is checking the instance dictionary (or slots, if that’s what the object is using). If it exists there, we simply return that. Otherwise, it moves to the next priority.

In this priority, it checks through the class dictionaries again, working its way down the MRO list if needed. If nothing is found, it moves to the next priority. Otherwise, it checks the found object to see if it’s a descriptor (at this point, we only need to check if it’s a non-data descriptor because if we’ve made it this far, it’s definitely not a data descriptor). If so, it calls the descriptor’s __get__() method and returns the result. Otherwise, it simply returns the object. This time, it doesn’t have a backup of returning the descriptor object itself if it doesn’t have __get__() because it, being a non-data descriptor, guarantees that it has __get__().

If all else has failed up to this point, it checks with __getattr__() for any possible custom behavior regarding attribute access. If there’s nothing, an AttributeError is raised.

With this complicated definition, Python users should be grateful that a lot of work has been put into optimizing this access algorithm to the point that it’s remarkably fast. The flowchart in Figure 5-1 show how descriptors are accessed, with blue bands denoting each priority.
../images/435481_2_En_5_Chapter/435481_2_En_5_Fig1_HTML.jpg
Figure 5-1

Class access

In the common case where the class’ metaclass is type, or there are no new attributes on the metaclass, class access can be viewed in a simplified way compared to instance access; it doesn’t even have a priority list. It still uses __getattribute__() , but it’s the one defined on its metaclass. It simply searches through the class dictionaries, progressing through the MRO as needed. If found, it checks to see if it’s a descriptor with the __get__() method . If so, it makes the proper call and returns the result. Otherwise, it just returns the object. At the class level, though, it doesn’t care if the descriptor is data or non-data; if the descriptor has a __get__() method, the method is used.

If nothing was found, an AttributeError is raised, as shown in Figure 5-2.
../images/435481_2_En_5_Chapter/435481_2_En_5_Fig2_HTML.png
Figure 5-2

An AttributeError is raised

Unfortunately, if there are new attributes on the metaclass, this simplification is unhelpful, since they might be used in the lookup. In fact, class access looks almost exactly like instance access (replacing “class” with “metaclass” and “instance” with “class”) with one big difference. Instead of checking just the current instance/class dictionary, it checks through the MRO of it as well. It also still treats descriptors on the class as descriptors, rather than automatically returning the descriptor object. Knowing this, Figure 5-3 shows the full class access diagram, with all the priority levels.
../images/435481_2_En_5_Chapter/435481_2_En_5_Fig3_HTML.png
Figure 5-3

The full class access diagram

Set and Delete Calls

Setting and deleting are just a little bit different. If the required __set__() or __delete__() method doesn’t exist, and it’s a data descriptor, an AttributeError is raised. The other difference is the fact that setting and deleting never get beyond the instance priority. If the attribute doesn’t exist on the instance, setting will add it and deleting will raise an AttributeError .

Figure 5-4 shows the last flowchart, depicting what happens for setting and deleting.
../images/435481_2_En_5_Chapter/435481_2_En_5_Fig4_HTML.jpg
Figure 5-4

The setting and deleting processes

The Reasoning Behind Data versus Non-Data Descriptors

Now that the difference between data and non-data descriptors has been explained, it should be explained why these two versions exist.

The first place to look at is the built-in use cases for each type within the language and standard library. The prime example of a data descriptor is property. As its name suggests, its purpose is to create properties for classes (replace getter and setter methods with a syntax that looks like simple attribute use). That means class-level access is not intended since properties represent fields on an instance.

Meanwhile, the primary use-case for non-data descriptors is decorating methods for different usages (classmethod, staticmethod, and especially the implicit descriptor used for normal methods). While these can be called from instances (and normal methods should be called from instances), they’re not meant to be set or deleted from instances. Methods are assigned on the class. A function can be assigned to an instance attribute, but it doesn’t make it a method, since self is not automatically provided as the first argument when called. Also, when it comes to the “magic” dunder methods (methods with two leading and two trailing underscores) being called through the normal, “magical” way, Python is optimized to look directly on the class, skipping over anything that may have been assigned to the instance.

Summary

Rarely is it useful to know the full depth of what is happening behind the scenes of attribute calls, and even knowing the basic priority list rarely comes into play, since descriptors generally do what is obvious, once you understand how they’re accessed. There are times, though, when the priority list, and possibly even the full depth, will help in understanding why a descriptor isn’t working as hoped or how to set up a descriptor to do a more complicated task.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.224.73.175