Private methods and name mangling

If you have any background with languages like Java, C#, or C++, then you know they allow the programmer to assign a privacy status to attributes (both data and methods). Each language has its own slightly different flavor for this, but the gist is that public attributes are accessible from any point in the code, while private ones are accessible only within the scope they are defined in.

In Python, there is no such thing. Everything is public; therefore, we rely on conventions and on a mechanism called name mangling.

The convention is as follows: if an attribute's name has no leading underscores, it is considered public. This means you can access it and modify it freely. When the name has one leading underscore, the attribute is considered private, which means it's probably meant to be used internally and you should not use it or modify it from the outside. A very common use case for private attributes are helper methods that are supposed to be used by public ones (possibly in call chains in conjunction with other methods), and internal data, such as scaling factors, or any other data that ideally we would put in a constant (a variable that cannot change, but, surprise, surprise, Python doesn't have those either).

This characteristic usually scares people from other backgrounds off; they feel threatened by the lack of privacy. To be honest, in my whole professional experience with Python, I've never heard anyone screaming "oh my God, we have a terrible bug because Python lacks private attributes!" Not once, I swear.

That said, the call for privacy actually makes sense because without it, you risk introducing bugs into your code for real. Let me show you what I mean:

# oop/private.attrs.py
class A:
def __init__(self, factor):
self._factor = factor

def op1(self):
print('Op1 with factor {}...'.format(self._factor))

class B(A):
def op2(self, factor):
self._factor = factor
print('Op2 with factor {}...'.format(self._factor))


obj = B(100)
obj.op1() # Op1 with factor 100...
obj.op2(42) # Op2 with factor 42...
obj.op1() # Op1 with factor 42... <- This is BAD

In the preceding code, we have an attribute called _factor, and let's pretend it's so important that it isn't modified at runtime after the instance is created, because op1 depends on it to function correctly. We've named it with a leading underscore, but the issue here is that when we call obj.op2(42), we modify it, and this is reflected in subsequent calls to op1.

Let's fix this undesired behavior by adding another leading underscore:

# oop/private.attrs.fixed.py
class A:
def __init__(self, factor):
self.__factor = factor

def op1(self):
print('Op1 with factor {}...'.format(self.__factor))

class B(A):
def op2(self, factor):
self.__factor = factor
print('Op2 with factor {}...'.format(self.__factor))


obj = B(100)
obj.op1() # Op1 with factor 100...
obj.op2(42) # Op2 with factor 42...
obj.op1() # Op1 with factor 100... <- Wohoo! Now it's GOOD!

Wow, look at that! Now it's working as desired. Python is kind of magic and in this case, what is happening is that the name-mangling mechanism has kicked in.

Name mangling means that any attribute name that has at least two leading underscores and at most one trailing underscore, such as __my_attr, is replaced with a name that includes an underscore and the class name before the actual name, such as _ClassName__my_attr.

This means that when you inherit from a class, the mangling mechanism gives your private attribute two different names in the base and child classes so that name collision is avoided. Every class and instance object stores references to their attributes in a special attribute called __dict__, so let's inspect obj.__dict__ to see name mangling in action:

# oop/private.attrs.py
print(obj.__dict__.keys()) # dict_keys(['_factor'])

This is the _factor attribute that we find in the problematic version of this example. But look at the one that is using __factor:

# oop/private.attrs.fixed.py
print(obj.__dict__.keys()) # dict_keys(['_A__factor', '_B__factor'])

See? obj has two attributes now, _A__factor (mangled within the A class), and _B__factor (mangled within the B class). This is the mechanism that ensures that when you do obj.__factor = 42, __factor in A isn't changed, because you're actually touching _B__factor, which leaves _A__factor safe and sound.

If you're designing a library with classes that are meant to be used and extended by other developers, you will need to keep this in mind in order to avoid the unintentional overriding of your attributes. Bugs like these can be pretty subtle and hard to spot.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.12.162.37