Who can access my data?

Most object-oriented programming languages have a concept of "access control". This is related to abstraction. Some attributes and methods on an object are marked "private", meaning only that object can access them. Others are marked "protected", meaning only that class and any subclasses have access. The rest are "public", meaning any other object is allowed to access them.

Python doesn't do that. Python doesn't really believe in enforcing laws that might someday get in your way. Instead, it provides unenforced guidelines and best practices. Technically, all methods and attributes on a class are publicly available. If we want to suggest that a method should not be used publicly, we can put a note in docstrings indicating if a method is meant for internal use only, (preferably with an explanation of how the public-facing API works!).

By convention, we can also prefix an attribute or method with an underscore character: _. Most Python programmers will interpret this as, "This is an internal variable, think three times before accessing it directly". But there is nothing stopping them from accessing it if they think it is in their best interest to do so. Yet, if they think so, why should we stop them? We may not have any idea what future uses our classes may be put to.

There's another thing you can do to strongly suggest that outside objects don't access a property or method. Prefix it with a double underscore: __. This will perform name mangling on the attribute in question. This basically means that the method can still be called by outside objects if they really want to do it, but it requires extra work and is a strong indicator that you think your attribute should remain private. For example:

	class SecretString:
		'''A not-at-all secure way to store a secret string.'''
		
		def __init__(self, plain_string, pass_phrase):
			self.__plain_string = plain_string
			self.__pass_phrase = pass_phrase	
			
		def decrypt(self, pass_phrase):
			'''Only show the string if the pass_phrase is correct.'''
			if pass_phrase == self.__pass_phrase:
				return self.__plain_string
			else:
				return ''

If we load this class and test it in the interactive interpreter, we can see that it hides the plaintext string from the outside world:


>>> secret_string = SecretString("ACME: Top Secret", "antwerp")
>>> print(secret_string.decrypt("antwerp"))
ACME: Top Secret
>>> print(secret_string.__plain_text)
Traceback (most recent call last):
	File "<stdin>", line 1, in <module>
AttributeError: 'SecretString' object has no attribute
'__plain_text'
>>>

It looks like it works. Nobody can access our plain_text attribute without the passphrase, so it must be safe. Before we get too excited, though, let's see how easy it can be to hack our security:


>>> print(secret_string._SecretString__plain_string)
ACME: Top Secret

Oh No! Somebody has hacked our secret string. Good thing we checked! This is Python name mangling at work. When we use a double underscore, the property is prefixed with _<classname>. When methods in the class internally access the variable, they are automatically unmangled. When external classes wish to access it, they have to do the name mangling themselves. So name mangling does not guarantee privacy, it only strongly recommends it. Most Python programmers will not touch a double-underscore variable on another object unless they have an extremely compelling reason to do so.

However, most Python programmers will not touch a single-underscore variable without a compelling reason either. For the most part, there is no good reason to use a name-mangled variable in Python, and doing so can cause grief. For example, a name-mangled variable may be useful to a subclass, and it would have to do the mangling itself. Let other objects access your hidden information if they want to, just let them know, using a single-underscore prefix or some clear docstrings that you think this is not a good idea.

Finally, we can import code directly from packages, as opposed to just modules inside packages. In our earlier example, we had an ecommerce package containing two modules named database.py and products.py. The database module contains a db variable that is accessed from a lot of places. Wouldn't it be convenient if this could be imported as import ecommerce.db instead of import ecommerce.database.db?

Remember the __init__.py file that defines a directory as a package? That file can contain any variables or class declarations we like, and they will be available as part of the package. In our example, if the ecommerce/__init__.py file contained this line:

	from .database import db

We could then access the db attribute from main.py or any other file using this import:

	from ecommerce import db

It might help to think of the __init__.py file as if it was an ecommerce.py file if that file were a module instead of a package. This can also be useful if you put all your code in a single module and later decide to break it up into a package of modules. The __init__.py file for the new package can still be the main point of contact for other modules talking to it, but the code can be internally organized into several different modules or subpackages.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.9.148