Let’s step back for a moment and consider how far we’ve come. At this point, we’ve created a database of records: the shelve, as well as per-record pickle file approaches of the prior section suffice for basic data storage tasks. As is, our records are represented as simple dictionaries, which provide easier-to-understand access to fields than do lists (by key, rather than by position). Dictionaries, however, still have some limitations that may become more critical as our program grows over time.
For one thing, there is no central place for us to collect record processing logic. Extracting last names and giving raises, for instance, can be accomplished with code like the following:
>>>import shelve
>>>db = shelve.open('people-shelve')
>>>bob = db['bob']
>>>bob['name'].split( )[-1]
# get bob's last name 'Smith' >>>sue = db['sue']
>>>sue['pay'] *= 1.25
# give sue a raise >>>sue['pay']
75000.0 >>>db['sue'] = sue
>>>db.close( )
This works, and it might suffice for some short programs. But if we ever need to change the way last names and raises are implemented, we might have to update this kind of code in many places in our program. In fact, even finding all such magical code snippets could be a challenge; hardcoding or cutting and pasting bits of logic redundantly like this in more than one place will almost always come back to haunt you eventually.
It would be better to somehow hide—that is, encapsulate—such bits of code. Functions in a module would allow us to implement such operations in a single place and thus avoid code redundancy, but still wouldn’t naturally associate them with the records themselves. What we’d like is a way to bind processing logic with the data stored in the database in order to make it easier to understand, debug, and reuse.
Another downside to using dictionaries for records is that they are difficult to expand over time. For example, suppose that the set of data fields or the procedure for giving raises is different for different kinds of people (perhaps some people get a bonus each year and some do not). If we ever need to extend our program, there is no natural way to customize simple dictionaries. For future growth, we’d also like our software to support extension and customization in a natural way.
This is where Python’s OOP support begins to become attractive:
With OOP, we can naturally associate processing logic with record data—classes provide both a program unit that combines logic and data in a single package and a hierarchy that allows code to be easily factored to avoid redundancy.
With OOP, we can also wrap up details such as name processing and pay increases behind method functions—i.e., we are free to change method implementations without breaking their users.
And with OOP, we have a natural growth path. Classes can be extended and customized by coding new subclasses, without changing or breaking already working code.
That is, under OOP, we program by customizing and reusing, not by rewriting. OOP is an option in Python and, frankly, is sometimes better suited for strategic than for tactical tasks. It tends to work best when you have time for upfront planning—something that might be a luxury if your users have already begun storming the gates.
But especially for larger systems that change over time, its code reuse and structuring advantages far outweigh its learning curve, and it can substantially cut development time. Even in our simple case, the customizability and reduced redundancy we gain from classes can be a decided advantage.
OOP is easy to use in Python, thanks largely to Python’s dynamic typing model. In fact, it’s so easy that we’ll jump right into an example: Example 2-14 implements our database records as class instances rather than as dictionaries.
Example 2-14. PP3EPreviewperson_start.py
class Person: def _ _init_ _(self, name, age, pay=0, job=None): self.name = name self.age = age self.pay = pay self.job = job if _ _name_ _ == '_ _main_ _': bob = Person('Bob Smith', 42, 30000, 'sweng') sue = Person('Sue Jones', 45, 40000, 'music') print bob.name, sue.pay print bob.name.split( )[-1] sue.pay *= 1.10 print sue.pay
There is not much to this class—just a constructor method that fills out the instance with data passed in as arguments to the class name. It’s sufficient to represent a database record, though, and it can already provide tools such as defaults for pay and job fields that dictionaries cannot. The self-test code at the bottom of this file creates two instances (records) and accesses their attributes (fields); here is this file being run under IDLE:
>>> Bob Smith 40000 Smith 44000.0
This isn’t a database yet, but we could stuff these objects into a list or dictionary as before in order to collect them as a unit:
>>>from person_start import Person
>>>bob = Person('Bob Smith', 42)
>>>sue = Person('Sue Jones', 45, 40000)
>>>people = [bob, sue]
# a "database" list >>>for person in people:
print person.name, person.pay
Bob Smith 0 Sue Jones 40000 >>>x = [(person.name, person.pay) for person in people]
>>>x
[('Bob Smith', 0), ('Sue Jones', 40000)]
Notice that Bob’s pay defaulted to zero this time because we didn’t pass in a value for that argument (maybe Sue is supporting him now?). We might also implement a class that represents the database, perhaps as a subclass of the built-in list or dictionary types, with insert and delete methods that encapsulate the way the database is implemented. We’ll abandon this path for now, though, because it will be more useful to store these records persistently in a shelve, which already encapsulates stores and fetches behind an interface for us. Before we do, though, let’s add some logic.
So far, our class is just data: it replaces dictionary keys with object attributes, but it doesn’t add much to what we had before. To really leverage the power of classes, we need to add some behavior. By wrapping up bits of behavior in class method functions, we can insulate clients from changes. And by packaging methods in classes along with data, we provide a natural place for readers to look for code. In a sense, classes combine records and the programs that process those records; methods provide logic that interprets and updates the data.
For instance, Example
2-15 adds the last-name and raise logic as class methods;
methods use the self
argument to
access or update the instance (record) being processed.
Example 2-15. PP3EPreviewperson.py
class Person: def _ _init_ _(self, name, age, pay=0, job=None): self.name = name self.age = age self.pay = pay self.job = job def lastName(self): return self.name.split( )[-1] def giveRaise(self, percent): self.pay *= (1.0 + percent) if _ _name_ _ == '_ _main_ _': bob = Person('Bob Smith', 42, 30000, 'sweng') sue = Person('Sue Jones', 45, 40000, 'music') print bob.name, sue.pay print bob.lastName( ) sue.giveRaise(.10) print sue.pay
The output of this script is the same as the last, but the results are being computed by methods now, not by hardcoded logic that appears redundantly wherever it is required:
>>> Bob Smith 40000 Smith 44000.0
One last enhancement to our records before they become
permanent: because they are implemented as classes now, they
naturally support customization through the inheritance search mechanism in Python. Example 2-16, for instance,
customizes the last section’s Person
class in order to give a 10 percent
bonus by default to managers whenever they receive a raise (any
relation to practice in the real world is purely
coincidental).
Example 2-16. PP3EPreviewmanager.py
from person import Person class Manager(Person): def giveRaise(self, percent, bonus=0.1): self.pay *= (1.0 + percent + bonus) if _ _name_ _ == '_ _main_ _': tom = Manager(name='Tom Doe', age=50, pay=50000) print tom.lastName( ) tom.giveRaise(.20) print tom.pay >>> Doe 65000.0
Here, the Manager
class
appears in a module of its own, but it could have been added to the
person
module instead (Python
doesn’t require just one class per file). It inherits the
constructor and last-name methods from its superclass, but it
customizes just the raise
method.
Because this change is being added as a new subclass, the original
Person
class, and any objects
generated from it, will continue working unchanged. Bob and Sue, for
example, inherit the original raise logic, but Tom gets the custom
version because of the class from which he is created. In OOP, we
program by customizing, not by changing.
In fact, code that uses our objects doesn’t need to be at all
ware of what the raise method does—it’s up to the object to do the
right thing based on the class from which it is created. As long as
the object supports the expected interface (here, a method called
giveRaise
), it will be compatible
with the calling code, regardless of its specific type, and even if
its method works differently than others.
If you’ve already studied Python, you may know this behavior
as polymorphism; it’s a core property of the language, and it
accounts for much of your code’s flexibility. When the following
code calls the giveRaise
method,
for example, what happens depends on the obj
object being processed; Tom gets a 20
percent raise instead of 10 percent because of the Manager
class’s customization:
>>>from person import Person
>>>from manager import Manager
>>>bob = Person(name='Bob Smith', age=42, pay=10000)
>>>sue = Person(name='Sue Jones', age=45, pay=20000)
>>>tom = Manager(name='Tom Doe', age=55, pay=30000)
>>>db = [bob, sue, tom]
>>>for obj in db:
obj.giveRaise(.10) # default or custom
>>>for obj in db:
print obj.lastName( ), '=>', obj.pay
Smith => 11000.0 Jones => 22000.0 Doe => 36000.0
Before we move on, there are a few coding alternatives worth noting here. Most of these underscore the Python OOP model, and they serve as a quick review.
As a first alternative, notice that we have introduced some
redundancy in Example
2-16: the raise calculation is now repeated in two places
(in the two classes). We could also have implemented the
customized Manager
class by
augmenting the inherited raise method instead of replacing it
completely:
class Manager(Person): def giveRaise(self, percent, bonus=0.1): Person.giveRaise(self, percent + bonus)
The trick here is to call back the superclass’s version of
the method directly, passing in the self
argument explicitly. We still
redefine the method, but we simply run the general version after
adding 10 percent (by default) to the passed-in percentage. This
coding pattern can help reduce code redundancy (the original raise
method’s logic appears in only one place and so is easier to
change) and is especially handy for kicking off superclass
constructor methods in practice.
If you’ve already studied Python OOP, you know that this coding scheme works because we can always call methods through either an instance or the class name. In general, the following are equivalent, and both forms may be used explicitly:
instance.method(arg1, arg2) class.method(instance, arg1, arg2)
In fact, the first form is mapped to the second—when calling
through the instance, Python determines the class by searching the
inheritance tree for the method name and passes in the instance
automatically. Either way, within giveRaise
, self
refers to the instance that is the
subject of the call.
For more object-oriented fun, we could also add a few
operator overloading methods to our people classes. For example, a
_ _str_ _
method, shown here,
could return a string to give the display format for our objects
when they are printed as a whole—much better than the default
display we get for an instance:
class Person: def _ _str_ _(self): return '<%s => %s>' % (self._ _class_ _._ _name_ _, self.name) tom = Manager('Tom Jones', 50) print tom # prints: <Manager => Tom Jones>
Here _ _class_ _
gives
the lowest class from which self
was made, even though _ _str_ _
may be inherited. The net
effect is that _ _str_ _
allows
us to print instances directly instead of having to print specific
attributes. We could extend this _ _str_
_
to loop through the instance’s _ _dict_ _
attribute dictionary to
display all attributes generically.
We might even code an _ _add_
_
method to make +
expressions automatically call the giveRaise
method. Whether we should is
another question; the fact that a +
expression gives a person a raise
might seem more magical to the next person reading our code than
it should.
Finally, notice that we didn’t pass the job
argument when making a manager in
Example 2-16; if we
had, it would look like this with keyword arguments:
tom = Manager(name='Tom Doe', age=50, pay=50000, job='manager')
The reason we didn’t include a job in the example is that it’s redundant with the class of the object: if someone is a manager, their class should imply their job title. Instead of leaving this field blank, though, it may make more sense to provide an explicit constructor for managers, which fills in this field automatically:
class Manager(Person): def _ _init_ _(self, name, age, pay): Person._ _init_ _(self, name, age, pay, 'manager')
Now when a manager is created, its job is filled in
automatically. The trick here is to call to the superclass’s
version of the method explicitly, just as we did for the giveRaise
method earlier in this
section; the only difference here is the unusual name for the
constructor method.
We won’t use any of this section’s three extensions in later
examples, but to demonstrate how they work, Example 2-17 collects these
ideas in an alternative implementation of our Person
classes.
Example 2-17. PP3EPreviewpeople-alternative.py
""" alternative implementation of person classes data, behavior, and operator overloading """ class Person: """ a general person: data+logic """ def _ _init_ _(self, name, age, pay=0, job=None): self.name = name self.age = age self.pay = pay self.job = job def lastName(self): return self.name.split( )[-1] def giveRaise(self, percent): self.pay *= (1.0 + percent) def _ _str_ _(self): return ('<%s => %s: %s, %s>' % (self._ _class_ _._ _name_ _, self.name, self.job, self.pay)) class Manager(Person): """ a person with custom raise inherits general lastname, str """ def _ _init_ _(self, name, age, pay): Person._ _init_ _(self, name, age, pay, 'manager') def giveRaise(self, percent, bonus=0.1): Person.giveRaise(self, percent + bonus) if _ _name_ _ == '_ _main_ _': bob = Person('Bob Smith', 44) sue = Person('Sue Jones', 47, 40000, 'music') tom = Manager(name='Tom Doe', age=50, pay=50000) print sue, sue.pay, sue.lastName( ) for obj in (bob, sue, tom): obj.giveRaise(.10) # run this obj's giveRaise print obj # run common _ _str_ _ method
Notice the polymorphism in this module’s self-test loop: all three objects share the constructor, last-name, and printing methods, but the raise method called is dependent upon the class from which an instance is created. When run, Example 2-17 prints the following to standard output—the manager’s job is filled in at construction, we get the new custom display format for our objects, and the new version of the manager’s raise method works as before:
<Person => Sue Jones: music, 40000> 40000 Jones <Person => Bob Smith: None, 0.0> <Person => Sue Jones: music, 44000.0> <Manager => Tom Doe: manager, 60000.0>
Such refactoring (restructuring) of code is common as class hierarchies grow and evolve. In fact, as is, we still can’t give someone a raise if his pay is zero (Bob is out of luck); we probably need a way to set pay, too, but we’ll leave such extensions for the next release. The good news is that Python’s flexibility and readability make refactoring easy—it’s simple and quick to restructure your code. If you haven’t used the language yet, you’ll find that Python development is largely an exercise in rapid, incremental, and interactive programming, which is well suited to the shifting needs of real-world projects.
It’s time for a status update. We now have encapsulated in the form of classes customizable implementations of our records and their processing logic. Making our class-based records persistent is a minor last step. We could store them in per-record pickle files again; a shelve-based storage medium will do just as well for our goals and is often easier to code. Example 2-18 shows how.
Example 2-18. PP3EPreviewmake_db_classes.py
import shelve from person import Person from manager import Manager bob = Person('Bob Smith', 42, 30000, 'sweng') sue = Person('Sue Jones', 45, 40000, 'music') tom = Manager('Tom Doe', 50, 50000) db = shelve.open('class-shelve') db['bob'] = bob db['sue'] = sue db['tom'] = tom db.close( )
This file creates three class instances (two from the original class and one from its customization) and assigns them to keys in a newly created shelve file to store them permanently. In other words, it creates a shelve of class instances; to our code, the database looks just like a dictionary of class instances, but the top-level dictionary is mapped to a shelve file again. To check our work, Example 2-19 reads the shelve and prints fields of its records.
Example 2-19. PP3EPreviewdump_db_class.py
import shelve db = shelve.open('class-shelve') for key in db: print key, '=> ', db[key].name, db[key].pay bob = db['bob'] print bob.lastName( ) print db['tom'].lastName( )
Note that we don’t need to reimport the Person
class here in order to fetch its
instances from the shelve or run their methods. When instances are
shelved or pickled, the underlying pickling system records both
instance attributes and enough information to locate their classes
automatically when they are later fetched (the class’s module simply
has to be on the module search path when an instance is loaded).
This is on purpose; because the class and its instances in the
shelve are stored separately, you can change the class to modify the
way stored instances are interpreted when loaded (more on this later
in the book). Here is the shelve dump script running under IDLE just
after creating the shelve:
>>> tom => Tom Doe 50000 bob => Bob Smith 30000 sue => Sue Jones 40000 Smith Doe
As shown in Example 2-20, database updates are as simple as before, but dictionary keys become object attributes and updates are implemented by method calls, not by hardcoded logic. Notice how we still fetch, update, and reassign to keys to update the shelve.
Example 2-20. PP3EPreviewupdate_db_class.py
import shelve db = shelve.open('class-shelve') sue = db['sue'] sue.giveRaise(.25) db['sue'] = sue tom = db['tom'] tom.giveRaise(.20) db['tom'] = tom db.close( )
And last but not least, here is the dump script again after running the update script; Tom and Sue have new pay values, because these objects are now persistent in the shelve. We could also open and inspect the shelve by typing code at Python’s interactive command line; despite its longevity, the shelve is just a Python object containing Python objects.
>>> tom => Tom Doe 65000.0 bob => Bob Smith 30000 sue => Sue Jones 50000.0 Smith Doe
Tom and Sue both get a raise this time around, because they are persistent objects in the shelve database. Although shelves can store simpler object types such as lists and dictionaries, class instances allow us to combine both data and behavior for our stored items. In a sense, instance attributes and class methods take the place of records and processing programs in more traditional schemes.
At this point, we have a full-fledged database system:
our classes simultaneously implement record data and record
processing, and they encapsulate the implementation of the behavior.
And the Python pickle
and
shelve
modules provide simple
ways to store our database persistently between program executions.
This is not a relational database (we store objects, not tables, and
queries take the form of Python object processing code), but it is
sufficient for many kinds of programs.
If we need more functionality, we could migrate this application to even more powerful tools. For example, should we ever need full-blown SQL query support, there are interfaces that allow Python scripts to communicate with relational databases such as MySQL, PostgreSQL, and Oracle in portable ways.
Moreover, the open source ZODB system provides a more comprehensive object database for Python, with support for features missing in shelves, including concurrent updates, transaction commits and rollbacks, automatic updates on in-memory component changes, and more. We’ll explore these more advanced third-party tools in Chapter 19. For now, let’s move on to putting a good face on our system.
3.129.70.185