Multiple inheritance

Multiple inheritance is a touchy subject. In principle, it's very simple: a subclass that inherits from more than one parent class is able to access functionality from both of them. In practice, this is much less useful than it sounds and many expert programmers recommend against using it. So we'll start with a warning:

Note

As a rule of thumb, if you think you need multiple inheritance, you're probably wrong, but if you know you need it, you're probably right.

The simplest and most useful form of multiple inheritance is called a mixin. A mixin is generally a superclass that is not meant to exist on its own, but is meant to be inherited by some other class to provide extra functionality. For example, let's say we wanted to add functionality to our Contact class that allows sending an e-mail to self.email. Sending e-mail is a common task that we might want to use on many other classes. So we can write a simple mixin class to do the e-mailing for us:

	class MailSender:
		def send_mail(self, message):
			print("Sending mail to " + self.email)
			# Add e-mail logic here

For brevity, we won't include the actual e-mail logic here; if you're interested in studying how it's done, see the smtplib module in the Python standard library.

This class doesn't do anything special (in fact, it can barely function as a stand-alone class), but it does allow us to define a new class that is both a Contact and a MailSender, using multiple inheritance:

	class EmailableContact(Contact, MailSender):
		pass

The syntax for multiple inheritance looks like a parameter list in the class definition. Instead of including one base class inside the parenthesis, we include two (or more), separated by a comma. We can test this new hybrid to see the mixin at work:


>>> e = EmailableContact("John Smith", "[email protected]")
>>> Contact.all_contacts
[<__main__.EmailableContact object at 0xb7205fac>]
>>> e.send_mail("Hello, test e-mail here")
Sending mail to [email protected]

The Contact initializer is still adding the new contact to the all_contacts list, and the mixin is able to send mail to self.email so we know everything is working.

That wasn't so hard, and you're probably wondering what the dire warnings about multiple inheritance are. We'll get into the complexities in a minute, but let's consider what options we had, other than using a mixin here:

  • We could have used single inheritance and added the send_mail function to the subclass. The disadvantage here is that the e-mail functionality then has to be duplicated for any other classes that need e-mail.
  • We can create a stand-alone Python function for sending mail, and just call that, with the correct e-mail address supplied as a parameter, when e-mail needs to be sent.
  • We could monkey-patch (we'll briefly cover monkey-patching in Chapter 7) the Contact class to have a send_mail method after the class has been created. This is done by defining a function that accepts the self argument, and setting it as an attribute on an existing class.

Multiple inheritance works all right when mixing methods from different classes, but it gets very messy when we have to work with calling methods on the superclass. Why? Because there are multiple superclasses. How do we know which one to call? How do we know what order to call them in?

Let's explore these questions by adding a home address to our Friend class. What are some ways we could do this? An address is a collection of strings representing the street, city, country, and other related details of the contact. We could pass each of these strings as parameters into the Friend class's __init__ method. We could also store these strings in a tuple or dictionary and pass them into __init__ as a single argument. This is probably the best course of action if there is no additional functionality that needs to be added to the address.

Another option would be to create a new Address class to hold those strings together, and then pass an instance of this class into the __init__ in our Friend class. The advantage of this solution is that we can add behavior (say, a method to give directions to that address or to print a map) to the data instead of just storing it statically. This would be utilizing composition, the "has a" relationship we discussed in Chapter 1. Composition is a perfectly viable solution to this problem and allows us to reuse Address classes in other entities such as buildings, businesses, or organizations.

However, inheritance is also a viable solution, and that's what we want to explore, so let's add a new class that holds an address. We'll call this new class AddressHolder instead of Address, because inheritance defines an "is a" relationship. It is not correct to say a Friend is an Address, but since a friend can have an Address, we can argue that a Friend is an AddressHolder. Later, we could create other entities (companies, buildings) that also hold addresses. Here's our AddressHolder class:

	class AddressHolder:
		def __init__(self, street, city, state, code):
			self.street = street	
			self.city = city
			self.state = state
			self.code = code

Very simple; we just take all the data and toss it into instance variables upon initialization.

The diamond problem

But how can we use this in our existing Friend class, which is already inheriting from Contact? Multiple inheritance, of course. The tricky part is that we now have two parent __init__ methods that both need to be initialized. And they need to be initialized with different arguments. How do we do that? Well, we could start with the naïve approach:

	class Friend(Contact, AddressHolder):
		def __init__(self, name, email, phone,
				street, city, state, code):
			Contact.__init__(self, name, email)
			AddressHolder.__init__(
				self, street, city, state, code)
			self.phone = phone

In this example, we directly call the __init__ function on each of the superclasses and explicitly pass the self argument. This example technically works; we can access the different variables directly on the class. But there are a few problems.

First, it is possible for a superclass to go uninitialized if we neglect to explicitly call the initializer. This is not bad in this example, but it could cause bad program crashes in common scenarios. Imagine, for example, trying to insert data into a database that has not been connected to.

Second, and more sinister, is the possibility of a superclass being called multiple times, because of the organization of the class hierarchy. Look at this inheritance diagram:

The diamond problem

The __init__ method from the Friend class first calls __init__ on Contact which implicitly initializes the object superclass (remember, all classes derive from object). Friend then calls __init__ on AddressHolder, which implicitly initializes the object superclass... again. The parent class has been set up twice. In this case, that's relatively harmless, but in some situations, it could spell disaster. Imagine trying to connect to a database twice for every request! The base class should only be called once. Once, yes, but when? Do we call Friend then Contact then Object then AddressHolder? Or Friend then Contact then AddressHolder then Object?

Note

Technically, the order in which methods can be called can be adapted on the fly by modifying the __mro__ (Method Resolution Order) attribute on the class. This is beyond the scope of this book. If you think you need to understand it, I recommend Expert Python Programming, Tarek Ziadé, Packt Publishing, or read the original documentation on the topic at: http://www.python.org/download/releases/2.3/mro/

Let's look at a second contrived example that illustrates this problem more clearly. Here we have a base class that has a method named call_me. Two subclasses override that method, and then another subclass extends both of these using multiple inheritance. This is called diamond inheritance because of the diamond shape of the class diagram:

The diamond problem

Diamonds are what makes multiple inheritance tricky. Technically, all multiple inheritance in Python 3 is diamond inheritance, because all classes inherit from object. The previous diagram, using object.__init__ is also such a diamond.

Converting this diagram to code, this example shows when the methods are called:

	class BaseClass:
		num_base_calls = 0
		def call_me(self):
			print("Calling method on Base Class")
			self.num_base_calls += 1 	
			
	class LeftSubclass(BaseClass):
		num_left_calls = 0
		def call_me(self):
			BaseClass.call_me(self)
			print("Calling method on Left Subclass")
			self.num_left_calls += 1
			
	class RightSubclass(BaseClass):
		num_right_calls = 0
		def call_me(self):
			BaseClass.call_me(self)
			print("Calling method on Right Subclass")
			self.num_right_calls += 1
			
	class Subclass(LeftSubclass, RightSubclass):
		num_sub_calls = 0
		def call_me(self):
			LeftSubclass.call_me(self)
			RightSubclass.call_me(self)
			print("Calling method on Subclass")
			self.num_sub_calls += 1

This example simply ensures each overridden call_me method directly calls the parent method with the same name. Each time it is called, it lets us know by printing the information to the screen, and updates a static variable on the class to show how many times it has been called. If we instantiate one Subclass object and call the method on it once, we get this output:


>>> s = Subclass()
>>> s.call_me()
Calling method on Base Class
Calling method on Left Subclass
Calling method on Base Class
Calling method on Right Subclass
Calling method on Subclass
>>> print(s.num_sub_calls, s.num_left_calls, s.num_right_calls,
s.num_base_calls)
1 1 1 2
>>>

The base class's call_me method has been called twice. This isn't expected behavior and can lead to some very difficult bugs if that method is doing actual work like depositing into a bank account twice.

The thing to keep in mind with multiple inheritance is that we only want to call the "next" method in the class hierarchy, not the "parent" method. In fact, that next method may not be on a parent or ancestor of the current class. The super keyword comes to our rescue once again. Indeed, super was originally developed to make complicated forms of multiple inheritance possible. Here is the same code written using super:

	class BaseClass:
		num_base_calls = 0
		def call_me(self):
			print("Calling method on Base Class")
			self.num_base_calls += 1
			
	class LeftSubclass(BaseClass):
		num_left_calls = 0
		def call_me(self):
			super().call_me()
			print("Calling method on Left Subclass")
			self.num_left_calls += 1	
			
	class RightSubclass(BaseClass):
		num_right_calls = 0
		def call_me(self):
			super().call_me()
			print("Calling method on Right Subclass")
			self.num_right_calls += 1
			
	class Subclass(LeftSubclass, RightSubclass):
		num_sub_calls = 0
		def call_me(self):
			super().call_me()
			print("Calling method on Subclass")
			self.num_sub_calls += 1

The change is pretty minor; we simply replaced the naïve direct calls with calls to super(). This is simple enough, but look at the difference when we execute it:


>>> s = Subclass()
>>> s.call_me()
Calling method on Base Class
Calling method on Right Subclass
Calling method on Left Subclass
Calling method on Subclass
>>> print(s.num_sub_calls, s.num_left_calls, s.num_right_calls,
s.num_base_calls)
1 1 1 1

Looks good, our base method is only being called once. But what is super() actually doing here? Since the print statements are executed after the super calls, the printed output is in the order each method is actually executed. Let's look at the output from back to front to see who is calling what.

First call_me of Subclass calls super().call_me(), which happens to refer to LeftSubclass.call_me(). LeftSubclass.call_me() then calls super().call_me(), but in this case, super() is referring to RightSubclass.call_me(). Pay particular attention to this; the super call is not calling the method on the superclass of LeftSubclass (which is BaseClass), it is calling RightSubclass, even though it is not a parent of LeftSubclass! This is the next method, not the parent method. RightSubclass then calls BaseClass and the super calls have ensured each method in the class hierarchy is executed once.

Different sets of arguments

Can you see how this is going to make things complicated when we return to our Friend multiple inheritance example? In the __init__ method for Friend, we were originally calling __init__ for both parent classes, with different sets of arguments:


	Contact.__init__(self, name, email)
	AddressHolder.__init__(self, street, city, state, code)

How can we convert this to using super? We don't necessarily know which class super is going to try to initialize first. Even if we did, we need a way to pass the "extra" arguments so that subsequent calls to super, on other subclasses, have the right arguments.

Specifically, if the first call to super passes the name and email arguments to Contact.__init__, and Contact.__init__ then calls super, it needs to be able to pass the address related arguments to the "next" method, which is AddressHolder.__init__.

This is a problem whenever we want to call superclass methods with the same name, but different sets of arguments. Most often, the only time you would want to call a superclass with a completely different set of arguments is in __init__, as we're doing here. Even with regular methods, though, we may want to add optional parameters that only make sense to one subclass or a set of subclasses.

Sadly, the only way to solve this problem is to plan for it from the beginning. We have to design our base class parameter lists so that they accept keyword arguments for any argument that is not required by every subclass implementation. We also have to ensure the method accepts arguments it doesn't expect and pass those on in its super call, in case they are necessary to later methods in the inheritance order.

Python's function parameter syntax provides all the tools we need to do this, but it makes the overall code cumbersome. Have a look at the proper version of the Friend multiple inheritance code:

	class Contact:
		all_contacts = []
	
		def __init__(self, name='', email='', **kwargs):
			super().__init__(**kwargs)
			self.name = name
			self.email = email		
			self.all_contacts.append(self)
			
	class AddressHolder:
		def __init__(self, street='', city='', state='', code='',
				**kwargs):
			super().__init__(**kwargs)
			self.street = street
			self.city = city
			self.state = state
			self.code = code
			
	class Friend(Contact, AddressHolder):
		def __init__(self, phone='', **kwargs):
			super().__init__(**kwargs)
			self.phone = phone

We've changed all arguments to keyword arguments by giving them an empty string as a default value. We've also ensured that a **kwargs parameter is included to capture any additional parameters that our particular method doesn't know what to do with. It passes these parameters up to the next class with the super call.

Note

If you aren't familiar with the **kwargs syntax, it basically collects any keyword arguments passed into the method that were not explicitly listed in the parameter list. These arguments are stored in a dictionary named kwargs (we can call the variable whatever we like, but convention suggests kw, or kwargs). When we call a different method (for example: super().__init__) with a **kwargs syntax, it unpacks the dictionary and passes the results to the method as normal keyword arguments. We'll cover this in detail in Chapter 7.

The previous example does what it is supposed to do. But it's starting to look messy, and it has become difficult to answer the question, "What arguments do we need to pass into Friend.__init__?" This is the foremost question for anyone planning to use the class, so a docstring should be added to the method to explain what is happening.

Further, even this implementation is insufficient if we want to "reuse" variables in parent classes. When we pass the **kwargs variable to super, the dictionary does not include any of the variables that were included as explicit keyword arguments. For example, in Friend.__init__, the call to super does not have phone in the kwargs dictionary. If any of the other classes need the phone parameter, we need to ensure it is in the dictionary that is passed. Worse, if we forget to do that, it will be tough to debug, because the superclass will not complain, but will simply assign the default value (in this case, an empty string) to the variable.

There are a few ways to ensure that the variable is passed upwards. Assume the Contact class does, for some reason, need to be initialized with a phone parameter, and the Friend class will also need access to it. We can do any of the following:

  • Don't include phone as an explicit keyword argument. Instead, leave it in the kwargs dictionary. Friend can look it up using the syntax kwargs['phone']. When it passes **kwargs to the super call, phone will still be in the dictionary.
  • Make phone an explicit keyword argument but update the kwargs dictionary before passing it to super, using the standard dictionary syntax kwargs['phone'] = phone.
  • Make phone an explicit keyword argument, but update the kwargs dictionary using the kwargs.update method. This is useful if you have several arguments to update. You can create the dictionary passed into update using either the dict(phone=phone) constructor, or the dictionary syntax {'phone': phone}.
  • Make phone an explicit keyword argument, but pass it to the super call explicitly with the syntax super().__init__(phone=phone, **kwargs).

We have covered many of the caveats involved with multiple inheritance in Python. When we need to account for all the possible situations, we have to plan for them and our code will get messy. Basic multiple inheritance can be handy, but in many cases, we may want to choose a more transparent way of combining two disparate classes, usually using composition or one of the design patterns we'll be covering in Chapter 8 and Chapter 9.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.109.8