Polymorphism

A group of related objects can have common behaviour, which still can be very distinctive. For instance, pens, pencils, and markers all can be used to draw, yet they leave very different marks on paper. Most animals make a noise, but the precise sound is unique to the animal.

Things with common behaviour can be modelled by using inheritance; all classes that share a base class will inherit that base classes' behaviour. A group of classes related by inheritance is called a class hierarchy.

If you have a number of graphical objects, they all will have at least one thing in common: They can draw themselves. That is, they will all have a draw() method. If these graphical objects all derive from some common base class Shape, which defines draw(), then they can all be accessed through Shape::draw().

Polymorphism, from the Greek words for many and forms, means that a common function name (such as draw()) can have different meanings for different objects. You have already seen polymorphism in action; operator overloading makes + mean very different operations (like integer or floating-point addition, or string concatenation) which nevertheless are all kinds of addition. This is sometimes called static polymorphism, as opposed to dynamic polymorphism, when the actual method to call is decided only at run-time. How this magic works and how it can work for you is discussed in the following sections.

Class Hierarchies

Consider the family tree shown in Figure 8.2. This tree is unlike human family trees in two ways: A person has two parents, not one, and generally a family tree is concerned with the patrilineal line only. Similarly to the way humans inherit from their ancestors, class inheritance creates a class hierarchy. As shown in Figure 8.2, both Temp and Manager are derived from Employee. However, Figure 8.2 does not show a hierarchy of importance; it doesn't mean than Temp and Manager are equally important, and less important than Employee. Rather, a class hierarchy is like a classification of animals, which is based on the animals' evolutionary ancestors.

Figure 8.2. A family tree and a class hierarchy.


When a program is running, objects form dynamic relationships with each other (for example, Manager keeps a list of Employee objects) called the object hierarchy, in which a Manager object is indeed more important than a Temp object.

Class hierarchies depend on what you are trying to model. Figure 8.3 shows a fairly arbitrary classification of some domestic animals; the basic division is according to diet (that is, carnivore, herbivore, omnivore). This hierarchy might be more useful to a farmer or zookeeper than a rigorous genetic classification, which would put elephants and shrews next to each other.

Figure 8.3. A classification of animals by diet.


The Animals hierarchy also classifies a Person as an omnivorous animal. This is a completely different view of Person, which would not be useful for human resources or payroll applications.

NOTE

A C++ class can have more than one base class. Strictly speaking, a cat is both a mammal and a carnivore, and many nonmammals are carnivorous. It is equally true that an Employee object is a Person object and is also a TaxPayer object. It is derived from two parent classes, just as people are derived from two parents. However, not everyone thinks that multiple inheritance is good object-oriented design. It is certainly important to get single inheritence right first. So this book does not discuss multiple inheritance.


A Hierarchy of Animals

Consider the following example, which defines the classes NamedObject, Animal, and Dog:


class NamedObject {
private:
  string m_name;
public:
  NamedObject(string name)
  : m_name(name) { }
  string name()  {  return m_name; }

};

class Animal: public NamedObject {
public:
  Animal(string name)
  : NamedObject(name) { }

  static void say(std::string s) {
    cout << "say: " << s << endl;
  }

  void call_out() {  say("<nada>"); }
};

class Dog: public Animal {
private:
  string m_breed;
public:
  Dog(string breed = "mongrel")
  : Animal("dog"), m_breed(breed) { }

  string breed()  {  return m_breed; }

  void call_out() {  say("woof!"); }
};

void exercise_animal(Animal* pa)
{
  pa->call_out();
  if (pa->name() == "dog") {
    cout << "breed: " << ((Dog *)pa)->breed() << endl;
  }
}
;> Animal a("?"); Dog d;
;> a.call_out();
<nada>
;> d.call_out()
woof!
;> exercise_animal(&d);
<nada>
breed: mongrel

An Animal object has a name; you can move this property of Animal into the base class NamedObject, which will not only simplify Animal but any other classes that carry names. Doing this also means you could in future impose some policy on all names in your system; for instance, that names should not contain any special characters. Note that NamedObject has a single constructor taking a string argument, so Animal must call this constructor using an initialization list.

Animal has a method call_out() that you use to print out the Animal object's cry. Animal::call_out() does nothing specific, because this is a general class. Animal also supplies a function say() for speaking to the world, which is static in this case because it doesn't depend on the particular object.

The Dog class is derived from the Animal class, which means it inherits everything that is publicly defined in Animal; that is, it inherits name(), say(), and call_out(). You redefine call_out() because a Dog object is a definite kind of Animal object that makes a definite sound. The Dog class can also have a breed value, although it defaults to being a mongrel.

Both generic Animal objects and specific Dog objects have a call_out() method, but they are really different functions that have the same name. Usually C++ makes a final decision at compile time about what function to call; this is called early binding. The exercise_aninal() function only knows about the plain generic Animal::call_out(), and so Dog::call_out() is not called.

Notice the expression ((Dog *)pa)->breed(); the parentheses are necessary because operator-> has a higher precedence than the typecast operator (Dog *). Besides being clumsy, it is a dangerous operation to apply to Animal objects that are not Dog objects, because they don't have an m_breed field. In this case, coercing or forcing the type is likely to cause grief. (This is why you had to explicitly test for the Animal object's name.) Such typecasts are called static because they happen at compile time and are basically dumb. Because these typecasts are potentially unsafe, modern C++ uses the static_cast keyword. The following example shows exercise_animal() rewritten to properly call the appropriate call_out() method; the principle here is that you make dangerous things obvious—it's much easier to search code for static_cast than for C-style typecasts like (Dog *):

void exercise_animal(Animal *pa)
{
  if (pa->name=="dog") {
    Dog *pd = static_cast<Dog *>(pa);
    pd->call_out();                   // Dog::call_out
    cout << "breed: " << pd->breed() << endl;
  }
  else pa->call_out();               // Animal::call_out
}
;> exercise_animal(&d);
woof!
breed: mongrel

This is not a very satisfying function. Sooner or later, somebody will need to keep track of Horse objects, and another if statement will have to go into exercise_animal(). The problem is worse than it seems in this simple example; there are likely to be many such functions, and they all have to be modified if extra animal classes are added. Unless you get paid for each line of code you write (and enjoy debugging bad code all night), you should not choose to go down this route. Yes, we could make the animal's sound a member field, but that would not solve the general problem of making animals with different behaviors.

It would be better if calling Animal::call_out() would automatically select the correct operation. How this is done and how it works is the subject of the next section.

Virtual Methods

Let's look at the classes Animal and Dog again, with a one-word change: adding the qualifier virtual to the first definition of call_out(). You then create an Animal class and a Dog object, and you can make two Animal pointers that refer to them:


class Animal: public NamedObject {
public:
  Animal(std::string name)
  : NamedObject(name) { }

// notice that call_out() has become virtual
  virtual void call_out()
  {  say("<nada>"); }
};

class Dog: public Animal {
private:
  string m_breed;
public:
  Dog(string breed)
  : Animal("dog"), m_breed(breed) { }
  string breed()  {  return m_breed; }

  void call_out()  // override Animal::call_out
 {  say("woof!"); }
};

void exercise_animal(Animal *pa)
{
  cout << "name: " << pa->name() << endl;
  pa->call_out();
}
;> Animal a = "?"; Dog d;
;> Animal *p1 = &a, *p2 = &d;
;> p1->call_out();
<nada>
;> p2->call_out();
woof!
;> exercise_animal(&a);
name: ?
<nada>
;> exercise_animal(&d);
name: dog
woof!

This example defines yet another version of exercise_animal(), which only calls that which is common to all animals, that is, their ability to call out. This time, calling call_out() calls the correct function!

The method call_out() is called a virtual method, and redeclaring it in any derived class is called overriding the method. (You can use the virtual qualifier on the overridden method, but it isn't necessary.) It is a good idea, however, to use a comment to indicate the fact that a method is overridden. It is important that the overriden method be declared with the same signature as the original; otherwise, you get an error (if you're lucky) or a warning about the fact that the new function is hiding a virtual function. That is, if the original declaration was void call_out(), then you should not declare it as int call_out() in some derived class, and so forth.

NOTE

If you have previously used Java, note that Java makes all methods virtual by default, unless you explicitly use the keyword final.


Here is another class derived from Animal:

class Horse: public Animal {
public:
  Horse()
  : Animal("horse") { }
  void call_out() // override Animal::call_out
  {
   say("neigh!");
  }
};
;> Horse h;
;> exercise_animal(&h);
name: horse
neigh!

You can add new animal objects, and any function that understands animal objects will call the correct method. Similarly, it can be useful to make say() virtual. What if the animals were expected to announce themselves in a window rather than to the console? In this case, you simply override say(), and all subsequent classes output to windows.

How does the system know at runtime what method to call? The secret to late binding (as opposed to early binding) is that any such object must have runtime type information (RTTI). Virtual methods are not identified by an actual address, but by an integer index. The object contains a pointer to a table of function pointers, and when the time comes to execute the method, you look up its actual address in the table, by using the index. (This is shown in Figure 8.4.) The table is called the virtual method table (VMT), or vtable, and every class derived from Animal has a different VMT, with at least one entry for call_out(). Such classes are bigger than you would expect because there is always some allocated space for the hidden pointer. The C++ standard does not specify where this pointer is found; in some compilers (such as the Microsoft and Borland C++ compilers), it is the first field; in GNU C++ it is the last field; and in UnderC it is just before the first field. This is the main reason you should not depend on a particular layout of complex classes in memory; the original definition of Employee would not match Person if you added even one virtual method. Virtual methods are also slightly slower because of this lookup, but you should be aware that the main performance issue is that making everything virtual makes inlining impossible (that is, it cannot insert code directly instead of calling a function.)

Figure 8.4. Each Animal object has a hidden pointer to a VMT.


Classes that contain virtual methods are called polymorphic; this refers to how call_out(), for example, can be redefined in many different ways.

You often need to keep collections of polymorphic objects. First note how assignment works between such objects. This example assigns a Dog fido to an Animal cracker:

;> cracker = fido;    // Dog => Animal is cool...
;> cracker.call_out();
<nada>
;> Animal *pa = &fido;
;> pa->call_out();
woof!

The type conversion in this example works, but it throws away information because a Dog object is larger than an Animal object (that is, it contains extra breed information). Plus—and this is very important—cracker remains an Animal object, with a hidden pointer to an Animal VMT. Any assignment between Animal and Dog will have this result. If there was a list of Animal objects, then adding Dog objects would involve such an assignment. Therefore, the following is not the way to make a collection of Animal objects:

;> list<Animal> la;
;> la.push_back(cracker);
;> la.push_back(fido);
;> la.back().call_out();
<nada>

To make a collection of Animal objects, you should keep a list of pointers to Animal objects. As Figure 8.5 shows, if you do this, it is no longer a problem that some Animal objects are larger objects than others. In the next example, various objects are created with new and added to the list with push_back(). I then define a function animal_call(Animal *pa) and use the standard algorithm for_each() to call this function for each element in the list:


;> list<Animal *> zoo;
;> zoo.push_back(new Animal("llama"));
;> zoo.push_back(new Dog("pointer"));
;> zoo.push_back(new Horse());
;> void animal_call(Animal *pa) {  pa->call_out(); }
;> for_each(zoo.begin(),zoo.end(),animal_call);
<nada>
woof!
Neigh!

Figure 8.5. A list of pointers to Animal objects.


Any Animal pointer might be a pointer to a Dog object. A type field such as name() makes the identification easier, but this is irritating to set up. C++ provides an interesting typecast operator called dynamic_cast, which you can use for polymorphic classes. dynamic_cast<A *>(p) will use RTTI to decide at run-time whether p is in fact a A * pointer. If not, it will return NULL. Note that any class derived from A will also qualify. Here it is in action:


;> Animal *p1 = new Horse, *p2 = new Dog;
;> dynamic_cast<Dog *>(p1);
(Dog *) 0
;> dynamic_cast<Dog *>(p2);
(Dog *) 9343e1a0
;> bool is_dog(Animal *a)
							{  return dynamic_cast<Dog *>(a) != NULL; }
;> struct Spaniel: public Dog {  Spaniel() {  breed("spaniel"); }  };
;> is_dog(new Spaniel);  // A Spaniel is a kind of Dog...
(bool) true
;> // remove the dogs from the zoo!
							;> // (std::remove_if() - see Appendix B, "Standard Algorithms")
;> remove_if(zoo.begin(),zoo.end(),is_dog);
						

Note that dynamic_cast does not work on just any type. It is specifically designed to exploit the fact that classes with virtual methods always carry extra type information. Without dynamic_cast, you need to put in the extra type field by hand. It is always safer than static_cast, which does not check the actual type at run-time.

Abstract Classes

In a way, there is no such thing as a plain Animal object. In the real world, any Animal is a specific species or at least has a well-defined family. It makes no sense to create an Animal class on its own, but only to create specific derived classes like Dog, Horse, and so forth.

Animal can be used as an abstract base class; the method call_out() is initialized to zero, which means that this virtual method is not defined yet (that is, it is a pure virtual method). A class with a number of pure virtual methods cannot be created directly, but only used as a base class. Any derived class must supply a proper definition of call_out(). As you see in this definition of Animal, call_out() is not given any definition:

class Animal: public NamedObject {
public:
  Animal(std::string name)
  : NamedObject(name) { }

  virtual void call_out() = 0;
};

;> Animal *pa = new Animal("sloth");
CON 32:class contains abstract methods

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.76.204