Chapter 29

Are Virtual Functions for Real?

In This Chapter

arrow Overriding between functions that are members of a class

arrow Introducing virtual member functions

arrow Binding early versus binding late

arrow Declaring your destructor virtual — when and when not to do it

Inheritance gives users the ability to describe one class in terms of another. Just as important, it highlights the relationship between classes. I describe a duck as “a bird that …”, and that description points out the relationship between duck and bird. From a C++ standpoint, however, a piece of the puzzle is still missing.

You have probably noticed this, but a microwave oven looks nothing like a conventional oven and nor does it work the same internally. Nevertheless, when I say “cook,” I don’t want to worry about the details of how each oven works internally. This chapter describes this problem in C++ terms and then goes on to describe the solution as well.

Overriding Member Functions

It has always been possible to overload a member function with another member function in the same class, as long as the arguments differ:

  class Student
{
  public:
    double grade();    // return the student's gpa
    double grade(double); // set the student's gpa

    // ...other stuff...
};

You see this in spades in Chapters 26 and 27, where I overload the constructor with a number of different types of constructors. It’s also possible to overload a function in one class with a function in another class, even if the arguments are the same, because the class is not the same:

  class Student
{
  public:
    double grade(double); // set the student's gpa
};

class Hill
{
  public:
    double grade(double); // set the slope of the hill
};

Inheritance offers yet another way to confuse things: A member function in a subclass can overload a member function in the base class.

Overloading a base-class member function is called overriding.

Early binding

Overriding is fairly straightforward. Consider, for example, the following EarlyBinding demonstration program:

  //
//  EarlyBinding - demonstrates early binding in
//                 overriding one member function with
//                 another in a subclass.
//
#include <cstdio>
#include <cstdlib>
#include <iostream>
using namespace std;

class Student
{
  public:
    double calcTuition() { return 0.0; }
};

class GraduateStudent : public Student
{
  public:
    double calcTuition() { return 1.0; }
};

int main(int nNumberofArgs, char* pszArgs[])
{
    // the following calls Student::calcTuition()
    Student s;
    cout << "The value of s.calcTuition() is "
         << s.calcTuition()
         << endl;

    // the following calls GraduateStudent::calcTuition()
    GraduateStudent gs;
    cout << "The value of gs.calcTuition() is "
         << gs.calcTuition()
         << endl;

    // wait until user is ready before terminating program
    // to allow the user to see the program results
    cout << "Press Enter to continue..." << endl;
    cin.ignore(10, ' '),
    cin.get();
    return 0;
}

Here both the Student and GraduateStudent classes include a calcTuition() member function (and nothing else, just to keep the listings short). Presumably, the university calculates tuition for graduate and undergraduate students differently, but for this demonstration, determining which function is being called is the only important thing. Therefore Student::calcTuition() returns a 0, while GraduateStudent::calcTuition() returns a 1 — can’t get much simpler than that!

The main() function first creates a Student object s and then invokes s.calcTuition(). Not surprisingly, this call is passed to Student::calcTuition() as is clear from the output of the program as quoted here. The main() function then does the same for GraduateStudent, with predictable results:

  The value of s.calcTuition() is 0
The value of gs.calcTuition() is 1
Press Enter to continue …

In this program, the C++ compiler can decide at compile time which member function to call, basing the decision on the declared type of s and gs.

remember.eps Resolving calls to overridden member functions based on the declared type of the object is called compile-time binding or early binding.

This simple example is not too surprising so far, but let me put a wrinkle in this simple fabric.

Ambiguous case

The following AmbiguousBinding program is virtually identical to the earlier EarlyBinding program. The only difference is that instead of invoking calcTuition() directly, this version of the program calls the function through a pointer passed to a function:

  //
//  AmbiguousBindng - demonstrates a case where it's not
//                    clear what should happen. In this
//                    case, C++ goes with early binding
//                    while languages like Java and C#
//                    use late binding.
//
#include <cstdio>
#include <cstdlib>
#include <iostream>
using namespace std;

class Student
{
  public:
    double calcTuition() { return 0.0; }
};

class GraduateStudent : public Student
{
  public:
    double calcTuition() { return 1.0; }
};

double someFn(Student* pS)
{
    return pS->calcTuition();
}

int main(int nNumberofArgs, char* pszArgs[])
{
    // the following calls Student::calcTuition()
    Student s;
    cout << "The value of someFn(&s) is "
         << someFn(&s)
         << endl;

    // the following calls GraduateStudent::calcTuition()
    GraduateStudent gs;
    cout << "The value of someFn(&gs) is "
         << someFn(&gs)
         << endl;

    // wait until user is ready before terminating program
    // to allow the user to see the program results
    cout << "Press Enter to continue..." << endl;
    cin.ignore(10, ' '),
    cin.get();
    return 0;
}

Just as in the EarlyBinding example, this program starts by creating a Student object s. Rather than invoke s.calcTuition() directly, however, this version passes the address of the object s to someFn() and that function does the honors. The program repeats the process with a GraduateStudent object gs.

Now, without looking ahead, consider this question: Which calcTuition() will pS->calcTuition() call when main() passes the address of a GraduateStudent to someFn()?

You could argue that it will call Student::calcTuition() because the declared type of pS is Student*. On the other hand, you could argue that the same call will invoke GraduateStudent::calcTuition() because the “real type” is GraduateStudent*.

remember.eps The “real type” of an object is known as the run-time type or the dynamic type. These are also known as dynamic type and static type, respectively.

The output from this program appears as follows:

  The value of someFn(&s) is 0
The value of someFn(&gs) is 0
Press Enter to continue …

You can see that, by default, C++ bases its decision on the declared type of the object. Therefore someFn() calls Student::calcTuition() because that’s the way the object is declared irrespective of the run-time type of the object provided in the call.

tip.eps The alternative to early binding is to decide which member function to call based on the run-time type of the object. This is known as late binding.

Thus we say that C++ prefers early binding.

Enter late binding

Early binding does not capture the essence of object-oriented programming. Consider how I make nachos in Chapter 21. In a sense, I act as the late binder. The recipe says, “Heat the nachos in the oven.” It doesn’t say, “If the type of oven is microwave, do this; if the type is convection oven, do this; if the type is conventional oven, do this; if using a campfire, do this.” The recipe (the code) relies on me (the late binder) to decide what the action (member function) heat means when applied to the oven (the particular instance of class Oven) or any of its variations (subclasses), such as a microwave (MicrowaveOven). People think this way, and designing a language along these lines enables the software model to describe more accurately a real-world solution that a person might think up.

There are also mundane reasons of maintainability and reusability to justify late binding. Suppose I write a great program around the class Student. This program, cool as it is, does lots of things, and one of the things it does is calculate the student’s tuition for the upcoming year. After months of design, coding, and testing, I release the program to great acclaim and accolades from my peers.

Time passes and my boss asks me to change the rules for calculating the tuition on graduate students. I’m to leave the rules for students untouched, but I’m to give graduate students some type of break on their tuition so that the university can attract more and better postgraduate candidates. Deep within the program, someFunction() calls the calcTuition() member function as follows:

  void someFunction(Student* pS)
{
    pS->calcTuition();

    // ...function continues on...
}

tip.eps This should look familiar. If not, refer to the beginning of this chapter.

If C++ did not support late binding, I would need to edit someFunction() to do something similar to the following:

  void someFunction(Student* pS)
{
    if (pS->type() == STUDENT)
    {
        pS->Student::calcTuition();
    }
    if (pS->type() == GRADUATESTUDENT)
    {
        pS->GraduateStudent::calcTuition();
    }

    // ...function continues on...
}

Using the extended name of the function (as discussed in Chapter 11), including the class name, forces the compiler to use the specific version of calcTuition().

I would add a member type() to the class that would return some constant. I could establish the value of this constant in the constructor.

This change doesn’t seem so bad until you consider that calcTuition() isn’t called in just one place; it’s called throughout the program. The chances are not good that I will find all the places that it’s called.

And even if I do find them all, I’m editing (read “breaking”) previously debugged, tested, checked in, and certified code. Edits can be time-consuming and boring, and they introduce opportunities for error. Any one of my edits could be wrong. At the very least, I will have to retest and recertify every path involving calcTuition().

What happens when my boss wants another change? (My boss, like all bosses, is like that.) I get to repeat the entire process.

What I really want is for C++ to keep track of the run-time type of the object and to perform the call using late binding.

tip.eps The ability to perform late binding is called polymorphism (“poly” meaning “varied” and “morph” meaning “form”). Thus a single object may take varied actions based on its run-time type.

All I need to do is add the keyword virtual to the declaration of the member function in the base class as demonstrated in the following LateBinding example program:

  //
//  LateBinding - addition of the keyword 'virtual'
//                changes C++ from early binding to late
//                binding.
//
#include <cstdio>
#include <cstdlib>
#include <iostream>
using namespace std;

class Student
{
  public:
    virtual double calcTuition() { return 0.0; }
};

class GraduateStudent : public Student
{
  public:
    virtual double calcTuition() { return 1.0; }
};

double someFn(Student* pS)
{
    return pS->calcTuition();
}

int main(int nNumberofArgs, char* pszArgs[])
{
    // the following calls Student::calcTuition()
    Student s;
    cout << "The value of someFn(&s) is "
         << someFn(&s)
         << endl;

    // the following calls GraduateStudent::calcTuition()
    GraduateStudent gs;
    cout << "The value of someFn(&gs) is "
         << someFn(&gs)
         << endl;

    // wait until user is ready before terminating program
    // to allow the user to see the program results
    cout << "Press Enter to continue..." << endl;
    cin.ignore(10, ' '),
    cin.get();
    return 0;
}

remember.eps It’s not necessary to add the virtual keyword to the subclass as well, but doing so is common practice. A member function that is bound late is known as a virtual member function.

Other than the virtual keyword, there is no other difference between the LateBinding program and its AmbiguousBinding predecessor, but the results are strikingly different:

  The value of someFn(&s) is 0
The value of someFn(&gs) is 1
Press Enter to continue …

This is exactly what I want: C++ is now deciding which version of calcTuition() to call, basing the decision on its run-time type and not on its declared type.

It may seem surprising that the default for C++ is early binding, but the reason is simple. Late binding adds a small amount of overhead to every call to virtual member functions. The inventors of C++ did not want to give critics any reasons to reject the language — so, by default, C++ does not include the overhead of late binding with functions that are not virtual.

When Is Virtual Not?

Beware: A particular function call is not necessarily bound late just because you think it is. The most critical thing to watch for is that all the member functions in question are declared identically, including the return type. If they aren’t declared with the same arguments in the subclass, the member functions aren’t overridden; without overriding, there can’t be late binding. Consider the following code snippet:

  class Base
{
  public:
    virtual void fn(int x);
};

class Subclass : public Base
{
  public:
    virtual void fn(double x);
};
void test(Base* pB)
{
    pB->fn(1);

    pB->fn(2.0);
};

The function fn() is not bound late because the arguments don’t match. Not surprisingly, the first call to fn() within test() goes to Base::fn(int) even if test() is passed to an object of class Subclass. Somewhat surprisingly, the second call goes to Base::fn(int) as well after converting the double to an int. Again, no overriding, no late binding.

The only exception to this rule is best explained by the following example:

  class Base
{
  public:
    virtual Base* fn();
};

class Subclass : public Base
{
  public:
    virtual Subclass* fn();
};

Here the function fn() is bound late, even though the return type doesn’t match exactly. In practice, this is quite natural. If a function is dealing with Subclass objects, it seems natural that it should return a Subclass object as well.

The 2011 standard introduces a way to make sure that overriding is, in fact, occurring: It uses the newly introduced keyword override, as shown in the following snippet:

  class Base
{
  public:
    virtual void fn(int x);
};

class Subclass : public Base
{
  public:
    virtual void fn(double x) override;
};

This generates a compiler error because Subclass::fn() does not, in fact, override a function in the base class — even though the override keyword says it does.

Correcting the argument solves the problem:

  class Base
{
  public:
    virtual void fn(int x);
};

class Subclass : public Base
{
  public:
    virtual void fn(int x) override;
};

This code compiles properly because Subclass::fn(int) does override Base::fn(int).

Virtual Considerations

Specifying the class name in the call forces the call to find out early whether the function is declared virtual or not. For example, the following call is to Base::fn() because that’s what the programmer indicated she intended:

  void test(Base* pB)
{
    pB->Base::fn();  // this call is not bound late
}

Constructors cannot be declared virtual because there is no completed object at the time the constructor is invoked to use as the basis for late binding.

On the other hand, destructors should almost always be declared virtual. If they aren’t, you run the risk of not completely destructing the object, as demonstrated in the following snippet:

  class MyObject {};

class Base
{
  public:
    ~Base() {}  // this should be declared virtual
};

class Subclass
{
  protected:
    MyObject* pMO;

  public:
    Subclass()
    {
        pMO = new MyObject;
    }
   ~Subclass()
    {
        delete pMO;
        pMO = nullptr;
    }
};

Base* someOtherFn()
{
    return new Subclass;
}

void someFn()
{
    Base* pB = someOtherFn();
    delete pB;
}

The program has a subtle but devastating bug. When someFn() is called, it immediately calls someOtherFn(), which creates an object of class Subclass. The constructor for Subclass allocates an object of class MyObject off the heap. Ostensibly, all is well because the destructor for Subclass returns MyObject to the heap when the Subclass object is destructed.

However, when someFn() calls delete, it passes a pointer of type Base*. If this call is allowed to bind early, it will invoke the destructor for Base, which knows nothing about MyObject. The memory will not be returned to the heap.

technicalstuff.eps I realize that technically delete is a keyword and not a function call, but the semantics are the same.

Declaring the destructor for Base virtual solves the problem. Now the call to delete is bound late — realizing that the pointer passed to delete actually points to a Subclass object, delete invokes the Subclass destructor, and the memory is returned, as it’s supposed to be.

So is there a case in which you don’t want to declare the destructor virtual? Only one. Earlier I said that virtual functions introduce a “little” overhead. Let me be more specific. One thing they add is an additional hidden pointer to every object — not one pointer per virtual function, just one pointer, period. A class with no virtual functions does not have this pointer.

Now, one pointer doesn’t sound like much, and it isn’t, unless the following two conditions are true:

  • The class doesn’t have many data members (so that one pointer is a lot compared with what’s there already).
  • You create a lot of objects of this class (otherwise the overhead doesn’t matter).

tip.eps If either of these two conditions is not true, always declare your destructors virtual.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.37.154