The Life of Objects

Everybody gets a little confused at first, when learning about C++ copy semantics, so don't worry if your head hurts a bit. An educator noted at a recent American Academy of Sciences meeting that the average soap opera is much more complicated than most mathematics. The C++ object model is certainly more straightforward than The Bold and the Beautiful (but then I probably don't watch enough TV). There are basically two ways to deal with objects:

  • Dynamically creating and destroying objects

  • Automatically creating and destroying objects

Dynamic Creation of Objects

You can create objects dynamically and use references or pointers to refer to them. In this case, assignment results in another alias to the object, and if you want a genuine copy, you have to ask for it explicitly. Generally, this is how Delphi and Java prefer to do things, and of course you can do this as well in C++. The following shows a String class (which is different from the standard one), with the various operations expressed as methods:

;> String *ps1 = new String("hello");
;> String *ps2 = new String("polly");
;> String *ps3 = ps2;
;> *ps2 = "dolly";      // changes *ps2 as well!
;> ps3 = ps2->clone();  // now ps3 is independent
;> cout << ps1->append(ps2) << endl;
hellodolly

Initially ps2 and ps3 point to the same object, and only after you explicitly call String::clone() are they independent. This style seems awkward if you are used to normal C++ strings. In fact, both Delphi and Java regard strings as exceptional objects and overload the + operator to mean 'concatenate string.' The advantage of this style is that everything is out in the open: Construction and copying are explicit, and (except in Java) destruction is also explicit. So you have full control of the life of the objects. It is also efficient because you are passing references around and not copying excessively.

The difficulty is twofold: First, the lack of “semantic sugar” (that is, operator overloading) can make code look awkward, and second, there are always problems with object lifetime. If you delete an object prematurely, pointers scattered throughout the system might still refer to it. Any attempt to access these dead objects leads to trouble, usually access violations (it's also common to try to delete objects twice). Even worse, the system might have reallocated that space to another object of the same type, in which case you are really in trouble because the program is then subtly wrong. On the other hand, if you don't delete objects, the free memory is eventually exhausted.

Java's solution to this problem is similar to how modern society consumes things: It assumes that resources are infinite and hopes that recycling will save the day. Occasionally, the system runs the garbage collector, which identifies objects that are no longer in use. (Garbage collection, by the way, is not restricted to Java. Some large C++ programs rely on it as well—it is a technique for memory management and does not require direct language support.) The argument for using garbage collection is that people cannot be relied on to dispose of their own discarded data. It is indeed easy to mismanage the life of dynamic objects, and it is a major cause of unreliability in C++ programs. Several techniques can help, however, and we'll talk about them later in this chapter, in the case study “A Resizable Array.”

Interestingly, C++ allows you to change the meanings of new and delete. They are operators, after all, and most C++ operators can be overridden. Of course, you need to do something sensible with them: They must manage memory allocation. You can pass the request to the usual operators by calling them directly. In the following example, you want to keep tabs on the number of Blob objects that have been created:

class Blob {
  int m_arr[100];
public:
  int *ptr()  {  return m_arr; }

  void *operator new (size_t sz) {
     kount++;
     return ::operator new(sz);
  }
  void operator delete (void *ptr, size_t sz) {
     kount—;
     ::operator delete(ptr,sz);
  }
};

Any dynamic allocation of Blob uses Blob::operator new, and any disposal of a Blob object with delete uses Blob::operator delete. So, you can have complete control over dynamic allocation; in this case, the overloaded operators call the system versions using the global scope operator (::). Dynamic allocation can be an expensive operation, and custom solutions can speed things up dramatically. Another use of this would be to use this technique to switch off the disposal of memory completely for a particular class.

As with any advanced C++ technique, remember that you should only do this if you have a very good reason—and proving that you can overload operator new is not good enough. In years of working with C++, I've only had to really do this twice.

Automatic Destruction of Objects

The second approach to dealing with objects involves declaring objects directly and letting C++ automatically dispose of them at the end of their lives. C++ guarantees that local automatic (that is, non-static) variables are automatically destroyed, no matter how you left the function. This makes C++ exception handling well behaved. Ordinary global variables are destroyed when the program closes. They are also created before main() is called, so you could use the following code to run initialization code for a module:

struct _InitMOD1 {
   _InitMOD1() {
      puts("MOD1 initialized");
    }
  ~_InitMOD1() {
      puts("MOD1 finalized");
    }
 };
InitMOD1 _InitVar;

This might look a bit ugly, but the preprocessor can make it quite elegant.

A class member variable is destroyed when its object is destroyed. The compiler always generates a suitable destructor for any objects that themselves contain objects. Of course, this does not apply to pointers in objects, which need to be explicitly destroyed in destructors.

Temporary Objects

It is possible for local objects to be nameless. These nameless objects only exist for the duration of a statement. For example, the following code shows a shortcut for writing out to a file, together with an equivalent long-hand version:

;> ofstream("out.txt") << "count is " << n;
;> {
							ofstream tmp("out.txt"); tmp << "count is " << n; }
						

The first line of this example opens a file, writes a string and an integer to the file, and closes the file. The second line of this example is equivalent code that shows what actually happens when you run the first line: A temporary object is created, used, and destroyed. A temporary local context around the statement forces the temporary object to go out of scope.

Temporary objects are created all the time, and usually you don't need to know about them, but they are an essential part of the C++ object model. For example, consider these string operations:

;> string s3 = s.substr(0,3);
;> string s3s = s3 + s;
						

The method string::substr() constructs and returns a temporary string object, which is used to initialize s3. The temporary object is then destroyed. Likewise, s3+s returns a temporary object. The generation of unnecessary temporary objects can slow down operations considerably. In the following code, the first expression s += s3 is much faster than s = s + s3, (especially if the strings are particularly big) because no temporaries are created. I have written out the last expression in full, to show that it involves the creation of a temporary, the concatenation, and two copies.

;> s
							+= s3;
(string) 'hellohel'
;> s
							=
							s
							+
							s3;
(string) 'hellohelhel'
;> {
							string tmp = s; s += s3; s = tmp; }   // what actually happened in 's = s + s3'
						

Beware of keeping references to temporary objects. Here are a few problem areas:

;> const char *p = (s1+s2).c_str();
;> cout << p << endl;   // can be utter garbage!
;> string& f() {  return s.substr(0,3); }
						

The pointer returned from c_str() should never be kept because it's generally valid only as long as the string lasts; in this case, (s1+s2) is a temporary object. This is an example of the trouble that comes with mixing high-level objects with low-level code. The function f() shows why you should be careful about returning references. Any references to a local variable—or a temporary variable, in this case—will be a source of trouble if that variable is out of scope.

You can use a constructor of a class as you use a function. Remember that you originally defined make _point() to create Point objects. It is possible to use Point(x,y) in expressions in a similar way to make _point(x,y):

;> Point p = Point(x,y);  // legal, but silly
;> int get_x(Point p)
;>  {
							return p.x; }
;> get_x(Point(100,200));
(int) 100

Using constructors in this way creates temporary objects as well, so you don't have to declare a Point variable explicitly and pass it as an argument.

By now you have probably seen a lot of functions that take std::string arguments, and you have often called them with string literals (for example, str2int("23")). However, you do not have to explicitly call the string constructor (that is, you do not have to use str2int(string("23"))) because C++ automatically uses the constructor for converting char * into std::string. This is not a special case; you can control the type conversion of any of your classes, which is the subject of the next section.

Type Conversion

Type conversion is an important part of the behavior of any type. The usual conversion rules related to numbers and pointers are called the standard conversions. For instance, any number will be converted to a double floating-point number, if required; any pointer will convert to void *, and so on.

C++ gives the designer of a class full control over how that type behaves in every situation, and so you can specify how other types can be converted into the type. This is set up for you if you have defined constructors taking one argument. For instance, the constructor string(char *) will be used by the compiler to convert char * into string. This saves a lot of typing because you can then pass string literals (that is, text in quotes) to functions expecting a string argument. If there was a constructor string(int) as well, then you could also pass integers to such functions. (This would probably not be a good idea; the more “silent” conversions that are possible, the more likely you are going to be unpleasantly surprised.)

But remember the cost: Every silent conversion using a constructor causes a temporary object to be created. If you need serious speed, you can always overload a function to achieve the same result—that is, define another function that takes a char * value directly. For instance, the following example is one way to make str2int() faster; the first function supplies another way to call the C library function atoi(), and the second function passes the string's character data to the first function:

inline int str2int(const char *s)
   {  return atoi(s); }
inline int str2int(const string& str)
 {  return str2int(str.c_str()); }

The inline function attribute explicitly asks the compiler to insert the function's code directly into the program wherever the function is called, so there is no extra cost in giving atoi() a new name and identity as str2int(). But you should test a program before going on a mad drive for maximum efficiency, and ask yourself if shaving off 50 milliseconds is going to affect the quality of your users' life.

NOTE

const in the first function's argument is very important. I was bitten by this recently, so you should share my experience, if not my pain. If the character pointer is not const, then str2int(str.c_str()) cannot match the first signature because str.c_str() returns const char * and C++ will never violate “constness” by converting char * to const char *. Instead, it matches the other signature by type conversion to const string& s. The snake proceeds to eat its own tail, and the recursion will end only when the program crashes.


The important thing to note with these two functions is that the compiler does not try to force a conversion of character literals because there is already a function that matches them perfectly well. User-defined conversion is only attempted if there is no other way to get a function match.

Note that user-defined conversions also apply to operators, which after all are just a fancy form of function call. If you were using a string class that has only defined operator==(const string&, const string&), you would still be able to compare string literals, as in name=="fred" or "dog"==animal, because the string(char *) constructor is used to convert those literals. But the compiler will need to generate temporary string objects, and you might find that a fast string comparison becomes surprisingly slow. Again, the solution is to overload operator==, at least for the first case. (The second case isn't as eccentric as it seems; some people write comparisons like this so they won't be bitten by name="fred".) Remember that assignments are also operator calls, and C++ strives to convert the right-hand side into the left-hand side by means of user conversions.

Sometimes you simply don't want an automatic conversion. In the past, people would prevent automatic conversion by using clever class design. For example, there is no std::string constructor that takes a single integer argument. The idea would be that you could generate a blank string of n characters with string(n), but it would have strange consequences. The call str2int(23) would translate as str2int(string(23)), which ends up returning zero because atoi() is so tolerant of spaces and non-numeric characters. str2int(23) is definitely a misunderstanding or a typo and should cause an error. So the string constructor is designed so that you must use string s(8,' ') to get an empty string s with eight spaces.

Standard C++ introduces a new keyword, explicit, which tells the compiler not to use a constructor in such conversions. For example, the standard vector class has the following constructor:

explicit vector(size_type sz = 0);

explicit before the declaration of this vector constructor allows you to declare vectors such as vector<int> vi(20) without getting puzzling attempts to convert integers into vectors.

So far you have seen how to control how C++ converts other types to your class. The other kind of conversion operation involves converting your class into other types. For example, say you have a Date class that has an as_str() method for returning the date in some standard format. Then the following user-defined conversion operator in the class definition causes Date objects to freely convert themselves into strings:

operator string () {  return as_str(); }

If C++ tries to match any function argument with Date objects and can't find any obvious match, it leaps at the chance to convert the date into a string. It would probably not be a good idea for a Date object to want to convert itself into an integer, both because that would not be unique (does it refer to a Julian date, a star date, or the number of seconds since Halloween?) and because integers are too common as function arguments.

You can get surprising control over an object's behaviour by using user-defined conversions. The following code uses YAWL, which was mentioned in the last chapter in the section “Class Frameworks.” (See Appendix B, “A Short Library Reference” for more on YAWL.) TWin is a type that describes a window, and it has a static member function get_active_window() that returns the the window which is currently active. When using the UnderC interactive prompt, this is the console window itself. TWin has two methods get_text() and set_text() for accessing a window's caption (that is, text in the title bar):

;> #include <yawl.h>
;> TWin *w = TWin::get_active_window();
;> w->get_text();
(char *) "UnderC for Windows"
;> w->set_text("hello, world!");
						

This is a classic pair of get and set methods. Borland's C++ Builder has a non-standard C++ extension called a property. Properties look like simple variables, but they have different meanings depending on which side of an assignment they appear. The preceding YAWL example would be written like this in C++ Builder:

String s = w->Caption;         // same as get_text() above
w->Caption = "Hello, World!";  // same as set_text() above

w->Caption looks like a straightforward member variable access but is actually an action. w->Caption = "Hello, World!" not only sets a value, but updates the window's caption. User-defined conversions make this possible using standard C++ as well. Consider the following class:


class TextProperty {
  TWin *m_win;
public:
  TextProperty(TWin *win) : m_win(win) { }

  void operator=(const string& s)
    {  m_win->set_text(s.c_str()); }

  operator string ()
    {  return m_win->get_text(); }
};
;> TextProperty text(TWin::get_active_window());
;> string t = text;      //TextProperty::operator string()
;> text = "hello dolly"; //TextProperty::operator=(const string&)
						

In the initialization string t = text, the only way to match a TextProperty value to a string value is to convert the text object into a string using the user-defined conversion. This will also happen with any function expecting a string argument. The string conversion operator then actually gets the window text.

If the object text appears on the left-hand side of an assignment, then its meaning changes completely. It will then match operator=(const string& s), which has the effect of setting the window text. This is precisely how properties are meant to work; what looks like the simple use of a variable causes either get or set code to be executed. In the first case study, you will see how this technique can make an intelligent array possible.

Interestingly, it was once common for C++ string classes to automatically convert to const char *; the Microsoft Foundation Class CString class behaves like this. But it led to too many odd surprises, and so the standard string has the c_str() method to do this explicitly. The problem is that C++ gives you the power to design your own language, and you should use this power wisely or not at all. User-defined conversion operators are like the color purple with amateur Web designers: They are best avoided until you know what you're doing.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.135.188.121