2. C with Classes

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

2. C with Classes

Specialization is for insects.

– R.A.Heinlein

C++’s immediate predecessor, C with Classes — key design principles — classes — run-time and space efficiency — the linkage model — static (strong) type checking — why C? — syntax problems — derived classes — living without virtual functions and templates — access-control mechanisms — constructors and destructors — my work environment.

2.1 The Birth of C with Classes

The work on what eventually became C++ started with an attempt to analyze the UNIX kernel to determine how it could be distributed over a network of computers connected by a local area network. This work started in April 1979 in the Computing Science Research Center of Bell Laboratories in Murray Hill, New Jersey. Two sub-problems soon emerged: how to analyze the network traffic that would result from the kernel distribution and how to modularize the kernel. Both required a way to express the module structure of a complex system and the communication pattern of the modules. This was exactly the kind of problem that I had become determined never again to attack without proper tools. Consequently, I set about developing a proper tool according to the criteria I had formed in Cambridge.

In October 1979 I had a running preprocessor, called Cpre, which added Simula-like classes to C, and by March 1980 this preprocessor had been refined to the point where it supported one real project and several experiments. My records show the preprocessor was in use on 16 systems by then. The first key C++ library, the task system supporting a co-routine style of programming [Stroustrup, 1980b] [Stroustrup, 1987b] [Shopiro,1987], was crucial to these projects. The language accepted by the preprocessor was called “C with Classes.”

During the April to October period the transition from thinking about a tool to thinking about a language had occurred, but C with Classes was still thought of primarily as an extension to C for expressing modularity and concurrency. A crucial decision had been made, though. Even though support of concurrency and Simula-style simulations was a primary aim of C with Classes, the language contained no primitives for expressing concurrency; rather, a combination of inheritance (class hierarchies) and the ability to define class member functions with special meanings recognized by the preprocessor was used to write the library that supported the desired styles of concurrency. Please note that “styles” is plural. I considered it crucial – as I still do – that more than one notion of concurrency should be expressible in the language. There are many applications for which support for concurrency is essential, but there is no one dominant model for concurrency support; thus, when support is needed, it should be provided through a library or a special-purpose extension so that a particular form of concurrency support does not preclude other forms.

The language thus provided general mechanisms for organizing programs, rather than support for specific application areas. This was what made C with Classes – and later, C++ – a general-purpose language rather than a C variant with extensions to support specialized applications. Later, the choice between providing support for specialized applications or general abstraction mechanisms has come up repeatedly. Each time the decision has been to improve the abstraction mechanisms. Thus, C++ does not have built-in complex number, string, or matrix types, or direct support for concurrency, persistence, distributed computing, pattern matching, or file system manipulation, to mention a few of the most frequently suggested extensions. Instead, libraries supporting those needs exist.

An early description of C with Classes was published as a Bell Labs technical report in April 1980 [Stroustrup,1980] and in SIGPLAN Notices [Stroustrup,1982]. A more detailed Bell Labs technical report, Adding Classes to the C Language: An Exercise in Language Evolution [Stroustrup, 1982b] was published in Software: Practice and Experience. These papers set a good example by describing only features that were fully implemented and had been used. This was in accordance with Bell Labs Computing Science Research Center tradition. That policy was modified only when more openness about the future of C++ became needed to ensure a free and open debate over the evolution of C++ among its many non-AT&T users.

C with Classes was explicitly designed to allow better organization of programs; “computation” was considered a problem solved by C. I was very concerned that improved program structure was not achieved at the expense of run-time overhead compared to C. The explicit aim was to match C in terms of run-time, code compactness, and data compactness. To wit: Someone once demonstrated a 3% systematic decrease in overall run-time efficiency compared with C caused by the use of a spurious temporary introduced into the function return mechanism by the C with Classes preprocessor. This was deemed unacceptable and the overhead promptly removed. Similarly, to ensure layout compatibility with C and thereby avoid space overhead, no “housekeeping data” was placed in class objects.

Another major concern was to avoid restrictions on the domain where C with Classes could be used. The ideal – which was achieved – was that C with Classes could be used for whatever C could be used for. This implied that in addition to matching C in efficiency, C with Classes could not provide benefits at the expense of removing “dangerous” or “ugly” features of C. This observation/principle had to be repeated often to people (rarely C with Classes users) who wanted C with Classes made safer by increasing static type checking along the lines of early Pascal. The alternative way of providing “safety,” inserting run-time checks for all unsafe operations, was (and is) considered reasonable for debugging environments, but the language could not guarantee such checks without leaving C with a large advantage in run-time and space efficiency. Consequently, such checks were not provided for C with Classes, although some C++ environments do provide such checks for debugging. In addition, users can and do insert run-time checks (see §16.10 and [2nd]) where needed and affordable.

C allows low-level operations, such as bit manipulation and choosing between different sizes of integers. There are also facilities, such as explicit unchecked type conversions, for deliberately breaking the type system. C with Classes and later C++ follow this path by retaining the low-level and unsafe features of C. In contrast to C, C++ systematically eliminates the need to use such features except where they are essential and performs unsafe operations only at the explicit request of the programmer. I strongly felt then, as I still do, that there is no one right way of writing every program, and a language designer has no business trying to force programmers to use a particular style. The language designer does, on the other hand, have an obligation to encourage and support a variety of styles and practices that have proven effective and to provide language features and tools to help programmers avoid the well-known traps and pitfalls.

2.2 Feature overview

The features provided in the initial 1980 implementation can be summarized:

[1] Classes (§2.3)

[2] Derived classes (but no virtual functions yet, §2.9)

[3] Public/private access control (§2.10)

[4] Constructors and destructors (§2.11.1)

[5] Call and return functions (later removed, §2.11.3)

[6] friend classes (§2.10)

[7] Type checking and conversion of function arguments (§2.6)

During 1981, three more features were added:

[8] Inline functions (§2.4.1)

[9] Default arguments (§2.12.2)

[10] Overloading of the assignment operator (§2.12.1)

Since a preprocessor was used for the implementation of C with Classes, only new features (that is, features not present in C) had to be described and the full power of C was available to users. Both aspects were appreciated at the time. Having C as a subset dramatically reduced the support and documentation work needed. This was most important because for several years I did all of the C with Classes and C++ documentation and support in addition to doing the experimentation, design, and implementation. Having all C features available further ensured that no limitations introduced through prejudice or lack of foresight on my part would deprive a user of features already available in C. Naturally, portability to machines supporting C was ensured. Initially, C with Classes was implemented and used on a DEC PDP/11, but soon it was ported to machines such as the DEC VAX and Motorola 68000 based machines.

C with Classes was still seen as a dialect of C rather than as a separate language. Furthermore, classes were referred to as “an abstract data type facility” [Strous-trup,1980]. Support for object-oriented programming was not claimed until the provision of virtual functions in C++ in 1983 [Stroustrup,1984].

2.3 Classes

Clearly, the most important aspect of C with Classes – and later of C++ – was the class concept. Many aspects of the C with Classes class concept can be observed by examining a simple example from [Stroustrup,1980]f:

Click here to view code image

class stack {
  char   s[SIZE];  /* array of character  s  */
  char*  min;      /* pointer to bottom of stack */
  char*  top;      /* pointer to top of stack */
  char*  max;      /* pointer to top of allocate  d space */
  void   new();   /* initialize  function  (constructor)  */
public :
  void  push(char);
  char  pop();
};

A class is a user-defined data type. A class specifies the type of class members that define the representation of a variable of the type (an object of the class), the set of operations (functions) that manipulate such objects, and the access users have to these members. Member functions are typically defined “elsewhere:”

Click here to view code image

char stack.pop()
{
if (top <= min) error("stack underflow");
return *(--top);
}

Objects of class stack can now be defined and used:

Click here to view code image

class stack s1, s2;           /* two stack variables */
class stack * p1 = &s2;       /* p1 points to s2 */
class stack * p2 = new stack; /* p2 points to stack object
                                 allocated on free store */

s1.push('h');     /* use object directly */
p1->push('s');   /* use object through pointer */

Several key design decisions are reflected here:

[1] C with Classes follows Simula in letting the programmer specify types from which variables (objects) can be created, rather than, say, the Modula approach of specifying a module as a collection of objects and functions. In C with Classes (as in C++), a class is a type (§2.9). This is a key notion in C++. When class means user-defined type in C++, why didn’t I call it type? I chose class primarily because I dislike inventing new terminology and found Simula’s quite adequate in most cases.

[2] The representation of objects of the user-defined type is part of the class declaration. This has far-reaching implications (§2.4, §2.5). For example, it means that true local variables of a user-defined type can be implemented without the use of free store (also called heap store and dynamic store) or garbage collection. It also means that a function must be recompiled if the representation of an object it uses directly is changed. See §13.2 for C++ facilities for expressing interfaces that avoid such recompilation.

[3] Compile-time access control is used to restrict access to the representation. By default, only the functions mentioned in the class declaration can use names of class members (§2.10). Members (usually function members) specified in the public interface – the declarations after the public: label – can be used by other code.

[4] The full type (including both the return type and the argument types) of a function is specified for function members. Static (compile-time) type checking is based on this type specification (§2.6). This differed from C at the time, where function argument types were neither specified in interfaces nor checked in calls.

[5] Function definitions are typically specified “elsewhere” to make a class more like an interface specification than a lexical mechanism for organizing source code. This implies that separate compilation for class member functions and their users is easy and the linker technology traditionally used for C is sufficient to support C++ (§2.5).

[6] The function new() is a constructor, a function with a special meaning to the compiler. Such functions provided guarantees about classes (§2.11). In this case, the guarantee is that the constructor – known somewhat confusingly as a new-function at the time – is guaranteed to be called to initialize every object of its class before the first use of the object.

[7] Both pointers and non-pointer types are provided (as in both C and Simula). Like C and unlike Simula, pointers can point to objects of both user-defined and built-in types (§2.4).

[8] Like C, objects can be allocated in three ways: on the stack (automatic storage), at a fixed address (static storage), and on the free store (on the heap, dynamic storage). Unlike C, C with Classes provides specific operators new and delete for free store allocation and deallocation (§2.11.2).

Much of the further development of C with Classes and C++ can be seen as exploring the consequences of these design choices, exploiting their good sides, and compensating for the problems caused by their bad sides. Many, but by no means all, of the implications of these design choices were understood at the time; [Stroustrup,1980] is dated April 3, 1980. This section tries to explain what was understood at the time and gives pointers to sections that explain consequences and later realizations.

2.4 Run-Time Efficiency

In Simula, it is not possible to have local or global variables of class types; that is, every object of a class must be allocated on the free store using the new operator. Measurements of my Cambridge simulator had convinced me this was a major source of inefficiency. Later, Karel Babcisky from the Norwegian Computer Centre presented data on Simula run-time performance that confirmed my conjecture [Babcisky, 1984]. For that reason alone, I wanted global and local variables of class types.

In addition, having different rules for the creation and scope of built-in and user-defined types is inelegant, and I felt that on occasion my programming style had been cramped by the absence of local and global class variables from Simula. Similarly, I had missed the ability to have pointers to built-in types in Simula, so I wanted the C notion of pointers to apply uniformly over user-defined and built-in types. This is the origin of the notion that over the years grew into a rule of thumb for the design of C++: User-defined and built-in types should behave the same relative to the language rules and receive the same degree of support from the language and its associated tools. When the ideal was formulated built-in types received by far the best support, but C++ has overshot that target so that built-in types now receive slightly inferior support compared to user-defined types (§15.11.3).

The initial version of C with Classes did not provide inline functions to take further advantage of the availability of the representation. Inline functions were soon provided, though. The general reason for the introduction of inline functions was concern that the cost of crossing a protection barrier would cause people to refrain from using classes to hide representation. In particular, [Stroustrup, 1982b] observes that people had made data members public to avoid the function call overhead incurred by a constructor for simple classes where only one or two assignments are needed for initialization. The immediate cause for the inclusion of inline functions into C with Classes was a project that couldn’t afford function call overhead for some classes involved in real-time processing. For classes to be useful in that application, crossing the protection barrier had to be free. Only the combination of having the representation available in the class declaration and having the calls of the public (interface) functions inlined could deliver that.

Over the years, considerations along these lines grew into the C++ rule that it was not sufficient to provide a feature, it had to be provided in an affordable form. Most definitely, affordable was seen as meaning “affordable on hardware common among developers” as opposed to “affordable to researchers with high-end equipment” or “affordable in a couple of years when hardware will be cheaper.” C with Classes was always considered as something to be used now or next month rather than simply a research project to deliver something in a couple of years hence.

2.4.1 Inlining

Inlining was considered important for the utility of classes. Therefore, the issue was more how to provide it than whether to provide it. Two arguments won the day for the notion of having the programmer select which functions the compiler should try to inline. First, I had had poor experiences with languages that left the job of inlining to compilers “because clearly the compiler knows best.” The compiler only knows best if it has been programmed to inline and it has a notion of time/space optimization that agrees with mine. My experience with other languages was that only “the next release” would actually inline, and it would do so according to an internal logic that a programmer couldn’t effectively control. To make matters worse, C (and therefore C with Classes and later C++) has genuine separate compilation so that a compiler never has access to more than a small part of the program (§2.5). Inlining a function for which you don’t know the source appears feasible given advanced linker and optimizer technology, but such technology wasn’t available at the time (and still isn’t in most environments). Furthermore, techniques that require global analysis, such as automatic inlining without user support, tend not to scale well to very large programs. C with Classes was designed to deliver efficient code given a simple, portable implementation on traditional systems. Given that, the programmer had to help. Even today, the choice seems right.

In C with Classes, only member functions could be inlined and the only way to request a function to be inlined was to place its body within the class declaration. For example:

Click here to view code image

class  stack {
    /*   ... */
    char pop()
    {   if (top <= min) error("stack underflow");
        return *--top;
    }
};

The fact that this made class declarations messier was observed at the time and seen as a good thing in that it discourages overuse of inline functions. The inline keyword and the ability to inline nonmember functions came with C++. For example, in C++ one can write the example like this:

Click here to view code image

class stack {   // C++
    // ...
    char pop();
};

inline char stack::pop()  // C++
{
    if (top <= min) error("stack  underflow");
    return *--top;
}

An inline directive is only a hint that the compiler can and often does ignore. This is a logical necessity because one can write recursive inline functions that cannot at compile time be proven not to cause infinite recursions; trying to inline one of those would lead to infinite compilations. Leaving inline a hint is also a practical advantage because it allows the compiler writer to handle “pathological” cases by simply not inlining.

C with Classes required – as its successor still requires – that an inline function must have a unique definition in a program. Defining a function like pop() above differently in different compilation units would lead to chaos by subverting the type system. Given separate compilation, it is extremely hard to guarantee that such subversion hasn’t taken place in a large system. C with Classes didn’t check, and most C++ implementations still don’t try to guarantee that an inline function hasn’t been defined differently in separate compilation units. However, this theoretical problem has not surfaced as a real problem largely because inline functions tend to be defined in header files together with classes – and class declarations also need to be unique in a program.

2.5 The Linkage Model

The issue of how separately compiled program fragments are linked together is critical for any programming language and to some extent determines the features the language can provide. One of the critical influences on the development of C with Classes and C++ was the decision that

[1] Separate compilation should be possible with traditional C/Fortran UNIX/DOS style linkers.

[2] Linkage should be type safe.

[3] Linkage should not require any form of database (although one could be used to improve a given implementation).

[4] Linkage to program fragments written in other languages such as C, assembler, and Fortran should be easy and efficient.

C uses header files to ensure consistent separate compilation. Declarations of data structure layouts, functions, variables, and constants are placed in header files that are typically textually included into every source file that needs the declarations. Consistency is ensured by placing adequate information in the header files and ensuring that the header files are consistently included. C++ follows this model up to a point.

The reason that layout information can be present in a C++ class declaration (though it doesn’t have to be; see §13.2) is to ensure that the declaration and use of true local variables is easy and efficient. For example:

void f()
{
    class stack s;
    int c;
    s.push ('h');
    c = s.pop();
}

Using the stack declaration from §2.3 and §2.4.1, even a simple-minded C with Classes implementation can ensure that no use is made of free store for this example, that the call of pop() is inlined so that no function call overhead is incurred and that the non-inlined call of push() can invoke a separately compiled function. In this, C++ resembles Ada.

At the time, I felt there was a trade-off between having separate interface and implementation declarations (as in Modula-2) plus a tool (linker) to match them up, and having a single class declaration plus a tool (a dependency analyzer) that considered the interface part separately from the implementation details for the purposes of recompilation. It appears I underestimated the complexity of the latter and that the proponents of the former approach underestimate the cost (in terms of porting problems and run-time overhead) of the former.

I also made matters worse for the C++ community by not properly explaining the use of derived classes to achieve the separation of interface and implementation. I tried (see for example [Stroustrup,1986,§7.6.2]), but somehow I never got the message across. I think the reason for this failure was primarily that it never occurred to me that many (most?) C++ programmers and non-C++ programmers looking at C++ thought that because you could put the representation right in the class declaration that specified the interface, you had to.

I made no attempt to provide tools to enforce type-safe linkage for C with Classes; that had to wait for Release 2.0 of C++. However, I remember talking to Dennis Ritchie and Steve Johnson to establish that type safety across compilation boundaries was considered a part of C. We just lacked the means of enforcement for real programs and had to rely on tools such as Lint [Kernighan,1984].

In particular, Steve Johnson and Dennis Ritchie affirmed that C was intended to have name equivalence rather than structural equivalence. For example:

struct A { int x, y; };
struct B { int x, y; };

defines two incompatible types A and B. Further:

Click here to view code image

struct C { int x, y; }; // in file 1
struct C { int x, y; }; // in file 2

defines two different types, both called C, and a compiler that can do checking across compilation unit boundaries should give a “double definition” error. The reason for this rule is to minimize maintenance problems. Such identical declarations are unlikely to occur except through copying. Once copied into different source files, however, the declarations are unlikely to stay identical forever. The moment one declaration – and not the other – is changed, the program will mysteriously fail to work correctly.

As a practical matter, C and therefore C++ guarantees that similar structures such as A and B have similar layout so that it is possible to convert between them and use them in the obvious manner:

Click here to view code image

extern f(struct A*);

void g(struct A* pa, struct B* pb)
{

     f(pa);  /* fine */
     f(pb);  /* error: A* expected */

     pa = pb;             /* error:  A* expected */
     pa = (struct A*)pb;  /* ok: explicit conversion */

     pb-<x = 1;
     if (pa-<x != pb-<x) error("bad implementation");
}

Name equivalence is the bedrock of the C++ type system and the layout compatibility rules ensure that explicit conversions can be used to provide low-level services that in other languages have been supplied through structural equivalence. I prefer name equivalence over structural equivalence because I consider it the safest and cleanest model. I was therefore pleased to find that this decision didn’t get me into compatibility problems with C and didn’t complicate the provision of low-level services.

This later grew into the “one-definition rule:” every function, variable, type, constant, etc., in C++ has exactly one definition.

2.5.1 Simple-Minded Implementations

The concern for simple-minded implementations was partly a necessity caused by the lack of resources for developing C with Classes and partly a distrust of languages and mechanisms that required clever techniques. An early formulation of a design goal was that C with Classes “should be implementable without using an algorithm more complicated than a linear search.” Wherever that rule of thumb was violated – as in the case of function overloading (§11.2) – it led to semantics that were more complicated than I felt comfortable with. Frequently, it also led to implementation complications.

The aim – based on my Simula experience – was to design a language that would be easy enough to understand to attract users and easy enough to implement to attract implementers. A relatively simple implementation had to be able to deliver code that compared favorably with C code in correctness, run-time speed, and code size. A relatively novice user in a relatively unsupportive programming environment had to be able to use this implementation for real projects. Only when both of these criteria were met could C with Classes and later C++ expect to survive in competition with C. An early formulation of that principle was that “C with Classes has to be a weed like C or Fortran because we cannot afford to take care of a rose like Algol68 or Simula. If we deliver an implementation and go away for a year, we want to find several systems running when we come back. That will not happen if complicated maintenance is needed or if a simple port to a new machine takes more than a week.”

This was part of a philosophy of fostering self-sufficiency among users. The aim was always – and explicitly – to develop local expertise in all aspects of using C++. Most organizations must follow the opposite strategy. They keep users dependent on services that generate revenues for a central support organization, consultants, or both. In my opinion, this contrast is a fundamental difference between C++ and many other languages.

The decision to work in the relatively primitive – and almost universally available – framework of the C linking facilities caused the fundamental problem that a C++ compiler must always work with only partial information about a program. An assumption made about a program could possibly be violated by a program written tomorrow in some other language (such as C, Fortran, or assembler) and linked in -possibly after the program has started executing. This problem surfaces in many contexts. It is hard for an implementation to guarantee that

[1] Something is unique.

[2] Information is consistent (in particular, that type information is consistent).

[3] Something is initialized.

In addition, C provides only the feeblest support for the notion of separate name-spaces so that avoiding namespace pollution by separately written program segments becomes a problem. Over the years, C++ has tried to face all of these challenges without departing from the fundamental model and technology that gives portability and efficiency, but in the C with Classes days we simply relied on the C technique of header files.

Through the acceptance of the C linker came another rule of thumb for the development of C++: C++ is just one language in a system and not a complete system. In other words, C++ accepts the role of a traditional programming language with a fundamental distinction between the language, the operating system, and other important parts of the programmer’s world. This delimits the role of the language in a way that is hard to do for a language, such as Smalltalk or Lisp, that is conceived as a complete system or environment. It makes it essential that a C++ program fragment can call program fragments written in other languages and that a C++ program fragment can itself be called by program fragments written in other languages. Being “just a language” also allows C++ implementations to benefit directly from tools written for other languages.

The need for a programming language and the code written in it to be just a cog in a much larger machine is of utmost importance to most industrial users. Yet such coexistence with other languages and systems was apparently not a major concern to most theoreticians, would-be perfectionists, and academic users. I believe this to be one of the main reasons for C++’s success.

C with Classes was almost source-compatible with C. However, it was never 100% C compatible; for example, words such as class and new are perfectly good identifier names in C, but they are keywords in C with Classes and its successors. It was, however, link compatible. C functions could be called from C with Classes, C with Classes functions could be called from C, a struct had the same layout in both languages so that passing both simple and composite objects between the languages was simple and efficient. This link compatibility has been maintained for C++ (with a few simple and explicit exceptions that can be avoided by programmers when necessary (§3.5.1). Over the years, my experience and that of my colleagues has been that link compatibility is much more important than source compatibility. This, at least, is the case when identical source code gives the same results on both C and C++ or alternatively fails to compile or link in one of the languages.

2.5.2 The Object Layout Model

The basic model of an object was fundamental to the design of C with Classes in the sense that I always maintained a clear view of what an object looked like in memory and considered how language features affected operations on such objects. The evolution of the object model is fundamental to the evolution of C++.

A C with Classes object was simply a C structure. Thus, the layout of

class stack {
    char s[10];
    char* min;
    char* top;
    char* max;
    void new();
public :
    void push();
    char pop();
};

is the same as for

Click here to view code image

struct stack { /* generated C code */
    char s[10];
    char* min;
    char* top;
    char* max;
};

that is

A compiler may add some “padding” between and after the members for alignment, but otherwise the size of the object is the sum of the sizes of the members. Thus, memory usage is minimized.

Run-time overhead is similarly minimized by a direct mapping from a call of a member function

Click here to view code image

void stack.push(char c)
{
    if (top>max) error("stack overflow");
    *top++ = c;
}

void g(class stack* p)
{
    p->push ('c');
}

to the call of an equivalent C function in the generated code:

Click here to view code image

void stack__push(this,c) /* generated C code */
struct stack* this;
char c;
{
    if  ((this-<top)>(this-<max))    error("stack overflow");
    *(this-<top)+      + = c;

}

void g(p) struct stack* p; /* generated C code */
{
    stack__push(p, 'c');
}

In every member function, a pointer called this refers to the object for which the member function was called. Stu Feldman remembers that in the very first C with Classes implementation, the programmer couldn’t refer directly to this. After he pointed that out to me, I promptly remedied the problem. Without this or some equivalent mechanism, member functions cannot be used for linked list manipulation.

The this pointer is C++’s version of the Simula THIS reference. Sometimes, people ask why this is a pointer rather than a reference and why it is called this rather than self. When this was introduced into C with Classes, the language didn’t have references, and C++ borrows its terminology from Simula rather than from Smalltalk.

Had stack.push() been declared inline, the generated code would have looked like this:

Click here to view code image

void g(p) /* generated C code */
struct stack* p;
{
if ((p->top)>(p->max)) error("stack overflow");
*(p->top)++ = 'c';
}

This is of course exactly the code a programmer would have written in C.

2.6 Static Type Checking

I have no recollection of discussions, no design notes, and no recollection of any implementation problems concerning the introduction of static (“strong”) type checking into C with Classes. The C with Classes syntax and rules, the ones subsequently adopted for the ANSI C standard, simply appeared fully formed in the first C with Classes implementation. After that, a minor series of experiments led to the current (stricter) C++ rules. Static type checking was to me, after my experience with Simula and Algol68, a simple must and the only question was exactly how it was to be added.

To avoid breaking C code, I decided to allow the call of an undeclared function and not perform type checking on such undeclared functions. This was of course a major hole in the type system, and several attempts were made to decrease its importance as the major source of programming errors before finally – in C++ – the hole was closed by making a call of an undeclared function illegal. One simple observation defeated all attempts to compromise, and thus maintain a greater degree of C compatibility: As programmers learned C with Classes, they lost the ability to find run-time errors caused by simple type errors. Having come to rely on the type checking and type conversion provided by C with Classes, they lost the ability to quickly find the silly errors that creep into C programs through the lack of checking. Further, they failed to take the precautions against such silly errors that good C programmers take as a matter of course. After all, “such errors don’t happen in C with Classes.” Thus, as the frequency of run-time errors caused by uncaught argument type errors decreases, their seriousness and the time spent finding them increases. The result was seriously annoyed programmers demanding further tightening of the type system.

The most interesting experiment with “incomplete static checking” was the technique of allowing calls of undeclared functions, but noting the type of the arguments used so that a consistency check could be done when further calls were seen. When Walter Bright many years later independently discovered this trick he named it autoprototyping, using the ANSI C term prototype for a function declaration. The experience was that autoprototyping caught many errors and initially increased a programmer’s confidence in the type system. However, since consistent errors and errors in a function called only once in a compilation were not caught, autoprototyping ultimately destroyed programmer confidence in the type checker and induced a sense of paranoia even worse than I have seen in C or BCPL programmers.

C with Classes introduced the notation f(void) for a function f that takes no arguments as a contrast to f() that in C declares a function that can take any number of arguments of any type without any type check. My users soon convinced me, however, that the f(void) notation wasn’t elegant, and that having functions declared f() accept arguments wasn’t intuitive. Consequently, the result of the experiment was to have f() mean a function f that takes no arguments, as any novice would expect. It took support from both Doug McIlroy and Dennis Ritchie for me to build up the courage to make this break from C. Only after they used the word abomination about f(void) did I dare give f() the obvious meaning. However, to this day, C’s type rules are much more lax than C++’s, and ANSI C adopted “the abominable f(void)” from C with Classes.

2.6.1 Narrowing Conversions

Another early attempt to tighten C with Classes’ type rules was to disallow “information destroying” implicit conversions. Like others, I had been badly bitten by examples equivalent to (but naturally not as easy to spot in a real program) as these:

Click here to view code image

void f()

    long int lng = 65000;
    int  i1 = lng;   /* i1 becomes negative (-536)  */
                    /* on machines with 16 bit ints */
    int i2 = 257;
    char c = i2;    /* truncates: c becomes 1       */
                    /* on machines with 8 bit chars */
}

I decided to try to ban all conversions that were not value preserving, that is, to require an explicit conversion operator wherever a larger object was stored into a smaller:

Click here to view code image

void g(long lng, int i) /* experiment */
{
    int i1 = lng;   /* error: narrowing conversion */
    i1 = (int)lng;  /* truncates for 16 bit ints   */

    char c = i;     /* error: narrowing conversion */
    c = (char)i     /* truncates                   */
}

The experiment failed miserably. Every C program I looked at contained large numbers of assignments of ints to char variables. Naturally, since these were working programs, most of these assignments were perfectly safe. That is, either the value was small enough not to become truncated, or the truncation was expected or at least harmless in that particular context. There was no willingness in the C with Classes community to make such a break from C. I’m still looking for ways to compensate for these problems (§14.3.5.2).

2.6.2 Use of Warnings

I considered introducing run-time checks for the values assigned, but that would imply a high cost in time and code size, and also detect the problems far too late for my taste. Therefore, run-time checks for conversions – and more importantly, in general – were relegated to the category of “ideas for future debugging support.” Instead, I used a technique that was to become standard for dealing with what I considered deficiencies in the C language that were too serious to ignore, but too ingrained in the structure of C to remove. I made the C with Classes preprocessor (and later my C++ compiler) issue warnings:

Click here to view code image

void f(long lng, int i)
{
    int i1 = lng;   // implicit conversion: warning
    i1 = (int)lng;  // explicit conversion: no warning

    char c = i;     // too common to repair: no warning
}

Unconditional warnings were (and still are) issued for long->int and double->int conversions, because I really don’t see any excuse for having such conversions legal. They are simply a result of the historical accident that floating point arithmetic was introduced into C before explicit conversions were. I have had no complaints about these warnings, and I and others have been saved by them many times. The int->char conversion, however, I didn’t feel able to do anything about. To this day, such conversions pass the AT&T C++ compiler without even a warning.

The reason for this is that I decided to use unconditional warnings exclusively for things that “had a higher than 90% chance of actually catching an error.” This reflected the experience that C-compiler and Lint warnings more often than not are “wrong” in the sense that they warn against something that doesn’t in fact cause the program to misbehave. This leads programmers to ignore warnings from C compilers or to heed them only under protest. My intent was to ensure that ignoring a C++ warning would be seen as a dangerous folly; I feel I succeeded. Thus, warnings are used to compensate for problems that cannot be fixed through language changes because of C compatibility requirements and also as a way of easing the transition from C to C++. For example:

Click here to view code image

class X {
  // ...
}
g (int i, int x, int j)
         // warning: class X defined as return type for g()
         // (did you forget a ';' after '}' ?)
         // warning: j not used
{
   if (i = 7) { // warning: constant assignment
                // in condition
      // ...
   }
      // ...
   if (x&077 == 0) {  // warning: == expression
                      // as operand for &
      // ...
   }
}

Even the first Cfront release (§3.3) produced these warnings. They were the result of a design decision and not an afterthought.

Much later, the first of these warnings was made into an error by banning the definition of new types in return types and argument types.

2.7 Why C?

A common question at C with Classes presentations was “Why use C? Why didn’t you build on, say, Pascal?” One version of my answer can be found in [Stroustrup, 1986c]:

“C is clearly not the cleanest language ever designed nor the easiest to use so why do so many people use it?

[1] C is flexible: It is possible to apply C to most every application area and to use most every programming technique with C. The language has no inherent limitations that preclude particular kinds of programs from being written.

[2] C is efficient: The semantics of C are “low level” that is, the fundamental concepts of C mirror the fundamental concepts of a traditional computer. Consequently, it is relatively easy for a compiler and/or a programmer to efficiently utilize hardware resources for C programs.

[3] C is available: Given a computer, whether the tiniest micro or the largest super-computer, chances are that there is an acceptable quality C compiler available and that that C compiler supports an acceptably complete and standard C language and library. Libraries and support tools are also available, so that a programmer rarely needs to design a new system from scratch.

[4] C is portable: A C program is not automatically portable from one machine (and operating system) to another, nor is such a port necessarily easy to do. It is, however, usually possible and the level of difficulty is such that porting even major pieces of software with inherent machine dependencies is typically technically and economically feasible.

Compared with these first-order advantages, the second-order drawbacks like the curious C declarator syntax and the lack of safety of some language constructs become less important. Designing “a better C” implies compensating for the major problems involved in writing, debugging, and maintaining C programs without compromising the advantages of C. C++ preserves all these advantages and compatibility with C at the cost of abandoning claims to perfection and of some compiler and language complexity. However, designing a language from scratch does not ensure perfection and the C++ compilers compare favorably in run-time, have better error detection and reporting, and equal the C compilers in code quality.”

This formulation is more polished than I could have managed in the early C with Classes days, but it does capture the essence of what I considered important about C and that I did not want to lose in C with Classes. Pascal was considered a toy language [Kernighan,1981], so it seemed easier and safer to add type checking to C than to add the features considered necessary for systems programming to Pascal. At the time, I had a positive dread of making mistakes of the sort where the designer, out of misguided paternalism or plain ignorance, makes the language unusable for real work in important areas. The ten years that followed clearly showed that choosing C as a base left me in the mainstream of systems programming where I intended to be. The cost in language complexity has been considerable, but manageable.

At the time, I considered Modula-2, Ada, Smalltalk, Mesa [Mitchell, 1979], and Clu as alternatives to C and as sources for ideas for C++ [Stroustrup, 1984c] so there was no shortage of inspiration. However, only C, Simula, Algol68, and in one case BCPL left noticeable traces in C++ as released in 1985. Simula gave classes, Algol68 operator overloading (§3.6), references (§3.7), and the ability to declare variables anywhere in a block (§3.11.5), and BCPL gave // comments (§3.11.1).

There were several reasons for avoiding major departures from C style. I saw the merging of C’s strengths as a systems programming language with Simula’s strengths for organizing programs as a significant challenge in itself. Adding significant features from other sources could easily lead to a “shopping list” language and destroy the integrity of the resulting language. To quote from [Stroustrup, 1986]:

“A programming language serves two related purposes: it provides a vehicle for the programmer to specify actions to be executed and a set of concepts for the programmer to use when thinking about what can be done. The first aspect ideally requires a language that is “close to the machine,” so that all important aspects of a machine are handled simply and efficiently in a way that is reasonably obvious to the programmer. The C language was primarily designed with this in mind. The second aspect ideally requires a language that is “close to the problem to be solved” so that the concepts of a solution can be expressed directly and concisely. The facilities added to C to create C++ were primarily designed with this in mind.”

Again, this formulation is more polished than I could have managed during the early stages of the design of C with Classes, but the general idea was clear. Departures from the known and proven techniques of C and Simula would have to wait for further experience with C with Classes and C++ and for further experiments. I firmly believe – and believed then – that language design is not just design from first principles, but an art that requires experience, experiments, and sound engineering tradeoffs. Adding a major feature or concept to a language should not be a leap of faith, but a deliberate action based on experience and fitting into a framework of other features and ideas of how the resulting language can be used. The post-1985 evolution of C++ shows the influence of ideas from Ada (templates, §15; exceptions, §16; namespaces, §17), Clu (exceptions, §16), and ML (exceptions, §16).

2.8 Syntax Problems

Could I have “fixed” the most annoying deficiencies of the C syntax and semantics at some point before C++ was made generally available? Could I have done so without removing useful features (to C with Classes’ users in their environments – as opposed to an ideal world) or introducing incompatibilities that were unacceptable to C programmers wanting to migrate to C with Classes? I think not. In some cases, I tried, but I backed out my changes after receiving complaints from outraged users.

2.8.1 The C Declaration Syntax

The part of the C syntax I disliked most was the declaration syntax. Having both prefix and postfix declarator operators is the source of a fair amount of confusion. For example:

Click here to view code image

int *p[10]; /* array of 10 pointers to int, or */
/* pointer to array of 10 ints? */

Allowing the type specifier to be omitted (meaning int by default) also led to complications. For example:

Click here to view code image

            /* C style   (proposed banned):   */

static a;   /* implicit: type of x 'a' is int */
f();        /* implicit: returns int          */

           // proposed C with Classes style:
static int a;
int f();

The negative reaction to changes in this area from users was very strong. They considered the “terseness” allowed by C essential to the point of refusing to use a “fascist” language that required them to write redundant type specifiers. I backed out the change. I don’t think I had a choice. Allowing that implicit int is the source of many of the annoying problems with the C++ grammar today. Note that the pressure came from users, not management or arm-chair language experts. Finally, ten years later, the C++ ANSI/ISO standard committee (§6) has decided to deprecate implicit int. That means that we may get rid of it in another decade or so. With the help of tools and compiler warnings, individual users can now start protecting themselves from confusions caused by implicit int, such as

Click here to view code image

void f(const T); // const argument of type T, or
// const int argument named T?
// (it's a const argument of type T)

The function definition syntax with the argument types within the function parentheses was, however, used for C with Classes and C++, and later adopted for ANSI C:

Click here to view code image

f(a,b) char b;   /* K&R C style function definition */
{
    /* ... */
}

int f(int a, char b) // C++ style function definition
{
    // ...
}

Similarly, I considered the possibility of introducing a linear notation for declarators. The C trick of having the declaration of a name mimic its use leads to declarations that are hard to read and write, and maximizes the opportunity for humans and programs to confuse declarations and expressions. Many people had observed that the problem with C’s declarator syntax was that the declarator operator * (“pointer to”) is prefix, whereas the declarator operators [] (“array of”) and () (“function returning”) are postfix. This forces people to use parentheses to disambiguate cases such as:

Click here to view code image

                   /* C style:                   */
int* v[10];        /* array of pointers to ints */
int (*p) [10];     /* pointer to array of ints   */

Together with Doug McIlroy, Andrew Koenig, Jonathan Shopiro, and others I considered introducing postfix “pointer to” operator -> as an alternative to the prefix *:

Click here to view code image

                  // radical   alternative:
v: [10]->int;     // array of pointers to ints
p: ->[10]int;     // pointer to array of ints

                  // less radical alternative:
int v[10]->;      // array of pointers to ints
int p->[10];      // pointer to array of  ints

The less radical alternative has the advantage of allowing the postfix -> declarator to coexist with the prefix * declarator during a transition period. After a transition period the * declarator and the redundant parentheses could have been removed from the language. A noticeable benefit of this scheme is that parentheses are only needed to express “function” so that an opportunity for confusion and grammar subtleties could be removed (see also [Sethi, 1981]). Having all declarator operators postfix would ensure that declarations can be read from left to right. For example:

Click here to view code image

int f(char)->[10]->(double)->;

meaning a function f returning a pointer to an array of pointers to functions returning a pointer to int. Try to write that in straight C/C++! Unfortunately, I fumbled the idea and didn’t ever deliver a complete implementation. Instead, people build up complicated types incrementally using typedef:

Click here to view code image

typedef int* DtoI(double); // function taking a double and
                           // returning a pointer to int
typedef DtoI* V10[10];     // array of 10 pointers to DtoI
V10* f(char);              // f takes a char and returns
                           // a pointer to V10

My eventual rationale for leaving things as they were was that any new syntax would (temporarily at least) add complexity to a known mess. Also, even though the old style is a boon to teachers of trivia and to people who want to ridicule C, it is not a significant problem for C programmers. In this case, I’m not sure if I did the right thing, though. The agony to me and other C++ implementers, documenters, and tool builders caused by the perversities of syntax has been significant. Users can – and do – of course insulate themselves from such problems by writing in a small and easily understood subset of the C/C++ declaration syntax (§7.2).

2.8.2 Structure Tags vs. Type Names

A significant syntactic simplification for the benefit of users was introduced into C++ at the cost of some extra work to implementers and some C compatibility problems. In C, the name of a structure, a “structure tag,” must always be preceded by the keyword struct. For example

Click here to view code image

struct buffer a; /* 'struct' is necessary in C */

In the context of C with Classes, this had annoyed me for some time because it made user-defined types second-class citizens syntactically. Given my lack of success with other attempts to clean up the syntax, I was reluctant and only made the change – at the time C with Classes evolved into C++ – at the urging of Tom Cargill. The name of a struct, union, or class is a type name in C++ and requires no special syntactic identification:

buffer a; // C++

The resulting fights over C compatibility lasted for years (see also §3.12). For example, the following is legal C:

Click here to view code image

struct S { int a; };
int S;
void f(struct S x)
{
x.a = S; // S is an int variable
}

It is also legal C with Classes and legal C++, yet for years we struggled to find a formulation that would allow such (marginally crazy, but harmless) examples in C++ for compatibility. Allowing such examples implies that we must reject

Click here to view code image

void g(S x) // error: S is an int variable
{
x.a = S; // S is an int variable
}

The real need to address this particular issue came from the fact that some standard UNIX header files, notably, stat.h, rely on a struct and a variable or function having the same name. Such compatibility issues are important and a delight for language lawyers. Unfortunately, until a satisfactory – and usually trivially simple -solution is found, such problems absorb an undesirable amount of time and energy. Once a solution is found, a compatibility problem becomes indescribably boring because it has no inherent intellectual value, only practical importance. The C++ solution to the C multiple namespace problem is that a name can denote a class and also a function or a variable. If it does, the name denotes the non-class unless explicitly qualified by one of the keywords struct, class, and union.

Dealing with stubborn old-time C users, would-be C experts, and genuine C/C++ compatibility issues has been one of the most difficult and frustrating aspects of developing C++. It still is.

2.8.3 The Importance of Syntax

I am of the opinion that most people focus on syntax issues to the detriment of type issues. The critical issues in the design of C++ were always those of type, ambiguity, and access control, not those of syntax.

It is not that syntax isn’t important; it is immensely important because the syntax is quite literally what people see. A well-chosen syntax significantly helps programmers learn new concepts and avoids silly errors by making them harder to express than their correct alternatives. However, the syntax of a language should be designed to follow the semantic notions of the language, not the other way around. This implies that language discussions should focus on what can be expressed rather than how it is expressed. An answer to the what often yields an answer to the how, whereas a focus on syntax usually degenerates into an argument over personal taste.

A subtle aspect of C compatibility discussions is that old-time C programmers are comfortable with the old ways of doing things and can therefore be quite intolerant of the incompatibilities needed to support styles of programming that C wasn’t designed for. Conversely, non-C programmers usually underestimate the value that C programmers attribute to the C syntax.

2.9 Derived Classes

The derived class concept is C++’s version of Simula’s prefixed class notion and thus a sibling of Smalltalk’s subclass concept. The names derived and base were chosen because I never could remember what was sub and what was super and observed that I was not the only one with this particular problem. I also noted that many people found it counterintuitive that a subclass typically has more information than its superclass. In inventing the terms derived class and base class, I departed from my usual principle of not inventing new names where old ones exist. In my defense, I note that I have never observed any confusion about what is base and what is derived among C++ programmers, and that the terms are trivially easy to learn even for people who don’t have a grounding in mathematics.

The C with Classes concept was provided without any form of run-time support. In particular, the Simula (and later C++) concept of a virtual function was missing. The reason for this was that I – with good reason, I think – doubted my ability to teach people how to use them, and even more, my ability to convince people that a virtual function is as efficient in time and space as an ordinary function as typically used. Often, people with Simula and Smalltalk experience still don’t quite believe that until they have had the C++ implementation explained to them in detail – and many still harbor irrational doubts after that.

Even without virtual functions, derived classes in C with Classes were useful for building new data structures out of old ones and for associating operations with the resulting types. In particular, they allowed list and task classes to be defined [Stroustrup, 1980,1982b].

2.9.1 Polymorphism without Virtual Functions

In the absence of virtual functions, a user could use objects of a derived class and treat base classes as implementation details. For example, given a vector class with elements indexed from 0 and no range checking:

Click here to view code image

class vector {
/* ... */
int get_elem(int i);
};

one can build a range-checked vector with elements in a specified range:

Click here to view code image

class vec : vector {
    in t  hi, lo;
public  :
    /* ... */
    new(int lo, int hi);
    get_elem(int i);
};
int vec.get_elem(int i)
{
    if   (i<lo || hi<i) error("range error");
    return vector.get_elem(i-lo);
}

Alternatively, an explicit type field could be introduced in a base class and used together with explicit type casts. The former strategy was used where the user only sees specific derived classes and “the system” sees only the base classes. The latter strategy was used for various application classes where, in effect, a base class was used to implement a variant record for a set of derived classes.

For example, [Stroustrup, 1982b] presents this ugly code for retrieving an object from a table and using it based on a type field:

Click here to view code image

class elem { /* properties to be put into a table */ };
class table { /* table data and lookup functions */ };

class cl_name * cl;    /* cl_name is derived   from elem */
class po_name * po;    /* po_name is derived  from elem */
class hashed * table;  /* hashed is derived from   table */

elem * p = table->look (" carrot1 1);

if (P) {
   switch (p->type) { /* type field in elem objects */
     case PO_NAME:
       po   = (class po_name *) p; /* explicit type conversion */
       ...
       break;
     case CL_NAME:
       cl = (class cl_name *) p; /* explicit type conversion */
       ...
       break;
     default:
       error("unknown type of element");
  }
}
else
   error("carrot not defined");

Much of the effort in C with Classes and C++ has been to ensure that programmers needn’t write such code.

2.9.2 Container Classes Without Templates

Most important in my thinking at the time and in my own code was the combination of base classes, explicit type conversions, and (occasionally) macros to provide generic container classes. For example, [Stroustrup, 1982b] demonstrates how a list that holds objects of a single type can be built from a list of links:

Click here to view code image

class wordlink : link
{
    char word[SIZE];
public:
    void clear(void);
    class wordlink * get(void)
        { return  (class wordlink *) link.get(); };
    void put(class wordlink * p) { link.put(p); };
};

Because every link that is put() onto a list through a wordlink must be a wordlink, it is safe to cast every link that is taken off the list using get() back to a wordlink. Note the use of private inheritance (the default in the absence of the keyword public in the specification of the base class link; §2.10). Allowing a wordlink to be used as a plain link would have compromised type safety.

Macros were used to provide generic types. Quoting from [Stroustrup,1982bl: “The class stack example in the introduction explicitly defined the stack to be a stack of characters. That is sometimes too specific. What if a stack of long integers was also needed? What if a class stack was needed for a library so that the actual stack element type could not be known in advance? In these cases the class stack declaration and its associated function declarations should be written so that the element type can be provided as an argument when a stack is created in the same way as the size was.

There is no direct language support for this, but the effect can be achieved through the facilities of the standard C preprocessor. For example:

Click here to view code image

class ELEM_stack {
   ELEM * min, * top, * max;
   void new(int), delete(void);
public:
   void push(ELEM);
   ELEM pop(void);
};

This declaration can then be placed in a header file and macro-expanded once for each type ELEM for which it is used:

Click here to view code image

#define ELEM long
#define ELEM_stack long_stack
#include "stack.h"
#undef ELEM
#undef ELEM_stack

typedef class x X;
#define ELEM X
#define ELEM_stack X_stack
#include "stack.h"

#undef ELEM
#undef ELEM_stack

class long_stack ls (1024);
class long_stack ls2(512);
class X_stack xs(512);

This is certainly not perfect, but it is simple.”

This was one of the earliest and crudest techniques. It proved too error-prone for real use, so I soon defined a few “standard” token-pasting macros and recommended a stylized macro usage based on them for generic classes [Stroustrup,1986,§7.3.5]. Eventually, these techniques matured into C++’s template facility and the techniques for using templates together with base classes to express commonality among instantiated templates (§15.5).

2.9.3 The Object Layout Model

The implementation of derived classes was simply concatenation of the members of the base and derived classes. For example, given:

Click here to view code image

class A {
    int a;
public :
    /* member functions */
};

class B : public A {
    int b;
public :
    /* member functions */
};

an object of class B will be represented by a structure:

Click here to view code image

struct B { /* generated C code */
int a;
int b;

};

that is

Name clashes between base members and derived members are handled by the compiler internally assigning suitably unambiguous names to the members. Calls are handled exactly as when no derivation is used. No added overhead in time or space is imposed relative to C.

2.9.4 Retrospective

Was it reasonable to avoid virtual functions in C with Classes? Yes, the language was useful without them, and their absence postponed time-consuming debates about their utility, proper use, and efficiency. Their absence led to development of language mechanisms and techniques that have proven useful even in the presence of more powerful inheritance mechanisms and provided a counterweight to the tendency of some programmers to use inheritance to the exclusion of all other techniques (see §14.2.3). In particular, classes were used to implement concrete types, such as complex and string, and interface classes became popular. A class stack used as an interface to a more general class dequeue is an example of inheritance without virtual functions.

Were virtual functions needed for C with Classes to serve the needs it aimed to serve? Yes, and therefore they were added as the first major extension to make C++.

2.10 The Protection Model

Before starting work on C with Classes, I worked with operating systems. The notions of protection from the Cambridge CAP computer and similar systems – rather than any work in programming languages – inspired the C++ protection mechanisms. The class is the unit of protection and the fundamental rule is that you cannot grant yourself access to a class; only the declarations placed in the class declaration (supposedly by its owner) can grant access. By default, all information is private.

Access is granted by declaring a member in the public part of a class declaration, or by specifying a function or a class as a friend. For example:

Click here to view code image

class X {
    /* representation */
public:
    void f();          /* member function with access */
                       /* to representation           */

    friend void g();   /* global function with access */
                       /* to representation           */
};

Initially, only classes could be friends, thus granting access to all member functions of the friend class, but later it was found convenient to be able to grant access (friendship) to individual functions. In particular, it was found useful to be able to grant access to global functions; see also §3.6.1.

A friendship declaration was seen as a mechanism similar to that of one protection domain granting a read-write capability to another. It is an explicit and specific part of a class declaration. Consequently, I have never been able to see the recurring assertions that a friend declaration “Violates encapsulation” as anything but a combination of ignorance and confusion with non-C++ terminology.

Even in the first version of C with Classes, the protection model applied to base classes as well as members. Thus, a class could be either publicly or privately derived from another. The private/public distinction for base classes predates the debate on implementation inheritance vs. interface inheritance by about five years [Snyder, 1986] [Liskov,1987]. If you want to inherit an implementation only, you use private derivation in C++. Public derivation gives users of the derived class access to the interface provided by the base class. Private derivation leaves the base as an implementation detail; even the public members of the private base class are inaccessible except through the interface explicitly provided for the derived class.

To provide “semi-transparent scopes” a mechanism was provided to allow individual public names from a private base class to be made public [Stroustrup,1982b]:

Click here to view code image

class vector {
    /* ... */
public :
    /* ... */
    void print(void);
};

class hashed : vector /* vector is private base of hashed */
{
    /* ... */
public :
    vector.print;   /* semi-transparent   scope */
                    /* other vector functions cannot */
                    /* be applied to hashed objects */
    /* ... */
};

The syntax for making an otherwise inaccessible name accessible is simply naming it. This is an example of a perfectly logical, minimalistic, and unambiguous syntax. It is also unnecessarily obscure; almost any other syntax would have been an improvement. This syntax problem has now been solved by the introduction of using-declarations (see §17.5.2).

In the [ARM], the C++ notion of protection is summarized:

[1] Protection is provided by compile-time mechanisms against accident, not against fraud or explicit violation.

[2] Access is granted by a class, not unilaterally taken.

[3] Access control is done for names and does not depend on the type of what is named.

[4] The unit of protection is the class, not the individual object.

[5] Access is controlled, not visibility.

All of this was true in 1980, though some of the terminology was different then. The last point can be explained like this:

Click here to view code image

int a;       // global a

class X {
private:
    int a;   // member X::a
};

class XX : public X {
    void f() { a = 1; } // which a?
};

Had visibility been controlled, X::a would have been invisible, and XX::f() would have referred to the global a. In fact, C with Classes and C++ deem the global a hidden by the inaccessible X::a and thus XX::f() gets a compile-time error for trying to access an inaccessible variable X::a. Why did I define it that way, and was it the right choice? My recollection on this point is vague, and the stored records are of no use. One point I do remember from the discussion at the time is that given the example above, the rule adopted ensures that f()’s reference to a refers to the same a independently of what access is declared for X::a. Making public/private control visibility, rather than access, would have a change from public to private quietly change the meaning of the program from one legal interpretation (access X::a) to another (access the global a). I no longer consider this argument conclusive (if I ever did), but the decision made has proven useful in that it allows programmers to add and remove public and private specifications during debugging without quietly changing the meaning of programs. I do wonder if this aspect of the C++ definition is the result of a genuine design decision. It could simply be a default outcome of the preprocessor technology used to implement C with Classes that didn’t get reviewed when C++ was implemented with more appropriate compiler technology (§3.3).

Another aspect of C++’s protection mechanism that shows operating system influence is the attitude towards circumvention of the rules. I assume that any competent programmer can circumvent any rule that is not enforced by hardware so it is not worth even trying to protect against fraud [ARM]:

“The C++ access control mechanisms provide protection against accident – not against fraud. Any programming language that supports access to raw memory will leave data open to deliberate tampering in ways that violate the explicit type rules specified for a given data item.”

The task of the protection system is to make sure that any such violation of the type system is explicit and to minimize the need for such violations.

The operating system notion of read/write protection grew into C++’s notion of const (§3.8).

Over the years, there have been many proposals for providing access to a unit smaller than a whole class. For example:

Click here to view code image

grant X::f(int) access to Y::a, Y::b, and Y::g(char);

I have resisted such suggestions on the grounds that such finer-grain control gives no added protection: Any member function can modify any data member of a class, so a function granted access to a function member can indirectly modify every member. I considered, as I still do, the complications in specification, implementation, and use to outweigh the benefits of more explicit control.

2.11 Run-Time Guarantees

The access-control mechanisms described above simply prevent unauthorized access. A second kind of guarantee was provided by “special member functions,” such as constructors, that were recognized and implicitly invoked by the compiler. The idea was to allow the programmer to establish guarantees, sometimes called invariants, that other member functions could rely on (see also §16.10).

2.11.1 Constructors and Destructors

One way I often explained the concept at the time was that a “new function” (a constructor) created the environment in which the member functions would operate and the “delete function” (a destructor) would destroy that environment and release all resources acquired for it. For example:

Click here to view code image

class monitor : object {
    /* ... */
public:
    new()    { /* create the monitor's lock */ }
    delete() { /* release and delete lock */ }

};

2.11.2 Allocation and Constructors

Like in C, objects can be allocated in three ways: on the stack (automatic storage), at a fixed address (static storage), and on the free store (on the heap, dynamic storage). In each case, the constructor must be called for the created object. Allocating an object on the free store in C involves only a call of an allocation function. For example:

Click here to view code image

monitor* p = (monitor*)malloc(sizeof(monitor));

This was clearly insufficient for C with Classes because there was no way of guaranteeing that a constructor was called. Consequently, I introduced an operator to ensure that both allocation and initialization was done:

monitor* p = new monitor;

The operator was called new because that was the name of the corresponding Simula operator. The new operator invokes some allocation function to obtain memory and then invokes a constructor to initialize that memory. The combined operation is often called instantiation or simply object creation; it creates an object out of raw memory.

The notational convenience offered by operator new is significant (§3.9). However, combining allocation and initialization in a single operation without an explicit error-reporting mechanism led to some practical problems. Handling and reporting errors in constructors was rarely critical, though, and the introduction of exceptions (§16.5) provided a general solution.

To minimize recompilation, Cfront implemented a use of operator new for a class with a constructor as simply a call of the constructor. The constructor then did both the allocation and the initialization. This implied that if a translation unit allocates all objects of class X using new and calls no inline functions from X then that translation unit need not be recompiled if the size and representation of X changes. Translation unit is the ANSI C term for a source file after preprocessing; that is, for the information given to a compiler at one time for separate compilation. I found it very useful to organize my simulation programs to minimize recompilation. However, the importance of such minimizing wasn’t generally appreciated in the C with Classes and C++ community until much later (§13.2).

An operator delete was introduced to complement new in the same way as the deallocation function free() complements malloc() (§3.9, §10.7).

2.11.3 Call and Return Functions

Curiously enough, the initial implementation of C with Classes contained a feature that is not provided by C++, but is often requested. One could define a function that would implicitly be called before every call of every member function (except the constructor) and another that would be implicitly called before every return from every member function (except the destructor). They were called call and return functions. They were used to provide synchronization for the monitor class in the original task library [Stroustrup, 1980b]:

Click here to view code image

class monitor : object {
    /* ... */
    call()   { /* grab lock */ }
    return() { /* release lock */ }
    /* ... */
};

These are similar in intent to the CLOS :before and :after methods. Call and return functions were removed from the language because nobody (but me) used them and because I seemed to have completely failed to convince people that call() and return() had important uses. In 1987 Mike Tiemann suggested an alternative solution called “wrappers” [Tiemann, 1987], but at the USENIX implementers’ workshop in Estes Park this idea was determined to have too many problems to be accepted into C++.

2.12 Minor Features

Two very minor features, overloading of assignment and default arguments were introduced into C with Classes. They were the precursors of C++’s overloading mechanisms (§3.6).

2.12.1 Overloading of Assignment

It was soon noticed that classes with a nontrivial representation such as string and vector couldn’t be copied successfully because C’s semantics of assignment (bitwise copy) wasn’t right for such types. This default copy semantics led to shared representations rather than true copies. My response was to allow the programmer to specify the meaning of assignment [Stroustrup,1980]:

“Unfortunately, this standard struct-like assignment is not always ideal. Typically a class object is only the root of a tree of information and a simple copy of that root without any notice taken of the branches is undesirable. Similarly, simply overwriting a class object can create chaos.

Changing the meaning of assignment for objects of a class provides a way of handling these problems. This is done by declaring a class member function called operator=. For example:

Click here to view code image

class x {
public:
   int    a;
   class y * p;
   void operator = (class x *);
};

void x.operator = (class x * from)
{
    a = from->a;
    delete p;
    p = from->p;
    from->p = 0;
}

This defines a destructive read for objects of class x, as opposed to the copy operation implied by the standard semantics.”

The [Stroustrup,1982] version uses an example that checks for this==from to handle self-assignment correctly. Apparently, I learned that technique the hard way.

Where defined, an assignment operator was used to implement all explicit and implicit copy operations. Initialization was handled by first initializing to a default value using a new-function (constructor) taking no arguments and then assigning. This was found to be inefficient and led to the introduction of copy constructors in C++ (§11.4.1).

2.12.2 Default Arguments

The heavy use of default constructors implied by the user-defined assignment operators naturally led to the introduction of default arguments [Stroustrup,1980]:

“The default argument list was a very late addition to the class mechanism. It was added to curb the proliferation of identical “standard argument lists” for class objects passed as function arguments, for class objects that were members of other classes, and for base class arguments. Providing argument lists in these cases proved enough of a nuisance to overcome the aversion to include yet another “feature,” and they can be used to make class object declarations less verbose and more similar to struct declarations.”

Here is an example:

“It is possible to declare a default argument list for a new() function. This list is then used whenever an object of the class is declared without an argument. For example, the declaration:

class char_stack
{
void new(int=512);
...
};

makes the declaration

class char_stack s3;

legal, and initializes s3 by the call s3.new(512).”

Given general function overloading (§3.6, §11), default arguments are logically redundant and at best a minor notational convenience. However, C with Classes had default argument lists for years before general overloading became available in C++.

2.13 Features Considered, but not Provided

In the early days many features were considered that later appeared in C++ or are still discussed. These included virtual functions, static members, templates, and multiple inheritance. However,

“All of these generalizations have their uses, but every “feature” of a language takes time and effort to design, implement, document, and learn. ... The base class concept is an engineering compromise, like the C class concept [Stroustrup, 1982b].”

I just wish I had explicitly mentioned the need for experience. With that, the case against featurism and for a pragmatic approach would have been complete.

The possibility of automatic garbage collection was considered on several occasions before 1985 and deemed unsuitable for a language already in use for real-time processing and hard-core systems tasks such as device drivers. In those days, garbage collectors were less sophisticated than they are today, and the processing power and memory capacity of the average computer were small fractions of what today’s systems offer. My personal experience with Simula and reports on other garbage-collection-based systems convinced me that garbage collection was unaffordable by me and my colleagues for the kind of applications we were writing. Had C with Classes (or even C++) been defined to require automatic garbage collection, it would have been more elegant, but stillborn.

Direct support for concurrency was also considered, but I rejected that in favor of a library-based approach (§2.1).

2.14 Work Environment

C with Classes was designed and implemented by me as a research project in the Computing Science Research Center of Bell Labs. This center provided – and still provides – a possibly unique environment for such work. When I joined, I was basically told to “do something interesting,” given suitable computer resources, encouraged to talk to interesting and competent people, and given a year before having to formally present my work for evaluation.

There was a cultural bias against “grand projects” that required more than a couple of people, against “grand plans” like untested paper designs for others to implement, and against a class distinction between designers and implementers. If you liked such things, Bell Labs and others have many places where you could indulge such preferences. However, in the Computing Science Research Center it was almost a requirement that you – if you were not into theory – personally implemented something embodying your ideas and found users that could benefit from what you built. The environment was very supportive for such work and the Labs provided a large pool of people with ideas and problems to challenge and test anything built. Thus, I could write in [Stroustrup, 1986]:

“There never was a C++ paper design; design, documentation, and implementation went on simultaneously. Naturally, the C++ front-end is written in C++. There never was a “C++ project” either, or a “C++ design committee”. Throughout, C++ evolved, and continues to evolve, to cope with problems encountered by users, and through discussions between the author and his friends and colleagues.”

Only after C++ was an established language did more conventional organizational structures emerge and even then I was officially in charge of the reference manual and had the final say over what went into it until that task was handed over to the ANSI C++ committee in early 1990. As the standards committee’s chairman of the working group for extensions, I’m still directly responsible for every feature that enters C++ (§6.4). On the other hand, after the first few months I never had the freedom to design just for the sake of designing something beautiful or to make arbitrary changes in the language as it stood at any given time. Every language feature required an implementation to make it real, and any change or extension required the concurrence and usually enthusiasm of key C with Classes and later C++ users.

Since there was no guaranteed user population, the language and its implementations could only survive by serving the needs of its users well enough to counteract the organizational pull of established languages and the marketing hype of newer languages. In particular, introducing even a minor incompatibility required delivering some much larger benefit to the users, so major incompatibilities were not often introduced even in the early days. Since to a user almost any incompatibility can seem major, incompatibilities were as rare as I could manage. Only in the move from C with Classes to C++ were many programs deliberately broken.

The absence of a formal organizational structure, of larger-scale support in terms of money, people, “captive” users, and marketing was more than compensated for by the informal help and insights of my peers in the Computing Science Research Center and the protection from nontechnical demands from development organizations offered by the center management. Had it not been for the insights of members of the center and the insulation from political nonsense, the design of C++ would have been compromised by fashions and special interests and its implementation bogged down in a bureaucratic quagmire. It was also most important that Bell Labs provided an environment where there was no need to hoard ideas for personal advancement. Instead, discussion could, did, and still does flow freely, allowing people to benefit from the ideas and opinions of others. Unfortunately, the Computing Science Research Center is not typical even within Bell Labs.

C with Classes grew through discussions with people in the Computing Science Research Center and early users there and elsewhere in the Labs. Most of C with Classes and later C++ was designed on somebody else’s blackboard and the rest on mine. Most such ideas were rejected as being too elaborate, too limited in usefulness, too hard to implement, too hard to teach for use in real projects, not efficient enough in time or space, too incompatible with C, or simply too weird. The few ideas that made it through this filter – invariably involving discussions with at least two people – I then implemented. Typically, the idea mutated through the effort of implementation, testing, and early use by me and a few others. The resulting version was tried on a larger audience and would often mutate a bit further before finding its way into the “official” version of C with Classes as shipped by me. Usually, a tutorial was written somewhere along the way. Writing a tutorial was considered an essential design tool, because if a feature cannot be explained simply, the burden of supporting it will be too great. This point was never far from my mind because during the early years I was the support organization.

In the early days Sandy Fraser, my department head at the time, was very influential. For example, I believe he was the one to encourage me to break from the Simula style of class definition where the complete function definition is included and adopt the style where function definitions are typically elsewhere thus emphasizing the class declaration’s role as an interface. Much of C with Classes was designed to allow simulators to be built that could be used in Sandy Fraser’s work in network design. The first real application of C with Classes was such network simulators. Sudhir Agrawal was another early user who influenced the development of C with Classes through his work with network simulations. Jonathan Shopiro provided much feedback on the C with Classes design and implementation based on his simulation of a “dataflow database machine.”

For more general discussions on programming language issues, as opposed to looking at applications to determine which problems needed to be solved, I turned to Dennis Ritchie, Steve Johnson, and in particular Doug McIlroy. Doug’s influence on the development of both C and C++ cannot be overestimated. I cannot remember a single critical design decision in C++ that I have not discussed at length with Doug. Naturally, we didn’t always agree, but I still have a strong reluctance to make a decision that goes against Doug’s opinion. He has a knack for being right and an apparently infinite amount of experience and patience.

Since the main design work for C with Classes and C++ was done on blackboards, the thinking tended to focus on solutions to “archetypical” problems: small examples that are considered characteristic for a large class of problems. Thus, a good solution to the small example will provide significant help in writing programs dealing with real problems of that class. Many of these problems have entered the C++ literature and folklore through my use of them as examples in my papers, books, and talks. For C with Classes, the example considered most critical was the task class that was the basis of the task-library supporting Simula-style simulation. Other key classes were queue, list, and histogram classes. The queue and list classes were based on the idea – borrowed from Simula – of providing a link class from which users derived their own classes.

The danger inherent in this approach is to create a language and tools that provide elegant solutions to small selected examples, yet don’t scale to building complete systems or large programs. This was counteracted by the simple fact that C with Classes (and later C++) had to pay for itself during its early years. This ensured that C with Classes couldn’t evolve into something that was elegant but useless.

Being an individual working closely with users also gave me the freedom to promise only what I could actually deliver, rather than having to inflate my promises to the point where it would appear to make economic sense for an organization to allocate significant resources to the development, support, and marketing of “a product.” Like all languages that have worked for a living during childhood, C++ matured with a distinct practical and pragmatic bent and a number of scars. The simulations of networks, board layouts, chips, network protocols, etc., based on the task library were my bread and butter in those early years.

† I have retained the original C with Classes syntax and style in the examples. The differences from C++ and modern style should not cause problems with comprehension and may be of interest to some readers. I have, however, fixed obvious bugs and added comments to compensate for the lack of the original context.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 2. C with Classes

Create new playlist

Sign In

Sign Up

2. C with Classes

2.1 The Birth of C with Classes

2.2 Feature overview

2.3 Classes

2.4 Run-Time Efficiency

2.4.1 Inlining

2.5 The Linkage Model

2.5.1 Simple-Minded Implementations

2.5.2 The Object Layout Model

2.6 Static Type Checking

2.6.1 Narrowing Conversions

2.6.2 Use of Warnings

2.7 Why C?

2.8 Syntax Problems

2.8.1 The C Declaration Syntax

2.8.2 Structure Tags vs. Type Names

2.8.3 The Importance of Syntax

2.9 Derived Classes

2.9.1 Polymorphism without Virtual Functions

2.9.2 Container Classes Without Templates

2.9.3 The Object Layout Model

2.9.4 Retrospective

2.10 The Protection Model

2.11 Run-Time Guarantees

2.11.1 Constructors and Destructors

2.11.2 Allocation and Constructors

2.11.3 Call and Return Functions

2.12 Minor Features

2.12.1 Overloading of Assignment

2.12.2 Default Arguments

2.13 Features Considered, but not Provided

2.14 Work Environment

Table of Contents for
2. C with Classes