2

User-Defined Types

Don’t Panic!

– Douglas Adams

2.1 Introduction

We call the types that can be built from the fundamental types (§1.4), the const modifier (§1.6), and the declarator operators (§1.7) built-in types. C++’s set of built-in types and operations is rich, but deliberately low-level. They directly and efficiently reflect the capabilities of conventional computer hardware. However, they don’t provide the programmer with high-level facilities to conveniently write advanced applications. Instead, C++ augments the built-in types and operations with a sophisticated set of abstraction mechanisms out of which programmers can build such high-level facilities.

The C++ abstraction mechanisms are primarily designed to let programmers design and implement their own types, with suitable representations and operations, and for programmers to simply and elegantly use such types. Types built out of other types using C++’s abstraction mechanisms are called user-defined types. They are referred to as classes and enumerations. User-defined types can be built out of both built-in types and other user-defined types. Most of this book is devoted to the design, implementation, and use of user-defined types. User-defined types are often preferred over built-in types because they are easier to use, less error-prone, and typically as efficient for what they do as direct use of built-in types, or even more efficient.

The rest of this chapter presents the simplest and most fundamental facilities for defining and using types. Chapters 48 are a more complete description of the abstraction mechanisms and the programming styles they support. User-defined types provide the backbone of the standard library, so the standard-library chapters, 9–17, provide examples of what can be built using the language facilities and programming techniques presented in Chapters 18.

2.2 Structures

The first step in building a new type is often to organize the elements it needs into a data structure, a struct:

struct Vector {
       double* elem;  // pointer to elements
       int sz;               // number of elements
};

This first version of Vector consists of an int and a double*.

A variable of type Vector can be defined like this:

Vector v;

However, by itself that is not of much use because v’s elem pointer doesn’t point to anything. For it to be useful, we must give v some elements to point to. For example::

void vector_init(Vector& v, int s)    // initialize a Vector
{
        v.elem = new double[s];  // allocate an array of s doubles
        v.sz = s;
}

That is, v’s elem member gets a pointer produced by the new operator and v’s sz member gets the number of elements. The & in Vector& indicates that we pass v by non-const reference (§1.7); that way, vector_init() can modify the vector passed to it.

The new operator allocates memory from an area called the free store (also known as dynamic memory and heap). Objects allocated on the free store are independent of the scope from which they are created and “live” until they are destroyed using the delete operator (§5.2.2).

A simple use of Vector looks like this:

double read_and_sum(int s)
        // read s integers from cin and return their sum; s is assumed to be positive
{
        Vector v;
        vector_init(v,s);                        // allocate s elements for v

        for (int i=0; i!=s; ++i)
                cin>>v.elem[i];                 // read into elements

        double sum = 0;
        for (int i=0; i!=s; ++i)
                sum+=v.elem[i];              // compute the sum of the elements
        return sum;
}

There is a long way to go before our Vector is as elegant and flexible as the standard-library vector. In particular, a user of Vector has to know every detail of Vector’s representation. The rest of this chapter and the next two gradually improve Vector as an example of language features and techniques. Chapter 12 presents the standard-library vector, which contains many nice improvements.

I use vector and other standard-library components as examples

  • to illustrate language features and design techniques, and

  • to help you learn and use the standard-library components.

Don’t reinvent standard-library components such as vector and string; use them. The standard-library types have lower-case names, so to distinguish names of types used to illustrate design and implementation techniques (e.g., Vector and String), I capitalize them.

We use . (dot) to access struct members through a name (and through a reference) and -> to access struct members through a pointer. For example:

void f(Vector v, Vector& rv, Vector* pv)
{
        int i1 = v.sz;             // access through name
        int i2 = rv.sz;            // access through reference
        int i3 = pv->sz;         // access through pointer
}

2.3 Classes

Having data specified separately from the operations on it has advantages, such as the ability to use the data in arbitrary ways. However, a tighter connection between the representation and the operations is needed for a user-defined type to have all the properties expected of a “real type.” In particular, we often want to keep the representation inaccessible to users so as to simplify use, guarantee consistent use of the data, and allow us to later improve the representation. To do that, we have to distinguish between the interface to a type (to be used by all) and its implementation (which has access to the otherwise inaccessible data). The language mechanism for that is called a class. A class has a set of members, which can be data, function, or type members.

The interface of a class is defined by its public members, and its private members are accessible only through that interface. The public and private parts of a class declaration can appear in any order, but conventionally we place the public declarations first and the private declarations later, except when we want to emphasize the representation. For example:

class Vector {
public:
        Vector(int s) :elem{new double[s]}, sz{s} { }    // construct a Vector
        double& operator[](int i) { return elem[i]; }      // element access: subscripting
        int size() { return sz; }
private:
        double* elem;  // pointer to the elements
        int sz;               // the number of elements
};

Given that, we can define a variable of our new type Vector:

Vector v(6);      // a Vector with 6 elements

We can illustrate a Vector object graphically:

Images

Basically, the Vector object is a “handle” containing a pointer to the elements (elem) and the number of elements (sz). The number of elements (6 in the example) can vary from Vector object to Vector object, and a Vector object can have a different number of elements at different times (§5.2.3). However, the Vector object itself is always the same size. This is the basic technique for handling varying amounts of information in C++: a fixed-size handle referring to a variable amount of data “elsewhere” (e.g., on the free store allocated by new; §5.2.2). How to design and use such objects is the main topic of Chapter 5.

Here, the representation of a Vector (the members elem and sz) is accessible only through the interface provided by the public members: Vector(), operator[](), and size(). The read_and_sum() example from §2.2 simplifies to:

double read_and_sum(int s)
{
        Vector v(s);                                    // make a vector of s elements
        for (int i=0; i!=v.size(); ++i)
                cin>>v[i];                               // read into elements

        double sum = 0;
        for (int i=0; i!=v.size(); ++i)
                sum+=v[i];                             // take the sum of the elements
        return sum;
}

A member function with the same name as its class is called a constructor, that is, a function used to construct objects of a class. So, the constructor, Vector(), replaces vector_init() from §2.2. Unlike an ordinary function, a constructor is guaranteed to be used to initialize objects of its class. Thus, defining a constructor eliminates the problem of uninitialized variables for a class.

Vector(int) defines how objects of type Vector are constructed. In particular, it states that it needs an integer to do that. That integer is used as the number of elements. The constructor initializes the Vector members using a member initializer list:

:elem{new double[s]}, sz{s}

That is, we first initialize elem with a pointer to s elements of type double obtained from the free store. Then, we initialize sz to s.

Access to elements is provided by a subscript function, called operator[]. It returns a reference to the appropriate element (a double& allowing both reading and writing).

The size() function is supplied to give users the number of elements.

Obviously, error handling is completely missing, but we’ll return to that in Chapter 4. Similarly, we did not provide a mechanism to “give back” the array of doubles acquired by new; §5.2.2 shows how to define a destructor to elegantly do that.

There is no fundamental difference between a struct and a class; a struct is simply a class with members public by default. For example, you can define constructors and other member functions for a struct.

2.4 Enumerations

In addition to classes, C++ supports a simple form of user-defined type for which we can enumerate the values:

enum class Color { red, blue, green };
enum class Traffic_light { green, yellow, red };

Color col = Color::red;
Traffic_light light = Traffic_light::red;

Note that enumerators (e.g., red) are in the scope of their enum class, so that they can be used repeatedly in different enum classes without confusion. For example, Color::red is Color’s red which is different from Traffic_light::red.

Enumerations are used to represent small sets of integer values. They are used to make code more readable and less error-prone than it would have been had the symbolic (and mnemonic) enumerator names not been used.

The class after the enum specifies that an enumeration is strongly typed and that its enumerators are scoped. Being separate types, enum classes help prevent accidental misuses of constants. In particular, we cannot mix Traffic_light and Color values:

Color x1 = red;                          // error: which red?
Color y2 = Traffic_light::red;   // error: that red is not a Color
Color z3 = Color::red;              // OK
auto x4 = Color::red;               // OK: Color::red is a Color

Similarly, we cannot implicitly mix Color and integer values:

int i = Color::red;           // error: Color::red is not an int
Color c = 2;                    // initialization error: 2 is not a Color

Catching attempted conversions to an enum is a good defense against errors, but often we want to initialize an enum with a value from its underlying type (by default, that’s int), so that’s allowed, as is explicit conversion from the underlying type:

Color x = Color{5};  // OK, but verbose
Color y {6};              // also OK

Similarly, we can explicitly convert an enum value to its underlying type:

int x = int(Color::red);

By default, an enum class has assignment, initialization, and comparisons (e.g., == and <; §1.4) defined, and only those. However, an enumeration is a user-defined type, so we can define operators for it (§6.4):

Traffic_light& operator++(Traffic_light& t)                // prefix increment: ++
{
        switch (t) {
        case Traffic_light::green:         return t=Traffic_light::yellow;
        case Traffic_light::yellow:        return t=Traffic_light::red;
        case Traffic_light::red:             return t=Traffic_light::green;
        }
}

auto signal = Traffic_light::red;
Traffic_light next = ++signal;           // next becomes Traffic_light::green

If the repetition of the enumeration name, Traffic_light, becomes too tedious, we can abbreviate it in a scope:

Traffic_light& operator++(Traffic_light& t)               // prefix increment: ++
{
        using enum Traffic_light;         // here, we are using Traffic_light

        switch (t) {
        case green:      return t=yellow;
        case yellow:     return t=red;
        case red:          return t=green;
        }
}

If you don’t ever want to explicitly qualify enumerator names and want enumerator values to be ints (without the need for an explicit conversion), you can remove the class from enum class to get a “plain” enum. The enumerators from a “plain” enum are entered into the same scope as the name of their enum and implicitly convert to their integer values. For example:

enum Color { red, green, blue };
int col = green;

Here col gets the value 1. By default, the integer values of enumerators start with 0 and increase by one for each additional enumerator. The “plain” enums have been in C++ (and C) since the earliest days, so even though they are less well behaved, they are common in current code.

2.5 Unions

A union is a struct in which all members are allocated at the same address so that the union occupies only as much space as its largest member. Naturally, a union can hold a value for only one member at a time. For example, consider a symbol table entry that holds a name and a value. The value can either be a Node* or an int:

enum class Type { ptr, num }; // a Type can hold values ptr and num (§2.4)

struct Entry {
       string name;    // string is a standard-library type
       Type t;
       Node* p;  // use p if t==Type::ptr
       int i;         // use i if t==Type::num
};

void f(Entry* pe)
{
        if (pe->t == Type::num)
                cout << pe->i;
        // ...
}

The members p and i are never used at the same time, so space is wasted. It can be easily recovered by specifying that both should be members of a union, like this:

union Value {
        Node* p;
        int i;
};

Now Value::p and Value::i are placed at the same address of memory of each Value object.

This kind of space optimization can be important for applications that hold large amounts of memory so that compact representation is critical.

The language doesn’t keep track of which kind of value is held by a union, so the programmer must do that:

struct Entry {
       string name;
       Type t;
       Value v;   // use v.p if t==Type::ptr; use v.i if t==Type::num
};

void f(Entry* pe)
{
        if (pe->t == Type::num)
               cout << pe->v.i;
        // ...
}

Maintaining the correspondence between a type field, sometimes called a discriminant or a tag, (here, t) and the type held in a union is error-prone. To avoid errors, we can enforce that correspondence by encapsulating the union and the type field in a class and offer access only through member functions that use the union correctly. At the application level, abstractions relying on such tagged unions are common and useful. The use of “naked” unions is best minimized.

The standard library type, variant, can be used to eliminate most direct uses of unions. A variant stores a value of one of a set of alternative types (§15.4.1). For example, a variant<Node*,int> can hold either a Node* or an int. Using variant, the Entry example could be written as:

struct Entry {
       string name;
       variant<Node*,int> v;
};

void f(Entry* pe)
{
        if (holds_alternative<int>(pe->v))   // does *pe hold an int? (see §15.4.1)
                cout << get<int>(pe->v);         // get the int
        // ...
}

For many uses, a variant is simpler and safer to use than a union.

2.6 Advice

[1] Prefer well-defined user-defined types over built-in types when the built-in types are too low-level; §2.1.

[2] Organize related data into structures (structs or classes); §2.2; [CG: C.1].

[3] Represent the distinction between an interface and an implementation using a class; §2.3; [CG: C.3].

[4] A struct is simply a class with its members public by default; §2.3.

[5] Define constructors to guarantee and simplify initialization of classes; §2.3; [CG: C.2].

[6] Use enumerations to represent sets of named constants; §2.4; [CG: Enum.2].

[7] Prefer class enums over “plain” enums to minimize surprises; §2.4; [CG: Enum.3].

[8] Define operations on enumerations for safe and simple use; §2.4; [CG: Enum.4].

[9] Avoid “naked” unions; wrap them in a class together with a type field; §2.5; [CG: C.181].

[10] Prefer std::variant to “naked unions.”; §2.5.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.177.169