15

Pointers and Containers

Education is what, when, and why to do things.

Training is how to do it.

– Richard Hamming

15.1 Introduction

C++ offers simple built-in low-level types to hold and refer to data: objects and arrays hold data; pointers and arrays refer to such data. However, we need to support both more specialized and more general ways for holding and using data. For example, the standard-library containers (Chapter 12) and iterators (§13.3) are designed to support general algorithms.

The main commonality among the container and pointer abstractions is that their correct and efficient use requires encapsulation of data together with a set of functions to access and manipulate them. For example, pointers are very general and efficient abstractions of machine addresses, but using them correctly to represent ownership of resources has proven excessively difficult. So, the standard-library offers resource-management pointers; that is, classes that encapsulate pointers and provide operations that simplify their correct use.

These standard-library abstractions encapsulate built-in language types and are required to perform as well in time and space as correct uses of those types.

There is nothing “magic” about these types. We can design and implement our own “smart pointers” and specialized containers as needed using the same techniques as are used for the standard-library ones.

15.2 Pointers

The general notion of a pointer is something that allows us to refer to an object and to access it according to its type. A built-in pointer, such as int*, is an example but there are many more.

Pointers

T*

A built-in pointer type: points to an object of type T

 

or to a contiguously-allocated sequence of elements of type T

T&

A built-in reference type: refers to an object of type T;

 

a pointer with implicit dereference (§1.7)

unique_ptr<T>

An owning pointer to a T

shared_ptr<T>

A pointer to an object of type T;

 

ownership is shared among all shared_ptr’s to that T

weak_ptr<T>

A pointer to an object owned by a shared_ptr;

 

must be converted to a shared_ptr to access the object

span<T>

A pointer to a contiguous sequence of Ts (§15.2.2)

string_view<T>

A pointer to a const sub-string (§10.3)

X_iterator<C>

A sequence of elements from C;

 

The X in the name indicates the kind of iterator (§13.3)

There can be more than one pointer pointing to an object. An owning pointer is one that is responsible for eventually deleting the object it refers to. A non-owning pointer (e.g., a T* or a span) can dangle; that is, point to a location where an object has been deleted or gone out of scope.

Reading or writing through a dangling pointer is one of the nastiest kinds of bugs. The result of doing so is technically undefined. In practice, that often means accessing an object that happens to occupy the location. Then, a read means getting an arbitrary value, and a write scrambles an unrelated data structure. The best we can hope for is a crash; that’s usually preferable to a wrong result.

The C++ Core Guidelines [CG] offers rules for avoiding this and advice for statically checking that it never happens. However, here are a few approaches for avoiding pointer problems:

  • Don’t retain a pointer to a local object after the object goes out of scope. In particular, never return a pointer to a local object from a function or store a pointer of uncertain provenance in a long-lived data structure. Systematic use of containers and algorithms (Chapter 12, Chapter 13) often saves us from employing programming techniques that make it hard to avoid pointer problems.

  • Use owning pointers to objects allocated on the free store.

  • Pointers to static objects (e.g., global variables) can’t dangle.

  • Leave pointer arithmetic to the implementation of resource handles (such as vectors and unordered_maps).

  • Remember that string_views and spans are kinds of non-owning pointers.

15.2.1 unique_ptr and shared_ptr

One of the key tasks of any nontrivial program is to manage resources. A resource is something that must be acquired and later (explicitly or implicitly) released. Examples are memory, locks, sockets, thread handles, and file handles. For a long-running program, failing to release a resource in a timely manner (“a leak”) can cause serious performance degradation (§12.7) and possibly even a miserable crash. Even for short programs, a leak can become an embarrassment, say by causing a resource shortage increasing the run time by orders of magnitude.

The standard-library components are designed not to leak resources. To do this, they rely on the basic language support for resource management using constructor/destructor pairs to ensure that a resource doesn’t outlive an object responsible for it. The use of a constructor/destructor pair in Vector to manage the lifetime of its elements is an example (§5.2.2) and all standard-library containers are implemented in similar ways. Importantly, this approach interacts correctly with error handling using exceptions. For example, this technique is used for the standard-library lock classes:

mutex m; // used to protect access to shared data

void f()
{
       scoped_lock lck {m};       // acquire the mutex m
       // ... manipulate shared data ...
}

A thread will not proceed until lck’s constructor has acquired the mutex18.3). The corresponding destructor releases the mutex. So, in this example, scoped_lock’s destructor releases the mutex when the thread of control leaves f() (through a return, by “falling off the end of the function,” or through an exception throw).

This is an application of RAII (the “Resource Acquisition Is Initialization” technique; §5.2.2). RAII is fundamental to the idiomatic handling of resources in C++. Containers (such as vector and map, string, and iostream) manage their resources (such as file handles and buffers) similarly.

The examples so far take care of objects defined in a scope, releasing the resources they acquire at the exit from the scope, but what about objects allocated on the free store? In <memory>, the standard library provides two “smart pointers” to help manage objects on the free store:

  • unique_ptr represents unique ownership (its destructor destroys its object)

  • shared_ptr represents shared ownership (the last shared pointer’s destructor destroys the object)

The most basic use of these “smart pointers” is to prevent memory leaks caused by careless programming. For example:

void f(int i, int j)         // X* vs. unique_ptr<X>
{
        X* p = new X;                            // allocate a new X
        unique_ptr<X> sp {new X};     // allocate a new X and give its pointer to unique_ptr
        // ...

        if (i<99) throw Z{};                   // may throw an exception
        if (j<77) return;                        // may return "early"
        // ... use p and sp ..
        delete p;                                  // destroy *p
}

Here, we “forgot” to delete p if i<99 or if j<77. On the other hand, unique_ptr ensures that its object is properly destroyed whichever way we exit f() (by throwing an exception, by executing return, or by “falling off the end”). Ironically, we could have solved the problem simply by not using a pointer and not using new:

void f(int i, int j)      // use a local variable
{
        X x;
        // ...
}

Unfortunately, overuse of new (and of pointers and references) seems to be an increasing problem.

However, when you really need the semantics of pointers, unique_ptr is a lightweight mechanism with no space or time overhead compared to correct use of a built-in pointer. Its further uses include passing free-store allocated objects in and out of functions:

unique_ptr<X> make_X(int i)
        // make an X and immediately give it to a unique_ptr
{
        //... check i, etc. ...
        return unique_ptr<X>{new X{i}};
}

A unique_ptr is a handle to an individual object (or an array) in much the same way that a vector is a handle to a sequence of objects. Both control the lifetime of other objects (using RAII) and both rely on elimination of copying or on move semantics to make return simple and efficient (§6.2.2).

The shared_ptr is similar to unique_ptr except that shared_ptrs are copied rather than moved. The shared_ptrs for an object share ownership of an object; that object is destroyed when the last of its shared_ptrs is destroyed. For example:

void f(shared_ptr<fstream>);
void g(shared_ptr<fstream>);

void user(const string& name, ios_base::openmode mode)
{

        shared_ptr<fstream> fp {new fstream(name,mode)};
        if (!*fp)                               // make sure the file was properly opened
                throw No_file{};

         f(fp);
         g(fp);
         // ...
}

Now, the file opened by fp’s constructor will be closed by the last function to (explicitly or implicitly) destroy a copy of fp. Note that f() or g() may spawn a task holding a copy of fp or in some other way store a copy that outlives user(). Thus, shared_ptr provides a form of garbage collection that respects the destructor-based resource management of the memory-managed objects. This is neither cost free nor exorbitantly expensive, but it does make the lifetime of the shared object hard to predict. Use shared_ptr only if you actually need shared ownership.

Creating an object on the free store and then passing the pointer to it to a smart pointer is a bit verbose. It also allows for mistakes, such as forgetting to pass a pointer to a unique_ptr or giving a pointer to something that is not on the free store to a shared_ptr. To avoid such problems, the standard library (in <memory>) provides functions for constructing an object and returning an appropriate smart pointer, make_shared() and make_unique(). For example:

struct S {
        int i;
        string s;
        double d;
        // ...
};

auto p1 = make_shared<S>(1,"Ankh Morpork",4.65);     // p1 is a shared_ptr<S>
auto p2 = make_unique<S>(2,"Oz",7.62);                        // p2 is a unique_ptr<S>

Now, p2 is a unique_ptr<S> pointing to a free-store-allocated object of type S with the value {2,"Oz"s,7.62}.

Using make_shared() is not just more convenient than separately making an object using new and then passing it to a shared_ptr – it is also notably more efficient because it does not need a separate allocation for the use count that is essential in the implementation of a shared_ptr.

Given unique_ptr and shared_ptr, we can implement a complete “no naked new” policy (§5.2.2) for many programs. However, these “smart pointers” are still conceptually pointers and therefore only my second choice for resource management – after containers and other types that manage their resources at a higher conceptual level. In particular, shared_ptrs do not in themselves provide any rules for which of their owners can read and/or write the shared object. Data races (§18.5) and other forms of confusion are not addressed simply by eliminating the resource management issues.

When do we use “smart pointers” (such as unique_ptr) rather than resource handles with operations designed specifically for the resource (such as vector or thread)? Unsurprisingly, the answer is “when we need pointer semantics.”

  • When we share an object, we need pointers (or references) to refer to the shared object, so a shared_ptr becomes the obvious choice (unless there is an obvious single owner).

  • When we refer to a polymorphic object in classical object-oriented code (§5.5), we need a pointer (or a reference) because we don’t know the exact type of the object referred to (or even its size), so a unique_ptr becomes the obvious choice.

  • A shared polymorphic object typically requires shared_ptrs.

We do not need to use a pointer to return a collection of objects from a function; a container that is a resource handle will do that simply and efficiently by relying on copy elision (§3.4.2) and move semantics (§6.2.2).

15.2.2 span

Traditionally, range errors have been a major source of serious errors in C and C++ programs, leading to wrong results, crashes, and security problems. The use of containers (Chapter 12), algorithms (Chapter 13), and range-for has significantly reduced this problem, but more can be done. A key source of range errors is that people pass pointers (raw or smart) and then rely on convention to know the number of elements pointed to. The best advice for code outside resource handles is to assume that at most one object is pointed to [CG: F.22], but without support that advice is unmanageable. The standard-library string_view10.3) can help, but that is read-only and for characters only. Most programmers need more. For example, when writing into and reading out of buffers in lower-level software, it is notoriously difficult to maintain high performance while still avoiding range errors (“buffer overruns”). A span from <span> is basically a (pointer,length) pair denoting a sequence of elements:

Images

A span gives access to a contiguous sequence of elements. The elements can be stored in many ways, including in vectors and built-in arrays. Like a pointer, a span does not own the characters to which it points. In that, it resembles a string_view10.3) and an STL pair of iterators (§13.3).

Consider a common interface style:

void fpn(int* p, int n)
{
        for (int i = 0; i<n; ++i)
                p[i] = 0;
}

We assume that p points to n integers. Unfortunately, this assumption is simply a convention, so we can’t use it to write a range-for loop and the compiler cannot implement cheap and effective range checking. Also, our assumption can be wrong:

void use(int x)
{
        int a[100];
        fpn(a,100);               // OK
        fpn(a,1000);             // oops, my finger slipped! (range error in fpn)
        fpn(a+10,100);         // range error in fpn
        fpn(a,x);                   // suspect, but looks innocent
}

We can do better using a span:

void fs(span<int> p)
{
        for (int& x : p)
                x = 0;
}

We can use fs like this:

void use(int x)
{
        int a[100];
        fs(a);                          // implicitly creates a span<int>{a,100}
        fs(a,1000);                 // error: span expected
        fs({a+10,100});          // a range error in fs
        fs({a,x});                    // obviously suspect
}

That is, the common case, creating a span directly from an array, is now safe (the compiler computes the element count) and notationally simple. In other cases, the probability of mistakes is lowered and error-detection is made easier because the programmer has to explicitly compose a span.

The common case where a span is passed along from function to function is simpler than for (pointer,count) interfaces and obviously doesn’t require extra checking:

void f1(span<int> p);

void f2(span<int> p)
{
        // ...
        f1(p);
}

As for containers, when span is used for subscripting (e.g., r[i]), range checking is not done and an out-of-range access is undefined behavior. Naturally, an implementation can implement that undefined behavior as range checking, but sadly few do. The original gsl::span from the Core Guidelines support library [CG] does range checking.

15.3 Containers

The standard provides several containers that don’t fit perfectly into the STL framework (Chapter 12, Chapter 13). Examples are built-in arrays, array, and string. I sometimes refer to those as “almost containers,” but that is not quite fair: they hold elements, so they are containers, but each has restrictions or added facilities that make them awkward in the context of the STL. Describing them separately also simplifies the description of the STL.

Containers

T[N]

Built-in array: a fixed-size contiguously allocated sequence of N

 

elements of type T; implicitly converts to a T*

array<T,N>

A fixed-size contiguously allocated sequence of N elements

 

of type T; like the built-in array, but with most problems solved

bitset<N>

A fixed-size sequence of N bits

vector<bool>

A sequence of bits compactly stored in a specialization of vector

pair<T,U>

Two elements of types T and U

tuple<T...>

A sequence of an arbitrary number of elements of arbitrary types

basic_string<C>

A sequence of characters of type C; provides string operations

valarray<T>

An array of numeric values of type T; provides numeric operations

Why does the standard provide so many containers? They serve common but different (often overlapping) needs. If the standard library didn’t provide them, many people would have to design and implement their own. For example:

  • pair and tuple are heterogeneous; all other containers are homogeneous (all elements are of the same type).

  • array, and tuple elements are contiguously allocated; list and map are linked structures.

  • bitset and vector<bool> hold bits and access them through proxy objects; all other standard-library containers can hold a variety of types and access elements directly.

  • basic_string requires its elements to be some form of character and to provide string manipulation, such as concatenation and locale-sensitive operations.

  • valarray requires its elements to be numbers and to provide numerical operations.

All of these containers can be seen as providing specialized services needed by large communities of programmers. No single container could serve all of these needs because some needs are contradictory, for example, “ability to grow” vs. “guaranteed to be allocated in a fixed location,” and “elements do not move when elements are added” vs. “contiguously allocated.”

15.3.1 array

An array, defined in <array>, is a fixed-size sequence of elements of a given type where the number of elements is specified at compile time. Thus, an array can be allocated with its elements on the stack, in an object, or in static storage. The elements are allocated in the scope where the array is defined. An array is best understood as a built-in array with its size firmly attached, without implicit, potentially surprising conversions to pointer types, and with a few convenience functions provided. There is no overhead (time or space) involved in using an array compared to using a built-in array. An array does not follow the “handle to elements” model of STL containers. Instead, an array directly contains its elements. It is nothing more or less than a safer version of a built-in array.

This implies that an array can and must be initialized by an initializer list:

array<int,3> a1 = {1,2,3};

The number of elements in the initializer must be equal to or less than the number of elements specified for the array.

The element count is not optional, the element count must be a constant expression, the number of elements must be positive, and the element type must be explicitly stated:

void f(int n)
{
        array<int> a0 = {1,2,3};                                      // error size not specified
        array<string,n> a1 = {"John's", "Queens' "};  // error: size not a constant expression
        array<string,0> a2;                                            // error: size must be positive
        array<2> a3 = {"John's", "Queens' "};             // error: element type not stated
        // ...
}

If you need the element count to be a variable, use vector.

When necessary, an array can be explicitly passed to a C-style function that expects a pointer. For example:

void f(int* p, int sz);         // C-style interface

void g()
{
        array<int,10> a;

         f(a,a.size());                    // error: no conversion
         f(a.data(),a.size());         // C-style use

         auto p = find(a,777);      // C++/STL-style use (a range is passed)
         // ...
}

Why would we use an array when vector is so much more flexible? An array is less flexible so it is simpler. Occasionally, there is a significant performance advantage to be had by directly accessing elements allocated on the stack rather than allocating elements on the free store, accessing them indirectly through the vector (a handle), and then deallocating them. On the other hand, the stack is a limited resource (especially on some embedded systems), and stack overflow is nasty. Also, there are application areas, such as safety-critical real-time control, where free store allocation is banned. For example, use of delete may lead to fragmentation (§12.7) or memory exhaustion (§4.3).

Why would we use an array when we could use a built-in array? An array knows its size, so it is easy to use with standard-library algorithms, and it can be copied using =. For example:

array<int,3> a1 = {1, 2, 3 };
auto a2 = a1;     // copy
a2[1] = 5;
a1 = a2;              // assign

However, my main reason to prefer array is that it saves me from surprising and nasty conversions to pointers. Consider an example involving a class hierarchy:

void h()
{
        Circle a1[10];
        array<Circle,10> a2;
        // ...
       Shape* p1 = a1;       // OK: disaster waiting to happen
       Shape* p2 = a2;       // error: no conversion of array<Circle,10> to Shape* (Good!)
       p1[3].draw();            // disaster
}

The “disaster” comment assumes that sizeof(Shape)<sizeof(Circle), so subscripting a Circle[] through a Shape* gives a wrong offset. All standard containers provide this advantage over built-in arrays.

15.3.2 bitset

Aspects of a system, such as the state of an input stream, are often represented as a set of flags indicating binary conditions such as good/bad, true/false, and on/off. C++ supports the notion of small sets of flags efficiently through bitwise operations on integers (§1.4). Class bitset<N> generalizes this notion by providing operations on a sequence of N bits [0:N), where N is known at compile time. For sets of bits that don’t fit into a long long int (often 64 bits), using a bitset is much more convenient than using integers directly. For smaller sets, bitset is usually optimized. If you want to name the bits, rather than numbering them, you can use a set12.5) or an enumeration (§2.4).

A bitset can be initialized with an integer or a string:

bitset<9> bs1 {"110001111"};
bitset<9> bs2 {0b1'1000'1111};       // binary literal using digit separators (§1.4)

The usual bitwise operators (§1.4) and the left- and right-shift operators (<< and >>) can be applied:

bitset<9> bs3 = ~bs1;               // complement: bs3=="001110000"
bitset<9> bs4 = bs1&bs3;        // all zeros
bitset<9> bs5 = bs1<<2;          // shift left: bs5 = "000111100"

The shift operators (here, <<) “shift in” zeros.

The operations to_ullong() and to_string() provide the inverse operations to the constructors. For example, we could write out the binary representation of an int:

void binary(int i)
{
         bitset<8*sizeof(int)> b = i;             // assume 8-bit byte (see also §17.7)
         cout << b.to_string() << '
';          // write out the bits of i
}

This prints the bits represented as 1s and 0s from left to right, with the most significant bit leftmost, so that argument 123 would give the output

00000000000000000000000001111011

For this example, it is simpler to directly use the bitset output operator:

void binary2(int i)
{
        bitset<8*sizeof(int)> b = i;      // assume 8-bit byte (see also §17.7)
        cout << b << 'n';                      // write out the bits of i
}

A bitset offers many functions for using and manipulating sets of bits, such as all(), any(), none(), count(), flip().

15.3.3 pair

It is fairly common for a function to return two values. There are many ways of doing that, the simplest and often the best is to define a struct for the purpose. For example, we can return a value and a success indicator:

struct My_res {
         Entry* ptr;
         Error_code err;
};

My_res complex_search(vector<Entry>& v, const string& s)
{
         Entry* found = nullptr;
         Error_code err = Error_code::found;
         // ... search for s in v ...
         return {found,err};
}

void user(const string& s)
{
         My_res r = complex_search(entry_table,s);      // search entry_table
         if (r.err != Error_code::good) {
                  // ... handle error ...
         }
         //... use r.ptr ....
}

We could argue that encoding failure as the end iterator or a nullptr is more elegant, but that can express just one kind of failure. Often, we would like to return two separate values. Defining a specific named struct for each pair of values often works well and is quite readable if the names of the “pair of values” structs and their members are well chosen. However, for large code bases it can lead to a proliferation of names and conventions, and it doesn’t work well for generic code where consistent naming is essential. Consequently, the standard library provides pair as a general support for the “pair of values” use cases. Using pair, our simple example becomes:

pair<Entry*,Error_code> complex_search(vector<Entry>& v, const string& s)
{
        Entry* found = nullptr;
        Error_code err = Error_code::found;
        // ... search for s in v ...
        return {found,err};
}

void user(const string& s)
{
        auto r = complex_search(entry_table,s);           // search entry_table
        if (r.second != Error_code::good) {
                // ... handle error ...
        }
        // ... use r.first ....
}

The members of pair are named first and second. That makes sense from an implementer’s point of view, but in application code we may want to use our own names. Structured binding (§3.4.5) can be used to deal with that:

void user(const string& s)
{
         auto [ptr,success] = complex_search(entry_table,s);      // search entry_table
         if (success != Error_code::good)
                 // ... handle error ...
         }
         // ... use r.ptr ....
}

The standard-library pair (from <utility>) is quite frequently used for “pair of values” use cases in the standard library and elsewhere. For example, the standard-library algorithm equal_range a pair of iterators specifying a subsequence meeting a predicate:

template<typename Forward_iterator, typename T, typename Compare>
         pair<Forward_iterator,Forward_iterator>
         equal_range(Forward_iterator first, Forward_iterator last, const T& val, Compare cmp);

Given a sorted sequence [first:last), equal_range() will return the pair representing the subsequence that matches the predicate cmp. We can use that to search in a sorted sequence of Records:

auto less = [](const Record& r1, const Record& r2) { return r1.name<r2.name;};          // compare names

void f(const vector<Record>& v)            // assume that v is sorted on its "name" field
{
        auto [first,last] = equal_range(v.begin(),v.end(),Record{"Reg"},less);

        for (auto p = first; p!=last; ++p)                // print all equal records
                 cout << *p;                                         // assume that << is defined for Record
}

A pair provides operators, such as =, ==, and <, if its elements do. Type deduction makes it easy to create a pair without explicitly mentioning its type. For example:

void f(vector<string>& v)
{
        pair p1 {v.begin(),2};                         // one way
        auto p2 = make_pair(v.begin(),2);    // another way
        // ...
}

Both p1 and p2 are of type pair<vector<string>::iterator,int>.

When code doesn’t need to be generic, a simple struct with named members often leads to more maintainable code.

15.3.4 tuple

Like arrays, the standard-library containers are homogeneous; that is, all their elements are of a single type. However, sometimes we want to treat a sequence of elements of different types as a single object; that is, we want a heterogeneous container; pair is an example, but not all such heterogeneous sequences have just two elements. The standard library provides tuple as a generalization of pair with zero or more elements:

tuple t0 {};                                                                      // empty
tuple<string,int,double> t1 {"Shark",123,3.14};         // the type is explicitly specified
auto t2 = make_tuple(string{"Herring"},10,1.23);      // the type is deduced to tuple<string,int,double>
tuple t3 {"Cod"s,20,9.99};                                            // the type is deduced to tuple<string,int,double>

The elements (members) of a tuple are independent; there is no invariant (§4.3) maintained among them. If we want an invariant, we must encapsulate the tuple in a class that enforces it.

For a single, specific use, a simple struct is often ideal, but there are many generic uses where the flexibility of tuple saves us from having to define many structs at the cost of not having mnemonic names for the members. Members of a tuple are accessed through a get function template. For example:

string fish = get<0>(t1);              // get the first element: "Shark"
int count = get<1>(t1);                // get the second element: 123
double price = get<2>(t1);         // get the third element: 3.14

The elements of a tuple are numbered (starting with zero) and the index argument to get() must be a constant. The function get is a template function taking the index as a template value argument (§7.2.2).

Accessing members of a tuple by their index is general, ugly, and somewhat error-prone. Fortunately, an element of a tuple with a unique type in that tuple can be “named” by its type:

auto fish = get<string>(t1);                // get the string: "Shark"
auto count = get<int>(t1);                  // get the int: 123
auto price = get<double>(t1);           // get the double: 3.14

We can use get<> for writing also:

get<string>(t1) = "Tuna";       // write to the string
get<int>(t1) = 7;                      // write to the int
get<double>(t1) = 312;          // write to the double

Most uses of tuples are hidden in implementations of higher-level constructs. For example, we could access the members of t1 using structured binding (§3.4.5):

auto [fish, count, price] = t1;
cout << fish << ' ' << count << ' ' << price << '
';      // read
fish = "Sea Bass";                                                       // write

Typically, such binding and its underlying use of a tuple is used for a function call:

auto [fish, count, price] = todays_catch();
cout << fish << ' ' << count << ' ' << price << '
';

The real strength of tuple is when you have to store or pass around an unknown number of elements of unknown types as an object.

Explicitly, iterating over the elements of a tuple is a bit messy, requiring recursion and compile-time evaluation of the function body:

template <size_t N = 0, typename... Ts>
constexpr void print(tuple<Ts...> tup)
{
        if constexpr (N<sizeof...(Ts)) {         // not yet at the end?
               cout << get<N>(tup) << ' ';         // print the Nth element
               print<N+1>(tup);                        // print the next element
        }
}

Here, sizeof...(Ts) gives the number of elements in Ts.

Using print() is straightforward:

print(t0);           // no output
print(t2);           // Herring 10 1.23
print(tuple{ "Norah", 17, "Gavin", 14, "Anya", 9, "Courtney", 9, "Ada", 0 });

Like pair, tuple provides operators, such as =, ==, and <, if its elements do. There are also conversions between a pair and a tuple with two members,

15.4 Alternatives

The standard offers three types to express alternatives:

Alternatives

union

A built-in type that holds one of a set of alternatives (§2.5)

variant<T...>

One of a specified set of alternatives (in <variant>)

optional<T>

A value of type T or no value (in <optional>)

any

A value one of an unbounded set of alternative types (in <any>)

These types offer related functionality to a user. Unfortunately, they don’t offer a unified interface.

15.4.1 variant

A variant<A,B,C> is often a safer and more convenient alternative to explicitly using a union2.5). Possibly the simplest example is to return either a value or an error code:

variant<string,Error_code> compose_message(istream& s)
{
        string mess;
        // ... read from s and compose message ...
        if (no_problems)
                return mess;                                               // return a string
        else
                return Error_code{some_problem};        // return an Error_code
}

When you assign or initialize a variant with a value, it remembers the type of that value. Later, we can inquire what type the variant holds and extract the value. For example:

auto m = compose_message(cin);

if (holds_alternative<string>(m)) {
       cout << get<string>(m);
}
else {
       auto err = get<Error_code>(m);
       // ... handle error ...
}

This style appeals to some people who dislike exceptions (see §4.4), but there are more interesting uses. For example, a simple compiler may need to distinguish between different kinds of nodes with different representations:

using Node = variant<Expression,Statement,Declaration,Type>;

void check(Node* p)
{
        if (holds_alternative<Expression>(*p)) {
                Expression& e = get<Expression>(*p);
                // ...
         }
         else if (holds_alternative<Statement>(*p)) {
                Statement& s = get<Statement>(*p);
                // ...
         }
         // ... Declaration and Type ...
}

This pattern of checking alternatives to decide on the appropriate action is so common and relatively inefficient that it deserves direct support:

void check(Node* p)
{
        visit(overloaded {
                 [](Expression& e) { /* ... */ },
                 [](Statement& s) { /* ... */ },
                 // ... Declaration and Type ...
        }, *p);
}

This is basically equivalent to a virtual function call, but potentially faster. As with all claims of performance, this “potentially faster” should be verified by measurements when performance is critical. For most uses, the difference in performance is insignificant.

The overloaded class is necessary and strangely enough, not standard. It’s a “piece of magic” that builds an overload set from a set of arguments (usually lambdas):

template<class... Ts>
struct overloaded : Ts... {            // variadic template (§8.4)
        using Ts::operator()...;
};

template<class... Ts>
        overloaded(Ts...) -> overloaded<Ts...>;    // deduction guide

The “visitor” visit then applies () to the overload object, which selects the most appropriate lambda to call according to the overload rules.

A deduction guide is a mechanism for resolving subtle ambiguities, primarily for constructors of class templates in foundation libraries (§7.2.3).

If we try to access a variant holding a different type from the expected one, bad_variant_access is thrown.

15.4.2 optional

An optional<A> can be seen as a special kind of variant (like a variant<A,nothing>) or as a generalization of the idea of an A* either pointing to an object or being nullptr.

An optional can be useful for functions that may or may not return an object:

optional<string> compose_message(istream& s)
{
        string mess;

        // ... read from s and compose message ...

        if (no_problems)
                return mess;
        return {};          // the empty optional
}

Given that, we can write

if (auto m = compose_message(cin))
        cout << *m;               // note the dereference (*)
else {
        // ... handle error ...
}

This appeals to some people who dislike exceptions (see §4.4). Note the curious use of *. An optional is treated as a pointer to its object rather than the object itself.

The optional equivalent to nullptr is the empty object, {}. For example:

int sum(optional<int> a, optional<int> b)
{
        int res = 0;
        if (a) res+=*a;
        if (b) res+=*b;
        return res;
}

int x = sum(17,19);         // 36
int y = sum(17,{});          // 17
int z = sum({},{});            // 0

If we try to access an optional that does not hold a value, the result is undefined; an exception is not thrown. Thus, optional is not guaranteed type safe. Don’t try:

int sum2(optional<int> a, optional<int> b)
{
         return *a+*b;     // asking for trouble
}

15.4.3 any

An any can hold an arbitrary type and know which type (if any) it holds. It is basically an unconstrained version of variant:

any compose_message(istream& s)
{
        string mess;

        // ... read from s and compose message ...

        if (no_problems)
                return mess;                        // return a string
        else
                return error_number;          // return an int
}

When you assign or initialize an any with a value, it remembers the type of that value. Later, we can extract the value held by the any by asserting the value’s expected type. For example:

auto m = compose_message(cin);
string& s = any_cast<string&>(m);
cout << s;

If we try to access an any holding a different type than the expected one, bad_any_access is thrown.

15.5 Advice

[1] A library doesn’t have to be large or complicated to be useful; §16.1.

[2] A resource is anything that has to be acquired and (explicitly or implicitly) released; §15.2.1.

[3] Use resource handles to manage resources (RAII); §15.2.1; [CG: R.1].

[4] The problem with a T* is that it can be used to represent anything, so we cannot easily determine a “raw” pointer’s purpose; §15.2.1.

[5] Use unique_ptr to refer to objects of polymorphic type; §15.2.1; [CG: R.20].

[6] Use shared_ptr to refer to shared objects (only); §15.2.1; [CG: R.20].

[7] Prefer resource handles with specific semantics to smart pointers; §15.2.1.

[8] Don’t use a smart pointer where a local variable will do; §15.2.1.

[9] Prefer unique_ptr to shared_ptr; §6.3, §15.2.1.

[10] use unique_ptr or shared_ptr as arguments or return values only to transfer ownership responsibilities; §15.2.1; [CG: F.26] [CG: F.27].

[11] Use make_unique() to construct unique_ptrs; §15.2.1; [CG: R.22].

[12] Use make_shared() to construct shared_ptrs; §15.2.1; [CG: R.23].

[13] Prefer smart pointers to garbage collection; §6.3, §15.2.1.

[14] Prefer spans to pointer-plus-count interfaces; §15.2.2; [CG: F.24].

[15] span supports range-for; §15.2.2.

[16] Use array where you need a sequence with a constexpr size; §15.3.1.

[17] Prefer array over built-in arrays; §15.3.1; [CG: SL.con.2].

[18] Use bitset if you need N bits and N is not necessarily the number of bits in a built-in integer type; §15.3.2.

[19] Don’t overuse pair and tuple; named structs often lead to more readable code; §15.3.3.

[20] When using pair, use template argument deduction or make_pair() to avoid redundant type specification; §15.3.3.

[21] When using tuple, use template argument deduction or make_tuple() to avoid redundant type specification; §15.3.3; [CG: T.44].

[22] Prefer variant to explicit use of unions; §15.4.1; [CG: C.181].

[23] When selecting among a set of alternatives using a variant, consider using visit() and overloaded(); §15.4.1.

[24] If more than one alternative is possible for a variant, optional, or any, check the tag before access; §15.4.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.21.246.123