Chapter 1. Containers

Sure, the STL has iterators, algorithms, and function objects, but for most C++ programmers, it’s the containers that stand out. More powerful and flexible than arrays, they grow (and often shrink) dynamically, manage their own memory, keep track of how many objects they hold, bound the algorithmic complexity of the operations they support, and much, much more. Their popularity is easy to understand. They’re simply better than their competition, regardless of whether that competition comes from containers in other libraries or is a container type you’d write yourself. STL containers aren’t just good. They’re really good.

This chapter is devoted to guidelines applicable to all the STL containers. Later chapters focus on specific container types. The topics addressed here include selecting the appropriate container given the constraints you face; avoiding the delusion that code written for one container type is likely to work with other container types; the significance of copying operations for objects in containers; difficulties that arise when pointers or auto_ptrs are stored in containers; the ins and outs of erasing; what you can and cannot accomplish with custom allocators; tips on how to maximize efficiency; and considerations for using containers in a threaded environment.

That’s a lot of ground to cover, but don’t worry. The Items break it down into bite-sized chunks, and along the way, you’re almost sure to pick up several ideas you can apply to your code now.

Item 1: Choose your containers with care

You know that C++ puts a variety of containers at your disposal, but do you realize just how varied that variety is? To make sure you haven’t overlooked any of your options, here’s a quick review.

• The standard STL sequence containers, vector, string, deque, and list.

The standard STL associative containers, set, multiset, map, and multimap.

• The nonstandard sequence containers slist and rope. slist is a singly linked list, and rope is essentially a heavy-duty string. (A “rope” is a heavy-duty “string.” Get it?) You’ll find a brief overview of these nonstandard (but commonly available) containers in Item 50.

• The nonstandard associative containers hash_set, hash_multiset, hash_map, and hash_multimap. I examine these widely available hash-table-based variants on the standard associative containers in Item 25.

vector<char> as a replacement for string. Item 13 describes the conditions under which such a replacement might make sense.

vector as a replacement for the standard associative containers. As Item 23 makes clear, there are times when vector can outperform the standard associative containers in both time and space.

• Several standard non-STL containers, including arrays, bitset, valarray, stack, queue, and priority_queue. Because these are non-STL containers, I have little to say about them in this book, though Item 16 mentions a case where arrays are preferable to STL containers and Item 18 explains why bitset may be better than vector<bool>. It’s also worth bearing in mind that arrays can be used with STL algorithms, because pointers can be used as array iterators.

That’s a panoply of options, and it’s matched in richness by the range of considerations that should go into choosing among them. Unfortunately, most discussions of the STL take a fairly narrow view of the world of containers, ignoring many issues relevant to selecting the one that is most appropriate. Even the Standard gets into this act, offering the following guidance for choosing among vector, deque, and list:

vector, list, and deque offer the programmer different complexity trade-offs and should be used accordingly. vector is the type of sequence that should be used by default. list should be used when there are frequent insertions and deletions from the middle of the sequence. deque is the data structure of choice when most insertions and deletions take place at the beginning or at the end of the sequence.

If your primary concern is algorithmic complexity, I suppose this constitutes reasonable advice, but there is so much more to be concerned with.

In a moment, we’ll examine some of the important container-related issues that complement algorithmic complexity, but first I need to introduce a way of categorizing the STL containers that isn’t discussed as often as it should be. That is the distinction between contiguous-memory containers and node-based containers.

Contiguous-memory containers (also known as array-based containers) store their elements in one or more (dynamically allocated) chunks of memory, each chunk holding more than one container element. If a new element is inserted or an existing element is erased, other elements in the same memory chunk have to be shifted up or down to make room for the new element or to fill the space formerly occupied by the erased element. This kind of movement affects both performance (see Items 5 and 14) and exception safety (as we’ll soon see). The standard contiguous-memory containers are vector, string, and deque. The nonstandard rope is also a contiguous-memory container.

Node-based containers store only a single element per chunk of (dynamically allocated) memory. Insertion or erasure of a container element affects only pointers to nodes, not the contents of the nodes themselves, so element values need not be moved when something is inserted or erased. Containers representing linked lists, such as list and slist, are node-based, as are all the standard associative containers. (They’re typically implemented as balanced trees.) The nonstandard hashed containers use varying node-based implementations, as you’ll see in Item 25.

With this terminology out of the way, we’re ready to sketch some of the questions most relevant when choosing among containers. In this discussion, I omit consideration of non-STL-like containers (e.g., arrays, bitsets, etc.), because this is, after all, a book on the STL.

Do you need to be able to insert a new element at an arbitrary position in the container? If so, you need a sequence container; associative containers won’t do.

Do you care how elements are ordered in the container? If not, a hashed container becomes a viable choice. Otherwise, you’ll want to avoid hashed containers.

Must the container be part of standard C++? If so, that eliminates hashed containers, slist, and rope.

What category of iterators do you require? If they must be random access iterators, you’re technically limited to vector, deque, and string, but you’d probably want to consider rope, too. (See Item 50 for information on rope.) If bidirectional iterators are required, you must avoid slist (see Item 50) as well as one common implementation of the hashed containers (see Item 25).

Is it important to avoid movement of existing container elements when insertions or erasures take place? If so, you’ll need to stay away from contiguous-memory containers (see Item 5).

Does the data in the container need to be layout-compatible with C? If so, you’re limited to vectors (see Item 16).

Is lookup speed a critical consideration? If so, you’ll want to look at hashed containers (see Item 25), sorted vectors (see Item 23), and the standard associative containers—probably in that order.

Do you mind if the underlying container uses reference counting? If so, you’ll want to steer clear of string, because many string implementations are reference-counted (see Item 13). You’ll need to avoid rope, too, because the definitive rope implementation is based on reference counting (see Item 50). You have to represent your strings somehow, of course, so you’ll want to consider vector<char>.

Do you need transactional semantics for insertions and erasures? That is, do you require the ability to reliably roll back insertions and erasures? If so, you’ll want to use a node-based container. If you need transactional semantics for multiple-element insertions (e.g., the range form—see Item 5), you’ll want to choose list, because list is the only standard container that offers transactional semantics for multiple-element insertions. Transactional semantics are particularly important for programmers interested in writing exception-safe code. (Transactional semantics can be achieved with contiguous-memory containers, too, but there is a performance cost, and the code is not as straightforward. To learn more about this, consult Item 17 of Sutter’s Exceptional C++ [8].)

Do you need to minimize iterator, pointer, and reference invalidation? If so, you’ll want to use node-based containers, because insertions and erasures on such containers never invalidate iterators, pointers, or references (unless they point to an element you are erasing). In general, insertions or erasures on contiguous-memory containers may invalidate all iterators, pointers, and references into the container.

Do you care if using swap on containers invalidates iterators, pointers, or references? If so, you’ll need to avoid string, because string is alone in the STL in invalidating iterators, pointers, and references during swaps.

Would it be helpful to have a sequence container with random access iterators where pointers and references to the data are not invalidated as long as nothing is erased and insertions take place only at the ends of the container? This is a very special case, but if it’s your case, deque is the container of your dreams. (Interestingly, deque’s iterators may be invalidated when insertions are made only at the ends of the container. deque is the only standard STL container whose iterators may be invalidated without also invalidating its pointers and references.)

These questions are hardly the end of the matter. For example, they don’t take into account the varying memory allocation strategies employed by the different container types. (Items 10 and 14 discuss some aspects of such strategies.) Still, they should be enough to convince you that, unless you have no interest in element ordering, standards conformance, iterator capabilities, layout compatibility with C, lookup speed, behavioral anomalies due to reference counting, the ease of implementing transactional semantics, or the conditions under which iterators are invalidated, you have more to think about than simply the algorithmic complexity of container operations. Such complexity is important, of course, but it’s far from the entire story.

The STL gives you lots of options when it comes to containers. If you look beyond the bounds of the STL, there are even more options. Before choosing a container, be sure to consider all your options. A “default container”? I don’t think so.

Item 2: Beware the illusion of container-independent code

The STL is based on generalization. Arrays are generalized into containers and parameterized on the types of objects they contain. Functions are generalized into algorithms and parameterized on the types of iterators they use. Pointers are generalized into iterators and parameterized on the type of objects they point to.

That’s just the beginning. Individual container types are generalized into sequence and associative containers, and similar containers are given similar functionality. Standard contiguous-memory containers (see Item 1) offer random-access iterators, while standard node-based containers (again, see Item 1) provide bidirectional iterators. Sequence containers support push_front and/or push_back, while associative containers don’t. Associative containers offer logarithmic-time lower_bound, upper_bound, and equal_range member functions, but sequence containers don’t.

With all this generalization going on, it’s natural to want to join the movement. This sentiment is laudable, and when you write your own containers, iterators, and algorithms, you’ll certainly want to pursue it. Alas, many programmers try to pursue it in a different manner. Instead of committing to particular types of containers in their software, they try to generalize the notion of a container so that they can use, say, a vector, but still preserve the option of replacing it with something like a deque or a list later—all without changing the code that uses it. That is, they strive to write container-independent code. This kind of generalization, well-intentioned though it is, is almost always misguided.

Even the most ardent advocate of container-independent code soon realizes that it makes little sense to try to write software that will work with both sequence and associative containers. Many member functions exist for only one category of container, e.g., only sequence containers support push_front or push_back, and only associative containers support count and lower_bound, etc. Even such basics as insert and erase have signatures and semantics that vary from category to category. For example, when you insert an object into a sequence container, it stays where you put it, but if you insert an object into an associative container, the container moves the object to where it belongs in the container’s sort order. For another example, the form of erase taking an iterator returns a new iterator when invoked on a sequence container, but it returns nothing when invoked on an associative container. (Item 9 gives an example of how this can affect the code you write.)

Suppose, then, you aspire to write code that can be used with the most common sequence containers: vector, deque, and list. Clearly, you must program to the intersection of their capabilities, and that means no uses of reserve or capacity (see Item 14), because deque and list don’t offer them. The presence of list also means you give up operator[], and you limit yourself to the capabilities of bidirectional iterators. That, in turn, means you must stay away from algorithms that demand random access iterators, including sort, stable_sort, partial_sort, and nth_element (see Item 31).

On the other hand, your desire to support vector rules out use of push_front and pop_front, and both vector and deque put the kibosh on splice and the member form of sort. In conjunction with the constraints above, this latter prohibition means that there is no form of sort you can call on your “generalized sequence container.”

That’s the obvious stuff. If you violate any of those restrictions, your code will fail to compile with at least one of the containers you want to be able to use. The code that will compile is more insidious.

The main culprit is the different rules for invalidation of iterators, pointers, and references that apply to different sequence containers. To write code that will work correctly with vector, deque, and list, you must assume that any operation invalidating iterators, pointers, or references in any of those containers invalidates them in the container you’re using. Thus, you must assume that every call to insert invalidates everything, because deque::insert invalidates all iterators and, lacking the ability to call capacity, vector::insert must be assumed to invalidate all pointers and references. (Item 1 explains that deque is unique in sometimes invalidating its iterators without invalidating its pointers and references.) Similar reasoning leads to the conclusion that, unless you’re eraseing the last element of a container, calls to erase must also be assumed to invalidate everything.

Want more? You can’t pass the data in the container to a C interface, because only vector supports that (see Item 16). You can’t instantiate your container with bool as the type of objects to be stored, because, as Item 18 explains, vector<bool> doesn’t always behave like a vector, and it never actually stores bools. You can’t assume list’s constant-time insertions and erasures, because vector and deque take linear time to perform those operations.

When all is said and done, you’re left with a “generalized sequence container” where you can’t call reserve, capacity, operator[], push_front, pop_front, splice, or any algorithm requiring random access iterators; a container where every call to insert and erase takes linear time and invalidates all iterators, pointers, and references; and a container incompatible with C where bools can’t be stored. Is that really the kind of container you want to use in your applications? I suspect not.

If you rein in your ambition and decide you’re willing to drop support for list, you still give up reserve, capacity, push_front, and pop_front; you still must assume that all calls to insert and erase take linear time and invalidate everything; you still lose layout compatibility with C; and you still can’t store bools.

If you abandon the sequence containers and shoot instead for code that can work with different associative containers, the situation isn’t much better. Writing for both set and map is close to impossible, because sets store single objects while maps store pairs of objects. Even writing for both set and multiset (or map and multimap) is tough. The insert member function taking only a value has different return types for sets/maps than for their multi cousins, and you must religiously avoid making any assumptions about how many copies of a value are stored in a container. With map and multimap, you must avoid using operator[], because that member function exists only for map.

Face the truth: it’s not worth it. The different containers are different, and they have strengths and weaknesses that vary in significant ways. They’re not designed to be interchangeable, and there’s little you can do to paper that over. If you try, you’re merely tempting fate, and fate doesn’t like to be tempted.

Still, the day will dawn when you’ll realize that a container choice you made was, er, suboptimal, and you’ll need to use a different container type. You now know that when you change container types, you’ll not only need to fix whatever problems your compilers diagnose, you’ll also need to examine all the code using the container to see what needs to be changed in light of the new container’s performance characteristics and rules for invalidation of iterators, pointers, and references. If you switch from a vector to something else, you’ll also have to make sure you’re no longer relying on vector’s C-compatible memory layout, and if you switch to a vector, you’ll have to ensure that you’re not using it to store bools.

Given the inevitability of having to change container types from time to time, you can facilitate such changes in the usual manner: by encapsulating, encapsulating, encapsulating. One of the easiest ways to do this is through the liberal use of typedefs for container types. Hence, instead of writing this,

class Widget { ... };

vector<Widget> vw;

Widget bestWidget;

...                                            // give bestWidget a value

vector<Widget>::iterator i =                   // find a Widget with the
  find(vw.begin(), vw.end(), bestWidget);      // same value as bestWidget

write this:

class Widget { ... };

typedef vector<Widget> WidgetContainer;

WidgetContainer cw;

Widget bestWidget;

...

WidgetContainer::iterator i = find(cw.begin(), cw.end(), bestWidget);

This makes it a lot easier to change container types, something that’s especially convenient if the change in question is simply to add a custom allocator. (Such a change doesn’t affect the rules for iterator/pointer/reference invalidation.)

class Widget { ... };

template<typename T>                        // see Item 10 for why this
SpecialAllocator { ... };                   // needs to be a template

typedef vector<Widget, SpecialAllocator<Widget> > WidgetContainer;

WidgetContainer cw;                         // still works

Widget bestWidget;

...

WidgetContainer::iterator i =
find(cw.begin(), cw.end(), bestWidget);    // still works

If the encapsulating aspects of typedefs mean nothing to you, you’re still likely to appreciate the work they can save, especially for iterator types. For example, if you have an object of type

map<string,
    vector<Widget>::iterator,
    CIStringCompare>                // CIStringCompare is "case-
                                    // insensitive string compare;"
                                    // Item 19 describes it

and you want to walk through the map using const_iterators, do you really want to spell out

map<string, vector<Widget>::iterator, CIStringCompare>::const_iterator

more than once? Once you’ve used the STL a little while, you’ll realize that typedefs are your friends.

A typedef is just a synonym for some other type, so the encapsulation it affords is purely lexical. A typedef doesn’t prevent a client from doing (or depending on) anything they couldn’t already do (or depend on). You need bigger ammunition if you want to limit client exposure to the container choices you’ve made. You need classes.

To limit the code that may require modification if you replace one container type with another, hide the container in a class, and limit the amount of container-specific information visible through the class interface. For example, if you need to create a customer list, don’t use a list directly. Instead, create a CustomerList class, and hide a list in its private section:

class CustomerList {
private:
  typedef list<Customer> CustomerContainer;
  typedef CustomerContainer::iterator CCIterator;

  CustomerContainer customers;

public:                               // limit the amount of list-specific
  ...                                 // information visible through
};                                    // this interface

At first, this may seem silly. After all a customer list is a list, right? Well, maybe. Later you may discover that you don’t need to insert or erase customers from the middle of the list as often as you’d anticipated, but you do need to quickly identify the top 20% of your customers—a task tailor-made for the nth_element algorithm (see Item 31). But nth_element requires random access iterators. It won’t work with a list. In that case, your customer “list” might be better implemented as a vector or a deque.

When you consider this kind of change, you still have to check every CustomerList member function and every friend to see how they’ll be affected (in terms of performance and iterator/pointer/reference invalidation, etc.), but if you’ve done a good job of encapsulating CustomerList’s implementation details, the impact on CustomerList clients should be small. You can’t write container-independent code, but they might be able to.

Item 3: Make copying cheap and correct for objects in containers

Containers hold objects, but not the ones you give them. Instead, when you add an object to a container (via, e.g., insert or push_back, etc.), what goes into the container is a copy of the object you specify.

Once an object is in a container, it’s not uncommon for it to be copied further. If you insert something into or erase something from a vector, string, or deque, existing container elements are typically moved (copied) around (see Items 5 and 14). If you use any of the sorting algorithms (see Item 31); next_permutation or previous_permutation; remove, unique, or their ilk (see Item 32); rotate or reverse, etc., objects will be moved (copied) around. Yes, copying objects is the STL way.

It may interest you to know how all this copying is accomplished. That’s easy. An object is copied by using its copying member functions, in particular, its copy constructor and its copy assignment operator. (Clever names, no?) For a user-defined class like Widget, these functions are traditionally declared like this:

class Widget {
public:
  ...
  Widget(const Widget&);               // copy constructor
  Widget& operator=(const Widget&);    // copy assignment operator
  ...
};

As always, if you don’t declare these functions yourself, your compilers will declare them for you. Also as always, the copying of built-in types (e.g., ints, pointers, etc.) is accomplished by simply copying the underlying bits. (For details on copy constructors and assignment operators, consult any introductory book on C++. In Effective C++, Items 11 and 27 focus on the behavior of these functions.)

With all this copying taking place, the motivation for this Item should now be clear. If you fill a container with objects where copying is expensive, the simple act of putting the objects into the container could prove to be a performance bottleneck. The more things get moved around in the container, the more memory and cycles you’ll blow on making copies. Furthermore, if you have objects where “copying” has an unconventional meaning, putting such objects into a container will invariably lead to grief. (For an example of the kind of grief it can lead to, see Item 8.)

In the presence of inheritance, of course, copying leads to slicing. That is, if you create a container of base class objects and you try to insert derived class objects into it, the derivedness of the objects will be removed as the objects are copied (via the base class copy constructor) into the container:

vector<Widget> vw;

class SpecialWidget:                    // SpecialWidget inherits from
  public Widget { ... };                // Widget above

SpecialWidget sw;

vw.push_back(sw);                       // sw is copied as a base class
                                        // object into vw. Its specialness
                                        // is lost during the copying

The slicing problem suggests that inserting a derived class object into a container of base class objects is almost always an error. If you want the resulting object to act like a derived class object, e.g., invoke derived class virtual functions, etc., it is always an error. (For more background on the slicing problem, consult Effective C++, Item 22. For another example of where it arises in the STL, see Item 38.)

An easy way to make copying efficient, correct, and immune to the slicing problem is to create containers of pointers instead of containers of objects. That is, instead of creating a container of Widget, create a container of Widget*. Copying pointers is fast, it always does exactly what you expect (it copies the bits making up the pointer), and nothing gets sliced when a pointer is copied. Unfortunately, containers of pointers have their own STL-related headaches. You can read about them in Items 7 and 33. As you seek to avoid those headaches while still dodging efficiency, correctness, and slicing concerns, you’ll probably discover that containers of smart pointers are an attractive option. To learn more about this option, turn to Item 7.

If all this makes it sound like the STL is copy-crazy, think again. Yes, the STL makes lots of copies, but it’s generally designed to avoid copying objects unnecessarily. In fact, it’s generally designed to avoid creating objects unnecessarily. Contrast this with the behavior of C’s and C++’s only built-in container, the lowly array:

Widget w[maxNumWidgets];  // create an array of maxNumWidgets
                          // Widgets, default-constructing each one

This constructs maxNumWidgets Widget objects, even if we normally expect to use only a few of them or we expect to immediately overwrite each default-constructed value with values we get from someplace else (e.g., a file). Using the STL instead of an array, we can use a vector that grows when it needs to:

vector<Widget> vw;            // create a vector with zero Widget
                              // objects that will expand as needed

We can also create an empty vector that contains enough space for maxNumWidgets Widgets, but where zero Widgets have been constructed:

vector<Widget> vw;

vw.reserve(maxNumWidgets);    // see Item 14 for details on reserve

Compared to arrays, STL containers are much more civilized. They create (by copying) only as many objects as you ask for, they do it only when you direct them to, and they use a default constructor only when you say they should. Yes, STL containers make copies, and yes, you need to understand that, but don’t lose sight of the fact that they’re still a big step up from arrays.

Item 4: Call empty instead of checking size() against zero

For any container c, writing

if (c.size() == 0) ...

is essentially equivalent to writing

if (c.empty()) ...

That being the case, you might wonder why one construct should be preferred to the other, especially in view of the fact that empty is typically implemented as an inline function that simply returns whether size returns 0.

You should prefer the construct using empty, and the reason is simple: empty is a constant-time operation for all standard containers, but for some list implementations, size may take linear time.

But what makes list so troublesome? Why can’t it, too, offer a constant-time size? The answer has much to do with the range form of list’s unique splicing functions. Consider this code:

list<int> list1;
list<int> list2;

...

list1.splice(                                     // move all nodes in list2
   list1.end(), list2,                            // from the first occurrence
   find(list2.begin(), list2.end(), 5),           // of 5 through the last
   find(list2.rbegin(), list2.rend(), 10).base()  // occurrence of 10 to the
);                                                // end of list1. See Item 28
                                                  // for info on the "base()" call

This code won’t work unless list2 contains a 10 somewhere beyond a 5, but let’s assume that’s not a problem. Instead, let’s focus on this question: how many elements are in list1 after the splice? Clearly, list1 after the splice has as many elements as it did before the splice plus however many elements were spliced into it. But how many elements were spliced into it? As many as were in the range defined by find(list2.begin(), list2.end(), 5) and find(list2.rbegin(), list2.rend(), 10).base(). Okay, how many is that? Without traversing the range and counting them, there’s no way to know. And therein lies the problem.

Suppose you’re responsible for implementing list. list isn’t just any container, it’s a standard container, so you know your class will be widely used. You naturally want your implementation to be as efficient as possible. You figure that clients will commonly want to find out how many elements are in a list, so you’d like to make size a constant-time operation. You’d thus like to design list so it always knows how many elements it contains.

At the same time, you know that of all the standard containers, only list offers the ability to splice elements from one place to another without copying any data. You reason that many list clients will choose list specifically because it offers high-efficiency splicing. They know that splicing a range from one list to another can be accomplished in constant time, and you know that they know it, so you certainly want to meet their expectation that splice is a constant-time member function.

This puts you in a quandary. If size is to be a constant-time operation, each list member function must update the sizes of the lists on which it operates. That includes splice. But the only way for the range version of splice to update the sizes of the lists it modifies is for it to count the number of elements being spliced, and doing that would prevent it from achieving the constant-time performance you want for it. If you eliminate the requirement that the range form of splice update the sizes of the lists it’s modifying, splice can be made constant-time, but then size becomes a linear-time operation. In general, it will have to traverse its entire data structure to see how many elements it contains. No matter how you look at it, something—size or the range form of splice—has to give. One or the other can be a constant-time operation, but not both.

Different list implementations resolve this conflict in different ways, depending on whether their authors choose to maximize the efficiency of size or the range form of splice. If you happen to be using a list implementation where a constant-time range form of splice was given higher priority than a constant-time size, you’ll be better off calling empty than size, because empty is always a constant-time operation. Even if you’re not using such an implementation, you might find yourself using such an implementation in the future. For example, you might port your code to a different platform where a different implementation of the STL is available, or you might just decide to switch to a different STL implementation for your current platform.

No matter what happens, you can’t go wrong if you call empty instead of checking to see if size() == 0. So call empty whenever you need to know whether a container has zero elements.

Item 5: Prefer range member functions to their single-element counterparts

Quick! Given two vectors, v1 and v2, what’s the easiest way to make v1’s contents be the same as the second half of v2’s? Don’t agonize over the definition of “half” when v2 has an odd number of elements, just do something reasonable.

Time’s up! If your answer was

v1.assign(v2.begin() + v2.size() / 2, v2.end());

or something quite similar, you get full credit and a gold star. If your answer involved more than one statement, but didn’t use any kind of loop, you get nearly full credit, but no gold star. If your answer involved a loop, you’ve got some room for improvement, and if your answer involved multiple loops, well, let’s just say that you really need this book.

By the way, if your response to the answer to the question included “Huh?”, pay close attention, because you’re going to learn something really useful.

This quiz is designed to do two things. First, it affords me an opportunity to remind you of the existence of the assign member function, a convenient beast that too many programmers overlook. It’s available for all the standard sequence containers (vector, string, deque, and list). Whenever you have to completely replace the contents of a container, you should think of assignment. If you’re just copying one container to another of the same type, operator= is the assignment function of choice, but as this example demonstrates, assign is available for the times when you want to give a container a completely new set of values, but operator= won’t do what you want.

The second reason for the quiz is to demonstrate why range member functions are superior to their single-element alternatives. A range member function is a member function that, like STL algorithms, uses two iterator parameters to specify a range of elements over which something should be done. Without using a range member function to solve this Item’s opening problem, you’d have to write an explicit loop, probably something like this:

vector<Widget> v1, v2;                 // assume v1 and v2 are vectors
                                       // of Widgets
...

v1.clear();
for (vector<Widget>::const_iterator ci = v2.begin() + v2.size() / 2;
     ci != v2.end();
     ++ci)
  v1.push_back(*ci);

Item 43 examines in detail why you should try to avoid writing explicit loops, but you don’t need to read that Item to recognize that writing this code is a lot more work than is writing the call to assign. As we’ll see shortly, the loop also happens to impose an efficiency penalty, but we’ll deal with that in a moment.

One way to avoid the loop is to follow the advice of Item 43 and employ an algorithm instead:

v1.clear();
copy(v2.begin() + v2.size() / 2, v2.end(), back_inserter(v1));

Writing this is still more work than writing the call to assign. Furthermore, though no loop is present in this code, one certainly exists inside copy (see Item 43). As a result, the efficiency penalty remains. Again, I’ll discuss that below. At this point, I want to digress briefly to observe that almost all uses of copy where the destination range is specified using an insert iterator (i.e., via inserter, back_inserter, or front_inserter) can be—should be—replaced with calls to range member functions. Here, for example, the call to copy can be replaced with a range version of insert:

v1.insert(v1.end(), v2.begin() + v2.size() / 2, v2.end());

This involves slightly less typing than the call to copy, but it also says more directly what is happening: data is being inserted into v1. The call to copy expresses that, too, but less directly. It puts the emphasis in the wrong place. The interesting aspect of what is happening is not that elements are being copied, it’s that v1 is having new data added to it. The insert member function makes that clear. The use of copy obscures it. There’s nothing interesting about the fact that things are being copied, because the STL is built on the assumption that things will be copied. Copying is so fundamental to the STL, it’s the topic of Item 3 in this book!

Too many STL programmers overuse copy, so the advice I just gave bears repeating: Almost all uses of copy where the destination range is specified using an insert iterator should be replaced with calls to range member functions.

Returning to our assign example, we’ve already identified two reasons to prefer range member functions to their single-element counterparts:

• It’s generally less work to write the code using the range member functions.

• Range member functions tend to lead to code that is clearer and more straightforward.

In short, range member functions yield code that is easier to write and easier to understand. What’s not to like?

Alas, some will dismiss these arguments as matters of programming style, and developers enjoy arguing about style issues almost as much as they enjoy arguing about which is the One True Editor. (As if there’s any doubt. It’s Emacs.) It would be helpful to have a more universally agreed-upon criterion for establishing the superiority of range member functions to their single-element counterparts. For the standard sequence containers, we have one: efficiency. When dealing with the standard sequence containers, application of single-element member functions makes more demands on memory allocators, copies objects more frequently, and/or performs redundant operations compared to range member functions that achieve the same end.

For example, suppose you’d like to copy an array of ints into the front of a vector. (The data might be in an array instead of a vector in the first place, because the data came from a legacy C API. For a discussion of the issues that arise when mixing STL containers and C APIs, see Item 16.) Using the vector range insert function, it’s honestly trivial:

int data[numValues];                             // assume numValues is
                                                 // defined elsewhere
vector<int> v;

...

v.insert(v.begin(), data, data + numValues);     // insert the ints in data
                                                 // into v at the front

Using iterative calls to insert in an explicit loop, it would probably look more or less like this:

vector<int>::iterator insertLoc(v.begin());
for (int i = 0; i < numValues; ++i) {
  insertLoc = v.insert(insertLoc, data[i]);
  ++insertLoc;
}

Notice how we have to be careful to save the return value of insert for the next loop iteration, then increment the returned iterator. If we didn’t update insertLoc after each insertion, we’d have two problems. First, all loop iterations after the first would yield undefined behavior, because each insert call would invalidate insertLoc. Second, even if insertLoc remained valid, we’d always insert at the front of the vector (i.e., at v.begin()), and the result would be that the ints copied into v would end up in reverse order.

If we follow the lead of Item 43 and replace the loop with a call to copy, we get something like this:

copy(data, data + numValues, inserter(v, v.begin()));

By the time the copy template has been instantiated, the code based on copy and the code using the explicit loop will be almost identical, so for purposes of an efficiency analysis, we’ll focus on the explicit loop, keeping in mind that the analysis is equally valid for the code employing copy. Looking at the explicit loop just makes it easier to understand where the efficiency hits come from. Yes, that’s “hits,” plural, because the code using the single-element version of insert levies up to three different performance taxes on you, none of which you pay if you use the range version of insert.

The first tax consists of unnecessary function calls. Inserting numValues elements into v one at a time naturally costs you numValues calls to insert. Using the range form of insert, you pay for only one function call, a savings of numValues-1 calls. Of course, it’s possible that inlining will save you from this tax, but then again, it’s possible that it won’t. Only one thing is sure. With the range form of insert, you definitely won’t pay it.

Inlining won’t save you from the second tax, which is the cost of inefficiently moving the existing elements in v to their final post-insertion positions. Each time insert is called to add a new value to v, every element above the insertion point must be moved up one position to make room for the new element. So the element at position p must be moved up to position p+1, etc. In our example, we’re inserting numValues elements at the front of v. That means that each element in v prior to the insertions will have to be shifted up a total of numValues positions. But each will be shifted up only one position each time insert is called, so each element will be moved a total of numValues times. If v has n elements prior to the insertions, a total of n*numValues moves will take place. In this example, v holds ints, so each move will probably boil down to an invocation of memmove, but if v held a user-defined type like Widget, each move would incur a call to that type’s assignment operator or copy constructor. (Most calls would be to the assignment operator, but each time the last element in the vector was moved, that move would be accomplished by calling the element’s copy constructor.) In the general case, then, inserting numValues new objects one at a time into the front of a vector<Widget> holding n elements exacts a cost of n*numValues function calls: (n-1)*numValues calls to the Widget assignment operator and numValues calls to the Widget copy constructor. Even if these calls are inlined, you’re still doing the work to move the elements in v numValues times.

In contrast, the Standard requires that range insert functions move existing container elements directly into their final positions, i.e., at a cost of one move per element. The total cost is n moves, numValues to the copy constructor for the type of objects in the container, the remainder to that type’s assignment operator. Compared to the single-element insert strategy, the range insert performs n*(numValues-1) fewer moves. Think about that for a minute. It means that if numValues is 100, the range form of insert would do 99% fewer moves than the code making repeated calls to the single-element form of insert!

Before I move on to the third efficiency cost of single-element member functions vis-à-vis their range counterparts, I have a minor correction. What I wrote in the previous paragraph is the truth and nothing but the truth, but it’s not quite the whole truth. A range insert function can move an element into its final position in a single move only if it can determine the distance between two iterators without losing its place. This is almost always possible, because all forward iterators offer this functionality, and forward iterators are nearly ubiquitous. All iterators for the standard containers offer forward iterator functionality. So do the iterators for the nonstandard hashed containers (see Item 25). Pointers acting as iterators into arrays offer such functionality, too. In fact, the only standard iterators that don’t offer forward iterator capabilities are input and output iterators. Thus, everything I wrote above is true except when the iterators passed to the range form of insert are input iterators (e.g. istream_iterators—see Item 6). In that case only, range insert is allowed to move elements into their final positions one place at a time, and if it’s implemented that way, its advantage as regards number of element movements ceases to exist. (For output iterators, this issue fails to arise, because output iterators can’t be used to specify a range for insert.)

The final performance tax levied on those so foolish as to use repeated single-element insertions instead of a single range insertion has to do with memory allocation, though it has a nasty copying side to it, too. As Item 14 explains, when you try to insert an element into a vector whose memory is full, the vector allocates new memory with more capacity, copies its elements from the old memory to the new memory, destroys the elements in the old memory, and deallocates the old memory. Then it adds the element that is being inserted. Item 14 also explains that vector implementations increase their capacity by some multiplicative factor each time they run out of memory, so inserting numValues new elements often results in new memory being allocated a number of times proportional to the logarithm of numValues. Inserting 1000 elements one at a time into a vector with a growth rate of 1.5 could thus result in 18 new allocations (including their incumbent copying of elements). In contrast (and, by now, predictably), a range insertion can figure out how much new memory it needs before it starts inserting things (assuming it is given forward iterators), so it need not reallocate a vector’s underlying memory more than once. As you can imagine, the savings can be considerable.

Because log1.51000 = 18 (rounding up to an integer).

The analysis I’ve just performed is for vectors, but the same reasoning applies to strings, too. For deques, the reasoning is similar, but deques manage their memory differently from vectors and strings, so the argument about repeated memory allocations doesn’t apply. The argument about moving elements an unnecessarily large number of times, however, generally does apply (though the details are different), as does the observation about the number of function calls.

Among the standard sequence containers, that leaves only list, but here, too, there is a performance advantage to using a range form of insert instead of a single-element form. The argument about repeated function calls continues to be valid, of course, but, because of the way linked lists work, the copying and memory allocation issues fail to arise. Instead, there is a new problem: repeated superfluous assignments to the next and prev pointers of some nodes in the list.

Each time an element is added to a linked list, the list node holding that element must have its next and prev pointers set, and of course the node preceding the new node (let’s call it B, for “before”) must set its next pointer and the node following the new node (we’ll call it A, for “after”) must set its prev pointer:

image

When a series of new nodes is added one by one by calling list’s single-element insert, all but the last new node will set its next pointer twice, once to point to A, a second time to point to the element inserted after it. A will set its prev pointer to point to a new node each time one is inserted in front of it. If numValues nodes are inserted in front of A, numValues-1 superfluous assignments will be made to the inserted nodes’ next pointers, and numValues-1 superfluous assignments will be made to A’s prev pointer. All told, that’s 2*(numValues-1) unnecessary pointer assignments. Pointer assignments are cheap, of course, but why pay for them if you don’t have to?

By now it should be clear that you don’t have to, and the key to evading the cost is to use list’s range form of insert. Because that function knows how many nodes will ultimately be inserted, it can avoid the superfluous pointer assignments, using only a single assignment to each pointer to set it to its proper post-insertion value.

For the standard sequence containers, then, a lot more than programming style is on the line when choosing between single-element insertions and range insertions. For the associative containers, the efficiency case is harder to make, though the issue of extra function call overhead for repeated calls to single-element insert continues to apply. Furthermore, certain special kinds of range insertions may lead to optimization possibilities in associative containers, too, but as far as I can tell, such optimizations currently exist only in theory. By the time you read this, of course, theory may have become practice, so range insertions into associative containers may indeed be more efficient than repeated single-element insertions. Certainly they are never less efficient, so you have nothing to lose by preferring them.

Even without the efficiency argument, the fact remains that using range member functions requires less typing as you write the code, and it also yields code that is easier to understand, thus enhancing your software’s long-term maintainability. Those two characteristics alone should convince you to prefer range member functions. The efficiency edge is really just a bonus.

Having droned on this long about the wonder of range member functions, it seems only appropriate that I summarize them for you. Knowing which member functions support ranges makes it a lot easier to recognize opportunities to use them. In the signatures below, the parameter type iterator literally means the iterator type for the container, i.e, container::iterator. The parameter type InputIterator, on the other hand, means that any input iterator is acceptable.

Range construction. All standard containers offer a constructor of this form:

container::container(InputIterator begin,   // beginning of range
                     InputIterator end);    // end of range

When the iterators passed to this constructor are istream_iterators or istreambuf_iterators (see Item 29), you may encounter C++’s most astonishing parse, one that causes your compilers to interpret this construct as a function declaration instead of as the definition of a new container object. Item 6 tells you everything you need to know about that parse, including how to defeat it.

Range insertion. All standard sequence containers offer this form of insert:

void container::insert(iterator position,     // where to insert the range
                       InputIterator begin,   // start of range to insert
                       InputIterator end);    // end of range to insert

Associative containers use their comparison function to determine where elements go, so they offer a signature that omits the position parameter:

void container::insert(InputIterator begin, InputIterator end);

When looking for ways to replace single-element inserts with range versions, don’t forget that some single-element variants camouflage themselves by adopting different function names. For example, push_front and push_back both insert single elements into containers, even though they’re not called insert. If you see a loop calling push_front or push_back, or if you see an algorithm such as copy being passed front_inserter or back_inserter as a parameter, you’ve discovered a place where a range form of insert is likely to be a superior strategy.

Range erasure. Every standard container offers a range form of erase, but the return types differ for sequence and associative containers. Sequence containers provide this,

iterator container::erase(iterator begin, iterator end);

while associative containers offer this:

void container::erase(iterator begin, iterator end);

Why the difference? The claim is that having the associative container version of erase return an iterator (to the element following the one that was erased) would incur an unacceptable performance penalty. I’m one of many who find this claim specious, but the Standard says what the Standard says, and what the Standard says is that sequence and associative container versions of erase have different return types.

Most of this Item’s efficiency analysis for insert has analogues for erase. The number of function calls is still greater for repeated calls to single-element erase than for a single call to range erase. Element values must still be shifted one position at a time towards their final destination when using single-element erase, while range erase can move them into their final positions in a single move.

One argument about vector’s and string’s insert that fails to apply to erase has to do with repeated allocations. (For erase, of course, it would concern repeated deallocations.) That’s because the memory for vectors and strings automatically grows to accommodate new elements, but it doesn’t automatically shrink when the number of elements is reduced. (Item 17 describes how you may reduce the unnecessary memory held by a vector or string.)

One particularly important manifestation of range erase is the erase-remove idiom. You can read all about it in Item 32.

Range assignment. As I noted at the beginning of this Item, all standard sequence containers offer a range form of assign:

void container::assign(InputIterator begin, InputIterator end);

So there you have it, three solid arguments for preferring range member functions to their single-element counterparts. Range member functions are easier to write, they express your intent more clearly, and they exhibit higher performance. That’s a troika that’s hard to beat.

Item 6: Be alert for C++’s most vexing parse

Suppose you have a file of ints and you’d like to copy those ints into a list. This seems like a reasonable way to do it:

ifstream dataFile("ints.dat");

list<int> data(istream_iterator<int>(dataFile),  // warning! this doesn't do
               istream_iterator<int>());         // what you think it does

The idea here is to pass a pair of istream_iterators to list’s range constructor (see Item 5), thus copying the ints in the file into the list.

This code will compile, but at runtime, it won’t do anything. It won’t read any data out of a file. It won’t even create a list. That’s because the second statement doesn’t declare a list and it doesn’t call a constructor. What it does is ... well, what it does is so strange, I dare not tell you straight out, because you won’t believe me. Instead, I have to develop the explanation, bit by bit. Are you sitting down? If not, you might want to look around for a chair...

We’ll start with the basics. This line declares a function f taking a double and returning an int:

int f(double d);

This next line does the same thing. The parentheses around the parameter name d are superfluous and are ignored:

int f(double (d));        // same as above; parens around d are ignored

The line below declares the same function. It simply omits the parameter name:

int f(double);            // same as above; parameter name is omitted

Those three declaration forms should be familiar to you, though the ability to put parentheses around a parameter name may have been new. (It wasn’t long ago that it was new to me.)

Let’s now look at three more function declarations. The first one declares a function g taking a parameter that’s a pointer to a function taking nothing and returning a double:

int g(double (*pf)());    // g takes a pointer to a function as a parameter

Here’s another way to say the same thing. The only difference is that pf is declared using non-pointer syntax (a syntax that’s valid in both C and C++):

int g(double pf());       // same as above; pf is implicitly a pointer

As usual, parameter names may be omitted, so here’s a third declaration for g, one where the name pf has been eliminated:

int g(double ());         // same as above; parameter name is omitted

Notice the difference between parentheses around a parameter name (such as d in the second declaration for f) and standing by themselves (as in this example). Parentheses around a parameter name are ignored, but parentheses standing by themselves indicate the existence of a parameter list; they announce the presence of a parameter that is itself a pointer to a function.

Having warmed ourselves up with these declarations for f and g, we are ready to examine the code that began this Item. Here it is again:

list<int> data(istream_iterator<int>(dataFile),
               istream_iterator<int>());

Brace yourself. This declares a function, data, whose return type is list<int>. The function data takes two parameters:

• The first parameter is named dataFile. Its type is istream_iterator<int>. The parentheses around dataFile are superfluous and are ignored.

• The second parameter has no name. Its type is pointer to function taking nothing and returning an istream_iterator<int>.

Amazing, huh? But it’s consistent with a universal rule in C++, which says that pretty much anything that can be parsed as a function declaration will be. If you’ve been programming in C++ for a while, you’ve almost certainly encountered another manifestation of this rule. How many times have you seen this mistake?

class Widget { ... };            // assume Widget has a default constructor

Widget w();                      // uh oh...

This doesn’t declare a Widget named w, it declares a function named w that takes nothing and returns a Widget. Learning to recognize this faux pas is a veritable rite of passage for C++ programmers.

All of which is interesting (in its own twisted way), but it doesn’t help us say what we want to say, which is that a list<int> object should be initialized with the contents of a file. Now that we know what parse we have to defeat, that’s easy to express. It’s not legal to surround a formal parameter declaration with parentheses, but it is legal to surround an argument to a function call with parentheses, so by adding a pair of parentheses, we force compilers to see things our way:

list<int> data((istream_iterator<int>(dataFile)),  // note new parens
               istream_iterator<int>());           // around first argument
                                                   // to list's constructor

This is the proper way to declare data, and given the utility of istream_iterators and range constructors (again, see Item 5), it’s worth knowing how to do it.

Unfortunately, not all compilers currently know it themselves. Of the several I tested, almost half refused to accept data’s declaration unless it was incorrectly declared without the additional parentheses! To placate such compilers, you could roll your eyes and use the declaration for data that I’ve painstakingly explained is incorrect, but that would be both unportable and short-sighted. After all, compilers that currently get the parse wrong will surely correct it in the future, right? (Surely!)

A better solution is to step back from the trendy use of anonymous istream_iterator objects in data’s declaration and simply give those iterators names. The following code should work everywhere:

ifstream dataFile("ints.dat");

istream_iterator<int> dataBegin(dataFile);
istream_iterator<int> dataEnd;

list<int> data(dataBegin, dataEnd);

This use of named iterator objects runs contrary to common STL programming style, but you may decide that’s a price worth paying for code that’s unambiguous to both compilers and the humans who have to work with them.

Item 7: When using containers of newed pointers, remember to delete the pointers before the container is destroyed

Containers in the STL are remarkably smart. They serve up iterators for both forward and reverse traversals (via begin, end, rbegin, etc.); they tell you what type of objects they contain (via their value_type typedef); during insertions and erasures, they take care of any necessary memory management; they report both how many objects they hold and the most they may contain (via size and max_size, respectively); and of course they automatically destroy each object they hold when they (the containers) are themselves destroyed.

Given such brainy containers, many programmers stop worrying about cleaning up after themselves. Heck, they figure, their containers will do the worrying for them. In many cases, they’re right, but when the containers hold pointers to objects allocated with new, they’re not right enough. Sure, a container of pointers will destroy each element it contains when it (the container) is destroyed, but the “destructor” for a pointer is a no-op! It certainly doesn’t call delete.

As a result, the following code leads straight to a resource leak:

void doSomething()
{
  vector<Widget*> vwp;
  for (int i = 0; i < SOME_MAGIC_NUMBER; ++i)
    vwp.push_back(new Widget);
  ...                                // use vwp
}                                    // Widgets are leaked here!

Each of vwp’s elements is destroyed when vwp goes out of scope, but that doesn’t change the fact that delete was never used for the objects conjured up with new. Such deletion is your responsibility, not that of your vector. This is a feature. Only you know whether the pointers should be deleted.

Usually, you want them to be. When that’s the case, making it happen seems easy enough:

void doSomething()
{
  vector<Widget*> vwp;

  ...                                 // as before

  for(vector<Widget*>::iterator i = vwp.begin();
      i != vwp.end();
      ++i)
    delete *i;
}

This works, but only if you’re not terribly picky about what you mean by “works”. One problem is that the new for loop does pretty much what for_each does, but it’s not as clear as using for_each (see Item 43). Another is that the code isn’t exception safe. If an exception is thrown between the time vwp is filled with pointers and the time you get around to deleteing them, you’ve leaked resources again. Fortunately, both problems can be overcome.

To turn your for_each-like loop into an actual use of for_each, you need to turn delete into a function object. That’s child’s play, assuming you have a child who likes to play with the STL:

template<typename T>
struct DeleteObject:                             // Item 40 describes why
  public unary_function<const T*, void> {        // this inheritance is here

  void operator()(const T* ptr) const
  {
    delete ptr;
  }
};

Now you can do this:

void doSomething()
{
  ...                                      // as before

  for_each(vwp.begin(), vwp.end(), DeleteObject<Widget>());
}

Unfortunately, this makes you specify the type of objects that DeleteObject will be deleting (in this case, Widget). That’s annoying. vwp is a vector<Widget*>, so of course DeleteObject will be deleting Widget* pointers! Duh! This kind of redundancy is more than just annoying, because it can lead to bugs that are difficult to track down. Suppose, for example, somebody ill-advisedly decides to inherit from string:

class SpecialString: public string { ... };

This is risky from the get-go, because string, like all the standard STL containers, lacks a virtual destructor, and publicly inheriting from classes without virtual destructors is a major C++ no-no. (For details, consult any good book on C++. In Effective C++, the place to look is Item 14.) Still, some people do this kind of thing, so let’s consider how the following code would behave:

void doSomething()
{
  deque<SpecialString*> dssp;

  ...

  for_each(dssp.begin(), dssp.end(),   // undefined behavior! Deletion
           DeleteObject<string>());    // of a derived object via a base
}                                      // class pointer where there is
                                       // no virtual destructor

Note how dssp is declared to hold SpecialString* pointers, but the author of the for_each loop has told DeleteObject that it will be deleting string* pointers. It’s easy to understand how such an error could arise. SpecialString undoubtedly acts a lot like a string, so one can forgive its clients if they occasionally forget that they are using SpecialStrings instead of strings.

We can eliminate the error (as well as reduce the number of keystrokes required of DeleteObject’s clients) by having compilers deduce the type of pointer being passed to DeleteObject::operator(). All we need to do is move the templatization from DeleteObject to its operator():

struct DeleteObject {                     // templatization and base
                                          // class removed here

  template<typename T>                    // templatization added here
  void operator()(const T* ptr) const
  {
    delete ptr;
  }
};

Compilers know the type of pointer being passed to DeleteObject::operator(), so we have them automatically instantiate an operator() taking that type of pointer. The downside to this type deduction is that we give up the ability to make DeleteObject adaptable (see Item 40). Considering how DeleteObject is designed to be used, it’s difficult to imagine how that could be a problem.

With this new version of DeleteObject, the code for SpecialString clients looks like this:

void doSomething()
{
  deque<SpecialString*> dssp;
  ...

  for_each(dssp.begin(), dssp.end(),
           DeleteObject());               // ah! well-defined behavior!
}

Straightforward and type-safe, just the way we like it.

But still not exception-safe. If an exception is thrown after the SpecialStrings are newed but before invocation of the call to for_each, it’s Leakapalooza. That problem can be addressed in a variety of ways, but the simplest is probably to replace the container of pointers with a container of smart pointers, typically reference-counted pointers. (If you’re unfamiliar with the notion of smart pointers, you should be able to find a description in any intermediate or advanced C++ book. In More Effective C++, the material is in Item 28.)

The STL itself contains no reference-counting smart pointer, and writing a good one—one that works correctly all the time—is tricky enough that you don’t want to do it unless you have to. I published the code for a reference-counting smart pointer in More Effective C++ in 1996, and despite basing it on established smart pointer implementations and submitting it to extensive pre-publication reviewing by experienced developers, a small parade of valid bug reports has trickled in for years. The number of subtle ways in which reference-counting smart pointers can fail is remarkable. (For details, consult the More Effective C++ errata list [28].)

Fortunately, there’s rarely a need to write your own, because proven implementations are not difficult to find. One such smart pointer is shared_ptr in the Boost library (see Item 50). With Boost’s shared_ptr, this Item’s original example can be rewritten as follows:

void doSomething()
{
  typedef boost::shared_ptr<Widget> SPW;      // SPW = "shared_ptr
                                              //        to Widget"
  vector<SPW> vwp;
  for (int i = 0; i < SOME_MAGIC_NUMBER; ++i)
    vwp.push_back(SPW(new Widget));           // create an SPW from a
                                              // Widget*, then do a
                                              // push_back on it

  ...                                         // use vwp

}                                   // no Widgets are leaked here, not
                                    // even if an exception is thrown
                                    // in the code above

One thing you must never be fooled into thinking is that you can arrange for pointers to be deleted automatically by creating containers of auto_ptrs. That’s a horrible thought, one so perilous, I’ve devoted Item 8 to why you should avoid it.

All you really need to remember is that STL containers are smart, but they’re not smart enough to know whether to delete the pointers they contain. To avoid resource leaks when you have containers of pointers that should be deleted, you must either replace the pointers with smart reference-counting pointer objects (such as Boost’s shared_ptr) or you must manually delete each pointer in the container before the container is destroyed.

Finally, it may have crossed your mind that if a struct like DeleteObject can make it easier to avoid resource leaks for containers holding pointers to objects, it should be possible to create a similar DeleteArray struct to make it easier to avoid resource leaks for containers holding pointers to arrays. Certainly it is possible, but whether it is advisable is a different matter. Item 13 explains why dynamically allocated arrays are almost always inferior to vector and string objects, so before you sit down to write DeleteArray, please review Item 13 first. With luck, you’ll decide that DeleteArray is a struct whose time will never come.

Item 8: Never create containers of auto_ptrs

Frankly, this Item shouldn’t need to be in Effective STL. Containers of auto_ptr (COAPs) are prohibited. Code attempting to use them shouldn’t compile. The C++ Standardization Committee expended untold effort to arrange for that to be the case. I shouldn’t have to say anything about COAPs, because your compilers should have plenty to say about such containers, and all of it should be uncomplimentary.

If you’re interested in the tortured history of auto_ptr standardization, point your web browser to the auto_ptr Update page [29] at the More Effective C++ web site.

Alas, many programmers use STL platforms that fail to reject COAPs. Alas even more, many programmers see in COAPs the chimera of a simple, straightforward, efficient solution to the resource leaks that often accompany containers of pointers (see Items 7 and 33). As a result, many programmers are tempted to use COAPs, even though it’s not supposed to be possible to create them.

I’ll explain in a moment why the spectre of COAPs was so alarming that the Standardization Committee took specific steps to make them illegal. Right now, I want to focus on a disadvantage that requires no knowledge of auto_ptr, or even of containers: COAPs aren’t portable. How could they be? The Standard for C++ forbids them, and better STL platforms already enforce this. It’s reasonable to assume that as time goes by, STL platforms that currently fail to enforce this aspect of the Standard will become more compliant, and when that happens, code that uses COAPs will be even less portable than it is now. If you value portability (and you should), you’ll reject COAPs simply because they fail the portability test.

But maybe you’re not of a portability mind-set. If that’s the case, kindly allow me to remind you of the unique—some would say bizarre—definition of what it means to copy an auto_ptr.

When you copy an auto_ptr, ownership of the object pointed to by the auto_ptr is transferred to the copying auto_ptr, and the copied auto_ptr is set to NULL. You read that right: to copy an auto_ptr is to change its value:

auto_ptr<Widget> pw1(new Widget);  // pw1 points to a Widget

auto_ptr<Widget> pw2(pw1);         // pw2 points to pw1's Widget;
                                   // pw1 is set to NULL. (Ownership
                                   // of the Widget is transferred
                                   // from pw1 to pw2.)

pw1 = pw2;                         // pw1 now points to the Widget
                                   // again; pw2 is set to NULL

This is certainly unusual, and perhaps it’s interesting, but the reason you (as a user of the STL) care is that it leads to some very surprising behavior. For example, consider this innocent-looking code, which creates a vector of auto_ptr<Widget> and then sorts it using a function that compares the values of the pointed-to Widgets:

bool widgetAPCompare(const auto_ptr<Widget>& lhs,
                     const auto_ptr<Widget>& rhs)
{
  return *lhs < *rhs;                   // for this example, assume that
}                                       // operator< exists for Widgets

vector<auto_ptr<Widget> > widgets;      // create a vector and then fill it
...                                     // with auto_ptrs to Widgets;
                                        // remember that this should
                                        // not compile!

sort(widgets.begin(), widgets.end(),    // sort the vector
     widgetAPCompare);

Everything here looks reasonable, and conceptually, everything is reasonable, but the results need not be reasonable at all. For example, one or more of the auto_ptrs in widgets may have been set to NULL during the sort. The act of sorting the vector may have changed its contents! It is worthwhile understanding how this can be.

It can be because one approach to implementing sort—a common approach, as it turns out—is to use some variation on the quicksort algorithm. The fine points of quicksort need not concern us, but the basic idea is that to sort a container, some element of the container is chosen as the “pivot element,” then a recursive sort is done on the values greater than and less than or equal to the pivot element. Within sort, such an approach could look something like this:

template<class RandomAccessIterator,       // this declaration for
         class Compare>                    // sort is copied straight
void sort(RandomAccessIterator first,      // out of the Standard
          RandomAccessIterator last,
          Compare comp)
{
  // this typedef is described below
  typedef typename iterator_traits<RandomAccessIterator>::value_type
    ElementType;

  RandomAccessIterator i;

  ...                              // make i point to the pivot element

  ElementType pivotValue(*i);      // copy the pivot element into a
                                   // local temporary variable; see
                                   // discussion below

  ...                              // do the rest of the sorting work
}

Unless you’re an experienced reader of STL source code, this may look intimidating, but it’s really not that bad. The only tricky part is the reference to iterator_traits<RandomAccessIterator>::value_type, and that’s just the fancy STL way of referring to the type of object pointed to by the iterators passed to sort. (When we refer to iterator_traits<RandomAccessIterator>::value_type, we must precede it by typename, because it’s the name of a type that’s dependent on a template parameter, in this case, RandomAccessIterator. For more information about this use of typename, turn to page 7.)

The troublesome statement in the code above is this one,

ElementType pivotValue(*i);

because it copies an element from the range being sorted into a local temporary object. In our case, the element is an auto_ptr<Widget>, so this act of copying silently sets the copied auto_ptr—the one in the vector—to NULL. Furthermore, when pivotValue goes out of scope, it will automatically delete the Widget it points to. By the time the call to sort returns, the contents of the vector will have changed, and at least one Widget will have been deleted. It’s possible that several vector elements will have been set to NULL and several Widgets will have been deleted, because quicksort is a recursive algorithm, so it could well have copied a pivot element at each level of recursion.

This is a nasty trap to fall into, and that’s why the Standardization Committee worked so hard to make sure you’re not supposed to be able to fall into it. Honor its work on your behalf, then, by never creating containers of auto_ptrs, even if your STL platforms allow it.

If your goal is a container of smart pointers, this doesn’t mean you’re out of luck. Containers of smart pointers are fine, and Item 50 describes where you can find smart pointers that mesh well with STL containers. It’s just that auto_ptr is not such a smart pointer. Not at all.

Item 9: Choose carefully among erasing options

Suppose you have a standard STL container, c, that holds ints,

Container<int> c;

and you’d like to get rid of all the objects in c with the value 1963. Surprisingly, the way to accomplish this task varies from container type to container type; no single approach works for all of them.

If you have a contiguous-memory container (vector, deque, or string—see Item 1), the best approach is the erase-remove idiom (see Item 32):

c.erase(remove(c.begin(), c.end(), 1963),   // the erase-remove idiom is
        c.end());                           // the best way to get rid of
                                            // elements with a specific
                                            // value when c is a vector,
                                            // string, or deque

This approach works for lists, too, but, as Item 44 explains, the list member function remove is more efficient:

c.remove(1963);              // the remove member function is the
                             // best way to get rid of elements with
                             // a specific value when c is a list

When c is a standard associative container (i.e., a set, multiset, map, or multimap), the use of anything named remove is completely wrong. Such containers have no member function named remove, and using the remove algorithm might overwrite container values (see Item 32), potentially corrupting the container. (For details on such corruption, consult Item 22, which also explains why trying to use remove on maps and multimaps won’t compile, and trying to use it on sets and multisets may not compile.)

No, for associative containers, the proper way to approach the problem is to call erase:

c.erase(1963);                 // the erase member function is the
                               // best way to get rid of elements with
                               // a specific value when c is a
                               // standard associative container

Not only does this do the right thing, it takes only logarithmic time for set and map, and, for multiset and multimap, time linear in the number of elements with the specified value. (The remove-based techniques for sequence containers require time linear in the number of elements in the container.) Furthermore, the associative container erase member function has the advantage of being based on equivalence instead of equality, a distinction whose importance is explained in Item 19.

Let’s now revise the problem slightly. Instead of getting rid of every object in c that has a particular value, let’s eliminate every object for which the following predicate (see Item 39) returns true:

bool badValue(int x);            // returns whether x is "bad"

For the sequence containers (vector, string, deque, and list), all we need to do is replace each use of remove with remove_if, and we’re done:

c.erase(remove_if(c.begin(), c.end(), badValue),     // this is the best way to
        c.end());                                    // get rid of objects
                                                     // where badValue
                                                     // returns true when c is
                                                     // a vector, string, or
                                                     // deque
c.remove_if(badValue);                  // this is the best way to get rid of
                                        // objects where badValue returns
                                        // true when c is a list

For the standard associative containers, it’s not quite so straightforward. There are two ways to approach the problem, one easier to code, one more efficient. The easier-but-less-efficient solution uses remove_copy_if to copy the values we want into a new container, then swaps the contents of the original container with those of the new one:

AssocContainer<int> c;                        // c is now one of the
...                                           // standard associative
                                              // containers

AssocContainer<int> goodValues;               // temporary container
                                              // to hold unremoved
                                              // values

remove_copy_if(c.begin(), c.end(),            // copy desired
               inserter(goodValues,           // values from c to
                        goodValues.end()),    // goodValues
               badValue);

c.swap(goodValues);                           // swap the contents of
                                              // c and goodValues

The drawback to this approach is that it involves copying all the elements that aren’t being removed, and such copying might cost us more than we’re interested in paying.

We can dodge that bill by removing the elements from the original container directly. However, because associative containers offer no member function akin to remove_if, we must write a loop to iterate over the elements in c, erasing elements as we go.

Conceptually, the task is simple, and in fact, the code is simple, too. Unfortunately, the code that does the job correctly is rarely the code that springs to mind. For example, this is what many programmers come up with first:

AssocContainer<int> c;
...

for (AssocContainer<int>::iterator i = c.begin();    // clear, straightforward,
     i != c.end();                                   // and buggy code to
     ++i) {                                          // erase every element
  if (badValue(*i)) c.erase(i);                      // in c where badValue
}                                                    // returns true; don't
                                                     // do this!

Alas, this has undefined behavior. When an element of a container is erased, all iterators that point to that element are invalidated. Once c.erase(i) returns, i has been invalidated. That’s bad news for this loop, because after erase returns, i is incremented via the ++i part of the for loop.

To avoid this problem, we have to make sure we have an iterator to the next element of c before we call erase. The easiest way to do that is to use postfix increment on i when we make the call:

AssocContainer<int> c;
...

for (AssocContainer<int>::iterator i = c.begin();    // the 3rd part of the for
     i != c.end();                                   // loop is empty; i is now
     /* nothing */) {                                // incremented below

    if (badValue(*i)) c.erase(i++);                  // for bad values, pass the
    else ++i;                                        // current i to erase and
}                                                    // increment i as a side
                                                     // effect; for good values,
                                                     // just increment i

This approach to calling erase works, because the value of the expression i++ is i’s old value, but as a side effect, i is incremented. Hence, we pass i’s old (unincremented) value to erase, but we also increment i itself before erase begins executing. That’s exactly what we want. As I said, the code is simple, it’s just not what most programmers come up with the first time they try.

Let’s now revise the problem further. Instead of merely erasing each element for which badValue returns true, we also want to write a message to a log file each time an element is erased.

For the associative containers, this is as easy as easy can be, because it requires only a trivial modification to the loop we just developed:

ofstream logFile;                                  // log file to write to

AssocContainer<int> c;
...

for (AssocContainer<int>::iterator i = c.begin();  // loop conditions are the
     i != c.end(); ) {                             // same as before

  if (badValue(*i)) {
     logFile << "Erasing " << *i << ' ';          // write log file
     c.erase(i++);                                 // erase element
  }
  else ++i;
}

It’s vector, string, and deque that now give us trouble. We can’t use the erase-remove idiom any longer, because there’s no way to get erase or remove to write the log file. Furthermore, we can’t use the loop we just developed for associative containers, because it yields undefined behavior for vectors, strings, and deques! Recall that for such containers, invoking erase not only invalidates all iterators pointing to the erased element, it also invalidates all iterators beyond the erased element. In our case, that includes all iterators beyond i. It doesn’t matter if we write i++, ++i, or anything else you can think of, because none of the resulting iterators is valid.

We must take a different tack with vector, string, and deque. In particular, we must take advantage of erase’s return value. That return value is exactly what we need: it’s a valid iterator pointing to the element following the erased element once the erase has been accomplished. In other words, we write this:

for (SeqContainer<int>::iterator i = c.begin();
     i != c.end(); ) {

  if (badValue(*i)) {
     logFile << "Erasing " << *i << ' ';
     i = c.erase(i);                           // keep i valid by assigning
  }                                            // erase's return value to it
  else ++i;
}

This works wonderfully, but only for the standard sequence containers. Due to reasoning one might question (Item 5 does), erase’s return type for the standard associative containers is void. For those containers, you have to use the postincrement-the-iterator-you-pass-to-erase technique. (Incidentally, this kind of difference between coding for sequence containers and coding for associative containers is an example of why it’s generally ill-advised to try to write container-independent code—see Item 2.)

This is true only for the forms of erase that take iterator arguments. Associative containers also offer a form of erase taking a value argument, and that form returns the number of elements erased. Here, however, we’re concerned only with eraseing things via iterators.

Lest you be left wondering what the appropriate approach for list is, it turns out that for purposes of iterating and erasing, you can treat list like a vector/string/deque or you can treat it like an associative container; both approaches work for list.

If we take stock of everything we’ve covered in this Item, we come to the following conclusions:

To eliminate all objects in a container that have a particular value:

If the container is a vector, string, or deque, use the erase-remove idiom.

If the container is a list, use list::remove.

If the container is a standard associative container, use its erase member function.

To eliminate all objects in a container that satisfy a particular predicate:

If the container is a vector, string, or deque, use the erase-remove_if idiom.

If the container is a list, use list::remove_if.

If the container is a standard associative container, use remove_copy_if and swap, or write a loop to walk the container elements, being sure to postincrement your iterator when you pass it to erase.

To do something inside the loop (in addition to erasing objects):

If the container is a standard sequence container, write a loop to walk the container elements, being sure to update your iterator with erase’s return value each time you call it.

If the container is a standard associative container, write a loop to walk the container elements, being sure to postincrement your iterator when you pass it to erase.

As you can see, there’s more to erasing container elements effectively than just calling erase. The best way to approach the matter depends on how you identify which objects to erase, the type of container they’re stored in, and what (if anything) you want to do while you’re erasing them. As long as you’re careful and heed the advice in this Item, you’ll have no trouble. If you’re not careful, you run the risk of producing code that’s needlessly inefficient or that yields undefined behavior.

Item 10: Be aware of allocator conventions and restrictions

Allocators are weird. They were originally developed as an abstraction for memory models that would allow library developers to ignore the distinction between near and far pointers in certain 16-bit operating systems (i.e., DOS and its pernicious spawn), but that effort failed. Allocators were also designed to facilitate the development of memory managers that are full-fledged objects, but it turned out that that approach led to efficiency degradations in some parts of the STL. To avoid the efficiency hits, the C++ Standardization Committee added wording to the Standard that emasculated allocators as objects, yet simultaneously expressed the hope that they would suffer no loss of potency from the operation.

There’s more. Like operator new and operator new[], STL allocators are responsible for allocating (and deallocating) raw memory, but an allocator’s client interface bears little resemblance to that of operator new, operator new[], or even malloc. Finally (and perhaps most remarkable), most of the standard containers never ask their associated allocator for memory. Never. The end result is that allocators are, well, allocators are weird.

That’s not their fault, of course, and at any rate, it doesn’t mean they’re useless. However, before I explain what allocators are good for (that’s the topic of Item 11), I need to explain what they’re not good for. There are a number of things that allocators seem to be able to do, but can’t, and it’s important that you know the boundaries of the field before you try to start playing. If you don’t, you’ll get injured for sure. Besides, the truth about allocators is so peculiar, the mere act of summarizing it is both enlightening and entertaining. At least I hope it is.

The list of restrictions on allocators begins with their vestigial typedefs for pointers and references. As I mentioned, allocators were originally conceived of as abstractions for memory models, and as such it made sense for allocators to provide typedefs for pointers and references in the memory model they defined. In the C++ standard, the default allocator for objects of type T (cunningly known as allocator<T>) offers the typedefs allocator<T>::pointer and allocator<T>::reference, and it is expected that user-defined allocators will provide these typedefs, too.

Old C++ hands immediately recognize that this is suspect, because there’s no way to fake a reference in C++. Doing so would require the ability to overload operator. (“operator dot”), and that’s not permitted. In addition, creating objects that act like references is an example of the use of proxy objects, and proxy objects lead to a number of problems. (One such problem motivates Item 18. For a comprehensive discussion of proxy objects, turn to Item 30 of More Effective C++, where you can read about when they work as well as when they do not.)

In the case of allocators in the STL, it’s not any technical shortcomings of proxy objects that render the pointer and reference typedefs impotent, it’s the fact that the Standard explicitly allows library implementers to assume that every allocator’s pointer typedef is a synonym for T* and every allocator’s reference typedef is the same as T&. That’s right, library implementers may ignore the typedefs and use raw pointers and references directly! So even if you could somehow find a way to write an allocator that successfully provided new pointer and reference types, it wouldn’t do any good, because the STL implementations you were using would be free to ignore your typedefs. Neat, huh?

While you’re admiring that quirk of standardization, I’ll introduce another. Allocators are objects, and that means they may have member functions, nested types and typedefs (such as pointer and reference), etc., but the Standard says that an implementation of the STL is permitted to assume that all allocator objects of the same type are equivalent and always compare equal. Offhand, that doesn’t sound so awful, and there’s certainly good motivation for it. Consider this code:

template<typename T>                         // a user-defined allocator
class SpecialAllocator { ... };              // template

typedef SpecialAllocator<Widget> SAW;        // SAW = "SpecialAllocator
                                             //        for Widgets"
list<Widget, SAW> L1;
list<Widget, SAW> L2;

...

L1.splice(L1.begin(), L2);                   // move L2's nodes to the
                                             // front of L1

Recall that when list elements are spliced from one list to another, nothing is copied. Instead, a few pointers are adjusted, and the list nodes that used to be in one list find themselves in another. This makes splicing operations both fast and exception-safe. In the example above, the nodes that were in L2 prior to the splice are in L1 after the splice.

When L1 is destroyed, of course, it must destroy all its nodes (and deallocate their memory), and because it now contains nodes that were originally part of L2, L1’s allocator must deallocate the nodes that were originally allocated by L2’s allocator. Now it should be clear why the Standard permits implementers of the STL to assume that allocators of the same type are equivalent. It’s so memory allocated by one allocator object (such as L2’s) may be safely deallocated by another allocator object (such as L1’s). Without being able to make such an assumption, splicing operations would be more difficult to implement. Certainly they wouldn’t be as efficient as they can be now. (The existence of splicing operations affects other parts of the STL, too. For another example, see Item 4.)

That’s all well and good, but the more you think about it, the more you’ll realize just how draconian a restriction it is that STL implementations may assume that allocators of the same type are equivalent. It means that portable allocator objects—allocators that will function correctly under different STL implementations—may not have state. Let’s be explicit about this: it means that portable allocators may not have any nonstatic data members, at least not any that affect their behavior. None. Nada. That means, for example, you can’t have one SpecialAllocator<int> that allocates from one heap and a different SpecialAllocator<int> that allocates from a different heap. Such allocators wouldn’t be equivalent, and STL implementations exist where attempts to use both allocators could lead to corrupt runtime data structures.

Notice that this is a runtime issue. Allocators with state will compile just fine. They just may not run the way you expect them to. The responsibility for ensuring that all allocators of a given type are equivalent is yours. Don’t expect compilers to issue a warning if you violate this constraint.

In fairness to the Standardization Committee, I should point out that it included the following statement immediately after the text that permits STL implementers to assume that allocators of the same type are equivalent:

Implementors are encouraged to supply libraries that ... support non-equal instances. In such implementations, ... the semantics of containers and algorithms when allocator instances compare non-equal are implementation-defined.

This is a lovely sentiment, but as a user of the STL who is considering the development of a custom allocator with state, it offers you next to nothing. You can take advantage of this statement only if (1) you know that the STL implementations you are using support inequivalent allocators, (2) you are willing to delve into their documentation to determine whether the implementation-defined behavior of “non-equal” allocators is acceptable to you, and (3) you’re not concerned about porting your code to STL implementations that may take advantage of the latitude expressly extended to them by the Standard. In short, this paragraph—paragraph 5 of section 20.1.5, for those who insist on knowing—is the Standard’s “I have a dream” speech for allocators. Until that dream becomes common reality, programmers concerned about portability will limit themselves to custom allocators with no state.

I remarked earlier that allocators are like operator new in that they allocate raw memory, but their interface is different. This becomes apparent if you look at the declaration of the most common forms of operator new and allocator<T>::allocate:

void* operator new(size_t bytes);

pointer allocator<T>::allocate(size_type numObjects);
                                       // recall that "pointer" is a typedef
                                       // that's virtually always T*

Both take a parameter specifying how much memory to allocate, but in the case of operator new, this parameter specifies a certain number of bytes, while in the case of allocator<T>::allocate, it specifies how many T objects are to fit in the memory. On a platform where sizeof(int) == 4, for example, you’d pass 4 to operator new if you wanted enough memory to hold an int, but you’d pass 1 to allocator<int>::allocate. (The type of this parameter is size_t in the case of operator new, while it’s allocator<T>::size_type in the case of allocate. In both cases, it’s an unsigned integral type, and typically allocator<T>::size_type is a typedef for size_t, anyway.) There’s nothing “wrong” about this discrepancy, but the inconsistent conventions between operator new and allocator<T>::allocate complicate the process of applying experience with custom versions of operator new to the development of custom allocators.

operator new and allocator<T>::allocate differ in return types, too. operator new returns a void*, which is the traditional C++ way of representing a pointer to uninitialized memory. allocator<T>::allocate returns a T* (via the pointer typedef), which is not only untraditional, it’s premeditated fraud. The pointer returned from allocator<T>::allocate doesn’t point to a T object, because no T has yet been constructed! Implicit in the STL is the expectation that allocator<T>::allocate’s caller will eventually construct one or more T objects in the memory it returns (possibly via allocator<T>::construct, via uninitialized_fill, or via some application of raw_storage_iterators), though in the case of vector::reserve or string::reserve, that may never happen (see Item 14). The difference in return type between operator new and allocator<T>::allocate indicates a change in the conceptual model for uninitialized memory, and it again makes it harder to apply knowledge about implementing operator new to the development of custom allocators.

That brings us to the final curiosity of STL allocators, that most of the standard containers never make a single call to the allocators with which they are instantiated. Here are two examples:

list<int> L;                      // same as list<int, allocator<int> >;
                                  // allocator<int> is never asked to
                                  // allocate memory!

set<Widget, SAW> s;               // recall that SAW is a typedef for
                                  // SpecialAllocator<Widget>; no
                                  // SAW will ever allocate memory!

This oddity is true for list and all the standard associative containers (set, multiset, map, and multimap). That’s because these are node-based containers, i.e., containers based on data structures in which a new node is dynamically allocated each time a value is to be stored. In the case of list, the nodes are list nodes. In the case of the standard associative containers, the nodes are usually tree nodes, because the standard associative containers are typically implemented as balanced binary search trees.

Think for a moment about how a list<T> is likely to be implemented. The list itself will be made up of nodes, each of which holds a T object as well as pointers to the next and previous nodes in the list:

template<typename T,                                   // possible list
         typename Allocator = allocator<T> >           // implementation
class list {
private:

  Allocator alloc;                   // allocator for objects of type T

  struct ListNode {                  // nodes in the linked list
    T data;
    ListNode *prev;
    ListNode *next;
  };

  ...

};

When a new node is added to the list, we need to get memory for it from an allocator, but we don’t need memory for a T, we need memory for a ListNode that contains a T. That makes our Allocator object all but useless, because it doesn’t allocate memory for ListNodes, it allocates memory for Ts. Now you understand why list never asks its Allocator to do any allocation: the allocator can’t provide what list needs.

What list needs is a way to get from the allocator type it has to the corresponding allocator for ListNodes. This would be tough were it not that, by convention, allocators provide a typedef that does the job. The typedef is called other, but it’s not quite that simple, because other is a typedef nested inside a struct called rebind, which itself is a template nested inside the allocator—which itself is a template!

Please don’t try to think about that last sentence. Instead, look at the code below, then proceed directly to the explanation that follows.

template<typename T>              // the standard allocator is declared
class allocator {                 // like this, but this could be a user-
public:                           // written allocator template, too

  template<typename U>
  struct rebind {
    typedef allocator<U> other;
  };

  ...
};

In the code implementing list<T>, there is a need to determine the type of the allocator for ListNodes that corresponds to the allocator we have for Ts. The type of the allocator we have for Ts is the template parameter Allocator. That being the case, the type of the corresponding allocator for ListNodes is this:

                         Allocator::rebind<ListNode>::other

Stay with me here. Every allocator template A (e.g., std::allocator, SpecialAllocator, etc.) is expected to have a nested struct template called rebind. rebind takes a single type parameter, U, and defines nothing but a typedef, other. other is simply a name for A<U>. As a result, list<T> can get from its allocator for T objects (called Allocator) to the corresponding allocator for ListNode objects by referring to Allocator::rebind<ListNode>::other.

Maybe this makes sense to you, maybe it doesn’t. (If you stare at it long enough, it will, but you may have to stare a while. I know I had to.) As a user of the STL who may want to write a custom allocator, you don’t really need to know how it works. What you do need to know is that if you choose to write allocators and use them with the standard containers, your allocators must provide the rebind template, because standard containers assume it will be there. (For debugging purposes, it’s also helpful to know why node-based containers of T objects never ask for memory from the allocators for T objects.)

Hallelujah! We are finally done examining the idiosyncrasies of allocators. Let us therefore summarize the things you need to remember if you ever want to write a custom allocator.

• Make your allocator a template, with the template parameter T representing the type of objects for which you are allocating memory.

• Provide the typedefs pointer and reference, but always have pointer be T* and reference be T&.

• Never give your allocators per-object state. In general, allocators should have no nonstatic data members.

• Remember that an allocator’s allocate member functions are passed the number of objects for which memory is required, not the number of bytes needed. Also remember that these functions return T* pointers (via the pointer typedef), even though no T objects have yet been constructed.

• Be sure to provide the nested rebind template on which standard containers depend.

Most of what you have to do to write your own allocator is reproduce a fair amount of boilerplate code, then tinker with a few member functions, notably allocate and deallocate. Rather than writing the boilerplate from scratch, I suggest you begin with the code at Josuttis’ sample allocator web page [23] or in Austern’s article, “What Are Allocators Good For?” [24].

Once you’ve digested the information in this Item, you’ll know a lot about what allocators cannot do, but that’s probably not what you want to know. Instead, you’d probably like to know what allocators can do. That’s a rich topic in its own right, a topic I call “Item 11.”

Item 11: Understand the legitimate uses of custom allocators

So you’ve benchmarked, profiled, and experimented your way to the conclusion that the default STL memory manager (i.e., allocator<T>) is too slow, wastes memory, or suffers excessive fragmentation for your STL needs, and you’re certain you can do a better job yourself. Or you discover that allocator<T> takes precautions to be thread-safe, but you’re interested only in single-threaded execution and you don’t want to pay for the synchronization overhead you don’t need. Or you know that objects in certain containers are typically used together, so you’d like to place them near one another in a special heap to maximize locality of reference. Or you’d like to set up a unique heap that corresponds to shared memory, then put one or more containers in that memory so they can be shared by other processes. Congratulations! Each of these scenarios corresponds to a situation where custom allocators are well suited to the problem.

For example, suppose you have special routines modeled after malloc and free for managing a heap of shared memory,

void* mallocShared(size_t bytesNeeded);

void freeShared(void *ptr);

and you’d like to make it possible to put the contents of STL containers in that shared memory. No problem:

template<typename T>
class SharedMemoryAllocator {
public:
  ...

  pointer allocate(size_type numObjects, const void *localityHint = 0)
  {
    return static_cast<pointer>(mallocShared(numObjects * sizeof(T)));</pointer>
  }

  void deallocate(pointer ptrToMemory, size_type numObjects)
  {
    freeShared(ptrToMemory);
  }

  ...
};

For information on the pointer type as well as the cast and the multiplication inside allocate, see Item 10.

You could use SharedMemoryAllocator like this:

// convenience typedef
typedef
  vector<double, SharedMemoryAllocator<double> > SharedDoubleVec;
...

{                                  // begin some block
  ...

  SharedDoubleVec v;               // create a vector whose elements
                                   // are in shared memory
  ...
}                                  // end the block

The wording in the comment next to v’s definition is important. v is using a SharedMemoryAllocator, so the memory v allocates to hold its elements will come from shared memory. v itself, however—including all its data members—will almost certainly not be placed in shared memory. v is just a normal stack-based object, so it will be located in whatever memory the runtime system uses for all normal stack-based objects. That’s almost never shared memory. To put both v’s contents and v itself into shared memory, you’d have to do something like this:

void *pVectorMemory =                     // allocate enough shared
  mallocShared(sizeof(SharedDoubleVec));  // memory to hold a
                                          // SharedDoubleVec object

SharedDoubleVec *pv =                     // use "placement new" to
  new (pVectorMemory) SharedDoubleVec;    // create a SharedDoubleVec
                                          // object in the memory;
                                          // see below

...                                       // use the object (via pv)

pv->~SharedDoubleVec();                   // destroy the object in the
                                          // shared memory

freeShared(pVectorMemory);                // deallocate the initial
                                          // chunk of shared memory

I hope the comments make clear how this works. Fundamentally, you acquire some shared memory, then construct a vector in it that uses shared memory for its own internal allocations. When you’re done with the vector, you invoke its destructor, then release the memory the vector occupied. The code isn’t terribly complicated, but it’s a lot more demanding than just declaring a local variable as we did above. Unless you really need a container (as opposed to its elements) to be in shared memory, I encourage you to avoid this manual four-step allocate/construct/destroy/deallocate process.

In this example, you’ve doubtless noticed that the code ignores the possibility that mallocShared might return a null pointer. Obviously, production code would have to take such a possibility into account. Also, construction of the vector in the shared memory is accomplished by “placement new.” If you’re unfamiliar with placement new, your favorite C++ text should be able to introduce you. If that text happens to be More Effective C++, you’ll find that the pleasantries are exchanged in Item 8.

As a second example of the utility of allocators, suppose you have two heaps, identified by the classes Heap1 and Heap2. Each heap class has static member functions for performing allocation and deallocation:

class Heap1 {
public:
  ...
  static void* alloc(size_t numBytes, const void *memoryBlockToBeNear);
  static void dealloc(void *ptr);
  ...
};

class Heap2 { ... };                  // has the same alloc/dealloc interface

Further suppose you’d like to co-locate the contents of some STL containers in different heaps. Again, no problem. First you write an allocator designed to use classes like Heap1 and Heap2 for the actual memory management:

template<typename T, typename Heap>
class SpecificHeapAllocator {
public:
  ...

  pointer allocate(size_type numObjects, const void *localityHint = 0)
  {
    return static_cast<pointer> (Heap::alloc(numObjects * sizeof(T),
                                 localityHint));
}

void deallocate(pointer ptrToMemory, size_type numObjects)
{
   Heap::dealloc(ptrToMemory);
}

...
};

Then you use SpecificHeapAllocator to cluster containers’ elements together:

vector<int, SpecificHeapAllocator<int, Heap1> > v;        // put both v's and
set<int, SpecificHeapAllocator<int, Heap1> > s;           // s's elements in
                                                          // Heap1

list<Widget,
     SpecificHeapAllocator<Widget, Heap2> > L;            // put both L's and
map<int, string, less<int>,                               // m's elements in
    SpecificHeapAllocator<pair<const int, string>,        // Heap2
                          Heap2> > m;

In this example, it’s quite important that Heap1 and Heap2 be types and not objects. The STL offers a syntax for initializing different STL containers with different allocator objects of the same type, but I’m not going to show you what it is. That’s because if Heap1 and Heap2 were objects instead of types, they’d be inequivalent allocators, and that would violate the equivalence constraint on allocators that is detailed in Item 10.

As these examples demonstrate, allocators are useful in a number of contexts. As long as you obey the constraint that all allocators of the same type must be equivalent, you’ll have no trouble employing custom allocators to control general memory management strategies, clustering relationships, and use of shared memory and other special heaps.

Item 12: Have realistic expectations about the thread safety of STL containers

The world of standard C++ is rather sheltered and old-fashioned. In this rarefied world, all executables are statically linked. Neither memory-mapped files nor shared memory exist. There are no window systems, no networks, no databases, no other processes. That being the case, you should not be surprised to learn that the Standard says not a word about threading. The first expectation you should have about the thread safety of the STL, then, is that it will vary from implementation to implementation.

Of course, multithreaded programs are common, so most STL vendors strive to make their implementations work well in a threaded environment. Even when they do a good job, however, much of the burden remains on your shoulders, and it’s important to understand why. There’s only so much STL vendors can do to ease your multithreading pain, and you need to know what it is.

The gold standard in support for multithreading in STL containers (and the aspiration of most vendors) has been defined by SGI and is published at their STL Web Site [21]. In essence, it says that the most you can hope for from an implementation is the following.

Multiple readers are safe. Multiple threads may simultaneously read the contents of a single container, and this will work correctly. Naturally, there must not be any writers acting on the container during the reads.

Multiple writers to different containers are safe. Multiple threads may simultaneously write to different containers.

That’s all, and let me make clear that this is what you can hope for, not what you can expect. Some implementations offer these guarantees, but some do not.

Writing multithreaded code is hard, and many programmers wish that STL implementations were completely thread safe out of the box. Were that the case, programmers might hope to be relieved of the need to attend to concurrency control themselves. There’s no doubt this would be a convenient state of affairs, but it would also be very difficult to achieve. Consider the following ways a library might try to implement such comprehensive container thread safety:

• Lock a container for the duration of each call to its member functions.

• Lock a container for the lifetime of each iterator it returns (via, e.g., calls to begin or end).

• Lock a container for the duration of each algorithm invoked on that container. (This actually makes no sense, because, as Item 32 explains, algorithms have no way to identify the container on which they are operating. Nevertheless, we’ll examine this option here, because it’s instructive to see why it wouldn’t work even if it were possible.)

Now consider the following code. It searches a vector<int> for the first occurrence of the value 5, and, if it finds one, changes that value to 0.

vector<int> v;

...

vector<int>::iterator first5(find(v.begin(), v.end(), 5));      // Line 1

if (first5 != v.end()) {                                        // Line 2
    *first5 = 0;                                                // Line 3
}

In a multithreaded environment, it’s possible that a different thread will modify the data in v immediately after completion of Line 1. If that were to happen, the test of first5 against v.end on Line 2 would be meaningless, because v’s values would be different from what they were at the end of Line 1. In fact, such a test could yield undefined results, because another thread could have intervened between Lines 1 and 2 and invalidated first5, perhaps by performing an insertion that caused the vector to reallocate its underlying memory. (That would invalidate all the vector’s iterators. For details on this reallocation behavior, turn to Item 14.) Similarly, the assignment to *first5 on Line 3 is unsafe, because another thread might execute between Lines 2 and 3 in such a way as to invalidate first5, perhaps by erasing the element it points to (or at least used to point to).

None of the approaches to locking listed above would prevent these problems. The calls to begin and end in Line 1 both return too quickly to offer any help, the iterators they generate last only until the end of that line, and find also returns at the end of that line.

For the code above to be thread safe, v must remain locked from Line 1 through Line 3, and it’s difficult to imagine how an STL implementation could deduce that automatically. Bearing in mind the typically high cost of synchronization primitives (e.g., semaphores, mutexes, etc.), it’s even more difficult to imagine how an implementation could do it without imposing a significant performance penalty on programs that knew a priori—that were designed in such a way—that no more than one thread had access to v during the course of Lines 1-3.

Such considerations explain why you can’t expect any STL implementation to make your threading woes disappear. Instead, you’ll have to manually take charge of synchronization control in these kinds of scenarios. In this example, you might do it like this:

vector<int> v;

...

getMutexFor(v);

vector<int>::iterator first5(find(v.begin(), v.end(), 5));
if (first5 != v.end()) {                                     // this is now safe
    *first5 = 0;                                             // so is this
}

releaseMutexFor(v);

A more object-oriented solution is to create a Lock class that acquires a mutex in its constructor and releases it in its destructor, thus minimizing the chances that a call to getMutexFor will go unmatched by a call to releaseMutexFor. The essence of such a class (really a class template) is this:

template<typename Container>       // skeletal template for classes
class Lock {                       // that acquire and release mutexes
public:                            // for containers; many details
                                   // have been omitted
  Lock(const Container& container)
  : c(container)
  {
      getMutexFor(c);              // acquire mutex in the constructor
  }

  ~Lock()
  {
      releaseMutexFor(c);          // release it in the destructor
  }
private:
  const Container& c;
};

The idea of using a class (like Lock) to manage the lifetime of resources (such as mutexes) is generally known as resource acquisition is initialization, and you should be able to read about it in any comprehensive C++ textbook. A good place to start is Stroustrup’s The C++ Programming Language [7], because Stroustrup popularized the idiom, but you can also turn to Item 9 of More Effective C++. No matter what source you consult, bear in mind that the above Lock is stripped to the bare essentials. An industrial-strength version would require a number of enhancements, but such a fleshing-out would have nothing to do with the STL. Furthermore, this minimalist Lock is enough to see how we could apply it to the example we’ve been considering:

vector<int> v;

...

{                                                 // create new block
  Lock<vector<int> > lock(v);                     // acquire mutex

  vector<int>::iterator first5(find(v.begin(), v.end(), 5));

  if (first5 != v.end()) {
      *first5 = 0;
  }
}                                                 // close block, automatically
                                                  // releasing the mutex

Because a Lock object releases the container’s mutex in the Lock’s destructor, it’s important that the Lock be destroyed as soon as the mutex should be released. To make that happen, we create a new block in which to define the Lock, and we close that block as soon as we no longer need the mutex. This sounds like we’re just trading the need to call releaseMutexFor with the need to close a new block, but that’s not an accurate assessment. If we forget to create a new block for the Lock, the mutex will still be released, but it may happen later than it should—when control reaches the end of the enclosing block. If we forget to call releaseMutexFor, we never release the mutex.

Furthermore, the Lock-based approach is robust in the presence of exceptions. C++ guarantees that local objects are destroyed if an exception is thrown, so Lock will release its mutex even if an exception is thrown while we’re using the Lock object. If we relied on manual calls to getMutexFor and releaseMutexFor, we’d never relinquish the mutex if an exception was thrown after calling getMutexFor but before calling releaseMutexFor.

There’s a loophole in the guarantee. If an exception is not caught at all, the program will terminate. In that case, local objects (such as lock) may not have their destructors called. Some compilers call them, some do not. Both behaviors are valid.

Exceptions and resource management are important, but they’re not the subject of this Item. This Item is about thread safety in the STL. When it comes to thread safety and STL containers, you can hope for a library implementation that allows multiple readers on one container and multiple writers on separate containers. You can’t hope for the library to eliminate the need for manual concurrency control, and you can’t rely on any thread support at all.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
13.59.96.247