Appendix C. Multithreading

Neither the C++ standard nor the Library Technical Report says much about multithreaded applications, in part because most programs do not need to be multithreaded, and most programmers should not be writing multithreaded code. For those uncommon cases in which multithreading is appropriate, however, it’s important to understand what you need to do to use the standard C++ library and the TR1 library components safely in a multithreaded application (Section C.1). Remember, though, that none of this is required by either the C++ standard or the Library Technical Report but instead reflects the general concensus on how to design libraries for use in multithreaded applications (Section C.2). Always check your library’s documentation.

C.1. Problems

When writing a multithreaded application, you have to think about two problems: avoiding conflicting changes to shared data and making sure that changes to shared data are visible to other threads. Most programmers are familiar with the first of these; the second is becoming more important as we move to hardware systems with multiple CPUs.

If two threads are changing the same data object at the same time, the result will likely be a nonviable hybrid, with some parts from one thread and other parts from the other thread. To prevent this from happening, you have to make sure that all the changes made by one thread have been written to the data object before the other thread makes any changes. You do this by locking a mutex object; when a thread tries to lock a mutex object that is already locked, the execution of that thread is suspended until the thread that locked the mutex object unlocks it. This serializes access to the shared data object, giving each of the threads a coherent view of the contents of the object.

In a hardware system with multiple CPUs, all of them usually share most or all of the system’s main memory, and each CPU has its own private cache memory. Each CPU reads data from and writes data to its cache; this speeds up the CPU’s work because the CPU doesn’t have to contend with other CPUs for access to main memory. However, because the CPU is working with its cache instead of main memory, it might not see changes that other CPUs have made to the contents of main memory, and changes made by it might not have made it out to main memory, so won’t be seen by other CPUs. So from time to time, each CPU has to synchronize the data in its cache with the corresponding data in main memory: Any data that has been written to the cache has to be flushed to main memory, and any data that another CPU has changed in main memory has to be refreshed in the cache.

On most systems, changes made to a shared object before creation of a thread are visible to code in that thread. At the other end, when a thread terminates, any changes made to a shared object by that thread are visible to any thread that checks the status of the terminated thread—by, for example, calling pthread_join on POSIX systems. In between, after a thread unlocks a mutex, all changes to global data made by that thread are visible to any other thread that subsequently locks the mutex:

int g1 =  0;
int g2 =  0;

// thread 1                        // thread 2
g1 = 1;                            // …
start_thread2 ();                  assert (g1 == 1);     // 1
lock_mutex ();                     lock_mutex ();
unlock_mutex ();                   g1 =  2;
assert (g1 == 2);     // 2         unlock_mutex ();
wait_for_thread2 ();               g2 =  3;
assert (g2 == 3);     // 3         exit_thread ();

In the preceding code, assert number 1 will always succeed. The change made to g1 by thread 1 is visible to thread 2 because thread 1 created thread 2 after it changed g1. Also, assert number 3 will always succeed. Thread 2 set the value of g2 to 3 before it terminated, and thread 1 waited until thread 2 terminated before taking its final look at the value of g2.

The second assert is more complicated. If thread 2 locked the mutex before thread 1 did, the assert will succeed. In that case, thread 2 made the change before unlocking the mutex, so the change will be visible to threads that subsequently lock the mutex. On the other hand, if thread 1 locked the mutex before thread 2 did, there’s no way to know what will happen. There is no guarantee that the change made by thread 2 will be visible, but there is also no guarantee that it won’t. So thread 1 could see either 1 or 2 in g1. But it’s worse than that: It’s possible that a thread switch could occur while thread 2 is writing the new value to g1, creating a nonviable hybrid. This is a race condition: The validity and the result of the code depend, in unpredictable ways, on the timing of the execution of the code in the two threads.

C.2. Libraries and Multithreading

If you think about it, it should become clear that a library cannot solve these multithreading problems. Ensuring that accesses to shared data are properly synchronized requires that an application be designed with thread safety in mind. For example:

void show ()
  {
  std :: cout << " hello, " << " world  ";
  }

The two insertion operations are separate functions. The implementation of the stream inserter can protect cout’s internal data from conflicting writes, but if show is called from multiple threads, there is nothing the implementation can reasonably do to prevent a thread switch between the two insertions. The output of such a program could have multiple hellos and multiple worlds instead of the orderly sequence of lines that the programmer might have expected. To maintain the proper order, the application must guard all stream insertions:

void show ()
  {
  mutex lock ; // constructor locks, destructor unlocks
  std :: cout << " hello, " << " world  ";
  }

However, library writers should avoid making things more difficult for application designers concerned with writing fast and robust multithreaded applications. As we saw earlier, applying visibility rules requires knowing when the contents of a shared data object have been changed. A good library implementation will change the contents of a shared data object only when a program explicitly changes that object.

That may seem obvious, but sneak paths often modify objects behind the scenes. For example, at one time or another, you’ve probably written code to count the number of objects of a particular type that are in existence: You add a static data member to hold the count, and then each constructor increments the count, and each destructor decrements it. This seems innocuous, but if you create two of these objects in two separate threads, you may find that neither one has the correct count; the static counter is shared data, and without synchronization, there is no guarantee that changes made by either thread will be visible to the other.[1]

Most library implementations make the following promises.

• Multiple threads can read data from the same object without interference

• Changes to one object do not affect other objects of the same type.

Keeping these promises in mind, you can write robust multithreaded applications by applying the visibility and locking rules that we just talked about. Using objects from the standard C++ library and the TR1 library won’t introduce additional problems.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.148.108.28