CountStrings: Using concurrent_hash_map

The container concurrent_hash_map (Chapter 5) is similar to the associative containers of STL, but it permits concurrent accesses to its elements. The hash table provides a way to use associative arrays that allows you to store data indexed by keys of any type you desire. For this example, we will start with a program that uses a standard STL map (hash map) to count the occurrences of distinct strings in an array and uses the parallel_for template to run in parallel. Because the STL map is not thread-safe (STL containers are not thread-safe in general), synchronization is required to avoid corruption of the map when more than one thread tries to access it concurrently.

Example 11-27 shows our initial hybrid implementation. Step by step, we’ll examine how to replace the uses of the STL map with a Threading Building Blocks concurrent_hash_map. Because concurrent_hash_map is thread-safe, this will allow us to remove the coarse-grained synchronization using a native lock, which STL required.

Example 11-27. CountStrings using STL map with a coarse-grained lock

 1 #include "my_native_lock_wrappers.h"
 2 #include <map>
 3 #include "tbb/blocked_range.h"
 4 #include "tbb/parallel_for.h"
 5 #include "tbb/tick_count.h"
 6 #include "tbb/task_scheduler_init.h"
 7 #include <string>
 8 #include <cctype>
 9
10 using namespace tbb;
11 using namespace std;
12
13 LOCK_TYPE my_lock;
14
15 //! maps strings to ints.
16 typedef map<string,int> StringTable;
17
18 //! Function object for counting occurrences of strings.
19 struct Tally {
20     StringTable& table;
21     Tally( StringTable& table_ ) : table(table_) {}
22     void operator( )( const blocked_range<string*> range ) const {
23         for( string* p=range.begin(); p!=range.end( ); ++p ) {
24             LOCK_WRAPPER(&my_lock);
25             table[*p] += 1;
26             UNLOCK_WRAPPER(&my_lock);
27         }
28     }
29 };
30
31 const size_t N = 1000000;
32
33 static string Data[N];
34
35 static void CountOccurrences( ) {
36     StringTable table;
37     LOCK_INIT_WRAPPER(&my_lock);
38     tick_count t0 = tick_count::now( );
39     parallel_for( blocked_range<string*>( Data, Data+N, 1000 ),
40                   Tally(table) );
41     tick_count t1 = tick_count::now( );
42
43     int n = 0;
44     for( StringTable::iterator i=table.begin(); i!=table.end( ); ++i ) {
45         n+=i->second;
46     }
47     printf("total=%d time = %g
",n,(t1-t0).seconds( ));
48     LOCK_DESTROY_WRAPPER(&my_lock);
49 }
50
51 static const string Adjective[] =
52     { "sour", "sweet", "bitter", "salty", "big", "small" };
53
54 static const string Noun[] =
55     { "apple", "banana", "cherry", "date", "eggplant",
56       "fig", "grape", "honeydew", "icao", "jujube" };
57
58 static void CreateData( ) {
59     size_t n_adjective = sizeof(Adjective)/sizeof(Adjective[0]);
60     size_t n_noun = sizeof(Noun)/sizeof(Noun[0]);
61     for( int i=0; i<N; ++i ) {
62         Data[i] = Adjective[rand( )%n_adjective];
63         Data[i] += " ";
64         Data[i] += Noun[rand( )%n_noun];
65     }
66 }
67
68 int main( int argc, char* argv[] ) {
69     srand(2);
70     task_scheduler_init init;
71     CreateData( );
72     CountOccurrences( );
73 }

The example code uses an STL map protected by a coarse-grained native lock because the STL map is not safe for concurrent use. This coarse-grained lock is necessary to ensure that the STL map is not corrupted, but limits concurrency.

On line 39, the CountOccurrences method uses a parallel_for template with a Tally object as the body and a blocked_range<string *> object to describe the range, including a grain size of 1000.

The operator() in Tally, lines 22–28, performs the actual calculation in which each thread iterates through its assigned portion of data and increments the corresponding element in the STL map table. The single lock that controls access to the entire map is acquired at line 24 and then released at line 26.

Line 47 prints the total time used to tally the occurrences as well as the total number of strings that were inspected (this total should always equal the data set size of N).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.216.230.107