Start a Large Task in Parallel with the Main Program

Instead of having all threads execute portions of a problem, it is possible to start a task in parallel with the main application. We’ve seen a number of requests for how to do this. The trick is to use a nonexecuting dummy task as the parent on which to synchronize, as shown in Example 11-37. Something very close to this trick is already used in tbb/parallel_while.h and tbb/parallel_scan.h, shown earlier.

One of the beautiful things about this approach is that each half of the program is free to invoke as much parallelism as it desires. The task-based approach of Threading Building Blocks does the load balancing and manages the assignment of tasks to threads without causing oversubscription.

Example 11-37. Using a dummy task for synchronization

 1 // The technique is similar to one used in tbb/parallel_while.h
 2
 3 #include "tbb/task.h"
 4 #include "tbb/task_scheduler_init.h"
 5 #include <stdio.h>
 6 #include <stdlib.h>
 7
 8 //! Some busywork
 9 void TwiddleThumbs( const char * message, int n ) {
10     for( int i=0; i<n; ++i ) {
11          printf(" %s: i=%d
",message,i);
12          static volatile int x;
13          for( int j=0; j<20000000; ++j )
14             ++x;
15     }
16 }
17
18 //! SideShow task
19 class SideShow: public tbb::task {
20     tbb::task* execute( ){
21         TwiddleThumbs("Sideshow task",4);
22         return NULL;
23     }
24 };
25
26 //! Start up a SideShow task.
27 //! Return pointer to dummy task that acts as parent of the SideShow.
28 tbb::empty_task* StartSideShow( ) {
29     tbb::empty_task* parent = new( tbb::task::allocate_root( ) ) tbb::empty_task;
30     // 2 = 1 for SideShow and C
31     parent->set_ref_count(2);
32     SideShow* s = new( parent->allocate_child( ) ) SideShow;
33     parent->spawn(*s);
34     return parent;
35 }
36
37 //! Wait for SideShow task. Argument is dummy parent of the SideShow.
38 void WaitForSideShow( tbb::empty_task* parent ) {
39     parent->wait_for_all( );
40     // parent not actually run, so we need to destroy it explicitly.
41     // (If you forget this line, the debug version of tbb reports a task leak.)
42     parent->destroy(*parent);
43 }
44
45 //! Optional command-line argument is number of threads to use. Default is 2.
46 int main( int argc, char* argv[] ) {
47     tbb::task_scheduler_init init( argc>1 ? strtol(argv[1],0,0) : 2 );
48     // Loop over n tests various cases where SideShow/Main finish twiddling first.
49     for( int n=3; n<=5; ++n ) {
50         printf("
test with n=%d
",n);
51
52         // Start up a Sideshow task
53         tbb::empty_task* e = StartSideShow( );
54
55         // Do some useful work
56         TwiddleThumbs("master",n);
57
58         // Wait for Sideshow task to complete
59         WaitForSideShow(e);
60     }
61     return 0;
62 }

In the example, the main program starts up an additional task called SideShow as the child of a dummy parent task. The parent task is never started and is therefore well suited to use in synchronization to determine whether and when the SideShow has completed. The example in the following section builds on this one to solve a common problem in parallel programs.

The main and SideShow tasks are free to create more tasks by using parallel algorithms from Threading Building Blocks or the task scheduler. There is no danger of oversubscription, so there is no need for the SideShow developer and the developer of the main program to coordinate their decisions on parallelism unless they share some data. If SideShow and the main program share, the developers need only talk about safe concurrent access to data. There is still no need to discuss load balancing because it is automatic when using Threading Building Blocks to manage your parallelism.

The program is instrumented with some simple printf calls to show the various cases where the main program completes before and after the SideShow task. Table 11-1 shows the output from a dual-core system running Windows Vista and using Intel Threading Building Blocks 1.1 for Windows. The program was run with input arguments of one, two, and four threads, to set the number of threads for the purposes of illustration. With only one thread you can see there is no parallelism.

Table 11-1. Output from TwiddleThumbs in the example

Only one thread run (run on a two core system)

Run with two threads (run on a two core system)

Run with four threads (run on a two- core system)

 test with n=3
  master: i=0
  master: i=1
  master: i=2
  Sideshow task: i=0
  Sideshow task: i=1
  Sideshow task: i=2
  Sideshow task: i=3

test with n=3
 master: i=0
  Sideshow task: i=0
  master: i=1
  Sideshow task: i=1
  master: i=2
  Sideshow task: i=2
  Sideshow task: i=3

 test with n=3
  master: i=0
  Sideshow task: i=0
  Sideshow task: i=1
  master: i=1
  Sideshow task: i=2
  master: i=2
  Sideshow task: i=3

 test with n=4
  master: i=0
  master: i=1
  master: i=2
  master: i=3
  Sideshow task: i=0
  Sideshow task: i=1
  Sideshow task: i=2
  Sideshow task: i=3

 test with n=4
  master: i=0
  Sideshow task: i=0
  Sideshow task: i=1
  master: i=1
  master: i=2
  Sideshow task: i=2
  master: i=3
  Sideshow task: i=3

 test with n=4
  master: i=0
  Sideshow task: i=0
  master: i=1
  Sideshow task: i=1
  master: i=2
  Sideshow task: i=2
  master: i=3
  Sideshow task: i=3

 test with n=5
  master: i=0
  master: i=1
  master: i=2
  master: i=3
  master: i=4
  Sideshow task: i=0
  Sideshow task: i=1
  Sideshow task: i=2
  Sideshow task: I=3

 test with n=5
  master: i=0
  Sideshow task: i=0
  master: i=1
  Sideshow task: i=1
  master: i=2
  master: i=3
  Sideshow task: i=2
  master: i=4
  Sideshow task: i=3

 test with n=5
  master: i=0
  Sideshow task: i=0
  Sideshow task: i=1
  Sideshow task: i=2
  master: i=1
  master: i=2
  Sideshow task: i=3
  master: i=3
  master: i=4

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.221.53.209