Modularity

With Cooperative Groups, programmers can modularize their collective operation kernel codes corresponding to the barrier target. This helps to avoid oversights, causing deadlock and race conditions by assuming all threads are running concurrently. The following is an example of a deadlock and normal operations by CUDA thread synchronization:

For the left-hand side example, the kernel code intends to synchronize a part of the thread in a CUDA thread block. This code minimizes synchronization overhead by specifying barrier targets. However, it introduces a deadlock situation because __syncthreads() invokes a barrier, which waits for all CUDA threads to reach the barrier. However, __synchthroead() cannot meet the others and waits. The right-handed side example shows sound operation since it does not have any deadlock point because all the threads in the thread block can meet __syncthreads().

In the Cooperative Groups API, on the other hand, the CUDA programmers specify thread groups to synchronize. The Cooperative Groups enable explicit synchronization targets so that the programmers can let CUDA threads synchronize explicitly. This item can also be treated as an instance so that we can pass the instance to the device functions.

The following code shows how Cooperative Groups provide explicit synchronization objects and let them be handled as an instance:

__device__ bar(thread_group block, float *x) {
...
block.sync();
...
}

__global__ foo() {
bar(this_thread_block(), float *x);
}

As you can see in the preceding example code, the kernel code can specify synchronization groups and pass them as a parameter as a thread_group. This helps us to specify the synchronize targets in the subroutines. Therefore, programmers can prevent inadvertent deadlock by using Cooperative Groups. Also, we can set different types of groups as a thread_group type and reuse synchronization code.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.13.255