Warp-level primitive programming

CUDA 9.0 introduces new warp synchronous programming. This major change aims to avoid CUDA programming relying on implicit warp synchronize operations and handling synchronous targets explicitly. This helps to prevent inattentive race conditions and deadlocks in warp-wise synchronous operations.

Historically, CUDA provided only one explicit synchronization API, __syncthreads() for the CUDA threads in a thread block and it relied on the implicit synchronization of a warp. The following figure shows two levels of synchronization of a CUDA thread block's operation:

However, the latest GPU architectures (Volta and Turing) have an enhanced thread control model, where each thread can execute a different instruction, while they keep its SIMT programming model. The following diagram shows how it has changed:

Until the Pascal architecture (left), threads were scheduled at warp level, and they were synchronized implicitly within a warp. Therefore, CUDA threads in a warp synchronized implicitly. However, this had unintended deadlock potential.

The Volta architecture renovated this and introduced independent thread scheduling. This control model enables each CUDA thread to have its program counter and allows sets of participating threads in a warp. In this model, we have to use an explicit synchronous API to specify each CUDA thread's operations.

As a result, CUDA 9 introduced explicit warp-level primitive functions:

Warp-level primitive functions

Identifying active threads

__activemask()

Masking active threads

__all_sync()__any_sync()__uni_sync()__ballot_sync()

__match_any_sync()__match_all_sync()

Synchronized data exchange

__shfl_sync()__shfl_up_sync()__shfl_down_sync()__shfl_xor_sync()

Threads synchronization

__syncwarp()

 

There are three categories of warp-wise primitive functions, which are warp identification, warp operations, and synchronization. All these functions implicitly specify synchronization targets to avoid unintended race conditions.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.138.116.20