There's more...

The key feature of CUDA that makes this programming model substantially different from other parallel models (normally used on CPUs) is that in order to be efficient, it requires thousands of threads to be active. This is made possible by the typical structure of GPUs, which use light threads and also allow the creation and modification of execution contexts in a very fast and efficient way.

Note that the scheduling of threads is directly linked to the GPU architecture and its intrinsic parallelism. In fact, a block of threads is assigned to a single SM. Here, the threads are further divided into groups, called warps. The threads that belong to the same warp are managed by the warp scheduler. To take full advantage of the inherent parallelism of the SM, the threads of the same warp must execute the same instruction. If this condition does not occur, then we speak of threads divergence.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.216.34.146