Implementing memory management with PyCUDA

PyCUDA programs should respect the rules dictated by the structure and the internal organization of SM that impose constraints on thread performances. In fact, the knowledge and the correct use of various types of memory that the GPU makes available are fundamental in order to achieve maximum efficiency. In those GPU cards, enabled for CUDA use, there are four types of memory, which are as follows:

Registers: Each thread is assigned a memory register which only the assigned thread can access, even if the threads belong to the same block.
Shared memory: Each block has its own shared memory between the threads that belong to it. Even this memory is extremely fast.
Constant memory: All threads in a grid have constant access to the memory, but can only be accessed in reading. The data present in it persists for the entire duration of the application.
Global memory: All the threads of the grid, and therefore all the kernels, have access to the global memory. Moreover, data persistence is exactly like a constant memory:

GPU memory model

Table of Contents for Implementing memory management with PyCUDA

Create new playlist

Sign In

Sign Up

Table of Contents for
Implementing memory management with PyCUDA