OpenMP and CUDA calls

OpenMP uses a fork-and-join model of parallelism to target multi-core CPUs. The master thread initiates the parallel operations and creates worker threads. The host threads operate their own jobs in parallel and join after finishing their work.

Using OpenMP, CUDA kernel calls can be executed in parallel with multiple threads. This helps the programmer to not have to maintain individual kernel calls, instead allowing them to have kernel executions depend on the host thread's index.

We will use the following OpenMP APIs in this section:

  • omp_set_num_threads() sets a number of worker threads that will work in parallel.
  • omp_get_thread_num() returns an index of worker threads so that each thread can identify their task.
  • #pragma omp parallel {} specifies a parallel region that will be covered by the worker threads.

Now, let's write some code in which OpenMP calls a CUDA kernel function.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.23.101.60