Parallel Programming Patterns in CUDA

In this chapter, we will cover parallel programming algorithms that will help you understand how to parallelize different algorithms and optimize CUDA. The techniques we will cover in this chapter can be applied to a variety of problems, for example, the parallel reduction problem we looked at in Chapter 3CUDA Thread Programming, which can be used to design an efficient softmax layer in neural network operations. 

In this chapter, we will cover the following topics:

  • Matrix multiplication optimization
  • Image convolution
  • Prefix sum
  • Pack and split
  • N-body operation
  • QuickSort in CUDA using dynamic parallelism
  • Radix sort
  • Histogram calculation
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.135.216.174