Basic linear algebra with cuBLAS

We will start this chapter by learning how to use Scikit-CUDA's cuBLAS wrappers. Let's spend a moment discussing BLAS. BLAS (Basic Linear Algebra Subroutines) is a specification for a basic linear algebra library that was first standardized in the 1970s. BLAS functions are broken down into several categories, which are referred to as levels

Level 1 BLAS functions consist of operations purely on vectors—vector-vector addition and scaling (also known as ax+y operations, or AXPY), dot products, and norms. Level 2 BLAS functions consist of general matrix-vector operations (GEMV), such as matrix multiplication of a vector, while level 3 BLAS functions consist of "general matrix-matrix" (GEMM) operations, such as matrix-matrix multiplication. Originally, these libraries were written entirely in FORTRAN in the 1970s, so you should take into account that there are some seemingly archaic holdovers in usage and naming that may seem cumbersome to new users today.

cuBLAS is NVIDIA's own implementation of the BLAS specification, which is of course optimized to make full use of the GPU's parallelism. Scikit-CUDA provides wrappers for cuBLAS that are compatible with PyCUDA gpuarray objects, as well as with PyCUDA streams. This means that we can couple and interface these functions with our own custom CUDA-C kernels by way of PyCUDA, as well as synchronize these operations over multiple streams.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.16.79.147