Other level-1 cuBLAS functions

Let's look at a few other level-1 functions. We won't go over their operation in depth, but the steps are similar to the ones we just covered: create a cuBLAS context, call the function with the appropriate array pointers (which is accessed with the gpudata parameter from a PyCUDA gpuarray), and set the strides accordingly. Another thing to keep in mind is that if the output of a function is a single value as opposed to an array (for example, a dot product function), the function will directly output this value to the host rather than within an array of memory that has to be pulled off the GPU. (We will only cover the single precision real versions here, but the corresponding versions for other datatypes can be used by replacing the S with the appropriate letter.)

We can perform a dot product between two single precision real gpuarrays, v_gpu, and w_gpu. Again, the 1s are there to ensure that we are using stride-1 in this calculation! Again, recall that a dot product is the sum of the point-wise multiple of two vectors:

dot_output = cublas.cublasSdot(cublas_context_h, v_gpu.size, v_gpu.gpudata, 1, w_gpu.gpudata, 1)

We can also perform the L2-norm of a vector like so (recall that for a vector, x, this is its L2-norm, or length, which is calculated with the  formula):

l2_output = cublas.cublasSnrm2(cublas_context_h, v_gpu.size, v_gpu.gpudata, 1)
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.90.182