Questions

Suppose you get a job translating some old legacy FORTRAN BLAS code to CUDA. You open a file and see a function called SBLAH, and another called ZBLEH. Can you tell what datatypes these two functions use without looking them up?
Can you alter the cuBLAS level-2 GEMV example to work by directly copying the matrix A to the GPU, without taking the transpose on the host to set it column-wise?
Use cuBLAS 32-bit real dot-product (cublasSdot) to implement matrix-vector multiplication using one row-wise matrix and one stride-1 vector.
Implement matrix-matrix multiplication using cublasSdot.
Can you implement a method to precisely measure the GEMM operations in the performance measurement example?
In the example of the 1D FFT, try typecasting x as a complex64 array, and then switching the FFT and inverse FFT plans to be complex64 valued in both directions. Then confirm whether np.allclose(x, x_gpu.get()) is true without checking the first half of the array. Why do you think this works now?
Notice that there is a dark edge around the blurred image in the convolution example. Why is this in the blurred image but not in the original? Can you think of a method that you can use to mitigate this?

Table of Contents for Questions