Listings

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Previous Chapter

Listings

Listing 2.1 HelloWorld OpenCL Kernel and Main Function 46

Listing 2.2 Choosing a Platform and Creating a Context 49

Listing 2.3 Choosing the First Available Device and Creating a Command-Queue 51

Listing 2.4 Loading a Kernel Source File from Disk and Creating and Building a Program Object 53

Listing 2.5 Creating a Kernel 54

Listing 2.6 Creating Memory Objects 55

Listing 2.7 Setting the Kernel Arguments, Executing the Kernel, and Reading Back the Results 56

Listing 3.1 Enumerating the List of Platforms 66

Listing 3.2 Querying and Displaying Platform-Specific Information 67

Listing 3.3 Example of Querying and Displaying Platform-Specific Information 79

Listing 3.4 Using Platform, Devices, and Contexts—Simple Convolution Kernel 90

Listing 3.5 Example of Using Platform, Devices, and Contexts—Simple Convolution 91

Listing 6.1 Creating and Building a Program Object 221

Listing 6.2 Caching the Program Binary on First Run 229

Listing 6.3 Querying for and Storing the Program Binary 230

Listing 6.4 Example Program Binary for HelloWorld.cl (NVIDIA) 233

Listing 6.5 Creating a Program from Binary 235

Listing 7.1 Creating, Writing, and Reading Buffers and Sub-Buffers Example Kernel Code 262

Listing 7.2 Creating, Writing, and Reading Buffers and Sub-Buffers Example Host Code 262

Listing 8.1 Creating a 2D Image Object from a File 284

Listing 8.2 Creating a 2D Image Object for Output 285

Listing 8.3 Query for Device Image Support 291

Listing 8.4 Creating a Sampler Object 293

Listing 8.5 Gaussian Filter Kernel 295

Listing 8.6 Queue Gaussian Kernel for Execution 297

Listing 8.7 Read Image Back to Host Memory 300

Listing 8.8 Mapping Image Results to a Host Memory Pointer 307

Listing 12.1 Vector Add Example Program Using the C++ Wrapper API 379

Listing 13.1 Querying Platform and Device Profiles 384

Listing 14.1 Sequential Implementation of RGB Histogram 393

Listing 14.2 A Parallel Version of the RGB Histogram—Compute Partial Histograms 395

Listing 14.3 A Parallel Version of the RGB Histogram—Sum Partial Histograms 397

Listing 14.4 Host Code of CL API Calls to Enqueue Histogram Kernels 398

Listing 14.5 A Parallel Version of the RGB Histogram—Optimized Version 400

Listing 14.6 A Parallel Version of the RGB Histogram for Half-Float and Float Channels 403

Listing 15.1 An OpenCL Sobel Filter 408

Listing 15.2 An OpenCL Sobel Filter Producing a Grayscale Image 410

Listing 16.1 Data Structure and Interface for Dijkstra’s Algorithm 413

Listing 16.2 Pseudo Code for High-Level Loop That Executes Dijkstra’s Algorithm 414

Listing 16.3 Kernel to Initialize Buffers before Each Run of Dijkstra’s Algorithm 415

Listing 16.4 Two Kernel Phases That Compute Dijkstra’s Algorithm 416

Listing 20.1 ImageFilter2D.py 489

Listing 20.2 Creating a Context 492

Listing 20.3 Loading an Image 494

Listing 20.4 Creating and Building a Program 495

Listing 20.5 Executing the Kernel 496

Listing 20.6 Reading the Image into a Numpy Array 496

Listing 21.1 A C Function Implementing Sequential Matrix Multiplication 500

Listing 21.2 A kernel to compute the matrix product of A and B summing the result into a third matrix, C. Each work-item is responsible for a single element of the C matrix. The matrices are stored in global memory 501

Listing 21.3 The Host Program for the Matrix Multiplication Program 503

Listing 21.4 Each work-item updates a full row of C. The kernel code is shown as well as changes to the host code from the base host program in Listing 21.3. The only change required in the host code was to the dimensions of the NDRange 507

Listing 21.5 Each work-item manages the update to a full row of C, but before doing so the relevant row of the A matrix is copied into private memory from global memory 508

Listing 21.6 Each work-item manages the update to a full row of C. Private memory is used for the row of A and local memory (Bwrk) is used by all work-items in a work-group to hold a column of B. The host code is the same as before other than the addition of a new argument for the B-column local memory 510

Listing 21.7 Different Versions of the Matrix Multiplication Functions Showing the Permutations of the Loop Orderings 513

Listing 22.1 Sparse Matrix-Vector Multiplication OpenCL Kernels 530

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

3.144.30.236