Table of Contents

Cover image

Title page

Copyright

List of Figures

List of Tables

Foreword

Acknowledgments

Chapter 1: Introduction

Abstract

1.1 Introduction to Heterogeneous Computing

1.2 The Goals of This Book

1.3 Thinking Parallel

1.4 Concurrency and Parallel Programming Models

1.5 Threads and Shared Memory

1.6 Message-Passing Communication

1.7 Different Grains of Parallelism

1.8 Heterogeneous Computing with OpenCL

1.9 Book Structure

Chapter 2: Device architectures

Abstract

2.1 Introduction

2.2 Hardware Trade-offs

2.3 The Architectural Design Space

2.4 Summary

Chapter 3: Introduction to OpenCL

Abstract

3.1 Introduction

3.2 The OpenCL Platform Model

3.3 The OpenCL Execution Model

3.4 Kernels and the OpenCL Programming Model

3.5 OpenCL Memory Model

3.6 The OpenCL Runtime with an Example

3.7 Vector Addition Using an OpenCL C++ Wrapper

3.8 OpenCL for CUDA Programmers

3.9 Summary

Chapter 4: Examples

Abstract

4.1 OpenCL Examples

4.2 Histogram

4.3 Image Rotation

4.4 Image Convolution

4.5 Producer-Consumer

4.6 Utility Functions

4.7 Summary

Chapter 5: OpenCL runtime and concurrency model

Abstract

5.1 Commands and the Queuing Model

5.2 Multiple Command-Queues

5.3 The Kernel Execution Domain: Work-Items, Work-Groups, and NDRanges

5.4 Native and Built-In Kernels

5.5 Device-Side Queuing

5.6 Summary

Chapter 6: OpenCL host-side memory model

Abstract

6.1 Memory Objects

6.2 Memory Management

6.3 Shared Virtual Memory

6.4 Summary

Chapter 7: OpenCL device-side memory model

Abstract

7.1 Synchronization and Communication

7.2 Global Memory

7.3 Constant Memory

7.4 Local Memory

7.5 Private Memory

7.6 Generic Address Space

7.7 Memory Ordering

7.8 Summary

Chapter 8: Dissecting OpenCL on a heterogeneous system

Abstract

8.1 OpenCL on an AMD FX-8350 CPU

8.2 OpenCL on the AMD Radeon R9 290X GPU

8.3 Memory Performance Considerations in OpenCL

8.4 Summary

Chapter 9: Case study: Image clustering

Abstract

9.1 Introduction

9.2 The Feature Histogram on the CPU

9.3 OpenCL Implementation

9.4 Performance Analysis

9.5 Conclusion

Chapter 10: OpenCL profiling and debugging

Abstract

10.1 Introduction

10.2 Profiling OpenCL Code Using Events

10.3 AMD CodeXL

10.4 Profiling Using CodeXL

10.5 Analyzing Kernels Using CodeXL

10.6 Debugging OpenCL Kernels Using CodeXL

10.7 Debugging Using printf

10.8 Summary

Chapter 11: Mapping high-level programming languages to OpenCL 2.0: A compiler writer’s perspective

Abstract

11.1 Introduction

11.2 A Brief Introduction to C++ AMP

11.3 OpenCL 2.0 as a Compiler Target

11.4 Mapping Key C++ AMP Constructs to OpenCL

11.5 C++ AMP Compilation Flow

11.6 Compiled C++ AMP Code

11.7 How Shared Virtual Memory in OpenCL 2.0 Fits in

11.8 Compiler Support for Tiling in C++AMP

11.9 Address Space Deduction

11.10 Data Movement Optimization

11.11 Binomial Options: A Full Example

11.12 Preliminary Results

11.13 Conclusion

Chapter 12: WebCL: Enabling OpenCL acceleration of Web applications

Abstract

12.1 Introduction

12.2 Programming with WebCL

12.3 Synchronization

12.4 Interoperability with WebGL

12.5 Example Application

12.6 Security Enhancement

12.7 WebCL on the Server

12.8 Status and Future of WebCL

Works Cited

Chapter 13: Foreign lands: Plugging OpenCL in

Abstract

13.1 Introduction

13.2 Beyond C and C+ +

13.3 Haskell OpenCL

13.4 Summary

Index

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.202.27