Copyright © 2011 NVIDIA Corporation and Wen-mei W. Hwu. All rights reserved.
Introduction
The State of GPU Computing in Signal and Audio Processing
Inexorable growth in the volume of digital data available and the stunning advance of parallel computational power in GPUs is leading to application of GPU computing for signal processing in such areas as telecommunications, networking, multimedia, man-machine interfaces, signal intelligence, and data analytics.
Many of the computations involved in these domains lend themselves naturally to parallel computing, but others present challenges inherent in the organization, scale, or distribution of the data. Developers and authors such as those featured in this chapter find innovative approaches to address these and other issues to achieve unprecedented performance. Based on these results and the breadth of their applicability, extensive use of GPUs for computation in signal and audio processing should be expected in the future.
In This Section
Chapter 37 , written by Jike Chong, Ekaterina Gonina, and Kurt Keutzer, discusses GPU-accelerated automated speech recognition (ASR), the process of transcribing acoustic waveforms to word sequences in text form. Particular challenges addressed are handling irregular graph structures, eliminating redundant work, conflict-free reduction in graph traversal, and parallel construction of a global queue. The effective techniques presented to address these bode well for applications such as automatic meeting transcription, news broadcast transcription, and voice-activated multimedia systems in home entertainment systems.
In Chapter 38 , Gabriel Falcao, Vitor Silva, and Leonel Sousa present GPU-accelerated low-density parity decoding (LDPC) codes for error correction. Using techniques such as a compact data structures representation for data access and optimum thread coarsening to balance computation and memory access, the authors achieve impressive throughputs previously attainable only by fixed-purpose VLSI-based systems.
In Chapter 39 , Yifeng Chen, Xiang Cui, and Hong Mei detail the acceleration of large-scale FFTs without data locality on GPU clusters as an example of a class of processing problems. Such tasks are harder to accelerate given the bottlenecks represented by the PCI between main memory and GPU device memory and by the communication network between workstation nodes. The authors mitigate these hurdles and achieve significant speedups using such techniques as manipulating array dimensions during data transfer.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.143.247.81