Listings

1.1 Add two vectors in C, with implied serial ordering

1.2 Overlapping (aliased) arguments in C

1.3 Add two vectors using Cilk Plus array notation

1.4 An ordered sum creates a dependence in C

1.5 A parallel sum, expressed as a reduction operation in Cilk Plus

1.6 Function calls with step-by-step ordering specified in C

1.7 Function calls with no required ordering in Cilk Plus

1.8 Serial vector addition coded as a loop in C

1.9 Parallel vector addition using Cilk Plus

1.10 Parallel vector addition using ArBB

1.11 Scalar function for addition in C

1.12 Vectorized function for addition in Cilk Plus

1.13 Serial Fibonacci computation in C

1.14 Parallel Cilk Plus variant of Listing 1.13

1.15 Vector computation in ArBB

1.16 Elemental function computation in ArBB

3.1 Serial sequence in pseudocode

3.2 Serial sequence, second example, in pseudocode

3.3 Serial selection in pseudocode

3.4 Iteration using a while loop in pseudocode

3.5 Iteration using a for loop in pseudocode

3.6 Demonstration of while/for equivalence in pseudocode

3.7 A difficult example in C

3.8 Another difficult example in C

3.9 Serial implementation of reduction

3.10 Serial implementation of scan

3.11 Superscalar sequence in pseudocode

4.1 Serial implementation of SAXPY in C

4.2 Tiled implementation of SAXPY in TBB

4.3 SAXPY in Cilk Plus using cilk_for

4.4 SAXPY in Cilk Plus using cilk_for and array notation for explicit vectorization

4.5 SAXPY in OpenMP

4.6 SAXPY in ArBB, using a vector expression

4.7 SAXPY in ArBB, using binding code for vector expression implementation

4.8 SAXPY in ArBB, using an elemental function

4.9 SAXPY in ArBB, call operation

4.10 SAXPY in OpenCL kernel language

4.11 Serial implementation of Mandelbrot in C

4.12 Tiled implementation of Mandelbrot in TBB

4.13 Mandelbrot using cilk_for in Cilk Plus

4.14 Mandelbrot in Cilk Plus using cilk_for and array notation for explicit vectorization

4.15 Mandelbrot in OpenMP

4.16 Mandelbrot elemental function for ArBB map operation

4.17 Mandelbrot call code for ArBB implementation

4.18 Mandelbrot binding code for ArBB implementation

4.19 Mandelbrot kernel code for OpenCL implementation

5.1 Serial reduction in C++ for 0 or more elements

5.2 Serial reduction in C++ for 1 or more elements

5.3 Serial implementation of dot product in C++

5.4 Vectorized dot product implemented using SSE intrinsics

5.5 Dot product implemented in TBB

5.6 Modification of Listing 5.5 with double-precision operations

5.7 Dot product implemented in Cilk Plus array notation

5.8 Dot product implementation in Cilk Plus using explicit tiling

5.9 Modification of Listing 5.8 with double-precision operations for multiplication and accumulation

5.10 Dot product implemented in OpenMP

5.11 Dot product implemented in ArBB

5.12 Dot product implementation in ArBB, wrapper code

5.13 High-precision dot product implemented in ArBB

5.14 Serial implementation of inclusive scan in C++

5.15 Serial implementation of exclusive scan in C++

5.16 Three-phase tiled implementation of a scan in OpenMP

5.17 Serial integrated table preparation in C++

5.18 Generic test function for integration

5.19 Concrete instantiation of test function for integration

5.20 Serial implementation of integrated table lookup in C++

5.21 Integrated table preparation in Cilk Plus

5.22 Integrated table preparation in TBB

5.23 Integrated table preparation in ArBB

5.24 Integrated table lookup in ArBB

6.1 Serial implementation of gather in pseudocode

6.2 Serial implementation of scatter in pseudocode

6.3 Array of structures (AoS) data organization

6.4 Structure of arrays (SoA) data organization

7.1 Serial implementation of stencil

7.2 Serial 2D recurrence

8.1 Recursive implementation of the map pattern in Cilk Plus

8.2 Modification of Listing 8.1 that changes tail call into a goto

8.3 Cleaned-up semi-recursive map in Cilk Plus

8.4 Three loop forms illustrating steal-continuation versus steal-child

8.5 Flat algorithm for polynomial multiplication using Cilk Plus array notation

8.6 Karatsuba multiplication in Cilk Plus

8.7 Type for scratch space

8.8 Pseudocode for recursive matrix multiplication

8.9 Code shared by Quicksort implementations

8.10 Fully recursive parallel Quicksort using Cilk Plus

8.11 Semi-recursive parallel Quicksort using Cilk Plus

8.12 Semi-iterative parallel Quicksort using TBB

8.13 Quicksort in TBB that achieves Cilk Plus space guarantee

8.14 Recursive implementation of parallel reduction in Cilk Plus

8.15 Using a hyperobject to avoid a race in Cilk Plus

8.16 Using a local reducer in Cilk Plus

8.17 Top-level code for tiled parallel scan

8.18 Upsweep phase for tiled parallel scan in Cilk Plus

8.19 Downsweep phase for tiled parallel scan in Cilk Plus

8.20 Implementing pack pattern with cilk_scan from Listing 8.17

8.21 Base case for evaluating a diamond of lattice points

8.22 Code for parallel recursive evaluation of binomial lattice in Cilk Plus

8.23 Marching over diamonds in Cilk Plus

9.1 Serial implementation of a pipeline

9.2 Pipeline in TBB

9.3 Pipeline in Cilk Plus equivalent to the serial pipeline in Listing 9.1

9.4 Defining a reducer for serializing consumption of items in Cilk Plus

10.1 Serial code for simulating wavefield

10.2 Code for one-dimensional iterated stencil

10.3 Base case for applying stencil to space–time trapezoid

10.4 Parallel cache-oblivious trapezoid decomposition in Cilk Plus

10.5 ArBB code for simulating a wavefield

11.1 K-means clustering in Cilk Plus

11.2 Type sum_and_count for computing mean of points in a cluster

11.3 Defining a hyperobject for summing an array elementwise in Cilk Plus

11.4 Declaring a type tls_type for thread-local views in TBB

11.5 Walking local views to detect changes

11.6 Walking local views to accumulate a global sum

11.7 Routine for finding index of centroid closest to a given point

11.8 K-means clustering in TBB

12.1 Declarations for bzip2 pipeline

12.2 Use of TBB parallel_pipeline to coordinate bzip2 actions

12.3 Sketch of bzip2 pipeline in Cilk Plus using a consumer reducer

13.1 Serial merge

13.2 Parallel merge in Cilk Plus

13.3 Converting parallel merge from Cilk Plus to TBB

13.4 Parallel merge sort in Cilk Plus

14.1 Top-level code for parallel sample sort

14.2 Code for mapping keys to bins

14.3 Parallel binning of keys using Cilk Plus

14.4 Repacking and subsorting using Cilk Plus

14.5 Using Cilk Plus to move and destroy a sequence, without an explicit loop!

15.1 Recursive Cholesky decomposition

15.2 Parallel triangular solve in Cilk Plus

15.3 Parallel symmetric rank update in Cilk Plus

15.4 Converting parallel symmetric rank update from Cilk Plus to TBB

B.1 Simple example use of cilk_for

B.2 Examples of using cilk_spawn and cilk_sync

B.3 Serial reduction in C++ and equivalent Cilk Plus code

B.4 Serial reduction in C and equivalent Cilk Plus code

B.5 Example of using __sec_reduce to reduce over string concatenation

B.6 Defining an elemental function

B.7 Calling an elemental function from a vectorizable loop

C.1 Example of affinity_partitioner

C.2 Using task_group

C.3 Example of using atomic <int> as a counter

C.4 Using atomic operations on a list

D.1 Using a manually written functor comparator

D.2 Using a lambda expression lets Listing D.1 be rewritten more concisely

D.3 Mixed capture with handwritten functor

D.4 Mixed capture modes

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.17.164.34