A work-efficient parallel prefix algorithm

Before we continue with our new algorithm, we'll look at the naive algorithm from two perspectives. In an ideal case, the computational time complexity is O(log n), but this is only when we have a sufficient number of processors for our data set; when the cardinality (number of elements) of our dataset, n, is much larger than the number of processors, we have that this becomes an O(n log n) time algorithm.

Let's define a new concept with relation to our binary operator —the work performed by a parallel algorithm here is the number of invocations of this operator across all threads for the duration of the execution. Similarly, the span is the number of invocations a thread makes in the duration of execution of the kernel; while the span of the whole algorithm is the same as the longest span among each individual thread, which will tell us the total execution time.

We seek to specifically reduce the amount of work performed by the algorithm across all threads, rather than focus merely span. In the case of the naive prefix, the additional work that is required costs a more time when the number of available processors falls short; this extra work will just spill over into the limited number of processors available.

We'll present a new algorithm that is work efficient, and hence more suitable for a limited number of processors. This consists of two separate two distinct partsthe up-sweep (or reduce) phase and the down-sweep phase. We should also note the algorithm we'll see is an exclusive prefix algorithm.

The up-sweep phase is similar to a single reduce operation to produce the value that is given by the reduce algorithm, that is  ; in this case we retain the partial sums () that are required the achieve the end result. The down-sweep phase will then operate on these partial sums and give us the final result. Let's look at some pseudocode, starting with the up-sweep phase. (The next subsection will then dive into the implementation from the pseudocode immediately.)

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.90.182