Predication

Predication refers to the ability to better manage branches in a given program. Predication is a great example of the architecture's ability to do multiple things in parallel. A typical out-of-order execution engine maintains a couple of different program counters at the same time. One keeps track of the program instruction currently being executed, while the other is looking ahead at instructions that will be executed in the future. It builds a pipeline, or queue, to speed the transaction when it actually occurs. An excellent example of how this works is shown in Figure 8-1.

Figure 8-1. Predication Process Illustrated


This whole process makes the chip extremely efficient, since it focuses on the task at hand while at the same time plotting out what it will be handling next. The objective is the have the instructions and data immediately available to the processor(s) so that the high speed computing components never have to wait for anything. Waiting is seen as a 'stall' in the system. But to better understand predication, we have to look at how a typical RISC machine handles branches today in an effort to keep the processor busy.

Branch Prediction Tables

Programs have a lot of different branches, or if–then–else conditions, which change the values fed to the processor depending on which branch is selected. A code decision construct typically involves an if–then–else form, such as

if a = b, then
   x = a – b
else
   y = a + b
end

This translates to assembly code similar to a branch followed by the subtraction or addition and a return from a branch.

In a RISC (and in an EPIC) machine, a branch prediction table resides directly on the chip itself. If the table notes that the branch goes to A 78 percent of the time and only 22 percent of the time to B, then it builds the pipeline for that direction of branch A. A visual example of how this works can be seen in Figure 8-2.

Figure 8-2. Prediction Process Code Sample


There are two forms of branch prediction: static and dynamic. Static branch prediction takes place when the compiler does the compile: it actually gets some information from the code that enables it to determine which branch direction is selected the majority of the time. This information is used to build the initial branch prediction table.

Dynamic branch prediction takes place on the chip in real time. As the chip executes the program, it notes inaccuracies in the static branch table (where the prediction says that 75 percent of the time the result is A when it's actually B). The chip can dynamically update the table to reflect the accurate information and restore the higher level of efficiency needed to run at the best speed. This is especially important given that programs themselves are not static. Depending on the conditions that the program is running under, the normal assumptions of the branch prediction table could be significantly different than the actual needs.

The problem occurs when the branch prediction is wrong, because all of the work that was done by the processor looking ahead and executing instructions beyond the branch is lost. If a pipeline has been built for the choice that is not taken, it is dumped and the program must start executing the new branch—which again wastes time and terminates the advantage of processing items in parallel.

Predication allows the elimination of this type of branch. In the example above, say that both the if and else statements (x = a – b; y = a + b) can be executed in parallel. In that case, the predicate register determines which of the conditional results should be used as the program proceeds to the next step.

The key benefit of predication is that it eliminates branches, thereby reducing the performance penalties associated with mispredicted branches. Predication provides a mechanism for easing into conditional branches wherever possible that would otherwise interrupt instruction-level parallelism.

Branch predication enables the removal of branches by taking both paths of execution and discarding the inaccurate outcome. Predication is an important technique to maximize parallelism, as mispredictions can cause many lost opportunities to issue new instructions. Predication can take on different forms depending on the function that is being performed.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.90.185