10.6 Super-optimizers

Although code optimization certainly improves a code, it does not deliver an optimum final code in the sense of minimum possible time and space requirements.

Super-optimizers are an attempt to develop optimization strategies to reach this theoretical limit. As in the final analysis, the fastest or the smallest program can be written with reference to a particular processor only, such super-optimizers are in general processor specific. The general approach is to find a least-time or the shortest sequence of instructions for a particular processor which will be equivalent – in terms of its semantics – to the given raw target code. As will be evident from the example given below, the instruction sequence which emerges may look very different from the original target code.

10.6.1 Massalin's Super-optimizer

One of the best known super-optimizers is developed by Henry Massalin. It does an exhaustive search of a large number of instruction sequences, starting with single instruction, then sequences of two instructions, and so on till an optimum sequence equivalent to the given target code is found. Here by equivalence of code sequences means that for a given set of values in the processor registers and initial processor state, both the sequences leave the processor with the same values in its register and the same final state.

A little thought will tell you that the approach can quickly become limited by combinatorial explosion, because the search space grows exponentially with the number of candidate instructions in the trial lists. Some form of tree-pruning has to be used.

Massalin adopted an approach based on statistical sampling; by carefully selecting the initial values in the processor registers, the equivalence or otherwise can be determined quickly. He has given as an example the following code:

signum(int x){
  if(x > 0) return 1;
  else if(x < 0) return −1;
  else return 0;
}

A typical C compiler generates an assembly code of about 8 instructions, with 4 jump instructions. A good optimizing compiler would generate about 6 instructions with 2 jump instructions. We are concerned with the jump instructions, because in the modern pipelined processors, with instruction look-ahead, a jump instruction can degrade the performance considerably.

Though Massalin originally worked with an Motorola processor, here is an equivalent x86 processor instruction sequence from the super-optimizer, with x assumed to be in eax:

cdq
negl %eax
adcl %edx, %edx

The result is in edx. The most remarkable thing about this optimum code is that there are no jumps! It was reported by Massalin that the code generated by a super-optimizer makes clever use of status flags and tends not to use branches. This is a significant advantage on modern pipelined processors.

Though Massalin super-optimizer gives significantly better code, it has, at least at present, three major limitations – first, the approach is viable for short sequence of raw code only, say 8 to 12 instructions; second, the generated code has to be hand checked for correctness, as it is based on statistical tests; and third, the method requires that the candidate sequences of instructions be worked out afresh for each processor architecture separately.

10.6.2 GNU GCC Super-optimizer – GSO

The GNU GCC super-optimizer – GSO – is an extension of GNU Compiler Collection. There are many short sequences of code which occur quite frequently in the compiled code, for example absolute value or minimum of two values. The GNU super-optimizer works by precomputing optimum target code for such common sequences, called Goal functions.

The goal functions are specified in C as expressions relating arguments, like v0, v1, etc. to give a result in a variable called r. For example, the goal function eq0, which returns 1 if its single argument is 0 and returns 0 if the argument is non-zero, is defined as:

r = v0 == 0;

The following is a list of some of the goal functions currently supported in GSO:

eq and ne: Two-operand equality comparisons.

les, ges, lts, and gts: Two-operand signed inequality comparisons.

leu, geu, ltu, and gtu: Two-operand unsigned inequality comparisons.

eq0, ne0, les0, ges0, lts0, and gts0: Single-operand comparisons against zero. As discussed above, the unsigned comparisons against zero are degenerate.

neq, nne, nles, nges, nlts, ngts, nleu, ngeu, nltu, ngtu, neq0, nne0, nles0, nges0, nlts0, ngts0: Same as above except that they return negative one for true and zero for false.

naeq, nane, nales, nages, nalts, nagts, naleu, nageu, naltu, nagtu, naeq0, nane0, nales0, nages0, nalts0, and nagts0: Three-operand functions that compute the logical and of one operand with the negated result of the comparison of the other two operands. An example was shown at the start of this section.

peq, pne, ples, pges, plts, pgts, pleu, pgeu, pltu, pgtu, peq0, pne0, ples0, pges0, plts0, pgts0: These functions add one operand to the result of comparing the other two operands. E.g, plts0(a, b, c) would be written ((signed) a < (signed) b) + c in C.

mins, maxs, minu, and maxu: Signed and unsigned min and max operations.

sgn: Returns positive one if the single operand is positive, zero if the operand is zero, and negative one if the operand is negative.

abs: Absolute value function.

nabs: Negated absolute value function.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.188.198.94