10.4 Structural Optimization

In this category of optimization methods, the overall structure of the code is analyzed. Generally, such optimization results in reduction of both Time and Space. For the best optimization, a full IR in the form of AST is required.

10.4.1 Redundant Sub-expressions

Also called Elimination of common sub-expressions, this method does a pattern analysis of the IR and searches for expressions which are used repeatedly. They need to be computed only once and then the result is utilized wherever they are used later. For example, a * b + a * b. The 4-tuple matrix is:

i: LD   a   T1  −−
   MUL  b   T1  T2
   LD   a   T3	−−
   MUL  b   T3	T4
   ADD  T2  T4	T5

and can be reduced to:

i: LD   a   T1  −−
   MUL  b   T1  T2
   ADD  T2  T2  T3

Here, the common sub-expression was in the same main expression, but a good structural optimizer would detect and eliminate them across different expressions. For example, consider

c = a * b;
…  …  …
d = a * b + a * b

This can be optimized to:

c = a * b;
…  …  …
d = c << 1;

If the common sub-expression invokes a function call, then the speed gain is even higher; consider

c = sqrt(1 − sin(x) * sin(x));

which can be rewritten as

t = sin(x);
c = sqrt(1 − t * t);

10.4.2 Loop Unwinding

Replace a loop over a small number of iterations by equivalent linear code. For example,

for(i = 0; i < 5; i++) a[i] = i;

is replaced by

a[0] = 0;
a[1] = 1;
a[2] = 2;
etc.

Here the compiler can pre-compute the addresses a[0], a[1], etc. and avoid generation of run-time array addressing code. Also, the loop set-up and loop-control are eliminated as well.

10.4.3 Replace Index by Pointers

In this method, we replace code segments like a[i] by *(a + i), b[i][j] by *(b + jmax * i + j), which is much faster, because the compiler can take advantage of constant propagation, folding, etc. The advantage is more evident if a complete array is being traversed, because the run-time indexing code can be replaced by incrementation of the pointer value. However on processors like x86, where indexing by more than one register is available, this method does not hold much significant advantage.

10.4.4 Code Motion

Moving a code portion not dependent on the iterations of a loop, called loop-invariant code, out of the loop can result in improved efficiency. For example,

for(i = 0; i < 100; i++) a[i] = numer/denom;

Here, the RHS of the assignment is a loop-invariant, so it can be moved out of the loop:

temp = numer/denom;
for(i = 0; i < 100; i++) a[i] = temp;

10.4.5 Variable Folding

Instructions of the form a = b; will become redundant if we replace all subsequent a by b, till b is changed or a is processed further.

10.4.6 In-line Function

A function call can be replaced by the body of the function, if it is short enough. The deciding factor is the cost of the code in the calling convention. It is quite a common practice in C programming, for example,

#define min(x,y) ((x) < (y)?(x):(y))

where we have defined a macro which will be reproduced everywhere min(a,b) is called.

10.4.7 Register Allocation

Though this method is really a structural optimization method, it is machine-dependent optimization. The idea is:

  • Put temporary values which are being used frequently in processor registers.
  • Save registers on a subroutine call and restore then on return from subroutine.
  • Life-time analysis is required.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.26.221