Chapter 10

Code Optimization

Code optimization is an important phase to improve the time and space requirement of the generated target code. Given a code in intermediate form it applies optimizing techniques and reduces the code size and returns the code in intermediate form.

CHAPTER OUTLINE 

Code optimization phase is an optional phase in the phases of a compiler, which is either before the code generation phase or after the code generation phase. This chapter focuses on the types of optimizer and the techniques available for optimizing. It also includes global flow analysis, a technique to gather the information of how data and control flows, to apply global optimization.

10.1  Introduction

It is possible for a programmer to outperform an optimizing compiler by using his or her knowledge in choosing a better algorithm and data items. This is far from practical conditions. In such an environment, there is need for an optimizing compiler. Optimization is the process of transformation of code to an efficient code. Efficiency is in terms of space requirement and time for its execution without changing the meaning of the given code. The optimization is considered an important phase because of the following practical reasons:

  • Inefficient programming (which forces many invisible instructions to be performed for actual computation.) e. × a = a + 0
  • The programming constructs for easy programming, e. × Iterative loops.
  • Compiler generated temporary variables or instructions.

The following constraints are to be considered while applying the techniques for code improvement:

  • The transformation must preserve the meaning of the program, that is, the target code should ensure semantic equivalence with source program.
  • Program efficiency must be improved by a measurable amount without changing the algorithm used in the program.
  • When the technique is applied on a special format, then it is worth transforming to that format only when it is beneficial enough.
  • Some transformations are applied only after detailed analysis, which is time consuming. Such analysis may not be worthy if the program is run very few number of times.

The optimization can be classified depending on

  • Level of code.
    • Design level—efficiency of code can be improved by making the best use of available resources and selection of suitable algorithm.
    • Source code level—the user can modify the program and change the algorithm to enhance the performance of the object code.
    • Compile level—the compiler can enhance the program by improving the loops, optimizing on the procedure calls and address calculations. This is possible when the representation is in three address code.
    • Assembly level—the compiler optimizes the code based on the machine architecture and is based on the available registers and suitable addressing modes.
  • Programming language
    • Machine Independent—the code-improvement techniques that do not consider the features of the target machine are machine-independent techniques. Constant folding, dead code elimination, and constant propagation are examples of machine-independent techniques. These are applied on either high-level language or intermediate representation.
    • Machine dependent—these techniques require specific information relating to target machine. Register allocation, strength reduction, and use of machine idioms are examples of machine-dependent techniques.
  • Scope
    • Local—Optimizations performed within a single basic block are termed as local optimizations. These techniques are simple to implement and does not require any: analysis since we do not require any information relating to how data and control flows.
    • Global—optimization performed across basic blocks is called global optimizations. These techniques are complex as it requires additional analysis to be performed across basic blocks. This analysis is called data-flow analysis.

Optimization is the field where most compiler research is done today. High-quality optimization is more of an art than a science. Compilers for mature languages are judged based upon the quality of the object code generated and not based on how well they parse or analyze the code.

Example 1:

Consider the following example, which sets every element of an array to 1.

a) int array_ele[10000];

void Binky() {

int i;

for (i=0; i < 10000; i++)

array_ele[i] = 1;

}

 

b) int array_ele[10000];

void Winky() {

register int *p;

for (p = array_ele; p < array_ele + 10000; p++)

*p = 1;

}

In the above two examples, one may think the second one is faster than the first. It may be true if they use a compiler without optimization. Many modern compilers emit the same object code by using clever techniques like “loop-induction variable elimination.”

10.2  Where and How to Optimize

Optimization techniques can be applied to intermediate code or on the final target code. It is a complex and a time-consuming task that involves multiple sub phases, sometimes applied more than once. Most compilers allow the optimization to be turned off to speed up compilation process. For example, in gcc there are specific flags that are turned on/off for individual optimization. To apply optimization it is important to do control flow analysis and data flow analysis followed by transformations as shown in Figure 10.1.

Control Flow Analysis: It determines the control structure of a program and builds a control flow graph.

Data Flow Analysis: It determines the flow of scalar values and builds data flow graphs. The solution to flow analysis propagates data flow information along a flow graph.

Transformation: Transformations help in improving the code without changing the meaning or functionality.

Figure 10.1

Figure 10.1 Code Optimization Model

Flow Graph

A graphical representation of three address code is called flow graph. The nodes in the flow graph represent a single basic block and the edges represent the flow of control. These flow graphs are useful in performing the control flow and data flow analysis and to apply local or global optimization and code generation.

Basic Block

A basic block is a set of consecutive statements that are executed sequentially. Once the control enters into the block then every statement in the basic block is executed one after the other before leaving the block.

Example 2: For the statement a = b + c * d/e the corresponding set of three address code is

 

      t1 = c * d

      t2 = t1 / e

      t3 = b + t2

      a = t3

 

All these statements correspond to a single basic block.

10.3  Procedure to Identify the Basic Blocks

Given a three address code, first identify the leader statements and group the leader statement with the statements up to the next leader. To identify the leader use the following rules:

  1. First statement in the program is a leader.
  2. Any statement that is the target of a conditional or unconditional statement is a leader statement.
  3. Any statement that immediately follows a conditional/unconditional statement is a leader statement.

Example 3: Identify the basic blocks for the following code fragment.

main ( )

{

int i = 0, n = 10;

int a [n];

while (i <=(n-1))

{

a[i] = i * i;

i=i+1;

}

return;

}

The three address code for the initialize function is as follows:

  1.  i: = 0
  2. n: = 10
  3. t1: = n – l
  4. If i > t1 goto (12)
  5.  t2: = i * i
  6.  t3: = 4 * i
  7. t4: = a[ t3]
  8. t4: = t2
  9. t5: = i + 1
  10. i: = t5
  11. goto (3)
  12. return

Identifying leader statements in the above three address code

      Statement (1) is leader using rule 1

      Statement (3) and (12) are leader using rule 2

      Statement (4) and (12) are leaders using rule 3

  1. i: = 0                 → Leader 1
  2. n: = 10
  3. t1 : = n – 1        → Leader 2
  4. If i > t1 go to (12)
  5. t2: = i * i          → Leader 3
  6. t3: = 4 * i
  7. t4: = a[ t3 ]
  8. t4: = t2
  9. t5: = i + l
  10. i: = t5
  11. go to (3)
  12.  Return   → Leader 4

Basic block 1 includes statements (1) and (2)

Basic block 2 includes statements (3) and (4)

Basic block 3 includes statements (5)–(11)

Basic block 4 includes statement (12)

Basic blocks are shown in Figure 10.2.

10.4  Flow Graph

Flow graph shows the relation between the basic block and its preceding and its successor blocks. The block with the first statement is B1. An edge is placed from block B1 to B2, if block B2 could immediately follow B1 during execution or satisfies the following conditions.

  • The last statement in B1 is either conditional or unconditional jump statement that is followed by the first statement in B2 or
  • the first statement in B2 follows the last statement in B1 and is not an unconditional/conditional jump statement.

Flow graph for Figure 10.2 is shown in Figure 10.3.

Figure 10.2

Figure 10.2 Basic Blocks

Figure 10.3

Figure 10.3 Flow Graph for Example 3

10.5  DAG Representation of Basic Block

A DAG is a useful data structure for implementing transformations within a basic block. It gives the pictorial representation of how values computed at one statement are useful in computing the values of other variables. It is useful in identifying common sub-expressions within a basic block. A DAG has nodes, which are labeled as follows:

  • The leaf nodes are labeled by either identifiers or constants. If the operators are arithmetic then it always requires the r- value.
  • The labels of interator nodes correspond to the operator symbol.
  • Some nodes are sometimes referred to by the sequence of identifiers for labels. The interior nodes represent computed values.

Note: The DAG is not the same as a flow graph. Each node in a flow graph is a basic block, which has a set of statements that can be represented using DAG.

10.6  Construction of DAG

To construct DAG, we process each statement of the block.

If the statement is a copy statement, that is, a statement of the form a = b, then we do not create a new node, but append label a to the node with label b.

If the statement is of the form a = b op c, then we first check whether there is a node with same values as b op c, if so we append the label a to that node. If such node does not exist then we first check whether there exists nodes for b and c which may be leaf nodes or an internal nodes if recently computed, then create a new node for op and add to it the left child b and the right child as c. Label this new node with a. This would become the value of a to be used for next statements; hence, we mark the previously marked nodes with a as a0.

DAG creation would be easy if we maintain the information of the nodes created for all the identifiers and facility to create a linked list of attached identifiers for each node.

We also define a function that returns the most recently created node associated with identifier.

10.6.1  Algorithm for Construction of DAG

Given a basic block we need to construct a DAG, which has the following information:

  • A label (identifier/operator) for each node.
  • For each node list of identifiers attached to it.

In the construction process, we process each statement and categorize it to one of the following cases.

  •  a = b op c
  • a = op b
  • a = b
  • Relational operators are treated as case (i) for example if i ≤ 20 go to l1 considered similar to case (i) with a undefined.

Initially we assume there are no nodes, and a node is undefined for all arguments and does the following steps:

  1. For case (i)
    • If node(b) and node(c) are undefined, create leafs labeled b and c respectively; let node(b) and node(c) be these nodes.
    • Find if there is a node labeled op, with left child as node(b) and right child as node(c); if found return this node; otherwise, create a node and let this be n.
    • Delete a from the list of attached identifiers for node (a). Append a to the list of attached identifiers for the node n found in (b) and set node(a) to n.
  2. For case (ii)
    • If node(b) is undefined, create a leaf labeled b; let node(b) be the node.
    • Find if there is a node labeled op, whose lone child is node(b); if so, return this node otherwise create a node and let this be n.
    • Delete a from the list of attached identifiers for node (a). Append a to the list of attached identifiers for the node n found in b) and set node(a) to n.
  3. For case (iii)
    • node(b) is an existing node and let it be n.
    • Delete a from the list of attached identifiers for node(a). Append a to the list of attached identifiers for the node n found in (a) and set node(a) to n.

Example 4: Let us consider Figure 10.4 and construct DAG for each block step by step.

 

For block B1

 

For the statement which satisfies case iii) we first create a leaf labeled 4 and attach identifier i to it as shown in Figure 10.5.

 

For block B2

 

For first statement, which satisfies case i) we first create nodes for n and 1, then create node for operator(–) and label it as t1. Figure 10.6(a) shows for single statement in B2.

 

For second statement, which is a conditional statement, we create a node for operator > and label it as 12 as when this condition is satisfied it should go to statement 12. Figure 10.6(b) shows for all statements in B2 block.

 

Block B3

 

For the first statement, which satisfies case (i), we create nodes for i and use as right child, then create node for operator(*) and label it as t2 as shown in Figure 10.7(a).

Figure 10.4

Figure 10.4 Example 3 Program in Basic Blocks

Figure 10.5

Figure 10.5 DAG for Block B1

Figure 10.6(a)

Figure 10.6(a) DAG for First Statement in Block B2

Figure 10.6(b)

Figure 10.6(b) DAG for Block B2

Figure 10.7(a)

Figure 10.7(a) DAG for One Statement Block B3

For second statement, we first create nodes for 4 and use node(i), then create node for operator (*) and label it as t3 as in Figure 10.7(b).

For third statement, we first create nodes for a and use node(t3), then create node for operator([]) and label it as t4 as in Figure 10.7(c).

For fourth statement, since it satisfies the copy statement, we delete label t4; mark it as t40 and attach t4 label to node with label t2 as in Figure 10.7(d).

After construction for the entire block, the DAG would be as shown in the Figure 10.7(e).

10.6.2  Application of DAG

  • By looking at DAG we can determine
    • which ids have their values used in the block.
    • which statement computes values which could be used outside the block.
  • DAGs are useful for redundancy elimination
Figure 10.7(b)

Figure 10.7(b) DAG for Two Statements in Block B3

Figure 10.7(c)

Figure 10.7(c) DAG for Three Statements in Block B3

Figure 10.7(d)

Figure 10.7(d) DAG for Four Statements in Block B3

Figure 10.7(e)

Figure 10.7(e) DAG for Complete Block B3

10.7  Principle Source of Optimization

A transformation of a program is called local if it is applied within a basic block and global if applied across basic blocks. There are different types of transformations to improve the code and these transformations depend on the kind of optimization required.

  • Function-preserving transformations are those transformations that are performed without changing the function it computes. These are primarily used when global optimizations are performed.
  • Structure-preserving transformations are those that are performed without changing the set of expressions computed by the block. Many of these transformations are applied locally.
  • Algebraic transformations are used to simplify the computation of expression set using algebraic identities. These can replace expensive operations by cheaper ones, for instance, multiplication by 2 can be replaced by left shift.

10.8  Function-Preserving Transformations

The following techniques are function-preserving transformations, where common sub expression elimination can be applied locally or globally. The other techniques are applied globally.

10.8.1  Common Sub-expression Elimination

An expression E is said to be common sub expression if E is computed before and the variables in the expression are not modified since its computation. If such expression is present, then the re-computation can be avoided by using the result of the previous computation. This technique can be applied both locally and globally. We need to maintain a table to store the details of expressions evaluated so far and use this information to identify and eliminate the re-computation of the same expression. The common sub-expression elimination can be done within basic block by analyzing and storing the information of expression in the table until the operands in the expression are redefined. If any operand in the expression is redefined, then remove the expression from the table. The algorithmic approach is given below.

 

Function subexpr_elimination (Block B)

{

      For each statement that evaluates an expression within basic block

      {

        Maintain in the table the set of expressions evaluated so far

        If any operand in the expression if found as redefined, then remove it from the table

        Modify the instructions accordingly as you go

        Generate temporary variable and store the expression in it and Use the latest variable definition next time the expression is encountered.

      }

}

Figure 10.8 is an example that shows the optimized code on applying this technique locally and globally.

Global common sub expression elimination is applied with the extra information collected in flow graph analysis. At every block entry and exit the collected information should be maintained to identify the evaluated expression.

Example 5: The following example in Figure 10.9 shows the common sub expression elimination globally by replacing the evaluated expression with a new temporary variable t1 and t2 and these variables are used wherever the same expression is used. This generates copy statements. If these copy statements are unnecessary, then they are eliminated by the technique explained in the next section.

Example 6: The bubble sort program is given below with its corresponding three address code represented in flow graph.

 

void quicksort(int LI, int HI)

{

int i, j;

int pivot, z ;

if ( HI ≤LI) return;

i = LI - 1;

j = HI

pivot =a [HI];

while (1)

{

      do i = i + 1; while (a[i] < pivot);

      do j = j - 1; while (a[j] > pivot);

      if (i3 j) break;

      z = a [i];

      a[i] = a[j];

      a[j] = z;

    }

z = a[i];

a[i] = a[HI];

a[HI] = z;

quicksort(LI, j);

quicksort(j+1, HI);

}

Figure 10.8

Figure 10.8 Program Code Before and After Common Sub Expression Elimination

Figure 10.9

Figure 10.9 Example for Global Common Sub Expression Elimination

Its corresponding three address code is given below and corresponding flow graph is shown in Figure 10.10.

 

 1. i: = LI - 1

 2. j: = HI

 3. temp1: = 4 * HI

 4. pivot: = a[ temp1 ]

 5. i: = i + 1

 6. temp2: = 4 * i

 7. temp3: = a[ temp2 ]

 8. if temp3 < pivot go to 5

 9. j: = j - 1

10. temp4: =4 * j

11. temp5: = a[ temp4 ]

12. if temp5 > pivot go to 9

13. if i 3 j go to 23

14. temp6: = 4 * i

15. z: = a[ temp6 ]

16. temp7: = 4 * i

17. temp8: = 4 * j

18. temp9: = a[ temp8 ]

19. a[ temp7 ]: = temp9

20. temp10: =4 * j

21. a[ temp10 ]: = z

22. go to 5

23. temp11: = 4 * i

24. z: =a[ temp11 ]

25. temp12: = 4 * i

26. temp13: = 4 * HI

27. temp14: = a[ temp13 ]

28. a [ temp12 ]: = temp14

29. temp15: = 4 * HI

30. a[ temp15 ]: = z

On applying local common sub expression elimination, we have block B5 and block B6 with improved code as the expressions 4 * i is computed more than once without change of value. Hence, on optimizing locally, we get resultant code as shown in Figure 10.11.

On applying this technique globally, the code is further improved and the resultant code is shown below in Figure 10.12. It is clear that the computation of a[ i ] in block B2 can be used in B5 and B6. Similarly, a[ j ] computed in B3 can be used in B5 and B6.

10.8.2  Copy Propagation

A copy statement is a statement that is in the form a = b. Copy propagation technique is applied to replace the later use of variable a with the use of b if original values of a do not change after this assignment. This technique can be applied locally and globally. This optimization reduces runtime stack size and improves execution speed. To implement this, we need to maintain a separate table called copy table, which holds all copy statements. While traversing the basic block, if any copy statement is found, add this information into the copy table. While processing any statement, check the copy table for variable a, which can be replaced variable b. If there is an assignment statement that computes the value for a then remove the copy statement from the copy table. Copy propagation helps in identifying the dead code.

Figure 10.10

Figure 10.10 Flow Graph for Quick Sort Program

Let us assume that there are a set of statements s1, s2, s3.....sn, where the statements would be either a copy statement of the statement which uses the right hand side of copy statement or computes the value for a variable.

Figure 10.11

Figure 10.11 After Applying Common Sub Expression Elimination for Figure 10.10

Algorithm for copy propagation

Let INSERT (a, b) be a function that inserts {(a, b)} in a copy table, for a copy statement a = b, which indicates that a can be replaced by b.

GET (a, table) be a function that checks in the table for the variable a. if found returns b; otherwise, returns a.

Figure 10.12

Figure 10.12 After Applying Global Common Sub Expression Elimination for Figure 10.11

GET (a, table)

{

if you find (a, b) in table

      return b

else return a

}

GET (a, table)

{

if you find (a, b) in table

      return b

else return a

}

OPT_CP_PROP(B)

{

for each statement from s1 to sn

      if statement is of the form “a = b op c”

        b = GET(b, table)

        c = GET(c, table)

      else if statement is of the form “a = x”

        x = GET(x, table)

      if statement has a left hand side a,

        REMOVE from table all pairs involving a.

      if statement is of the form “a = x”

        insert {(a, GET(x, table))} in the table

endfor

}

Example 7: Let the basic block contain the following set of statements:

   

      Y = X

      Z = Y + 1

      W = Y

      Y = W + Z

      Y = W

image

On the first statement 1, since it is a copy statement, the information is added in to the copy table. Statement 2 uses the value of X indirectly from Y. hence GET(Y,table) would return X and the statement is modified replacing Y with X. The third statement is a copy statement and since Y is to replace X, we insert into the copy table W to be replaced by X. When statement 4 is processed, Y value is defined; hence, details for Y are removed from the copy table. The fifth statement is a copy statement and is inserted into the copy table.

To apply the technique globally, we perform flow analysis and given a copy statement X = Y and a statement as W = X in successive blocks, we can replace W = X with W = Y only if the following conditions are met:

  • X = Y must be the only definition of X reaching W = X. This can be determined through ud-chains.
  • The value of Y is not redefined in any path from the statement X = Y to W = X. Use iterative data flow analysis to gather this information.

Example 8: The following example in Figure 10.13 shows the copy propagation technique applied globally across different blocks. After the copy propagation we can see that the copy statements d = c and g = e is a dead code as the values assigned to d and g are not used; hence they can be eliminated.

Example 9: For the resultant quicksort code in Figure 10.12, obtained after common sub expression elimination, copy propagation is applied the resultant code as is shown in Figure 10.14.

10.8.3  Dead Code Elimination

Code that is unreachable or that does not affect the program is said to be dead code. Such code requires unnecessary CPU time, which can be identified and eliminated using this technique. The following program gives an example in high-level language where, the value assigned to i is never used, and the value assigned to var1 variable in statement 6 is dead store and the statement 9 is unreachable. These statements can be eliminated as they do not affect the program execution.

Figure 10.13

Figure 10.13 Example for Global Copy Propagation Technique

Figure 10.14

Figure 10.14 Quick Sort Program After Applying Copy Propagation

 1. int var1;

 2. void sample()

 3. {

 4. int i;

 5. i = 1; /* dead store */

 6. var1 = 1; /* dead store */

 7.  var1 = 2;

 8. return;

 9. var1 = 3; /* unreachable */

10. }

Below is the code fragment after dead code elimination.

 

1. int var1;

2. void sample()

3. {

4. var1 = 2;

5. return;

6. }

Example 10: Blocks B5 and B6 of flowchart corresponding to quick sort there is dead store to variable Z which can be eliminated by dead code elimination. The resultant code is shown in Figure 10.15.

Figure 10.15

Figure 10.15 Quick Sort Program After Dead Code Elimination

10.8.4  Constant Propagation

Constant propagation is an approach that propagates the constant values assigned to a variable at the place of its use. For example, given an assignment statement x = c, where c is a constant, replace later uses of x with uses of c, provided there are no intervening assignments to x. This approach is similar to copy propagation and is applied at first stage of optimization. This method can analyze by propagating constant value in conditional statement, to determine whether a branch should be executed or not, that is, identifies the dead code.

Example 11: Let us consider the following example:

  1. pi = 22/7
  2. void area_per(int r)
  3. {
  4.    float area, perimeter;
  5.    area = pi * r * r;
  6.    perimeter = 2 * pi * r;
  7.    print area, perimeter;
  8. }

In this short example, we can notice some simple constant propagation results, which are as follows:

  • In line 1 the variable pi is constant and this value is the result of 22/7, which can be computed at compile time and has the value of 3.413.
  • In line 5 the variable pi can be replaced with the constant value 3.413.
  • In line 6 since the value of pi is constant, the partial result of the statement can be computed and the statement can be modified as
perimeter = 6.285 * r;

 

Note: In common sub expression elimination or constant propagation, often it may require to rearrange the expressions by using the associative properties of the algebraic expression.

10.9  Loop Optimization

In most of the programs, 90% of execution time is inside the loops; hence, loop optimization is the most valuable machine-independent optimization and are good candidates for improvement. Loops are for convenience of program writing, but these programs has to be transformed using loop transformation techniques. Let us consider the example of a program that has a loop statement.

 

for (int i=0; i<1000; i++)

x = x + y/z;

print x;

The optimized code of this program is represented in flow graph as shown in Figure 10.16.

Figure 10.17

Figure 10.16 Flow Graph After Code Optimization

It is clear that for the code the total number of statements that are executed are as follows:

Statement Number
Frequency of execution
Total number of statements

1

1
1

2 – 5

1000
4000

6

1
1

On the whole, the number of statements that are executed are 4002. Instead, if the code is written as follows, then the total number of statements that are executed are only 1002.

  1. t1 = y / z
  2. x = x + t1
  3. x = x + t1
  4. x = x + t1
  5.  …

                     ….

    1001.  x = x + t1

    1002.  print x

The execution is three times better for later than the first form, but the space requirement is more. Inefficient code is generated in the loops for various reasons like

  • Induction variable usage to keep track of iteration
  • Unnecessary computations made inside the iterative loops that are not effective
  • Use of high strength operators inside the loop

The important loop optimizations are elimination of loop invariant computations, strength reduction, code motion and elimination of induction variables.

10.9.1  A Loop Invariant Computation

A loop invariant computation is one that evaluates the same value every time the statements in loop are executed. Moving such a computational statements outside the loop leads to a reduction in the execution time.

Algorithm for elimination of loop invariant code

Step 1: Identify loop-invariant code.

Step 2: An instruction is loop-invariant if, for each operand:

  • The operand is constant, OR
  • All definitions for all the operands that reach the instruction inside the loop are made outside the loop, OR
  • There is exactly one definition made using loop invariant variables inside the loop for an operand that reaches the instruction.

Step 3: Move it outside the loop.

  • For each loop-invariant definition statement i: x = y + z in basic block B, verify that it satisfies the conditions:
    • B is the first block that dominates all exits of the loop
    • x is not redefined anywhere else in the loop, that is, it is defined once in block
    • block B dominates all uses of x with in the loop

      (i.e., all uses of x within the loop can be reached only by statement i.)

  • Move each instruction i to newly created preheader if and only if it satisfies these requirements of the loop, making certain that any non-constant operands (like y, z) have already had their definitions moved to the pre-header.

Note: It is important that while applying loop-invariant code motion in nested loops, start at innermost loop and work outwards.

Example 12:

 

for (int i=0; i<1000; i++)

x = x + y/z;

print x;

In the example, the value of y and z remains unchanged as there is no definition for these variables in the loop. Hence, the computation of y/z remains unchanged, which can be moved above the loop and the same code can be rewritten as follows:

 

t1= y/z

for (int i=0; i<1000; i++)

x = x + t1;

print x;

10.9.2  Induction Variables

Induction variables are those variables used in a loop, their values are in lock-step, and hence, it may be possible to eliminate all except one. There are two types of induction variables—basic and derived. Basic induction variables are those that are of the form

 

I = I ± C

 

where I is loop variable and C is some constant.

Derived induction variables are those that are defined only once in the loop and their value is a linear function of the basic induction variable.

For example,

 

J = A * I + B

 

Here J is a variable that is dependent on the basic induction variable I and the constants A and B. This is represented as a triplet (I, A, B)

Induction variable elimination involves three steps.

  1. Detecting induction variable
  2. Reducing the strength of induction variable
  3. Eliminating induction variable

10.9.2.1  Detecting Induction Variable

Given a loop L with reaching definition information and the families of induction variable this algorithm generates the output that indicates the set of induction variables. For each induction variable J that belongs to family of I if, there is as associated triplet (I, A, B) where I is the basic induction variable, A and B are constants such that, value of J is computed as A * I + B.

Algorithm for detecting the induction variable

For all basic blocks

  • Scan the statements of loop L, if there is a loop—invariant computation associated with each basic induction variable, the triplet as (I, 1, 0).
  •  Search for a variable J within the loop L is defined as single assignment and is in the family of I where I is basic induction variable.

     

    J = I + B ( then the triplet is (I, 1, B)

    J = I– B ( then the triplet is (I, 1, –B)

    J = A * I ( then the triplet is (I, A, 0)

    J = I / A ( then the triplet is (I, 1/A, 0)

     

  •  If the induction variable J is in the family of some K, where K is not basic induction variable but is in family if I, then the other requirements are
    • There is no assignment to I between the actual point of assignment to K in L and the assignment to J and
    • There is no definition of K outside L reaches J.

Suppose the induction variable J = C * K where the triplet for K is (I, A, B), then triplet for J is given as (I, A * C, B * C). Once the family of induction variable is found, we modify the instructions, computing induction variable to use additions or subtractions rather than multiplications. This can be done by the strength reduction process, which is shown in the next algorithm.

Example 13: Let us consider the block B2, optimized code of quick sort program as in Figure 10.17. In this block, variable i is a basic induction variable as there is a lone assignment to i in the loop and increments by 1. temp2 is in the family of i and its triplet is (i, 4, 0). Similarly J is the only basic induction variable in block B3 with temp4 in the family of J with triplet (j, 4, 0)

 

5. i : = i + 1

6. temp2 : = 4 * i

7. temp3 : = a [ temp2]

8. if temp3 < pivot go to B2

Figure 10.17 Block B2 from Quick Sort Program

10.9.2.2  Strength Reduction Applied to the Induction Variable

For a loop L with reaching definition information and families of induction variables, we revise the loop where the high strength operations are replaced by low strength operations.

Algorithm

Consider each basic induction variable I, consider a variable J in the family of I depending on the variable I with triplet (I, a, b):

  1. Create a new variable K for the two variables J1 and J2 with same triplet.
  2. Replace the assignment to J by J = S
  3. Add a new statement S = S + A*n immediately after every assignment I = I + n in L, and place S in family of I.
  4. Ensure that S is initialized to A * I + b on entry to the loop. This initialization may be placed at the end of preheader which consists of

     

    S = A * I
    S = S + B

Example 14: On applying the previous algorithm, we got that in block B3 the basic induction variable is j and temp4 is in family of J with triplet (j, 4, 0). On applying the above algorithm, the statement temp4 = 4 * j is replaced by temp4 = s4 and inserts the assignment statement s4 = s4 – 4 after the assignment of j = j – 1 where –4 is obtained by multiplying the –1 in the assignment to j and the 4 in the triplet (j, 4, 0) for temp4. Initialization of s4 is placed at the end of block B1, which contains the definition of J. After strength reduction, we find that only use of some induction variables is in tests, we can replace a test of such an induction variable by that of another as shown in Figure 10.18.

10.9.2.3  Elimination of Induction Variable

The induction variable can be eliminated if the reaching definition information, loop invariant computation information, and live variable information is available.

Figure 10.18

Figure 10.18 Quick Sort Program After Applying Strength Reduction

  1. Let there be some basic induction variable that is used only to compute other induction variable in its family and in conditional branches.

    For example, let J be in the family of I with triplet (I, A , B) where A is positive.

     

          => J = A * I + B.

     

    Let there be a test of the form

     

          if I relop X goto B

     

    where X is not an induction variable; then this statement can be replaced as follows:

     

          C = A *X

          C = C + B

          If J relop C goto B

     

    where C is a new temporary variable. This modification makes the code independent of induction variable I since comparing I with X is the same as comparing J with A*X+B

    If there are two variables I1 and I2 then the test statement is

     

          if I1 relop I2 goto B,

     

    then we check for the variables J1 and J2 in the families of I1 and I2 with the triplets as (I1, A1, B1) and (I2, A2, B2) respectively with A1 = A2 and B1 = B2 then the statement

     

          if I1 relop I2 goto B, can be replaced as

          if J1 relop J2 goto B.

     

    Once these variables are replaced, delete all the induction variables as these are useless.

On applying induction variable elimination to the quick sort program, the resultant code is shown in Figure 10.19.

10.10  Global Flow Analysis

To apply the optimization globally and get a good optimized code, it is required to collect information about variables or expressions in the program as a whole and distribute this information to each block. Data flow information can be collected by setting up equations and solving these equations at various points. These equations are framed in terms of four variables that are listed below.

 

IN(S)

it indicates the set of definitions that are reaching the statement S.

OUT(S)

it indicates the set of definitions that are leaving the statement S.

GEN(S)

the definition that is generated at this statement.

KILL(S)

the definition that is killed at this statement.

 

A typical equation has a form

 

OUT(S) = GEN(S) ∪ (IN(S) – KILL(S))

 

This indicates the information at the end of a statement is either generated within the statement, or enters at the beginning of the statement and is not killed in as the control flows through the statement.

The factors that influence the framing of the equations and solving them depends on

  1. The notions of generating and killing depend on the desired information. Sometimes it may require analyzing the flow backwards rather than normal flow.
  2. Since the data flows along the control path, the data flow analysis is affected by the control structures in the program.
  3. There may be cases that may control the analysis. For example, procedure calls, assignment through pointers or array variables.

Before understanding the method for framing and solving the equations, let us try to understand reaching definition, use definition chains, definition use chains, live variable analysis.

Figure 10.19

Figure 10.19 Code After Induction Variable Elimination

10.10.1  Points and Paths

A point is between two adjacent statements, as well as the point before and after a statement. If a block has four statements, then there would be five points in the block as shown in Figure 10.20.

A path is a sequence of points p1, p2, .......pn, which give the global view of how the control flows and satisfies the condition for each i between 1 to n; either

  1. pi is a point immediately preceding a statement and pi+1 is the point immediately following the statement in the same block or
    Figure 10.20

    Figure 10.20 Basic Block indicating Different Points

  2. pi is a point at the end of some block and pi+1 is a point which is at the beginning of the succeeding block.

10.10.2  Reaching Definition

A definition of a variable a is a statement that assigns, or may assign, a value to a. The common forms of definition to assign value to a is by an input statement or by an assignment statement. These forms are said to be unambiguous. The ambiguous definitions are

  1. A call to a procedure with variable a as a parameter using call by reference.
  2. An assignment through a pointer that could refer to variable a.

A definition d that reaches a point p, if and only if there exists a path from the point of its definition to the point immediately following d, such that d is not killed in any path that reaches d.

10.10.3  Use Definition Chains

Use definition chains are used to store the reaching definition information. It is the list of each use of a variable out of all definitions that reach that use.

  • If a variable a has unambiguous definition then ud-chain for that use is a set of definitions in IN[B]. If there is an unambiguous definition within the block, then the use of a is the one that is last defined in the same block.
  • If a variable a has an ambiguous definition, then all of these for which no unambiguous definition of a lies between it and the use of a are on the ud-chain for this use of a.

10.10.4  Live Variable Analysis

Some code optimization techniques depend on the information computed in the direction opposite to the flow of control. For example, in dead code elimination, we may eliminate the statements that assign values to variables that are never used. This requires the analysis in the opposite direction, which identified what definitions are used at any statement. In live variable analysis, we wish to know for a variable x and a point p whether the value x at p could be used along some path in the flow graph starting at p. if we say so, x is live at p, otherwise x is dead at p.

10.10.5  Definition Use Chains

A variable is used at a statement s if the r-value of the variable is required in the computation in statement s. The du-chain represents for a point p the set of uses s of a variable x, such that there is a path from p to s that does not redefine x.

10.10.6  Data Flow Analysis of Structured Programs

The program includes different control flow constructs such as if statements, iterative statements, which would control the framing of data flow equations. This section gives different ways of framing equations that are structure dependent.

Figure 10.21 gives the list of rules for framing equations where Da indicates the set of all definitions corresponding to variable a.

10.10.7  Representation of Sets

The set of definitions for GEN and KILL can be represented using bit representation, which is compact and easy to compute. We assign a number for each definition and then use a bit vector that has the bits depending on the number of definitions. To compute union, we can perform logical operation, for intersection perform a logical operation and for difference, that is, A – B perform logical A and (not B).

Figure 10.21

Figure 10.21 Rules for Framing Equations

Example 15: Consider the program that has eight definitions as shown below

d1 I = n - 1

d2 J = m

d3 a = x 1

      do

d4 I = I + 1

d5 J = J - 1

      if E1 then

d6    a = x2

      else

d7    I = x2

d8    a = x3

      while E2

In the above partial program, there are eight definitions from d1 through d8. Eight bits are required for bit representation. At any given point, the KILL, GEN, IN, OUT are represented as 8 bits. The bits that are set to 1 are said to be those definitions that are killed, generated, reaching, or leaving the point respectively. The set for KILL and GEN are computed by applying the data flow equations to the statements. These statements can be represented as a syntax tree using the following rules.

 

S → id: = E | ‌ S ; S ‌ if E then S else S |

     do S while E |‌ if E then S

E → id + id ‌| id – id ‌ | id * id ‌| id / id |‌ id

 

We can represent the entire program in the form of syntax tree using the rules. For each node in the parse tree, we can compute the values for GEN and KILL as explained below.

GEN(d1) = 1000 0000 / as only one definition is done here

KILL(d1) = 0001 0010 / the definition here is for I and this kills the definition of I in

definitions of d4 and d7

Similarly, for all definitions, d2, d3 .....d8 the calculations are done. At a point above d1 and d2 to compute GEN and KILL we apply the rule of statements in sequence

 

  GEN[S] = GEN[S2] ∪ (GEN[S1] – KILL[S2])

      ⇒ 0100 0000 ∪ (1000 0000– 0000 1000)

      ⇒ 0100 0000 ∪ (1000 0000 ∩ 11110111)

      ⇒ 0100 0000 ∪ (1000 0000)

      ⇒ 1100 0000

  KILL[S] = KILL[S2] ∪ (KILL[S1] – GEN[S2])

      ⇒ 0000 1000 ∪ (0001 0010 – 0100 0000)

      ⇒ 0000 1000 ∪ (0001 0010 ∩ 1011 1111)

      ⇒ 0100 0000 ∪ (0001 0010)

      ⇒ 0001 1010

Similarly, calculations are done for each node and the values are shown in the annotated parse tree of Figure 10.22.

10.10.8  Iterative Algorithm for Reaching Definition

Once the KILL and GEN for each block is computed, we can compute IN and OUT for each block using the formulas

N[B] = image
OUT[B] = GEN[B] ∪ (N[B] − KILL[B])

 

We use iterative approach, starting with an estimation for IN[b] = Ø for all blocks B and converging to the desired value of IN and OUT. The algorithm sketch is given below.

Figure 10.22

Figure 10.22 Annotated Parse Tree with GEN and KILL Values

  1. initialize IN[B]= Ø and OUT[B] based on GEN[B] values
  2. for each block B do until OUT[B] = GEN[B]
  3. change = true
  4. while change do begin
  5.     change = false
  6.      for each block B do begin
  7. N[B] = image
  8. OLDOUT= OUT[B]
  9. OUT[B]=GEN[B] ∪ (IN[B] – KILL[B])
  10. if OUT[B] 1 OLDOUT then change = true
  11. end
  12. end

Example 16: The above algorithm when applied to the flow graph of Example 15, we can observe that at the end, the GEN and KILL values for each block are shown in Figure 10.23.

According to the algorithm, first initialize the values of IN[B] with all zeros and based on GEN information find the values for OUT for each block.

PASS 1:

IN[B1] remains the same as there is no predecessor block.

 

OUT[B1] = GEN[B1] ∪ (IN[B1] – KILL[B1])

      = 1110 0000 ∪ (0000 0000 – 0001 1111)

      = 1110 0000 ∪ (0000 0000 ∩ 1110 0000)

Figure 10.23

Figure 10.23 GEN and KILL Values for Each Block

image

      = 11100000 ∪ 0000 0000

      = 1110 0000

 

IN[B2] = OUT[B1] ∪ OUT[B3] ∪ OUT[B4]

      = 1110 0000 ∪ 0000 0100 ∪ 0000 0011

      = 1110 0111

 

OUT[B2] = GEN[B2] ∪ (IN[B2] – KILL[B2])

      = 0001 1000 ∪ (1110 0111– 1100 0010)

      = 0001 1000 ∪ (1110 0111 ∩ 0011 1101)

      = 0001 1000 ∪ 0011 0101

      = 0011 1101

 

IN[B3] = OUT[B2] = 0011 1101

 

OUT[B3] = GEN[B3] ∪ (IN[B3] – KILL[B3])

      = 0000 0100 ∪ (0011 1101 – 0010 0001)

      = 0000 0100 ∪ (0011 1101 ∩ 1101 1110)

      = 0000 0100 ∪ 0001 1100

      = 0001 1100

 

IN[B4] = OUT[B2] = 0011 1101

 

OUT[B4] = GEN[B4] ∪ (IN[B4] – KILL[B4])

      = 0000 0011 ∪ (0011 1101 – 0010 0001)

      = 0000 0011 ∪ (0011 1101 ∩ 1101 1110)

      = 0000 0011 ∪ 0001 1100

      = 0001 1111

 

PASS 2:

IN[B1] remains same as there is no predecessor block.

OUT[B1] = GEN[B1] ∪ (IN[B1] – KILL[B1])

      = 1110 0000 ∪ (0000 0000– 0001 1111)

      = 1110 0000 ∪ (0000 0000 ∩ 1110 0000)

      = 1110 0000 ∪ 0000 0000

      = 1110 0000

 

IN[B2] = OUT[B1] ∪ OUT[B3] ∪ OUT[B4]

      = 1110 0000 ∪ 0001 1100 ∪ 0001 1111

      = 1111 1111

 

OUT[B2] = GEN[B2] ∪ (IN[B2] – KILL[B2])

      = 0001 1000 ∪ (1111 1111– 1100 0010)

      = 0001 1000 ∪ (1111 1111 ∩ 0011 1101)

      = 0001 1000 ∪ 0011 1101

      = 0011 1101

 

IN[B3] = OUT[B2] = 0011 1101

 

OUT[B3] = GEN[B3] ∪ (IN[B3] – KILL[B3])

      = 0000 0100 ∪ (0011 1101– 0010 0001)

      = 0000 0100 ∪ (0011 1101 ∩ 1101 1110)

      = 0000 0100 ∪ 0001 1100

      = 0001 1100

 

IN[B4] = OUT[B2] = 0011 1101

 

OUT[B4] = GEN[B4] ∪ (IN[B4] – KILL[B4])

      = 0000 0011 ∪ (0011 1101– 0010 0001)

      = 0000 0011 ∪ (0011 1101 ∩ 1101 1110)

      = 0000 0011 ∪ 0001 1100

      = 0001 1111

 

Since the OUT of each block in pass 2 is same with pass 1, it is converged and can be stopped.

The same technique can also be applied for common sub expression and use this information for global optimization.

10.11  Machine-Dependent Optimization

This optimization can be applied on target machine instructions. This includes register allocation, use of addressing modes and peep hole optimization. Instructions involving register operands are faster and shorter (instruction length is small); hence, if we make use of more registers during target code generation, efficient code will be generated. Hence, register allocation and use of addressing modes also contribute to optimization. The most popular optimization that can be applied on target machine is peephole optimization.

Peephole Optimization

Generally code generation algorithms produce code, statement by statement. This may contain redundant instructions and suboptimal constructs. The efficiency of such code can be improved by applying peephole optimization, which is simple but effective optimization on target code. The peephole is considered a small moving window on the target code. The code in peephole need not be contiguous. It improves the performance of the target program by examining and transforming a short sequence of target instructions. The advantage of peephole optimization is that each improvement applied increases opportunities and shows additional improvements. It may need repeated passes to be applied over the target code to get the maximum benefit. It can also be applied directly after intermediate code generation.

In this section, we shall define the following examples of program transformations that are characteristic of peephole optimizations.

10.11.1  Redundant Loads and Stores

The code generation algorithm produces the target code, which is either represented with single operand or two operands or three operands. Let us assume the instructions are with two operands. The following is an example that gives the assembly code for the statement x = y + z.

  1. MOV y, R0
  2. ADD z, R0
  3. MOV R0, x

Instruction 1 moves the value of y to register R0, second instruction performs the addition of value in z with the register content and the result of the operation is stored in the register. The third instruction copies the register content to the location x. At this point the value of x is available in both location of x and the register R0.

If the above algorithm is applied on the code a = b + c, d = a + e then it generates the code given below:

  1. MOV b, R0
  2. ADD c, R0
  3. MOV R0, a
  4. MOV a, R0
  5. ADD e, R0
  6. MOV R0, d

Here we can say that 3 and 4 are redundant load and store instructions. These instructions will not affect the values before or after their execution. Such redundant statements can be eliminated and the resultant code is as follows:

  1. MOV b, R0
  2. ADD c, R0
  3. ADD e, R0
  4. MOV R0, d

10.11.2  Algebraic Simplification

There are few algebraic identities that occur frequently enough and are worth considering.

Look at the following statements.

 

      x: = x + 0

      x: = x * 1

 

They do not alter the value of x. If we keep them as it is, later when code generation algorithm is applied on it, it may produce six statements that are of no use. Hence, such statements whether they are in three address code or target code can be removed.

10.11.3  Dead Code Elimination

Removal of unreachable code is an opportunity for peephole optimization. A statement immediately after an unconditional jump or a statement that never get a chance to be executed can be identified and eliminated. Such code is called the dead code.

For example consider a statement in high level language code

# define x = 0

…….

If (x)

{ ----print value

-----

}

If this is translated to target code as

 

If x=l goto L1

      goto L2

L1: print value

L2 : ……..

Here value will never be printed. So whatever code inside the body of “if(x)” is dead code; hence, it can be removed.

10.11.4  Flow-of-Control Optimization

Sometimes when we apply code generation algorithms mechanically we may get jump on jumps as follows:

goto L1

L1: goto L2

…….

L2: goto L3

….

L3: if a < b goto L4

L4:

This can be optimized as

goto L3

……….

L3: if a < b goto L4

L4:

10.11.5  Reduction in Strength

This optimization mainly deals with replacing expensive operations by cheaper ones. For example

  • x2x * x
  • fixed-point multiplication and division by a power of 2 ⇒ shift
  • floating-point division by a constant ⇒ floating-point multiplication by a constant

10.11.6  Use of Machine Idioms

While generating the target code it is better to make use of rich instruction set supported by the target machine, instead of blindly applying available code-generation algorithms. This may produce efficient code. Feature provided by machine architecture may be identified and used wherever applicable to reduce the overall execution time significantly.

For example, consider the statement x = x + 1; if we apply the code-generation algorithm mentioned in redundant load/store, we get six instructions but there can be hardware instructions for certain specific operations auto-increment and auto-decrement addressing mode like INR x, which is one instruction only.

Solved Problems

  1. Create DAG for the following code:

     

    t1 = 4 * i

    t2 = a[t1]

    t3 = b[t1]

    t4 = t2 * t3

    pr = pr + t4

    i = i + 1

    if i <=20 goto (1)

    Solution: The DAG for a given code is shown in Figure 10.24

    Figure 10.24

    Figure 10.24 DAG for 1

  2. Divide the following code, which is for factorial function, into basic blocks and draw flow graph
    1. f = l;
    2. i = 2;
    3. If (i>x) goto (8)
    4. f = f * i;
    5. t1 = i + 1;
    6.  i = t1;
    7. goto (3)
    8. goto calling program

    Solution: By using the algorithm for leaders, we identify leaders as 1, 3, 4, 8.

    The statement starting from a leader up to the next leader is a basic block.

    Figure 10.25 shows the basic blocks for the factorial program.

    The flow graph is shown in Figure 10.26.

    Figure 10.25

    Figure 10.25 Basic Blocks for Factorial Program

    Figure 10.26

    Figure 10.26 Flow Graph for Factorial Program

  3. Consider the following part of code.

    for (i=1; i<n; i++)

    for (j=1; j<n; j++)

          c [i,j] = a[i,j] * b[i,j];

    print(“done”);

    • Write its equivalent three address code.
    • Identify the basic blocks.

     

    Solution: a. The three address code for the above program is as follows:

        1. i = 1

        2. j = 1

        3. t1 = 4 * i

        4. t2 =i - 1

        5. t3 =t2 + j

        6. t4 = t3 * 4

        7. t5 = a[t4]

        8. t6 = i - 1

        9. t7 = t6 * n

        10. t8 = t7 + j

        11. t9 = t8 * 4

        12. t10 = b[t9]

        13. t11 = t5 * t10

        14. t12 = i - 1

        15. t13 = t12 * n

        16. t14 = t13 + j

        17. t15 = t14 * 4

        18. t16 = c[t15]

        19. t16 = t11

        20. t17 = j + 1

        21. j = t17

        22. if j < n go to 2

        23. t18 = i + 1

        24. i = t18

        25. if i < n go to 1

        26. print(“done”);

    b. To identify the basic blocks, first identify the leader statements. In the above three address code, the leader statements are statement 1, 2, 23, and 26.

        1. i = 1       leader 1

        2. j = 1;    leader 2

        3. t1 = 4 * i

        4. t2 = i - 1

        5. t3 = t2 + j

        6. t4 = t3 * 4

        7. t5 = a[t4]

        8. t6 = i - 1

        9. t7 = t6 * n

        10. t8 = t7 + j

        11. t9 = t8 * 4

        12. t10 = b[t9]

        13. t11 = t5 * t10

        15. t13 = t12 * n

        16. tl4 = t13 + j

        17. t15 = t14 * 4

        18. t16 = c[t15]

        19.t16 = t11

        20. t17 = j + 1

        21. j = t17

        22. if j < n go to 2

        23. t18 = i + 1    leader 3

        24. i = t18

        25. if i <n go to 1

        26. print(“done”);   leader 4

        14. t12 = i - 1

    Basic block 1 has statement 1

    Basic block 2 has statements 2 – 22

    Basic block 3 has statements 23 – 25

    Basic block 4 has statement 26.

     

    The flow graph is shown in Figure 10.27.

    Figure 10.27

    Figure 10.27 Flow Graph for 3

  4. Optimize the code applying suitable optimization techniques on the resultant three address code for question 3.

    Solution: The three address code is given in the previous problem. On applying redundant sub expression elimination we get it as

        1. i = 1

        2. j = 1

        3. t1 = 4 * i

        4. t2 = i - 1

        5. t3 = t2 + j

        6. t4 = t3 * 4

        7. t5 = a[t4]

        8. t6 = b[t4]

        9. t7 = t5 * t6

        10. t8 = c[t4]

        11. t8 = t7

        12. t9 = j + 1

        13. j = t9

        14. if j < n go to 2

        15. t10 = i + 1

        16. i = t10

        17. if i <n go to 1

        18. print(“done”);

  5. For the following statements, draw the DAG.

          t1 = a + b;

          t2 = a * b;

          t3 = t1 * b;

          c = t3 + t2;

    Solution:

    The DAG for the above sequence of statements is in Figure 10.28

    Figure 10.28

    Figure 10.28 DAG for 5

  6. Write the procedure to construct the DAG for a statement. Represent the following statement using DAG.
    a = (b + c) * (d – c) + a / (d – c) + (b + c)

    Solution: Procedure for constructing DAG is explained in Section 10.6. The DAG for the given expression is in Figure 10.29.

  7. Represent the statement x = a / (b + c) – d * (e + f) using DAG.

Solution:

Figure 10.29

Figure 10.29 DAG for 6

Figure 10.30

Figure 10.30 DAG for 7

Summary

  • Optimization improves the code in terms of space and time.
  • Machine-independent optimization techniques are applied on intermediate representation.
  • Machine-dependent optimization techniques are applied on target code of machine code.
  • Flow graphs are used to represent the three address code where nodes are basic blocks and the edges represent the control flow.
  • A basic block is the basic unit of program where all the statements in the block are either executed or ignored depending on the control flow.
  • Leader is the first statement in the basic block.
  • DAG representation is useful in identifying the redundant sub expression.
  • Function-preserving transformations are performed without changing the function it computes.
  • Structure-preserving transformations are performed without changing the set of expressions computed in the block.
  • Algebraic transformations are useful to change the set of expressions computed by basic block into algebraically equivalent set.

Fill in the Blanks

1. __________ are the principal sources of optimization.

2. __________ analysis is used to detect loops in intermediate code.

3. Applying optimization guarantee that generated code is optimal. (Y/N) __________

3. Replicating the body of the loop to reduce number of tests if the number of iterations is constant is called __________.

4. Merging the bodies of two loops if the loops have same number of iterations and they use same indices is called __________.

5. Moving a computation from a high-frequency region to a low-frequency region is called __________.

6. __________ is a code which is never used.

7. __________ are useful data structures for redundancy elimination.

8. Replacing a costlier operation by a cheaper one is called __________.

9. The usefulness of a variable is listed using __________.

10. The possible definitions that are reachable at a particular point are listed using

11. In a path p1, p2, …..pn, if pi is the last point in a block then pi+1 is __________.

12. To apply global redundant sub expression elimination, __________ gives the required information.

Objective Question Bank

  1. Folding is a __________ optimization with regard to the basic block.
    • global 
    • local
    • universal 
    • none
  2. Which of the following is machine-dependent optimization?
    • Strength reduction 
    • Redundancy elimination
    • Flow of control optimization 
    • None
  3. Which of the following is machine-independent optimization?
    • Strength reduction
    • redundancy elimination
    • loop optimization
    • all
  4.  __________ optimization should not introduce additional errors in the program.
    • Folding 
    • Constant propagation
    • Any 
    • None
  5. For finding leaders for basic block, we use the rule as the__________ statement in a leader.
    • firs 
    • last
    • statement that is before a conditional jump
    • statement that is before a unconditional jump
  6. Statement starting from a leader up to the next leader is called__________.
    • leader 
    • DAG
    • basic block 
    • none
  7. Which of the following is peephole optimization?
    • Common sub expression elimination 
    • Flow of control optimization
    • Loop jamming. 
    • None
  8. Advantages of copy propagation is that__________.
    •  it often reduces code
    •  it often turns the copy statement into dead code
    •  it often makes code to run faster
    •  None
  9. Code motion is__________.
    •  moving part of computation outside loop.
    •  moving loop invariant computation before loop
    •  Either (a) or (b)
    • None
  10. Folding can be applied to__________.
    •  subscripted variables
    • floating point numbers
    • un subscripted variables
    •  None

Exercises

1. Write the importance of code optimization. Explain the strength-reduction technique in loop with example.

2. Explain with example the various techniques in loop optimization.

3. What is local and global optimization? Explain with example any three local optimization techniques.

4. Consider the following part of code.

int main()

{

int n,k=0;

scanf(“%d,”&n) ;

for(i=2; i<n;i++)

{

      if ( (n % i) == 0) break;

}

k=1;

if( i==n)

      printf(“number is prime”);

else

      printf(“number is not prime”);

}

  •  Identify the basic blocks in the given program.
  • Draw the domination tree for the program.
  • Using dead code elimination identify the statements that are eliminated.

5. Explain the following loop optimization techniques with examples.

  •  Frequency reduction
  •  Strength reduction
  •  Code motion

6. Explain about (a) loop unrolling (b) loop jamming (c) code motion.

7. Explain about (a) folding (b) strength reduction (c) redundant code elimination (d) dead code elimination.

8. Write the procedure to construct the DAG for a statement. Represent the following statement using DAG.

 

a = (x + y) * (x – y) + x /(x – z) + (x + y)

 

9. Explain DAG and its use. Write the procedure to construct the DAG for a statement.

10. What is a basic block? With an example explain the procedure to identify the basic blocks in a given program. Draw the domination tree for the following procedure.

(16M) -M

read a,b

c=a+b

If (c>=30) go to 10

      c=c*2

      go to 20

10 c=c*3

20 print a,b,c

11.

  • Explain DAG and its use.
  • For the following statements draw the DAG.

     

    t8 = d + e

    t6 = a + b

    t5 = t6 – c

    t4 = t5 * t8

    t3 = t4 – e

    t2 = t6 + t4

    t1 = t2 * t3

 

12.

  • What is local and global optimization? (6 + 5 + 5 M) – M

    read a,b

    c=a+b

    If (c>=30) go to 10

          c=c+0

          c=c*2

          go to 20

    10 c=c*3

    20 print a,b,c

  •  Identify the basic blocks in the given program.
  •  Identify the optimization technique that can be applied and write the optimized code.

14. Define the following terms.

  • Reaching definition.
  • Live variables.
  • Use definition chains.
  • Definition use chains.

15. Explain the formulation of data flow equations for reaching definition in structured programs. Describe the procedure to compute in and out values.

16. Explain about global data flow analysis. Explain the factors that affect the data flow equations.

17. Write the iterative algorithm for reaching definition. Compute in and out for the following flow graph.

image

18. Explain about global data flow analysis. List the data flow equations for reaching definitions for structured programs.

19. 

  • Explain the use of algebraic identities in optimization.
  • Explain strength reduction with example.

20. 

  • Explain live variable analysis with example.
  • Explain redundant sub expression elimination with example.

Key for Fill in the Blanks

  1. dead code elimination and constant propagation
  2. global flow
  3. No
  4. loop unrolling and loop jamming
  5. code motion
  6. dead code
  7. DAG
  8. strength reduction
  9. use definition chains
  10. definition use chains
  11. first point in the successor block
  12. data flow analysis

Key for Objective Question Bank

  1. b.
  2. a.
  3. d.
  4. b.
  5. a.
  6. c.
  7. b.
  8. b.
  9. c.
  10. c.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.222.114.132