Chapter 8

Intermediate Code Generation

Intermediate code generator makes target code generation easier. It takes the hierarchical representation of the source program as parse tree and prepares linear representation that is simple and easy to analyze.

CHAPTER OUTLINE

8.1  Introduction

The intermediate code is useful representation when compilers are designed as two pass system, i.e. as front end and back end. The source program is made source language independent by representing it in intermediate form, so that the back end is filtered from source language dependence. The intermediate code can be generated by modifying the syntax-directed translation rules to represent the program in intermediate form. This phase of intermediate code generation comes after semantic analysis and before code optimization as shown in Figure 8.1.

Benefits of intermediate code

  • Intermediate code makes target code generation easier
  • It helps in retargeting, that is, creating more and more compilers for the same source language but for different machines.
  • As intermediate code is machine independent, it helps in machine-independent code optimization.
Figure 8.1

Figure 8.1 Role of Intermediate Code Generator

8.2  Intermediate Languages

Intermediate code can be represented in the following four ways.

  1. Syntax trees
  2. Directed acyclic graph(DAG)
  3. Postfix notation
  4. Three address code

8.2.1  Syntax Trees

A syntax tree is a graphical representation of the source program. Here the node represents an operator and children of the node represent operands. It is a hierarchical structure that can be constructed by syntax rules. The target code can be generated by traversing the tree in post order form. For instance, consider an assignment statement a = b* – (c – d) + b* – (c – d) when represented using the syntax tree it appears as follows shown in Figure 8.2.

The rules for constructing the syntax tree for assignment statements are produced by the syntax-directed definition as shown in Figure 8.3

The tree for the statement a = b* – (c – d) + b* – (c – d) is constructed by creating the nodes in the following order.

      p1 = mkleaf(id, c)

      p2 = mkleaf(id, d)

      p3 = mknode('–', p1, p2)

      p4 = mknode('U', p3, NULL)

      p5 = mkleaf(id, b)

Figure 8.2

Figure 8.2 Syntax tree for a = b* – (c – d) + b* – (c – d)

Production
Semantic Rule

S → id := E

S.nptr := mknode( 'assign', mkleaf(id, id.place), E.nptr)

E → E1 + E2

E.nptr := mknode('+', E1.nptr ,E2.nptr)

E → E1 * E2

E.nptr := mknode('* ', E1.nptr ,E2.nptr)

E → − E1

E.nptr := mkunode('uminus', E1.nptr)

E → ( E1 )

E.nptr := E1.nptr

E → id

E.nptr := mkleaf(id, id.place)

Figure 8.3 SDT for Creating the Syntax tree

      p6 = mknode('*', p5, p4)

      p7 = mkleaf(id, c)

      p8 = mkleaf(id, d)

      p9 = mknode('–', p7, p8)

      p10 = mknode('U', p9, NULL)

      p11 = mkleaf(id, b)

      p12 = mknode('*', p11, p10)

      p13 = mknode('+', p6, p12)

      p14 = mkleaf(id, a)

      p15 = mknode('=', p14, p13)

The tree that is constructed using the above procedures is shown in Figure 8.4.

Figure 8.4

Figure 8.4 Tree Construction for a = b* – (c – d) + b* – (c – d)

8.2.2  Directed Acyclic Graph (DAG)

The tree that shows the same information with identified common sub-expression is called Directed Acyclic Graph (DAG). On examining the above example, it is observed that there are some nodes that are unnecessarily created. To avoid extra nodes these functions can be modified to check the existence of similar node before creating it. If a node exists then the pointer to it is returned instead of creating a new node. This creates a DAG, which reduces the space and time requirement. We can use the same SDT (Figure 8.3) to create a DAG without any modification but mknode() is redefined as above. The list of nodes created in DAG is as follows:

      p1 = mkleaf(id, c)

      p2 = mkleaf(id, d)

      p3 = mknode('–', p1, p2)

      p4 = mknode('U', p3, NULL)

      p5 = mkleaf(id, b)

      p6 = mknode('*', p5, p4)

      p7 = mkleaf(id, c) = p1

      p8 = mkleaf(id, d) = p2

      p9 = mknode('–', p1, p2) = p3

      p10 = mknode('U', p3, NULL) = p4

      p11 = mkleaf(id, b) = p5

      p12 = mknode('*', p5, p4) = p6

      p13 = mknode('+', p6, p6)

      p14 = mkleaf(id, a)

      p15 = mknode('=', p14, p13)

The Figure 8.5 shows the constructed DAG for the given expression.

In this tree the number of nodes is reduced and also helps in identifying redundant sub expression. All the nodes in the syntax tree can be visited by following the pointers starting from the root and the instructions can be generated by traversing the tree in post order form.

8.2.3  Postfix Notation

Postfix notation is a linear representation of a syntax tree. This can be written by traversing the tree in the post order form. The edges in a syntax tree do not appear explicitly in postfix notation; only the nodes are listed. The order is followed by listing the parent node immediately after listing its left sub tree and its right sub tree. In postfix notation, the operators are placed after the operands. The postfix notation for the statement is as follows:

 

a = b* – (c d) + b* – (c – d) is a b c d – U * b c d – U * + =
Figure 8.5

Figure 8.5 DAG Construction for a = b* – (c – d) + b* – (c – d)

8.2.4  Three Address Code

Three address code is a linear representation of a syntax tree or a DAG in which explicit names correspond to the interior nodes of the graph. Three address code is a sequence of statements of the form A = B OP C where A, B and C are the names of variables, constants or the temporary variables generated by the compiler. OP is any arithmetic operation or logical operation applied on the operands B and C. The name reflects that there are at most three variables where two are operands and one is for the result. In three address statement, only one operator is permitted; if the expression is large, then break it into a sequence of sub expressions using the BODMAS rules of arithmetic and store the intermediate results in newly created temporary variables. For example, consider the expression a + b * c; this expression is expressed as follows:

      T1 = b * c

      T2 = a + T1

Here T1 and T2 are compiler-generated temporary names. This simple representation of a complex expression in three address code makes the task of optimizer and code generator simple. It is also easy to rearrange the sequence for efficient code generation. Three address code for the statement a = b* – (c – d) + b* – (c – d) is as follows:

      T1 = c – d

      T2 = –T1

      T3 = b * T2

      T4 = c – d

      T5 = –T4

      T6 = b * T5

      T7 = T3 + T6

        a = T7

The code can also be written for DAG as follows:

      T1 = c – d

      T2 = –T1

      T3 = b * T2

      T4 = T3 + T3

        a = T4

Example 1: Consider the assignment a: = b* – c + b* – c. Draw the syntax tree and the DAG.

Solution: The tree and DAG for the expression a: = b* – c + b* – c is shown in Figure 8.6.

Figure 8.6

Figure 8.6 Syntax tree and DAG for a = b* – c + b* – c

8.3  Types of Three Address Statements

For expressing the different programming constructs, the three address statements can be written in different standard formats and these formats are used based on the expression. Some of them are as follows:

  • Assignment statements with binary operator. They are of the form A := B op C where op is a binary arithmetic or logical operation.
  • Assignment statements with unary operator. They are of the form A: = op B where op is a unary operation like unary plus, unary minus, shift, etc.
  • Copy statements. They are of the form A: = B where the value of B is assigned to variable A.
  • Unconditional Jumps such as goto L: The label L with three address statement is the next statement number to be executed.
  • Conditional Jumps such as if X relop Y goto L. If the condition is satisfied, then this instruction applies a relational operator (<=,>=,<,>) to X and Y and executes the statement with label L else the statement following if X relop Y goto L is executed.
  • Functional calls: The functional calls are written as a sequence of param A, call fun,n, and return B statements, where A indicates one of the input argument in n arguments to be passed to the function fun that returns B. The return statement is optional. For example, if the statement is
  • B fun(Al, A2, A3,.... An), then three address statements for it are as follows:

     

    param Al

    param A2

    param A3

       .

       .

    param An

    call fun, n

    return B

  • Indexed assignments The statements of the form A: = B[ i ] and A [ i ]: = B are indexed assignments.
    • In the A: = B[ i ] statement, A is set to the value in the location i memory units beyond location B.
    • A [i]:=B sets the contents of the location i units beyond A to the value of B. In both these instructions, A, B, and i refer to data objects.
  • Address assignments Statement of the form A:= &B, which sets A to the location B.
  • Pointer assignment Statements of the form A:= *B and *A: = B are included. For instance,
    • A:= *B sets the value of A to the value pointed to by B.
    • *A:=B changes the location of the value in A to the address pointed by B.

For any intermediate code generator, the choice of allowable operator is an important issue as this will simplify the optimization and code generation task.

8.4  Representation of Three Address Code

Three address codes can be represented in special structures known as quadruple, triple and indirect triple.

8.4.1  Quadruple

A quadruple is a record structure with four fields. The first field is to store the operator, the second and third fields are for the operands used in the operation, and the fourth field is for the result. For example, the three address statement A = B op C is written by placing op in the first field, that is, operator, B, and C are placed in the second and third fields, that is, operand 1 and operand 2 respectively, and A in fourth field, that is, result. If the operator is unary, then the third field is not used. In case of conditional and unconditional jump statements, the target label is placed in the result field. The quadruple for the statement a = b* – (c – d) + b* – (c – d) is shown below.

We use U for unary minus and – as it is for binary minus.

image

8.4.2  Triple

In quadruples there is an overhead for managing the temporary variables created. This problem can be reduced by referring the position of the statement that computes the value of the sub expression. In triples there are only three fields, one for the operation and the remaining two for operands that may have a variable or a constant or the statement position number that computes the value of the operand. Since there are only three fields, it is called "triples."

Parenthesized numbers represent the position number of the triple structure, while symbol-table pointers are represented by the names themselves. Since the operands represent two different entries, the fields can be encoded in a proper manner. For example, in case of copy statement a:= tl, where tl is computed at some position (i) then the triplet entries are "=" is placed in the first field, a in the second field and (i) in the third field. For a ternary operation like a[i]:=b, it requires two entries in the structure. The first entry finds the position of i with reference to a and then the actual copy statement as given below.

image

The triple for the statement a = b* – (c – d) + b* – (c – d) is

image

8.4.3  Indirect Triples

Indirect triples is another implementation of three address code where the listing of pointer to triples is given separately as shown in the following example:

image
Sequence
Statement No

(0)

(20)

(1)

(21)

(2)

(22)

(3)

(23)

(4)

(24)

(5)

(25)

(6)

(26)

(7)

(27)

8.4.4  Comparison of Representations

The main difference between the three forms of structures is that in case of quadruples, temporary variables are created, which need entries in the symbol table. This leads to waste of space. This problem is eliminated in triples and indirect triples. The second difference is regarding the extent of indirection that is present in the representation. For instance, when the target code is produced, each variable or compiler generated temporary variable is assigned some run-time memory location. There would be an entry regarding these addresses in the symbol table. In case of quadruples, the three address statement that uses these temporary variables can be accessed immediately from the symbol table. The advantage of these entries in the symbol table is when the optimizer rearranges the statements for generating efficient code, it requires no changes for the variables. In case of triples moving a single statement that defines a temporary value, many changes have to be made in all the statements that refer to this computation either in argl or in arg2. Hence, it is difficult for the optimizer to optimize the code.

In case of indirect triples, this problem does not exist. A statement can be moved by reordering the statement list. Since pointers to temporary values refer to the statements in the actual list of statements, which are not modified, the reference pointers can be reordered easily. The indirect triples are almost similar to quadruples in terms of space and utility. However, indirect triples can save some space compared with quadruples if the same temporary value is used more than once. This is because two or more entries in the statement array can point to the same line in the actual triple structure.

For example, lines (20), (21), and (22) could be combined with (23), (24), and (25).

8.5  Syntax-Directed Translation into Three Address Code

Syntax-directed translation rules can be defined to generate the three address code while parsing the input. It may be required to generate temporary names for interior nodes which are assigned to nonterminal E on the left side of the production E→E1 op E2. While processing, the variable names and code generated so far are tracked till they reach the starting nonterminal. For this purpose, we associate two attributes place and code associated with each nonterminal.

  • E.place, the name that will hold the value of E.
  • E.code, the sequence of three-address statements evaluating E.

8.5.1  Assignment Statement

To generate intermediate code for assignment statement, first searching is applied to get the information of the identifier from the symbol table. These identifiers are simple or multidimensional array or a constant value that is stored in a literal table. After searching, the three address code is generated for the program statement. Function lookup will search the symbol table for the lexeme and store it in id.place. Function newtemp is defined to return a new temporary variable when invoked and gen function generates the three address statement in one of the above standard forms depending on the arguments passed to it.

Let us consider the example arithmetic statement a = – (b – c). When it is represented as three address code, the statements are

      T1 = b – c

      T2 = – T1

        a = T2

Let us consider the order in which the statements are generated. First the statement relating to c-d is generated by using a new temporary variable. While parsing the possible rules used for this expression are

      S → id = E

      E → – E1

      E → E1 – E2

      E → id

Figure 8.8

Figure 8.7 Syntax tree for a = –(b – c)

Consider the syntax tree for the expression a = –(b – c) shown in Figure 8.7.

  1. Using the production E → id, that is, E → b and E → c. Check for entries b and c in the symbol table; if these entries are not present an error message is displayed. If these entries are present then E1.place = b and E2.place = c (pointer to symbol table for the entry b and c).
  2. At node 1 in the figure using the production E → E1 – E2 it creates a temporary variable T1 using newtemp(). Three address code E.place = E1.place + E2.place is generated.
  3. At node 2 in the figure using the production E → – E1, it creates a temporary variable T2 using newtemp(). Three address code E.place = – E1.place is generated.
  4. At node 3 using the production S → id = E searches for a in the symbol table, assuming it the code produced is a = E.place

    The syntax-directed translation for arithmetic statements can thus be written as follows:

     

    S → id = E

    {id.place = lookup(id.name);

    if id.place ≠ null then S.code = E.code | | gen( id.place ":=" E.place) else S.code = type_error}

    E → – E1

    {E.place = newtemp();

    E.code = E1.code | | gen(E.place ":=" "–" E1.place)}

    E → E1 + E2

    {E.place = newtemp();

    E.code = E1.code | | E2.code | | gen(E.place ":=" E1.place "+" E2.place)}

    E → E1 – E2

    {E.place = newtemp();

    E.code = E1.code | | E2.code | | gen(E.place ":=" E1.place "–" E2.place)}

    E → E1 * E2

    {E.place = newtemp();

    E.code = E1.code | | E2.code | | gen(E.place ":=" E1.place "*" E2.place)}

    E → E1 / E2

    {E.place = newtemp();

    E.code = E1.code | | E2.code | | gen(E.place ":=" E1.place "/" E2.place)}

    E → id

    {E.place = lookup(id.name), E.code = " "}

8.5.2  Addressing Array Elements

Elements of array are stored in consecutive memory location. If the array is of size n and the size of each element is s, then the ith element of the array can be accessed at the base address + (i – low) * s when the array is single dimension.

Here base is the base address of the array or the address of the first element of array and low is the lower bound of the array or the index of the first element of the array.

Example 2: Let A[5] be an array of 5 elements. Let the size of each element be 2, that is, s = 2 and the array is stored from memory location 100, that is, base address = 100.

image

To refer to the third element of the array, the address is calculated as 100 + (3 – 1) * 2 = 104.

In case of the two-dimensional array, the expression is written as i * s + base – low * s where the first sub expression is (i * s) and second sub expression is (base – low * s). In the second expression, all the components are known before compilation; hence, they can be pre-computed and stored. This reduces the time taken to generate the address of the ith element. In case of multi-dimension arrays like matrix, elements are either stored as row major or column major order.

Example 3: Consider Array A[3,3] with elements stored in row major order is shown below in Figure 8.8.

image

Figure 8.8 Array A[3,3]

The address of element A[i,j] in row major order is computed with the expression base + (( i – low_i )* n2 + j – low_j ) * s, where base is the starting address of the array, low_i is the lower bound of i, low_j is the lower bound of j, n2 is the number of columns, and s is the size of each element.

This expression can be written as ((i * n2) + j) * s + ( base – (( low_i * n2) + low_j) * s The second part of the expression can be pre-computed by knowing the value of base, low_i, low_j, and s. This helps in faster generation of address of A[i,j]. The generalized rules may be as follows:

      S → A = E

      E → E + E

      E → E * E

      E → A

      A → Alist ]

      A → id

      Alist → Alist, E | id [ E

A can be a simple name (has only one base address and no offset) or an indexed name (has base address and object) assignment to location.

 

 

S → A = E

{ If A.offset = null then gen(A.place '=' E.place

else

gen(A.place '[' A.offset']''='E.place )}

 

E → E1 + E2

{E.place = newtemp();

E.code = E1.code || E2.code || gen(E.place ":=" E1.place"+"E2.place)}

 

E → E1 + E2

{E.place = newtemp();

E.code = E1.code | | E2.code | | gen(E.place ":=" E1.place "*" E2.place)}

 

E → A

{ If A.offset = null then E.place = A.place

else begin

E.place = newtemp()

gen(E.place '=' A.place '[' A.offset']')

end}

 

A → Alist

{ A.place = newtemp()

A.offset = newtemp()

gen(A.place '=' c(Alist.array)

gen(A.offset '=' Alist.place '*' width(Alist.array))}

 

A → id

{ A.place = id.place

A.offset = null}

 

Alist → Alist1, E

{ t = newtemp()

m = Alist1.dim +1

gen(t '=' Alist.place '*' lmt(Alist.array,m))

gen(t '=' t'+' E.place)

Alist.array = Alist1.array

Alist.place = t

Alist.ndim=m}

 

Alist → id [ E

{ Alist.array = id.place

Alist.place = E.place

Alist.ndim = 1}

If A A.offset and A.value are the two attributes associated with A as L-values. If A is an array, A.offset is a temporary variable to store the first part of the expression and A.value stores the second part of the expression. If A is a simple variable, then A.value points to the symbol table and A.offset is set to null. Here we define other functions like lmt() for maximum number of elements present in the jth dimension, width() for the size of the array, m to denote the dimension of the array, and c for the second component of the expression.

8.5.3  Logical Expression

An expression that contains operators like +, –, *, / are simple arithmetic expressions, whereas the expression that contains relational operators like <, >, ≥, ≤, or, and, not, etc., are logical expressions. The logical expressions are mainly used for a set of statements that are to be executed based on the condition that is satisfied, that is, to control the flow of path. The use of logical expression always results in either true or false, which is considered 0/1. 0 indicates false and 1 or a positive number indicates true. The rules for writing the logical expressions are as follows:

E → E1 or E2

E → E1 and E2

E → not E1

E → id1 relop id2

E → (E1)

E → true

E → false

To convert the expression with logical operator requires two things to be added after the evaluation of the expression to true or false.

  1. Where the control should go when the condition is true and
  2. Where to go when the condition is false.

This is done by first evaluating the expression and then adding the "if" statement that checks the result and sets the value to 0 or 1. While generating three address statement, we use variable next that keeps track of current statement number that is used to generate the next required statement number for true or false condition. The translation rules to convert to three address code are as follows:

 

 

E → E1 or E2

{E.value = newtemp();

gen(E.value "=" E1.value "or" E2.value)}

 

E → E1 and E2

{E.value = newtemp();

gen(E.value "=" E1.value "and" E2.value)}

 

E → not E1

{E.value = newtemp();

gen(E.value "=" "not" E1.value)}

 

E → (E1)

{E.value = E1.value}

 

E → id1 relop id2

{E.value = newtemp();

gen( "if" id1.value relop.op id2.value "goto" nextstat + 3)

gen(E.value "=" "0")

gen("goto" nextstat + 2)

gen(E.value "=" "1")}

 

E → true

{E.value = newtemp();

gen(E.value "=" "1")}

 

E → false

{E.value = newtemp();

gen(E.value "=" "0")}

Example 4: Write three address statement for the x or y and not z

Solution: Three address code for the above expression is

t1 = not z

t2 = y and t1

t3 = x or t2

Example 5: Write three address statement for the if x < y then 1 else 0.

Solution: The address code for the above expression is

10. if x < y go to 13

11. t1 = 0

12. go to 14

13. t1 = 1

14.

8.5.4  Control Statements

Flow of control statements can be shown pictorially as in the Figures 8.9 and 8.10 below.

Figure8.9

Figure8.9 Control low for if-then-else statement

Figure 8.10

Figure 8.10 Control low for while statement

While generating the code we need to generate the label that will execute the segment of code depending on the condition whether it is true or false. The rules for writing different constructs are as follows:

      S →. if E then S1

      S → if E then S1 else S2

      S → while E do S1

While writing the translation rules for the first rule, we need to generate a new label for the true segment and the next statement is the statement that follows S. Hence, the translation rules are as follows:

 

 

S → if E then S1

{E.true = newlabel();

E.false = S.next;

S1.next = S.next;

S.code = E.code | | gen(E.true,":") | | S1.code}

 

S → if E then S1 else S2

{E.true = newlabel();

E.false = newlabel();

S1next = S.next;

S2.next = S.next;

S.code = E.code | | gen(E.true,":") | | S1.code | | gen(" GOTO ", S.next) | | gen(E.false, " :") | | S2.code}

 

S → while E do S1

{S.begin = newlabel();

E.true = newlabel();

E.false = S.next;

S1.next = S.next;

S.code = gen(S.begin":") | | E.code | | gen(E.true,":") | | S1..code | | gen("GOTO",S.begin)}

Example 6: Give three address code for the following:

While (a < 5) do a: = b + 2

Solution:

 

L1:

If a < 5 goto L2

goto last

 

L2:

t1 = b + 2

a = t1

goto L1

 

last:

 

Similarly, we can translate for the "do...while" and the "for" loop.

Translate "do Swhile (E)."

The flow of control for the "do while" loop is as shown in Figure 8.11

Figure 8.11

Figure 8.11 Control flow of repeat... statement do... while

Example 7: Translate do x = y + z while a < 10

Solution: LI:

      t1= y + z

      x = t1

      If a < 10 goto L1

Flow of control for the "for" loop is shown in Figure 8.12 for(El;E2;E3) S;

Figure 8.12

Figure 8.12 Control flow of for loop

Translate the following switch statement.

switch(E)

{ case v1: s1;

case v2:s2;

:

:

case vn : sn

}

The translation is based on the way the flow of control executes the statement and is given as follows:

 

 

      t1=E

 

 

      goto test

 

L1:

      s1

 

 

      goto last

 

L2:

      s2

 

 

      goto last

 

Ln:

      sn

 

 

      goto last

 

test:

if (t1==v1) goto L1

 

 

if (t2==v2) goto L2

 

 

      :

 

 

if (tn==vn) goto Ln

 

last:

 

Solved Problems

  1. Translate the following IR forms

    –(a + b) * (c + d) + (a + b + c)

    Solution:

    Postfix form = ab + – cd + * ab + c + +

    Three address code:

     

          t1 = a + b

          t2 = –tl

          t3 = c + d

          t4 = t2 * t3

          t5 = a + b

          t6 = t5 + c

          t7 = t4 + t6

     

    Syntax tree: The syntax tree for the given expression is shown in Figure 8.13.

    Figure 8.13

    Figure 8.13 Syntax tree

    DAG: The DAG for the given expression is shown in Figure 8.14.

    Figure 8.14

    Figure 8.14 DAG:

  2. Convert into postfix notation and write the three address code.

    a * (b + c) / (d + e)

    Solution: The postfix form step by step is given by

    a* (bc+) / (de+)

    (abc+*) / (de+)

    abc+*de+/

    Three address code is

     

    t1 = b + c

    t2 = a * t1

    t3 = d + e

    t4 = t2/t3

     

  3. Convert into postfix notation and write the three address code.

    –a + b / c ↑ d ↑ e*f / g

    Solution: postfix form step by step is given by

    (a–) + b / c ↑ d ↑ e*f/g

    (a–) + b / c ↑ (de ↑)*f/g

    (a–) + b / (cde↑↑)*f/g

    (a–) + (bcde↑↑/)*f/g

    (a–) + (bcde↑↑/f*g/)

    a–bcde↑↑/f*g/+

    Postfix form is:       a-bcde↑↑/f*g/+

    Three address code:

          t1 = –a

          t2 = d ↑ e

          t3 = c↑t2

          t4 = b / t3

          t5 = t4 *f

          t6 = t5 / g

          t7 = tl + t6

     

  4. Let A be the two-dimension matrix with the size 10 * 10; generate the address of x = A[y,z].

    Solution: The lower bound of both is equal to 1, that is, lowl = low2 = 1 and higher bound is equal to 10, that is, nl = n2 = 10 and let the size of each element be 4, that is, s = 4. The annotated parse tree generates the address of x = A[y, z] as shown in Figure 8.15.

    Figure 8.15

    Figure 8.15 Parse tree for x = A[y, z]

  5. Write the three-address statement for x < y or z < xl and yl < zl.

    Solution: Three address code for the above expression will be as follows:

    10. if x < y go to 13

    11. t1= 0

    12. go to 15

    13. t1 = 1

    14. go to 22

    15. if z < X1 go to 18

    16. t2 = 0

    17. go to 23

    18. if y1 < Z1 go to 21

    19. t3 = 0

    20. go to 23

    21.t3 = l

    22. go to true

    23. false cond

  6. Generate the three address code for the following statement:

     

    while a<b

    do

    if c < d then

          x = y + z

    else

          x = y - z

    done

    Solution: The three address code will be as follows:

     

    L1. if a<b then GOTO L2

    GOTO LNEXT

    L2. if c<d then GOTO L3

    GOTO L4

    L3. tl = y + z

    x = tl

    GOTO L1

    L4. tl= y - z

    x = tl

    GOTO L1

    LNEXT.

  7. Generate the three address code for the following C program:

     

    main (0

    {

    int i=l;

    int a[10];

    while(i<=10)

    A[i] = 10;

    }

    Solution: Three address code is as follows:

     

    i = 1

    L: if i<=10 goto L1

    goto last

    L1:tl = i * w

    t2 = addr(A)

    t2[tl]= 10

    Goto L

    Last:

    where w is the width of each array location.

  8. Translate the following switch statement.

     

    switch (i+j)

    { case 1: x = y + z;

    case 2: u = v + w;

    default: p = q + r

    }

    Solution:

    The translation as follows:

     

     

    t=i + j;

    goto test

     

    L1:

    tl = y + z;

    x = tl;

    goto last

     

    L2:

    t2 = v + w

    u = t2 ;

    goto last

     

    L3:

    t3 = q + r;

    p = t3;

    goto last

     

    test:

    if (t==l) goto L1

    if (t==2) goto L2

    goto L3

     

    last:

Summary

  • The intermediate code is useful representation when the compilers are designed as front end and back end.
  • A syntax tree is graphical representation of the source program.
  • DAG representation is useful for common sub expression elimination.
  • Postfix notation is a linear representation of a syntax tree.
  • Three address code is the most commonly used intermediate representation.
  • Quadruples use temporary variables to store intermediate results.
  • Syntax-directed translation rules can be defined for translating the source program to intermediate code.

Fill in the Blanks

  1. __________ rules are defined to generate the three address code.
  2. __________ representation makes the program source language independent.
  3. __________ is the simplest form of intermediate representation.
  4. The postfix notation for the expression x + –y*c is__________ .
  5. The three address code for the expression x + –y*c is__________ .
  6. __________ is a special graph that eliminates the nodes for common expression.
  7. Three address code for the statement a= add(int x, int y)__________ .
  8. __________ representation of three address code uses temporary variables.
  9. __________ representation uses references within the table that has three address code.
  10. __________ is a graphical representation of the source program.

Objective Question Bank

  1. Which of the following is the postfix representation for the expression x + – y * (– y + z)?
    1. xy + –y – z* +
    2. xy – y –z +* +
    3. xy – +y –z +*
    4. xy –y –z +* +
  2. __________ is an invalid three address code.
    1. Quadruple.
    2. Threaded code
    3. Triple
    4. Indirect Triple
  3. DAG stands for__________.
    1. directed adjacent graph
    2. double adjacent graph
    3. directed acyclic graph
    4. double acyclic graph
  4. Program is independent of the source language and the target language when it is represented as__________.
    1. Machine code
    2. Intermediate code
    3. Assembly code
    4. Relocatable code
  5. Machine-independent optimization is applied on the code when it is in__________.
    1. Intermediate form
    2. Assembly code
    3. Target code
    4. Machine Code
  6. The three address code for the statement x + – y * (– y + z) is
    1. t1 = –y, t2 = t1 + z, t3 = t1 * t2, t4 = x + t3
    2. t1 = –y, t2 = x + t1, t3 = t1 + z, t4 = t2 * t3
    3. Both a and b are valid
    4. None
  7. Temporary variables are generated in__________.
    1. quadruple
    2. threaded code
    3. triple
    4. indirect triple
  8. Statements can be moved in and around in__________.
    1. quadruple
    2. threaded code
    3. triple
    4. indirect triple
  9. Intermediate code can be represented as__________.
    1. graph
    2. prefix expression
    3. tree
    4. DAG
  10. Postfix expression can be evaluated using the__________ data structure.
    1. tree
    2. stack
    3. graph
    4. queue

Exercises

  1. Translate the arithmetic expression x* – (y–z) into a syntax tree, postfix notation and three address code.
  2. Translate the executable statement of the following C program into a syntax tree, three address code.

     

    main ()

    {

    int i ;

    int a [10] ;

    i = l;

    while (i < 10)

    {

          a[i] = 0.0; i = i + 1;

    }

    }

  3. Write a program to implement the syntax-directed definition for translating Boolean expression into three address code.
  4. Write a program to implement the syntax-directed definition for translating flow- of-control statements into three address code.
  5. Translate the following assignment statement into three address code using the translation scheme.

    A[i , j] : = B[i , j] + C[A[k , 1]] + D[i + j]

Key for Fill in the Blanks

  1. Syntax direction translation
  2. Intermediate code
  3. Postfix
  4. xy – c*+
  5. t1 = – y, t2 = t1 * c, t3 = x + t2.
  6. Directed Acyclic Graph
  7. param x, param y, call add 2, return a
  8. Quadruples
  9. Indirect triples
  10. Syntax trees

Key for Objective Question Bank

  1. d.
  2. b.
  3. c.
  4. b.
  5. a.
  6. a.
  7. a.
  8. a.
  9. b.
  10. b.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.138.36.72