Chapter 11

Code Generation

Code generation is the final phase in a compiler. Given a code in intermediate form, it uses code generation algorithm and register allocation strategies to generate the final target code.

CHAPTER OUTLINE

Code generation phase is responsible for generating the target code. This chapter focuses on the issues in code generation phase and the register allocation strategies. Code generation algorithm for three address code and DAG is explained with an example.

11.1  Introduction

Code generator is the last phase in the design of a compiler. It takes three address code or DAG representation of the source program as input and produces an equivalent target program as output. Figure 11.1 shows the position of the location of code generation phase.

The code generator has many limitations regarding the generation of code that is of high quality, accurate, and efficient. In addition to this, the code generator should run efficiently. There are many issues that are to be considered while designing a code generator.

11.2  Issues in the Design of a Code Generator

The code generated is target language dependent and operating system dependent as memory management, instruction selection, register allocation, order of evaluation would affect the efficiency of the code generated. Even the input to the code generation phase is an issue because there are many forms of intermediate codes that are generated by the front end.

Figure 11.1

Figure 11.1 Code Generator

11.2.1  Input to the Code Generator

The output of front end is the input to the code generator along with the information in the symbol table that is used to determine the run time address of the data objects denoted by the names in the intermediate representation.

Intermediate code is either in the linear form or in the hierarchical form. Linear representation that includes postfix notation, three address representation (quadruples and triples), and hierarchical representation includes syntax trees and DAGs. The input is intermediate representation and is error free. The values of variable names present in the intermediate code can be represented by target machine in a directly manipulatable form. Since semantic analyzer has already performed the type checking, type conversion operators have been inserted wherever necessary and obvious semantic errors have already been detected.

11.2.2  Target Programs

The output of the code generator is the program in target language. Various possible outputs can be generated.

 

Absolute machine code—this code is static and is always placed in the same location in memory. This can be loaded and run immediately. Fast in execution but requires same memory locations. This may be suitable for programs like VI editor. Compilers like PL/C produce absolute code.

 

Relocatable machine code—this allows the subprograms to be compiled separately resulting in a set of relocatable object modules. Loader and linker programmes are needed to link the modules and load the programs into the memory for execution. Although the procedure is expensive, there is flexibility in being able to compile subroutines separately and to call other previously compiled programs from an object module. If relocation is not carried out automatically by the target machine then the compiler must provide explicit relocation information to the loader. This is used to link the separately compiled object modules.

 

Assembly code—When we have Assembly language as the output, it makes the process of code generation easier. We can generate mnemonic instructions and use the macro facilities of the assembler to generate code. The cost involved after code generation is less as the other forms require an assembler. Particularly for a machine with a low memory, this choice is reasonable, where a compiler must use more passes.

11.2.3  Memory Management

The role of code generator in coordination with front end is to map the names of variables in the source program to the addresses of the data objects in run time memory. The name in a three address statement refers to a symbol table entry for that name, which is used by code generator.

Each label in the three address code has to be converted to actual memory addresses of instructions and this process is called “back patching.” In case of quadruples, if the numbers are referred by labels, then each quadruple is read and the address is computed by maintaining a counter for the words used for the instructions generated so far. The quadruple array within an additional field is used to store the count.

For example, if a reference such as j: goto i is encountered, where i could be less than or greater than j,

  • if i is less than j, it is a backward jump, we may simply generate a jump instruction with the target address equal to the machine location of the first instruction in the code for quadruple i.
  • The jump is forward jump when i is greater than j. Hence, i exceeds j. Here we have store on quadruple list i the location of the first instruction generated for quadruple j. Then the quadruple i is processed. Then for all forward jumps to i, proper machine locations are filled.

11.2.4  Instruction Selection

The instruction set of the target machine is an important factor that controls the uniformity of the execution. If the target machine does not support each data type in a uniform manner, then each exception to the general rule requires special handling.

Machine idioms and speed of instruction are other important factors to consider in code generation. Instruction selection is straightforward provided the efficiency of the target program is not an important task then, but this results in more computation time.

For example, let us consider a simple statement a: = b + c, where a, b, and c are statically allocated and can be translated into the machine code as follows:

 

MOV b, R0 /* load b into register R0 */

ADD c, R0 /* add c to R0 */

MOV R0, a /* store R0 into a */

If the code generator translates statement-by-statement, it often produced a very poor code as given below.

Let the statements in three address code be

 

a := b + c

d := a + e

 

would be translated into

 

MOV b, R0

ADD c, R0

MOV R0, a

MOV a, R0

ADD e, R0

MOV R0, d

The fourth statement in the code sequence is redundant. If the value of a is not subsequently used, then the third statement is also redundant.

Let us consider another statement, a = a + 1; if this statement is translated into machine code it results in three machine instructions. If the target machine has an increment instruction INC, using this instruction is more efficient as it would perform the same task as that of three instructions generated in normal code generation.

Instruction speeds are needed to design good code sequence but knowing the accurate timing information is often a difficult task. Deciding which machine code sequence is best for a given three address construct may also require knowledge about the context in which that construct appears.

11.2.5  Register Allocation

Efficient use of registers is particularly important in code generation as the instructions involving the register operands are short and fast than those instructions involving the operands in memory. The use of registers is often subdivided into two sub problems:

  • During the first phase, that is, register allocation phase, we select the set of names that reside in registers at a point in the program.
  • During the later phase of register assignment, we pick the register where a variable will reside in

Register assignment to variables is a difficult task as the registers available are to be used by the operating system and also by the programs that are currently running on the system. Mathematically this problem is NP-complete problem and is also complicated because the hardware and/or the operating system of the target machine may require that certain register usage.

Few target machines require register pairs for some operations and to store results. For example, the integer division and multiplication requires register pairs in the IBM System OS 370 machine. The multiplication instruction is of the form

 

M x, y

 

The even register of an even/odd register pair will have x, which is the multiplicand. The multiplier y is a single register and the multiplicand value is taken from the odd register pair. The result occupies the complete even/odd register pair. Similarly, the division instruction is of the form

 

D x, y

 

In the even/odd register pair, the 64-bit dividend occupies even register x and y represents the divisor. After performing the division operation, the remainder is stored in the even register and the quotient is stored in the odd register.

11.2.6  Choice of Evaluation Order

The efficiency of the target code depends on the evaluation order. The computation order also effects the register requirements to hold intermediate results. Choosing the best order is another difficult task. If the input is in the form of three address code, it may require reordering of the input for efficient code generation. If the input is in the form of DAG, then the best code can be generated by traversing the tree in post order form.

11.3  Approach to Code Generation

An important criterion for a code generator is to generate the code that is correct in terms of meaning and efficiency. Correctness is another important factor because of the number of different cases that code generator must face. Because of prominence in correctness, designing a good target code generator that can be implemented, tested, and maintained easily is an important design goal.

The input for code generator is a sequence of three-address statements partitioned into basic blocks. A simple code generation involves generating code for each three-address statement, taking the advantage of the operands that are in the registers and storing the result in registers as long as possible. The output in register is stored until it is needed for the next computation or just before a procedure call, jump/labeled statement, or end of the basic block. The reason for this is that after leaving a basic block, we may go to several different blocks, or we may go to one particular block that can be reached from several others. This is done to avoid possible error in using the data that reaches that point.

The code generator should keep track of what is currently available in the registers and where the data of a variable is available which will cost less. For this reason it uses two data structures called register descriptor and address descriptor.

 

Register descriptor: It is a structure that maintains a pointer to the list that contains information about what is currently available in each of the registers. Initially all the registers are set to empty. Whenever a code generator makes a request for register to be allocated, this list is verified, if it already holds the operand it is returned otherwise a free register is allocated.

 

Address descriptor: It is a structure that keeps track of the locations for each variable name—where the current value of the name can be found at run time. This information can be stored in the symbol table.

 

The code generator first invokes a function getreg(), that returns a location specifying, where the computation has to be performed by three-address statement. If there is a statement a = b op c, the getreg() function returns a location L where the computation of b op c should be performed. This could be the memory location or a register; the decision depends on the required efficiency.

11.3.1  Algorithm for Code Generation Using Three Address Code

The code generator reads every three-address statement, which is of the form a = b op c it first invokes the getreg() function, generates the target code and appropriately updates the register and address descriptors as follows:

For every three-address statement of the form a = b op c in the basic block do

{

  1. First invoke the function getreg() to return the register or memory location L in which the operation b op c should be performed. The input for this function is three-address statement a = b op c as a parameter, which can be done by passing the index of this statement in the quadruple array.
  2. Check the current location of the operand b by consulting its address descriptor, and if the value of b is currently present in both memory location and in register, then prefer the register reference. If the value of b is currently not available in L, then generate a statement MOV b, L (where b as assumed to represent the current location of b).
  3. Generate the statement OP c, L, and modify the address descriptor of a to state that value of a is now available in L, and if L is in a register, then modify its descriptor to indicate that it will contain the run-time value of a.
  4. If the current values of b and /or c are in the register, and there is no next use of these variables, that is, they are not live at the end of the block, then alter the register descriptor to indicate that after the execution of the statement a = b op c, these registers will no longer contain value of b and /or c.

}

The procedure getreg(), when invoked it returns a location where the computation relating to the given three-address statement a = b op c should be performed. The location to be used to be returned is based on the following conditions:

  1. Prefer the register location if one of the register already contains the variable name b. If b has no next use after the execution of a = b op c, and if b is not live at the end of the block and this register does not hold the value of any other variable, then return the register for L.
  2. If such a register is not available then getreg() search is made to find an empty register; and if an empty register is available, then it returns it for L.
  3. In the absence of an empty register, and if a has next use in the block, or op is an operator, such as indexing, which requires a register for computation, then getreg() searches for a suitable register. If this register is not empty then a statement is generated to empty the register by generating store instruction to store its value. In the appropriate memory location M, the address descriptor is modified, and the register that is freed is returned for L. It uses the different strategies and one such strategy is least recently used to empty one of the occupied register.
  4. Finally, the getreg() procedure returns the memory location L if es that have next use within the block.

Example: Let us consider the sequence of statements as given below.

 

      t = a – b;

      u := a – c;

      v := t + u;

      d := v + u;

 

When these statements are given as input for a code generator, then the operations are performed as follows:

First the address descriptor and register descriptor are set to empty. Let us assume that R0 and R1 registers are available for performing these operations.

For the first statement t = a – b, since the registers are empty, the get register would return register R0 for the computation and hence, the instruction MOV a, R0 is generated and the register descriptor indicates that R0 holds the value of a. Then it generates the instruction SUB b, R0 and the address descriptor for t is set to R0. The last instruction generated is MOV R0 t and updates the address descriptor indicating that the value of t is available in both register R0 and memory location t. Similarly for each statement in three address code is processed and the resultant code generated is shown in Table 11.1.

 

Table 11.1 Contents of register and address descriptor during code generation

Table 11.1

If the code generator is efficient, then the target code that is generated would have the statements as follows:

The algorithm makes use of the next-use information of each name for efficient use of the registers. Therefore it is required to compute the next-use information. If:

  • A statement at the index i in a block assigns a value to name a,
  • And if a statement at the index j in the same block uses a as an operand,
  • If the value assigned to a variable a is not modified in any path from any statement at index i to the statement at index j then

we say that the value of a computed by the statement at index i is used in the statement at index j. In such cases the next use of the variable a in the statement i is statement j. For each three address statement i, compute the information relating to variables that appear in statement i, which has next use in the same block. Backward scanning of basic block allows to attach every statement i under consideration with information of those statements that have next use of each variable name in statement i. The algorithm is as follows:

 

For each statement i of the form a = b op c do

{

  • compute information relating to the next uses of a, b, and c to statement i
  • update the next-use for a as no next-use /* This information can be kept into the symbol table */
  • update the information for b and c to be the next use in statement i

}

Once we gathered the information regarding the next use, use this information for selecting register allocation strategies.

The register usage reduces the cost of computation. It may not always be true, it depends on the factor how the instruction is written, what addressing modes are used, the number of registers available, number of variables used in the program, and their frequency.

11.4  Instruction Costs

The instruction cost depends on the operation and the addressing modes of the operands involved in the instruction. Addressing modes involving the register have cost zero, while those with a memory location or literal in them have one as the operands have to be stored with the instruction.

  1. The instruction MOV R0, R1 has cost one, since it occupies only one word of memory. Both the operands in the instruction are registers which costs zero.
  2. The instruction MOV R0, M has cost two, since the address of memory location M is in the word following the instruction.
  3. The instruction ADD #1, R0 has cost two, since the constant 1 is in the next word following the instruction.
  4. The instruction SUB 4(R0), *12(R1) has the cost three as the constants 4 and 12 are stored in the next two words following the instruction.

The following table gives the details of the added cost based on the addressing mode.

image
Note: For calculating, the cost of instruction is one plus the added cost of source and, destination, i.e., Cost of instruction = 1 + cost of source address + cost of destination address.

Example:

  1. To move register contents to memory (® M)

    MOV R0, M

    cost = 1 + 0 + 1 = 2.

  2. To move the contents from register in indirect indexed mode to memory

    MOV * 4 (R0), M

    cost = 1 + indirect index + instruction word = 1 + 1 + 1 = 3

  3. To move the contents from register in indexed mode to memory

    MOV 4(R0), M

    cost = 1 + indirect mode + instruction word = 1 + 1 + 1 = 3

  4. To move constant or litetral to register

    MOV #1, R0

    cost = 1 + 1 + 0 = 2

  5. To move from memory to memory

    MOV M1, M2

    cost = 1 + 1 + 1 = 3

11.5  Register Allocation and Assignment

From the examples in the previous section, it is clear that the register operands are faster and shorter than the memory operands. A good code generator should consider the availability of registers and the use of variables for generating an efficient code. There are different strategies for identifying the registers and their assignment.

11.5.1  Fixed Registers

The most common and simple strategy is to assign the specific register to specific values. For example, use a separate set of registers for storing the base address, set of registers for storing the stack pointers, set of registers for arithmetic computations and few are reserved by compiler for suitable operations. The advantage of this approach is that the register allocation task is simplified. But the disadvantage is that all the registers are not utilized properly.

11.5.2  Global Register Allocation

When the code is generated for a single block, the variables that are consistently used are stored in the registers. At the end of the block only those variables that are live are stored in the registers. The allocation of variables across the block boundaries is known as global register allocation and this must be consistent. The strategies available for global register allocation are listed below:

  • Registers can be fixed to store values for most frequently used variables throughout the loop.
  • Number of registers can be fixed to hold the most active values in each inner loop.
  • Preference may be given to the free registers if available to one block.

11.5.3  Usage Count

The best way of register allocation is to count the number of times the variable is used in the basic block and then assigning the register to the variable that has the highest usage count. This usage count gives the idea about the reduction in cost of code generated depending on selection of register allocation for specific variable.

image

Here use(x,B) gives the number of times the variable x is used in block B, prior to the definition of and live(x,B) is 1 if x is live on exit from B, otherwise it is 0.

Let us consider the following program fragment in the Figure 11.2, which has four blocks B1, B2, B3, and B4.

The variables that are used are I, J, a, n, m, x1, x2, and x3. The cost of I is computed across every block as follows:

image
Figure 11.2

Figure 11.2 Flowgraph

Similarly, the use count for all variables is computed.

image

Suppose, there are only two registers then it would be better to reserve them for variable I and a as their use count is more. If there are four registers, then three can be assigned for I, J, a and the fourth register can be used for other variables.

11.5.4  Register Assignment for Outer Loop

If the program has nested loops, that is, loop L2 inside the loop L1, then while assigning the registers the following criteria may be chosen.

  • If x is allocated in inner loop L2 then it should not be allocated in outer loop L1-L2.
  • If x is allocated in L1 and is not used in L2 then store the variable x in memory at the entrance to L2 and load while leaving L2.
  • If x is allocated in L2 and not used in L1 then load z on the entrance of L2 and store z on the exit of L2.

11.5.5  Graph Coloring for Register Assignment

Register allocation can be done using graph coloring. Each variable is represented as a node in the graph. If the variables are interfering, then we place an edge between them and this indicates that the same register cannot be assigned for both of them. If there is no edge between two nodes, that is, they are not adjacent, then we can use the same register for both variables, which will reduce the number of registers.

Let us consider n registers r1, r2...., rn with n different colors. Now it is required to assign a color so that no two adjacent nodes get the same color. This resembles the graph coloring problem. The solution to this problem will result in minimum number of registers to be used for execution of the program which is cost effective.

11.6  Code Generation Using DAG

Code generation using DAG would result in efficient code as it eliminates redundant instructions and uses minimum number of registers. Target code is generated in two phases—numbering and code generation.

Numbering phase assigns a number to each node in the tree that indicates how many registers are needed to evaluate the subtree of this node. Let us assume that for a node T it requires 1 register to evaluate its left subtree and r registers to evaluate the right subtree. If one of the numbers is large, say 1 > r, then we can evaluate the left subtree first and store its result into one of the registers Rk. Now the same registers can be used to evaluate the right subtree excluding the register Rk. If l = r we need an extra register Rl+1 to remember the result of the left subtree. If node T is a leaf then the number of registers to evaluate T is either 1 or 0 depending whether it is a left or a right subtree. For example, the instruction ADD R1, A it is better to handle the right operand A directly without storing it into register. Usually the numbering algorithm starts from the leaf nodes of the tree and assigns 1 or 0 as explained. Then for each node whose children are labeled l and r, if they are equal then the node number is l + 1 otherwise it is maximum of l and r.

For example, for the expression (A – B) + ((C + D) + ( E * F)), which corresponds to the AST/DAG as shown in the following Figure 11.3.

Numbering phase would process this and assign the register number to each node as shown in Figure 11.4.

Code generation phase generates the code based on the numbering assigned to each node T. All the registers available are arranged as a stack to maintain the order of the lower register at the top. This makes an assumption that the required number of registers cannot exceed the number of available registers. In some cases, we may need to spill the intermediate result of a node to memory. This algorithm also does not take advantage of the commutative and associative properties of operators to rearrange the expression tree.

It first checks whether the node T is a leaf node; if yes, it generates a load instruction corresponding to it as load top(), T. If the node T is an internal node, then it checks the left l and right r subtree for the number assigned. There are three possible values, the number on the right is 0 or greater than or less than the number on the left. If it is 0 then call the generate() function with left subtree l and then generate instruction op top(), r. If the numbering on the left is greater than or equal to right, then call generate() with left subtree, get new register by popping the top, call generate() with right subtree, generate new instruction for OP R, top(), and push back the used register on to the stack.

Figure 11

Figure 11.3 AST for (A–B) + ((C+D) + (E * F))

Figure 11.4

Figure 11.4 Register Allocation

If the number on the left is smaller than the number on the right, then first swap the top two elements on the stack, call the generate() function with the right subtree, get new register by popping the top, call generate() with left subtree, generate new instruction for OP R, top() push back the used register on to the stack, and finally swap the top two elements of the stack.

Algorithm to generate the target code using the numbering information.

 

function CodeGen_DAG(T)

{

If T is a leaf node then generate Load top(), T

If T is some internal node then identify the l and r

     children then

          {

        CodeGen_DAG(l)

        Generate statement op top(), r

           }

If register (l) > register( r ) then

          {

        CodeGen_DAG(l)

        R=pop ()

        CodeGen_DAG(r)

        Generate statement op R, top()

        Push(R)

          }

If(register(l) < register (r) then

          {

        Exchange the top two elements of the stack CodeGen_DAG(r)

        R=pop ()

        CodeGen_DAG(l)

        Generate statement op R, top()

        Push(R)

        Exchange the top two elements of the stack

          }

}

It is clear from the numbering phase that this evaluation requires two registers R1 and R2. Let the registers be arranged with R1 on top of the stack. The code generator first calls the generate function with the root node. Since the right child has the number 2, which is greater than the left, it first swaps the top two register numbers and then calls the generate function with the right sub tree. This procedure is repeated until the leaf is reached and the resultant code generated is as follows:

 

load R2, C

add R2, D

load R1, E

mult R1, F

add R2, R1

load R1, A

sub R1, B

add R1, R2

Solved Problems

  1. Generate the target code for the following three address code.

     

    D := B – C

    E := A + B

    B := B + C

    A := E – D

     

    Solution: Let us assume there are two registers. The code generated is displayed step by step in the following table.

    image
  2. Draw DAG for the statement a/(b + c) – d * (e + f) and generate the target code.

     

    Solution: The above expression when represented as a DAG then we get Figure 11.5. After the numbering phase, the number of registers required at each node is computed and displayed in Figure 11.6.

    Figure 11.5

    Figure 11.5 AST for a/(b + c) – d * (e +imagef)

    Figure 11.6

    Figure 11.6 Register Allocation

    MOV A, R1

    MOV B R2

    ADD C R2

    DIV R2 R1

    MOV D R2

    MOV E R3

    ADD F R3

    MUL R3 R2

    SUB R2 R1

    Assuming there are three registers available, the target code generated is listed below.

  3. If there are three registers r1, r2, and r3, how are these registers allocated for the following program fragment?

     

    Solution:

    The variables that are used are temp1, temp2, temp3, temp4, temp5, and temp6. The cost of these variables is computed across every block as follows:

    image
    Variable
    image
    Total Use count

    temp1

    1 + 2 * 1
    3

    temp2

    1 + 2 * 3
    7

    temp3

    2 + 2 * 1
    4

    temp4

    1 + 2 * 2
    5

    temp5

    1 + 2 * 1
    3

    temp6

    0 + 2 * 1
    2

    Similarly, the use count for all variables is computed as shown in the table below.

    If there are three registers, then these registers are to be assigned to the variables temp2, temp4, temp3 as their use count is high.

Summary

  • Code generator is dependent on the target language.
  • Different forms that the input code generator can have are polish notation, three address code, and syntax trees.
  • There are three forms of target code—absolute machine code, relocatable machine code, and assembly code.
  • Space and time of the object code also depends on the type of instructions used.
  • DAG’s representation is useful in generating efficient code.
  • Register descriptors are maintained to keep track of the contents of the registers.
  • Address descriptors are maintained to keep track of the availability of the value of variable.
  • The getreg() function returns the register that can be used for the computation of the instruction.
  • The cost of instruction is dependent on the register allocation.
  • Use count and graph coloring approaches will improve the cost of program execution with minimum number of registers being used.
  • Code generation using DAG helps in generating the code that uses minimum number of registers.

Fill in the Blanks

  1.  _________ phase is responsible for generating the target code.
  2.  The intermediate codes that are in linear representation are _________ and _________.
  3. The intermediate codes that are in hierarchical representation are _________ and _________.
  4.  _________ code is static and is always placed in the same location in memory.
  5.  _________ and _________ programmes are needed to link the modules and to load the programs into the memory for execution.
  6.  The conversion of all labels in three address statements to addresses of instructions is known as _________.
  7. Target code for the instruction a = b * c is _________.
  8.  Use of_________ reduces the cost of instruction.
  9.  _________ maintains a pointer to the list that contains the information about what is currently available in the registers.
  10.  _________ keeps track of locations of each variable.
  11.  What is the cost of the instruction mov r5, r3?
  12.  _________ form of writing the instruction would minimize the instruction cost.
  13. C(R) indicates that the instruction is in the _________ mode.
  14. The cost of instruction is dependent on the _________ of the operands.
  15.  _________ gives the number of times the variable x is used in block B.

Objective Question Bank

  1. The input of code generation phase is.
    1. Polish notation.
    2. Three address code
    3. Abstract syntax tree
    4. Any one of the above
  2. Name the programs that are needed to run the code that is in relocatable form.
    1. Assemblers and loaders
    2. Loaders and linkers
    3. Assemblers and linkers 
    4. None
  3. Back patching is a process that comprises which of the following tasks?
    1. The labels in three address statements are converted to the addresses of instructions.
    2. The address of instruction is converted to labels in three address statements.
    3. Both are valid
    4. None
  4. Which of the following machine idioms perform the task equivalent to a = a + 1?
    1. INC
    2. SFT
    3. Both are valid
    4. None
  5. Register assignment to variables is _________ problem.
    1. P – hard problem
    2. P – complete problem
    3. NP – hard problem
    4. NP – complete problem
  6. For integer multiplication in IBM System/370 _________ registers are used.
    1. single
    2. pair
    3. two pairs
    4. Any of the above
  7. Name the descriptor required to keep track of content of registers.
    1. Address
    2. Register
    3. Both
    4. Any of the above
  8. Name the descriptor used to keep track of the availability of value for the variables.
    1. Address
    2. Register
    3. Both
    4. Any of the above
  9. Which of the following intermediate code helps in generating the efficient target code?
    1. Post fix notation
    2. Three address code
    3. Quadruples
    4. DAG
  10. Code generator is dependent on
    1. Type of input
    2. Type of output
    3. Register allocation
    4. All

Exercises

  1. Explain the code generation algorithm, function getreg() with an example.
  2. Write about the use of DAG in code generation. Explain the procedure for constructing DAG with an example.
    1. Write about the issues in the design of code generator.
    2. Write about target code forms. Explain how the instruction forms effect the computation time.
  3. Write the code generated for the following statements.
    1. A = B[i] 
    2. A[i] = B
    3. A = *p
    4. A = B + C
    1. Write about global register allocation strategy for loops.
    2. Explain code generation from DAG. For the following instructions construct DAG.

    t1: = a + b

    t2: = a + b

    t3: = e – t2

    t4: = t1 – t3

  4. For the following instructions, construct DAG and generate code with and without optimization.

    t1: = a + b

    t2: = c + d

    t3: = e – t2

    t4: = t1 – t3

  5. Generate the code for the following C statements using its equivalent three address code.
    1. a = b + c 
    2. x = a/(b + c) – d * (e + f)
    3. * A = p
    4. A = B + C
  6. Generate the code for the following C statements using its equivalent three address code.
    1. a = b + 1
    2. x = y + 3
    3. y = a/b
    4. a = b + c
  7. Explain the following terms.
    1. Register descriptor.
    2. Address descriptor,
    3. Instruction costs.
    1. What is DAG? Construct the DAG for the following basic block.

    D: = B – C

    E: = A + B

    B: = B + C

    A: = E – D

    1. Explain the importance of register allocation with respect to optimization?
    2. Explain the importance of addressing modes with respect to optimization?
    1. Write about global register allocation strategy for loops.
    2. Explain code generation from DAG. For the following instructions construct DAG.

    t1: = a/b

    t2: = a/b

    t3: = e – t2

    t4: = t1– t3

    t5: = e – t2

    t6: = t4 * t5

    1. Write the three-address code for the following code.

      fact (x)

      { int f=1;

      for (i=2, i >=x, i++)

      {

      f=f*i;

      return f;

      }

    2. Represent the above code using DAG.
    1. Explain the importance of register allocation with respect to optimization.
    2. Explain the procedure for constructing DAG with an example. Write the applications of DAG.

Key for Fill in the Blanks

  1. Code generation
  2. Postfix notation, three address representation.
  3.  Syntax trees and DAGs
  4.  Absolute machine code
  5.  Loader and linker
  6. Back patching.
  7. mov b R1, mul c R1, mov R1 a
  8. Registers
  9. Register descriptor.
  10. Address descriptor
  11. One
  12. Register and indirect register
  13. Indexed
  14. Addressing mode
  15. use(x,B)

Key for Objective Question Bank

  1. d
  2. d
  3. a
  4. d
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.185.169