1.5 Phases of a Compiler

Most of the modern compilers have the following phases:

  • Pre-processing: Usually, this phase is implemented as a macro processor, with its own macro language. It provides for file inclusion, conditional compilation control of segments of the source code, definition of literals, “pragma” commands to the compiler, etc.
  • Lexical analysis: (Scanner) – checks for correct use of the input character set, identifies language atoms and tags them as the type of atom, e.g. NUMBER, IDENTIFIER, OPERATION.
  • Syntax analysis: (Parser) – processes output of the Scanner, detects syntactic constructs and types of statements, generates a parse tree.
  • Semantic analysis: (Mapper) – processes the parse trees, detects the “meaning” of statements and generates an intermediate code.
  • Code Generation: Processes the intermediate code to generate the “machine code” for the target machine. Or alternately, works as pseudo-code interpreter to execute the intermediate code directly (e.g. Perl, Java, Python). In that case it may be called a Virtual Machine (VM).
  • Error checking: (Spread throughout the compiler) – various kinds of error checking, appropriate for a particular phase, are done in each phase.
  • Optimization: (Spread among several phases) – two kinds of optimizations are done – machine-dependent and machine-independent.
  • A compiler can also be thought of as divided into two logical parts (see Fig. 1.11):

 

Front- and back-end of a compiler

 

Fig. 1.11 Front- and back-end of a compiler

 

Front-end of compiler: Consisting of pre-processing, lexical, syntax and semantic phases. This is the part which does the analysis of the input code.

Back-end of compiler: Code generation, Optimization phases. This is the part which does the synthesis of the output or target code.

On the basis of this concept of two parts of a compiler, we can envision several situations for a compiler development project. We may be required to develop compilers for:

  • single source language, single target language (1 × 1);
  • multi-source language, single target language (m × 1);
  • single source language, multi-target language (1 × n);
  • multi-source language, multi-target language (m × n).

In the 1 × 1 case, we need to develop only one compiler, with one front-end and one back-end. For cases m × 1 and 1 × n, we have to develop m and n compilers, respectively. Instead, we really need to develop only m front-ends and one back-end for m × 1 case. Similarly, we really need to develop one front-end and only n back-end for 1 × n case (see Fig. 1.12).

 

Multiple source, single target (m × 1) and single source, multi-target (1 × n) cases. L Source language, C Compiler, P Processor

 

Fig. 1.12 Multiple source, single target (m × 1) and single source, multi-target (1 × n) cases. L Source language, C Compiler, P Processor

 

The real impact of this idea will be realized in the case of m × n, as shown in Fig. 1.13. Instead of developing m × n compilers, we need to develop only m front-ends and n back-ends, to fabricate a total of m × n compilers.

 

Multi-source, multi-target case (m × n). CF Compiler front-end, CB Compiler back-end

 

Fig. 1.13 Multi-source, multi-target case (m × n). CF Compiler front-end, CB Compiler back-end

 

An example of implementation of this idea is the GNU C compiler, some preliminary details of which are given in Section 1.7.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.142.12.207