1.3 Translation Schemes

1.3.1 T-diagram

Most of the translation schemes accept some input code in a specified source language, translates it using a program which is written in its implementation language and generates an output in a specified target language, as shown in Fig. 1.2. The translation program itself works on some specified system – a combination of hardware and operating system software. To depict this arrangement in a compact way, the so-called T-diagrams are traditionally used. A general T-diagram is shown in Fig. 1.3. The arrow over the middle upper box indicates the direction of information processing. Sometimes, we may have to extend this symbolism by adding more components such as, e.g., the hardware platform on which the translator will work. In that case one more box will be added below the middle lower box.

 

A T-diagram

 

Fig. 1.3 A T-diagram

1.3.2 Assembler

An assembler translates an assembly language source code to executable or almost executable (object) code (see Fig. 1.4).

 

T-diagram of an assembler

 

Fig. 1.4 T-diagram of an assembler

 

Basic operation in an assembler is the replacement of symbolic information by numeric (binary or hex) information. It does this by help of two tables – one fixed – op-code table – and the other built-up when the source is read and analyzed – symbol table. The assembler will create and use a Symbol Table to keep track of the identifiers used in the source code.

An assembler is generally used as one of the steps in a typical modern compiler. For further details, see Appendix B, “Assemblers and Macro processors”.

1.3.3 Macro Assembler

Though it is also loosely called Macro processor, there is some difference in meaning. A macro processor is any program which accepts macro definitions in some macro language and provides for macro expansions or substitutions. It could be a stand-alone system like M4 macro processor used heavily in UNIX/Linux systems, or it could be a part, as an additional layer of facility, of other software – text editors, spreadsheets, compiler, etc.

On the other hand, macro assembler is a macro processor integrated into an assembler. It is an assembler with built-in macro facility. For further details, see Appendix B, “Assemblers and Macro Processors”.

1.3.4 Interpreter

An interpreter accepts a source code as input and immediately executes it. It performs analysis of each statement in the source code to find its meaning and performs those specified operations, using the operating system and hardware platform on which it is based (Fig. 1.5).

 

A simplified view of an interpreter

 

Fig. 1.5 A simplified view of an interpreter

 

There is statement by statement processing of the source code, but there is no one-time translation of the whole source code. Each time a statement needs execution, it has to be processed afresh. This takes time, of course, and shows up as slower execution compared with a compiled program. How much slower? It could be as slow as 5 to 100 times, depending upon the source language and implementation details.

For example, consider a small script segment in shell-script – an interpreted, scripting language for the shell of UNIX/Linux systems:

for f in *.c
do
  grep -n “function” $f
done

This script segment considers all files in current directory, having file name extension ‘. c’, one by one, and prints all the lines in each of them which have the word “function” in it. Every time the for statement is executed, it is analyzed afresh to find its “meaning”, similarly for the grep statement etc.

1.3.5 Load-and-Go Scheme

A substantial use of an HLL is in “one time” programs which are required to be written and checked out quickly as there is an immediate but short-term need for that program. Also, students’ programs are for learning the HLL language and programming techniques rather than long-term or commercial use. In such situations, a Load-and-Go scheme will be desirable. In such an arrangement, the text editor via which a program is developed and the compiler (or interpreter) are integrated in one package. The user types in his/her program using the built-in editor and the program is immediately translated and executed. A considerable simplification in the compiler and Linker takes place as the user program will be always loaded at a fixed location in the memory. The required function library is also in a fixed place, which makes the linking operation easy (see Fig. 1.6).

 

Load-and-Go scheme

 

Fig. 1.6 Load-and-Go scheme

 

Advantages: Simpler compiler and linker, fast turn-around on experimental programs, easier to debug.

Disadvantages: Useful only for small programs, usually limited to a single source file. Library extension is almost impossible. Code optimization is not present due to need for keeping the compiler simple and small.

Similar scheme is implemented for Perl and Python script processors, as shown in Fig. 1.7.

 

Perl and Python Load-and-Go

 

Fig. 1.7 Perl and Python Load-and-Go

1.3.6 Compiler

A compiler takes a source code in an HLL as input and generates either a machine code executable or an object code for subsequent linking and execution (Fig. 1.8).

 

A simplified view of a compiler

 

Fig. 1.8 A simplified view of a compiler

 

There are several types of compilers.

One-pass compilers: The compiler completes all its processing while scanning the source code only once. It has advantage of simpler and faster compiler, but it cannot do some of the sophisticated optimization. In the old days, the main memories were relatively small (a few hundred kilobytes of memory was considered a luxury) and the intermediate outputs during translation were stored on magnetic tapes. In modern times, high-speed hard-disk drives and much larger memories are norms, and distinction between one- and multi-pass compilers is dimming.

Multi-pass compilers: The compiler scans the source code several times to complete the translation. This allows for much better optimization. It also takes care of some quirks of the HLL being handled. For example, consider the following three statements in FORTRAN:

DO 11 I = 1, 10
DO11I = 1,10
DO 11 I = 1. 10

The first and the second statements mean exactly the same thing, because in FORTRAN spaces are simply ignored, they do not have any role in separating out atoms of the language. Now look at the third statement, it has only one character different from the first two, but the meaning is very much different. While the first and second statements were header of a DO-loop, the third is an assignment statement. Thus after the FORTRAN compiler detects a ‘,’ between ‘1’ and ‘10’, it will have to go back and re-scan from beginning to separate out the atoms ‘DO’ ‘11’ I ‘=’, which it need not do if the character between ‘1’ and ‘10’ were ‘.’. In that case, the atoms would be ‘DO11I (a valid identifier)’ ‘=’ ‘1.10’.

Load-and-Go compiler: We have already discussed this in Section. 1.3.5.

Optimizing compilers: They contain provisions for target code optimization, so that it is efficient in terms of execution speed and memory usage. Almost all modern compilers do have such facility, generally as a number of options.

Just-in-time compiler: Used by Java and by Microsoft .Net's Common Intermediate Language (CIL). Here, an application is supplied as bytecode, which is subsequently compiled to machine code of the platform just prior to execution.

1.3.7 What Does a Compiler Do?

To summarize what we have discussed till now about a compiler, we note that a compiler (see Fig. 1.9):

  • Translates a user program in one language L1 into a program in another language L2.
  • It is a large set of programs, with several modules.
  • L1 (source) is usually a High Level language like C, C++ or Java.
  • L2 (target) is usually a form of the binary machine language.
  • L2 is not a pure machine language, because two further operations Linking and Loading are needed before the program is in executable form. For a detailed discussion of Linking and Loading operations, see Appendix C “Linkers and Loaders”.
  • It consists of several steps or phases

 

A compiler in action

 

Fig. 1.9 A compiler in action

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.78.237