4.2 Interpretation Vis-à-Vis Compilation

An interpreter, given the program input, traverses the abstract-syntax tree to evaluate and directly execute the program (see the right side of Figure 4.1 labeled “Interpreter”). There is no translation to object/bytecode involved in interpretation. “The interpreter for a computer language is just another program” (Friedman, Wand, and Haynes 2001, p. xi, Foreword, Hal Abelson). This observation is described as the most fundamental idea in computer programming (Friedman, Wand, and Haynes 2001). The input to an interpreter is (1) the source program to be executed and (2) the input of that source program. We say the input of the interpreter is the source program because to the programmer of the source program, the entire language implementation (i.e., Figure 4.1) is the interpreter rather than just the last component of it which accepts an abstract-syntax tree as input labeled “Interpreter” (see the bottom component in Figure 4.1). The output of an interpreter is the output of the source program.

A flow diagram of execution by interpretation.

Figure 4.1 Execution by interpretation.

Data from Friedman, Daniel P., Mitchell Wand, and Christopher T. Haynes. 2001. Essentials of Programming Languages. 2nd ed. Cambridge,MA: MIT Press.

Description

In contrast, a compiler translates the abstract-syntax tree (which is already an intermediate representation of the original source program) into another intermediate representation of the program (often assembly code), which is typically closer in similarity to the instruction set architecture (ISA) of the target processor intended to execute the program1 (see the center of Figure 4.2 labeled “Compiler”). A compiler typically involves two subcomponents: the semantic analyzer and the code generator (neither of which is discussed here). Notice how the first three components used in the process of compilation (i.e., scanner, parser, semantic analyzer) in Figure 4.2 correspond to the three progressive types of sentence validity in Table 2.1.

A flow diagram of execution by compilation.

Figure 4.2 Execution by compilation.

Data from Friedman, Daniel P., Mitchell Wand, and Christopher T. Haynes. 2001. Essentials of Programming Languages. 2nd ed. Cambridge,MA: MIT Press.

Description

Abstraction is the general concept referring to the idea that primitive details of an entity can be hidden (i.e., abstracted away) by adding a layer to that entity; this layer provides higher-level interfaces to those details such that the entity can be accessed and used without knowledge of its primitive details. Abstraction is a fundamental concept in computer science and recurs in many different contexts in the study of computer science. Progressively abstracting away from the details of the instruction set understood by the target processor has resulted in a series of programming languages, each at a higher level of abstraction than the prior:

Fourth-generation language, example lex and y a c c, leads to high-level language, example Python, Java, and Scheme. The high-level language leads to assembly language, example M I P S, which in turn leads to machine language, example x 86.

Assembly languages (e.g., MIPS) replaced the binary digits of machine language with mnemonics—short English-like words that represent commands or data. High-level languages (e.g., Python) extend this abstraction with respect to control, procedure, and data. C is sometimes referred to as the lowest high-level language because it provides facilities for manipulating machine addresses and memory, and inlining assembly language into C sources. Fourth-generation languages are referred to as such because they follow three prior levels. Note that machine language is not the end of abstraction. The 0s and 1s in object code are simply abstractions for electrical signals, and so on.

Compilation is typically structured as a series of transformations from the source program to an intermediate representation to another intermediate representation and so on, morphing the original source program so that it becomes closer, at each step, to the instruction set understood by the target processor, often until an assembly language program is produced. An assembler—not shown as a component in Figure 4.2—translates an assembly language program into machine code (i.e., object code). A compiler is simply a translator; it does not execute the source program—or the final translated representation of the program it produces—at all. Furthermore, its translation need not bring the source program any closer to the instruction set of the targeted platform. For instance, a system that translates a C program to a Java program is no less a compiler than a system that translates C code into assembly code. Another example is a LATEX compiler from LATEX source code—a high-level language for describing and typesetting documents—to PostScript—a language interpreted by printers. A PostScript document generated by a compiler can be printed by a printer, which is a hardware interpreter for PostScript (see approach ① in Figure 4.6 later in this chapter), or rendered on a display using a software interpreter for PostScript such as Ghostscript (see approach ② in Figure 4.6).

Web browsers are software interpreters (compiled into object code) that directly interpret HTML—a markup language describing the presentation of a webpage— as well as JavaScript and a variety of other high-level programming languages such as Dart.2 (One can think of the front end of a language implementation as a compiler as well. The front end translates a source program—a string of characters—into an abstract-syntax tree—an intermediate representation.) Therefore, a more appropriate term for a compiler is translator. The term compiler derives from the use of the word to describe a program that compiled subroutines, which is now called a linker. Later in the 1950s the term compiler, shortened from “algebraic compiler,” was used—or misused—to describe a source-to-source translator conveying its present-day meaning (Bauer and Eickel 1975).

Sometimes students, coming from the perspective of an introductory course in computer programming in which they may have exclusively programmed using a compiled language, find it challenging to understand how the individual instructions in an interpreted program execute without being translated into object code. Perhaps this is because they know that in a computer system everything must be reduced to zeros and ones (i.e., object code) to execute. The following example demonstrates that an interpreter does not translate its source program into object code. Consider an interpreter that evaluates and runs a program written in the language simple with the following grammar:

A list of two grammar rules of an interpreter that evaluates and runs a program.
Description

The following is an interpreter, written in C, for the language simple:

A set of code lines for an interpreter written in C.
Description

A session with the simple interpreter follows:

A set of 32 code lines for an interpreter.
Description

The simple program 2 + 3 is never translated prior to execution. Instead, that program is read as input to the interpreter, which has been compiled into object code (i.e, the executable simple). It is currently executing on the processor and, therefore, has become part and parcel of the image of the simple interpreter process in memory (see Figure 4.3 and line 7 in the example session). In that sense, the simple program 2 + 3 has become part of the interpreter. An interpreter typically does not translate its source program into any representation other than an abstract-syntax tree or a similar data structure to facilitate subsequent evaluation.

An illustration of a simple program fed into a simple interpreter to produce a program output.

Figure 4.3 Interpreter for the language simple, illustrating that the simple program becomes part of the running interpreter process.

Description

In summary, an interpreter and a compiler each involve two major components. The first of these—the front end—is the same (see the top of Figures 4.1 and 4.2). The differences in the various approaches to implementation lie beyond the front end.

1. This is not always true. For instance, the Java compiler javac outputs Java bytecode.

2. https://dart.dev

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.221.163.13