3.3 Parsing

Parsing (or syntactic analysis) is the process of determining whether a string is a sentence (in some language) and, if so, (typically) converting the concrete representation of it into an abstract representation, which generally facilitates the intended subsequent processing of it. A concrete-syntax representation of a program is typically a string (or a parse tree as shown in Chapter 2, where the terminals along the fringe of the tree from left-to-right constitute the input string). Since a program in concrete syntax is not readily processable, it must be parsed into an abstract representation, where the details of the concrete-syntax representation that are irrelevant to the subsequent processing are abstracted away. A parse tree and abstract-syntax tree are the syntactic analogs of a lexeme and token from lexics, respectively (Table 3.3). (See Section 9.5 for more details on abstract-syntax representations.) A parser (or syntactic analyzer) is the component of an interpreter or compiler that also typically converts the source program, once syntactically validated, into an abstract, or more easily manipulable, representation.

Table 3.3 (Concrete) Lexemes and Parse Trees Vis-à-Vis (Abstract) Tokens and Abstract-Syntax Trees, Respectively

A table of different conversions and relations.
Description

Often lexical and syntactic analysis are combined into a single phase (and referred to jointly as syntactic analysis) to obviate making multiple passes through the string representing the program. Furthermore, the syntactic validation of a program and the construction of an abstract-syntax tree for it can proceed in parallel. Note that parsing is independent of the subsequent processing planned on the tree: interpretation or compilation (i.e., translation) into another, typically, lower-level representation (e.g., x86 assembly code).

Parsers can be generally classified as one of two types: top-down or bottom-up. A top-down parser develops a parse tree starting at the root (or start symbol of the grammar), while a bottom-up parser starts from the leaves. (In Section 2.7, we implicitly conducted top-down parsing when we intuitively proved the validity of a string by building a parse tree for it beginning with the start symbol of the grammar.) There are two types of top-down parsers: table-driven and recursive descent. A table-driven, top-down parser uses a two-dimensional parsing table and a programmer-defined stack data structure to parse the input string. The parsing table is used to determine which move to apply given the non-terminal on the top of the stack and the next terminal in the input string. Thus, use of a table requires looking one token ahead in the input string without consuming it. The moves in the table are derived from production rules of the grammar. The other type of top-down parsing, known as recursive-descent parsing, lends itself well to implementation.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.15.100