lexical analysis, Scanner, lex, flex
In this chapter, we discuss the first phase of a compiler – the lexical analyzer or Scanner.
Parsing is a general term meaning analyzing a sentence to arrive at its syntactic structure. From theoretical viewpoint, lexical analysis is also parsing, only the atoms of the language involved are at finer granularity.
Table 3.1 Parsing: Scanner and Parser
Item | Scanner | Parser |
---|---|---|
Language atoms | Characters | Tokens |
Structure sought | Tokens for identifiers, numbers, operators | Statements, expressions,control constructs |
Usual grammar | Regular | CFG |
Usual acceptor | FSM | PDA |
Although a typical programming language contains several context-sensitive rules, e.g. identifier types, goto labels, number of arguments in a functions, etc., there is no known parsing algorithm for context-sensitive grammars. We therefore define the programming languages in terms of regular (Chomsky type 3) and context-free (Chomsky type 2) grammars (Appendix A) and use context-sensitive rules separately while performing the semantic analysis.
We shall assume that the reader has some background in formal languages and Automata, but it is strongly suggested that the material in Appendix A be reviewed. We have defined and explained some of the terms – sentential form, phrase, simple phrase and handle – used in the discussion of parsing in Section A.5.2.
We shall see the full details of parsing in Chapter 4 on Parser.
18.191.176.99