Generally, all errors are passed on to the parser. Usually the Scanner does not print anything. Errors are communicated to the parser by returning a special error token called ERROR. Note that you should ignore the token called error (in lowercase), used by the parser. There are several requirements for reporting and recovering from lexical errors:
We shall now see the basic steps to include errors detection by the Scanner. We shall add error detection to a typical Scanner for a programming language given in Section 3.3.1.
%{
#define VARIABLE 257
#define INTEGER 258
#define TEXT 259
#define ERROR 511
%}
comment ″//″.*
… … …
text ″({ascii})*″ %%
{whitespace} {}
… … …
{text} {mktext(); return TEXT;}
. {return ERROR;}
%%
int main(){
int i;
while(i= yylex())
if(i == 511){
printf(″Error! %s
″, yytext);
} else {
printf(″%d
″, i);
/* Here parser will take over instead of printf() */
}
}
If you generate the Scanner for the above lex-code and compile and execute the resultant Scanner, it will give the following response for a valid integer, a valid variable, a valid string, an invalid variable and an invalid integer (or variable), respectively:
123
258
asd
257
″this is good.″
259
ASD
Error! A
Error! S
Error! D
456wer
258
257
The last trial shows that though from a typical programming language viewpoint, the input string “456wer” is neither an integer nor a variable; our Scanner has detected it as an integer immediately followed by a variable. From the viewpoint of syntax (i.e. the Parser), this is a wrong construct and it should be detected as such by the parser. On the other hand, when the Scanner detects an error, normally it will resume with the next character, but the Parser will have to recover at its own recovery point. This is the reason why we said that the Scanner detected errors should be passed on to the Parser and be reported by the Parser.
3.145.66.94