8.1 Building a Parse Tree

Suppose we want to obtain the parse tree for a program in the language generated by the above grammar. In anticipation of further processing phases, we would like to have an internal form most suitable for reduction to an Intermediate representation. However, here for illustration purpose, first we represent the parse tree in a form suitable for actually drawing the tree, which is done with the help of a TEXpackage known as pst-tree.

In the pst-tree package, a tree is written as pstree{root}{node1 node2 … noden}, where each node can be written in one of several different ways, but we use TR{node-label} to write a node. For example, pstree{TR{A}}{ TR{B} TR{C}} gives the tree.

 

image

 

How do we generate the parse tree? The idea is rather simple – each Non-Terminal in the grammar should correspond to a sub-tree, with that Non-Terminal as the root-node. Each Terminal should be a leaf-node. We can put printf() statements in our RDP at the beginning of each function printing out the relevant portion of the total tree. For example, the following code segments show how a Non-Terminal and a Terminal output their contribution to the parse tree.

pT(){
     enter(“\pstree{\TR{T}}{”);
     pF();
     while(nextsymbol() == ‘*’) {
	    ppast();
	    pF();
     }
     out();
}
pplus(){
    enter(“\TR{+}”);
    if(symbol() != ‘+’)
      error(“+ expected”);
}

The enter() simply prints out its argument. Function out() simply prints a ‘}’. In fact the parse tree shown in Fig. 8.5 (and others too) was obtained from an RDP in this manner. A sample RDP program parser_tree.c is available. For example, with an input PLa = a * (a + a), a parse tree shown in Fig. 8.3 is obtained. Before we close the discussion of generation of a parse tree, in anticipation of later need, let us consider how a parse tree data structure can be generated in the computer memory.

 

Parse tree for input string PLa = a * (a + a)

 

Fig. 8.3 Parse tree for input string PLa = a * (a + a)

8.1.1 Generating Parse Tree in Memory

The basic idea is still the same as displaying of the picture of the tree as given above. However, while drawing the tree, conversion from the linear representation, i.e. in a serial text form, to the two-dimensional picture was done by the pst-tree package. To generate the tree in the memory, we have to write code for some of that work. A C program to generate the parse tree parser_tree_ tb.c is available. We show here its essential segments.

A tree-node is defined in a header file tb_tree_1.h as follows:

typedef union{
  struct node_struct*N;
  struct tree_struct*T;
  char*S;
  long I;
  float F;
  void *P;
}util;
typedef struct node_struct{
struct node_struct *link[2]; // left = 0, right = 1
util u,v,w,x,y,z;
}Node;

Though this tree-node definition is optimized for a binary search tree, it is flexible enough to deal with very complex tree structure. We use pointer u.N as the first child pointer and v.N as a pointer to the next sibling. Also, we use w.S as a pointer to the node label string. For example, the tree

 

image

 

is represented in memory as shown in Fig. 8.4.

 

A sample tree representation in memory

 

Fig. 8.4 A sample tree representation in memory

 

With this arrangement in mind, we write two essential functions:

#include “tb_tree_1.h”
Node * tree_root(char *label){
  Node *np;
  np = (Node *)malloc(sizeof(Node));
  if(np == NULL) exit(-1);
  np–>u.N = NULL;	// ptr first child (if any)
  np–>v.N = NULL;	// ptr next sibling (if any)
  np–>w.S = (char *)malloc(strlen(label)+1);
  strcpy(np–>w.S, label); // node label
  return np;
}
void add_child(Node *root, Node *child){
  Node *np;
  np = root–>u.N; // children list start
  child–>v.N = NULL;
  if(np == NULL){
    root–>u.N = child;
    return;
  }
  while(np–>v.N != NULL){
    np = np–>v.N; // next in child list
  }
  np–>v.N = child;
}

We would like to have a look at the parse tree that we created in memory, for which we write a tree traversal function:

void traverse_parse_tree(Node *root){
  Node *np;
  printf(“[%s" ,root–>w.S);
  np = root–>u.N;
  while(np != NULL){
    traverse_parse_tree(np);
    np = np–>v.N;
  }
  printf(“]”);
}

The following code segments illustrate modifications to the basic RDP code for generating an internal parse tree. A typical function for a Non-Terminal symbol in the grammar looks like:

Node *pE(){
  Node *np;
  np = tree_root(“E”);
  add_child(np,pT());
  while(nextsymbol() == ‘+’) {
    add_child(np,pplus());
    add_child(np,pT());
  }
  return np;
}

A function for a Terminal symbol in the grammar looks like:

Node *psemi(){
  if(symbol() != ‘;’) error(“; expected”);
  return(tree_root(“;”));
}

With PLa = a * (a + a). as input the parse tree is internally built and main() displays it as:

[P[p][R[S[L[1][a][=][E[T[F[a]][*][F[(][E[T[F[a]]][+][T[F[a]]]][)]]]]]]] [.]]

Compare this output with that shown in Fig. 8.3.

With this introduction to how to build a parse tree in memory we now go on to discuss Intermediate representations.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.158.165