Compilation Phase

Input Buffer

The input buffer is a logical area in memory into which SAS reads each record of a raw data file when SAS executes an INPUT statement. The buffer is created only when the DATA step reads raw data. When the DATA step reads a SAS data set, SAS reads the data directly into the PDV.

Program Data Vector (PDV)

After the input buffer is created, the PDV is created. The PDV is a logical area in memory where SAS builds a data set, one observation at a time. When a program executes, SAS reads data values from the input buffer or creates them by executing SAS language statements. The data values are assigned to the appropriate variables in the program data vector. From here, SAS writes the values to a SAS data set as a single observation.
Along with data set variables and computed variables, the PDV contains these automatic variables:
  • the _N_ variable, which counts the number of times the DATA step iterates.
  • the _ERROR_ variable, which signals the occurrence of an error caused by the data during execution. The value of _ERROR_ is either 0 (no error exists) or 1 (one or more errors occurred). The default value is 0.
Note: SAS does not write these variables to the output data set.

Syntax Checking

During the compilation phase, SAS scans each statement in the DATA step, looking for syntax errors. Here are examples:
  • missing or misspelled keywords
  • invalid variable names
  • missing or invalid punctuation
  • invalid options

Data Set Variables

As the INPUT statement is compiled, a slot is added to the program data vector for each variable in the new data set. Generally, variable attributes such as length and type are determined the first time a variable is encountered.
filename invent 'Z:sasuserinvent.dat';
data work.update;
	infile invent;
	input Item $1-13 IDnum $15-19
	InStock 21-22 BackOrd 24-25;
	Total=instock+backord;
run;
Program Data Vector
Any variables that are created with an assignment statement in the DATA step are also added to the program data vector. For example, the assignment statement below creates the variable Total . As the statement is compiled, the variable is added to the program data vector. The attributes of the variable are determined by the expression in the statement. Because the expression contains an arithmetic operator and produces a numeric value, Total is defined as a numeric variable and is assigned the default length of 8.
filename invent 'Z:sasuserinvent.dat';
data work.update;
	infile invent;
	input Item $1-13 IDnum $15-19
	InStock 21-22 BackOrd 24-25;
	Total=instock+backord;
run;
Program Data Vector

Descriptor Portion of the SAS Data Set

The descriptor portion is information that SAS creates and maintains about each SAS data set, including data set attributes and variable attributes. Here are examples:
  • the name of the data set and its member type
  • the date and time that the data set was created
  • the names, data types (character or numeric), and lengths of the variables
The descriptor information also contains information about extended attributes (if defined on a data set). Extended attribute descriptor information includes the name of the attribute, the name of the variable, and the value of the attribute.
Figure 7.3 CONTENTS Procedure Output: Data Set Descriptor Specifics
The CONTENTS procedure output shows the data set specifics
At this point, the data set contains the five variables that are defined in the input data set and in the assignment statement. _N_ and _ERROR_ are not written to the data set. There are no observations because the DATA step has not yet executed. During execution, each raw data record is processed and is then written to the data set as an observation.
Last updated: January 10, 2018
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.147.27.131