Compilation Phase

Program Data Vector (PDV)

The PDV is a logical area in memory where SAS builds a data set, one observation at a time. When a program executes, SAS reads data values or creates them by executing SAS language statements. The data values are assigned to the appropriate variables in the PDV. From here, SAS writes the values to a SAS data set as a single observation.
Along with data set variables and computed variables, the PDV contains these automatic variables:
  • the _N_ variable, which counts the number of times the DATA step iterates.
  • the _ERROR_ variable, which signals the occurrence of an error caused by the data during execution. The value of _ERROR_ is 0 when there are no errors. When an error occurs, whether one error or multiple errors, the value is set to 1. The default value is 0.
Note: SAS does not write these variables to the output data set.

Syntax Checking

During the compilation phase, SAS scans each statement in the DATA step, looking for syntax errors. Here are examples:
  • missing or misspelled keywords
  • invalid variable names
  • missing or invalid punctuation
  • invalid options

Data Set Variables

As the SET statement compiles, a slot is added to the PDV for each variable in the new data set. Generally, variable attributes such as length and type are determined the first time a variable is encountered.
data work.update;
  set cert.invent;
  Total=instock+backord;
  SalePrice=(CostPerUnit*0.65)+CostPerUnit;
  format CostPerUnit SalePrice dollar6.2;
run;
Figure 7.3 Program Data Vector
Program Data Vector
Any variables that are created with an assignment statement in the DATA step are also added to the PDV. For example, the assignment statement below creates two variables, Total and SalePrice. As the statement is compiled, the variable is added to the PDV. The attributes of the variable are determined by the expression in the statement. Because the expression contains an arithmetic operator and produces a numeric value, Total and SalePrice are defined as numeric variables and are assigned the default length of 8.
data work.update;
  set cert.invent;
  Total=instock+backord;
  SalePrice=(CostPerUnit*0.65)+CostPerUnit;
  format CostPerUnit SalePrice dollar6.2;
run;
Figure 7.4 Program Data Vector
Program Data Vector

Descriptor Portion of the SAS Data Set

The descriptor portion is information that SAS creates and maintains about each SAS data set, including data set attributes and variable attributes. Here are examples:
  • the name of the data set and its member type
  • the date and time that the data set was created
  • the names, data types (character or numeric), and lengths of the variables
Extended attribute descriptor information is defined by the user and includes the name of the attribute, the name of the variable, and the value of the attribute. The descriptor information also contains information about extended attributes (if defined in a data set). You can use the CONTENTS procedure to display descriptor information.
proc contents data=work.update;
run;
Figure 7.5 CONTENTS Procedure Output: Data Set Descriptor Specifics
The CONTENTS procedure output shows the data set specifics
At this point, the data set contains the six variables that are defined in the input data set and in the assignment statement. _N_ and _ERROR_ are not written to the data set. There are no observations because the DATA step has not yet executed. During execution, each raw data record is processed and is then written to the data set as an observation.
Last updated: August 23, 2018
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.148.144.228