Working with Lookup Values Outside of SAS Data Sets

Overview

Remember that it is not necessary for your lookup table to be a SAS data set. Suppose you want to combine the data from your base table with lookup values that are not stored in a SAS data set. You can use the following techniques to hardcode lookup values into your program:
  • the IF-THEN/ELSE statement
  • SAS arrays
  • user-defined SAS formats
  • hash objects.

The IF-THEN/ELSE Statement

You should be familiar with the syntax and use of the IF-THEN/ELSE statement. Overall, this technique is easy to use and easy to understand. Because of its simplicity and because you can use other DATA step syntax with it, the IF-THEN/ELSE statement can be quite versatile as a technique for performing lookup operations. You can use this technique if your lookup values are not stored in a data set, and you can use it to handle any of the possible relationships between your base table and your lookup table. You can use it to retrieve single or multiple values. For example, you can use DO groups to provide multiple values based on a condition.
Keep in mind that this technique requires maintenance. If you expect your lookup values to change, or you have a large number of lookup values, or if you use the lookup values in multiple programs, the resources required for maintaining the IF-THEN/ELSE statements in your programs might make this technique inappropriate. Also, this technique might result in a prohibitively long program or even in a program that will not execute because it times out.

Example: Using the IF-THEN/ELSE Statement to Combine Data

Suppose you have a data set, Mylib.Employees, that contains information about employees. Mylib.Employees contains a variable named IDnum that records each employee's unique identification number. If you want to combine the data from Mylib.Employees with a list of employees' birthdates that is not stored in a data set, you can use the IF-THEN/ELSE statement to do so.
data mylib.employees_new;
   set mylib.employees;
   if IDnum=1001 then Birthdate='01JAN1963'd;
   else if IDnum=1002 then Birthdate='08AUG1946'd;
   else if IDnum=1003 then Birthdate='23MAR1950'd;
   else if IDnum=1004 then Birthdate='17JUN1973'd;
run;

SAS Arrays

You should be familiar with the syntax and use of the ARRAY statement. With the ARRAY statement, you can either hardcode your lookup values into the program, or you can read them into the array from a data set. Elements of a SAS array are referenced positionally. That is, you use a numeric value as a pointer to the array element, so you must be able to identify elements of the array either by position or according to another numeric value. You can use multiple values or numeric mathematical expressions to determine the array element to be returned.
This technique is capable of returning only a single value from the lookup operation. The dimensions of the array must be supplied at compile time either by hardcoding or through the use of macro variables.

Example: Using the ARRAY Statement to Combine Data

We will consider our example of combining the data from Mylib.Employees with a list of lookup values. Remember that Mylib.Employees contains data about employees, which includes their identification numbers (IDnum) but does not include their birthdates. You can use the ARRAY statement to hardcode the birthdates into a temporary array named Birthdates, and then use the array to combine the birthdates with the data in Mylib.Employees.
In the following DATA step, the values that are specified as subscripts for the array correspond to values of the variable IDnum in the base table, Mylib.Employees. The assignment statement for the new variable Birthdate retrieves a value from the Birthdates array according to the current value of IDnum.
data mylib.employees_new;
   array birthdates{1001:1004} _temporary_ ('01JAN1963'd
         '08AUG1946'd '23MAR1950'd '17JUN1973'd);
   set mylib.employees;
   Birthdate=birthdates{IDnum};
run;

User-Defined SAS Formats

You should be familiar with the syntax and use of the FORMAT procedure with the VALUE statement. Formats can be referenced in FORMAT statements, PUT statements, and PUT functions in assignment WHERE or IF statements.
The FORMAT procedure uses a binary search (a rapid search technique) through the lookup table. Another benefit of using this technique is that maintenance is centralized; if a lookup value changes, you have to change it in only one place (in the format), and every program that uses the format will use the new value.

Example: Using the FORMAT Procedure to Combine Data

Once again, suppose the data set Mylib.Employees contains information about employees according to their employee identification numbers (IDnum), but does not contain employees' birthdates. You can use a format to combine employees' birthdates with the data that is stored in Mylib.Employees.
The following PROC FORMAT step uses a VALUE statement to hardcode the lookup values in the BIRTHDATE format. The DATA step uses the PUT function to associate the lookup values from the format with the values of IDnum. The INPUT function associates the lookup value with the DATE9. informat, and assigns the formatted values to a new variable named Birthdate.
proc format;
   value birthdate 1001 = '01JAN1963'
                   1002 = '08AUG1946'
                   1003 = '23MAR1950'
                   1004 = '17JUN1973';
   run;

   data mylib.employees_new;
   set mylib.employees;
   Birthdate=input(put(IDnum,birthdate.),date9.);
run;
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.142.144