Searching for a Specific Numeric Value

Specific values such as 999 or 9999 are sometimes used to denote missing values. For example, numeric values that are left blank in Dbase files are stored as zeros. If 0 is a valid value for some of your variables, this can lead to problems. So, values such as 999 or 9999 are sometimes used instead of zeros or blanks. Program 3-9 searches a SAS data set for all numeric variables set to a specific value and produces a report, which shows the variable name and the observation where the specific value was found.

The “trick” in this program is the relatively unknown routine, VNAME (see online help for SAS Release 6.12 or later for more details). A call to this routine returns the variable name of an array element. The first program (Program 3-9) searches a SAS data set for a specific value. The program is then generalized by making the data set name and the specific value, calling arguments in a macro. Here is the first program.

Program 3-9. Identifying All Numeric Variables Equal to a Fixed Value (Such as 999)
*---------------------------------------------------------------- *
| Program Name: FIND_X.SAS  in C:CLEANING                        |
| Purpose: Identifies any specified value for all numeric vars    |
*-----------------------------------------------------------------*;
***Create test data set;
DATA TEST;
   INPUT X Y A $ X1-X3 Z $;
DATALINES;
1 2 X 3 4 5 Y
2 999 Y 999 1 999 J
999 999 R 999 999 999 X
1 2 3 4 5 6 7
;
***Program to detect the specified values;
DATA _NULL_;
   SET TEST;
   FILE PRINT;
   ARRAY NUMS[*] _NUMERIC_;  1
   LENGTH VARNAME $ 8;
   DO __I = 1 TO DIM(NUMS);  2
      IF NUMS[__I] = 999 THEN DO;
         CALL VNAME(NUMS[__I],VARNAME);  3
         PUT "Value of 999 found for variable " VARNAME
             "in observation " _N_;
      END;
   END;
   DROP __I;
RUN;

Key to this program is the use of _NUMERIC_ in the ARRAY statement . Because this ARRAY statement follows the SET statement, the array NUMS will contain all the numeric variables in the data set TEST. The next step is to examine each of the elements in the NUMS array, determine if a value of 999 is found, and then determine the variable name associated with that array element. The DO loop uses the index variable__I in the hopes that there will not be any variables in the data set to be tested with that name.

Now for the “trick.” As you search for values of 999 for each of the numeric variables, you can use the CALL VNAME routine (see the online help for Release 6.12 or later) to return the variable name that corresponds to the array element. In this program, the variable name is stored in the variable VARNAME (the first argument) by the VNAME routine. All that is left to do is write out the variable names and observation numbers. Next is the macro version of the same program, followed by the output.

Program 3-10. Creating a Macro Version of Program 3-9
*----------------------------------------------------------------*
| Macro Name: FIND_X.SAS  in C:CLEANING                         |
| Purpose: Identifies any specified value for all numeric vars   |
| Calling Arguments: DSN   SAS Data Set Name                     |
|                    NUM   Numeric value to search for           |
| Example:  To find variable values of 999 in data set TEST, use |
|           %FIND_X(TEST,999)                                    |
*----------------------------------------------------------------*;
%MACRO FIND_X(DSN,NUM);
   TITLE "Variables with 999 as Missing Values";
   DATA _NULL_;
      SET &DSN;
      FILE PRINT;
      LENGTH VARNAME $ 8; ***Or LENGTH 32 for V7 and Later;
      ARRAY NUMS[*] _NUMERIC_;
      DO __I = 1 TO DIM(NUMS);
         IF NUMS[__I] = &NUM THEN DO;
            CALL VNAME(NUMS[__I],VARNAME);
            PUT "Value of &NUM found for variable " VARNAME
                "in observation " _N_;
         END;
      END;
      DROP __I;
   RUN;
%MEND FIND_X;

You call this macro by using the following statement:

%FIND_X(TEST,999)

The resulting output is shown next.

Variables with 999 as Missing Values

Value of 999 found for variable Y in observation 2
Value of 999 found for variable X1 in observation 2
Value of 999 found for variable X3 in observation 2
Value of 999 found for variable X in observation 3
Value of 999 found for variable Y in observation 3
Value of 999 found for variable X1 in observation 3
Value of 999 found for variable X2 in observation 3
Value of 999 found for variable X3 in observation 3


If you would prefer just to see a summary of variables that have the value of a number, such as 999, for one or more observations, you can modify Program 3-9 to create a data set, and use PROC FREQ to count the number of times a specified value is detected as shown in Program 3-11.

Program 3-11. Identifying Variables with Specified Numeric Values and Counting the Number of Times the Value Appears
DATA NUM_999;
   SET TEST;
   FILE PRINT;
   ARRAY NUMS[*] _NUMERIC_;
   LENGTH VARNAME $ 8;
   DO __I = 1 TO DIM(NUMS);
      IF NUMS[__I] = 999 THEN DO;
         CALL VNAME(NUMS[__I],VARNAME);
         OUTPUT;
      END;


   END;
   KEEP VARNAME;
RUN;


PROC FREQ DATA=NUM_999;
   TABLES VARNAME / NOCUM NOPERCENT;
RUN;

Each time a numeric variable (in the array NUMS) is equal to a value of 999, a call to VNAME places the variable name into the variable VARNAME. An observation is then written to the data set NUM_999. Because you want to count the number of times the specific numeric value (such as 999) occurred, use PROC FREQ to print out the frequencies.

Running Program 3-11 with the value of 999 on the data set TEST generated the following output.

Variables with 999 as Missing Values

The FREQ Procedure

VARNAME    Frequency
--------------------
X                 1
X1                2
X2                1
X3                2
Y                 2


You can convert this last program into a SAS macro as an exercise.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.139.238.226