Excluding Unmatched Observations

Overview

By default, DATA step match-merging combines all observations in all input data sets.
To exclude unmatched observations from your output data set, use the following in your DATA step:
  • Use the IN= data set option to create and name a variable that indicates whether the data set contributed data to the current observation.
  • Use the subsetting IF statement to check the IN= values and write to the merged data set only matching observations

Identifying Observation in Both Data Sets

To match-merge the data sets Clinic.Demog and Clinic.Visit and select only observations that appear in both data sets, use IN= to create two temporary variables, Indemog and Invisit. The IN= variable is a temporary variable that is available to program statements during the DATA step, but it is not included in the SAS data set that is being created.
Syntax, IN= data set option:
(IN= variable)
  • The IN= option, in parentheses, follows the data set name.
  • variable names the variable to be created.
Within the DATA step, the value of the variable is 1 if the data set contributed data to the current observation. Otherwise, its value is 0.
The DATA step that contains the IN= options appears below. The first IN= creates the temporary variable indemog, which is set to 1 when an observation from Clinic.Demog contributes to the current observation. Otherwise, it is set to 0. Likewise, the value of Invisit depends on whether Clinic.Visit contributes to an observation or not.
data clinic.merged; 
   merge clinic.demog(in=indemog)  
         clinic.visit(in=invisit
                     rename=(date=BirthDate)); 
   by id; 
run;
Tip
To specify multiple data set options for a given data set, enclose the options in a single set of parentheses.

Selecting Matching Observations

To select only observations that appear in both Clinic.Demog and Clinic.Visit, specify a subsetting IF statement in the DATA step.
The subsetting IF statement checks the values of Indemog and Invisit and continues processing only those observations that meet the condition of the expression. The condition is that both Clinic.Demog and Clinic.Visit contribute to the observation. If the condition is met, the new observation is written to Clinic.Merged. Otherwise, the observation is deleted.
data clinic.merged; 
   merge clinic.demog(in=indemog  
                     rename=(date=BirthDate)) 
         clinic.visit(in=invisit 
                      rename=(date=VisitDate));  
   by id; 
   if indemog=1 and invisit=1; 
run; 
proc print data=clinic.merged; 
run;
In previous examples, Clinic.Merged contained 12 observations. In the output below, notice that only 10 observations met the condition in the IF expression.
Figure 13.20 Selecting Matching Observations
Selecting Matching Observations
SAS evaluates the expression within an IF statement to produce a result that is either nonzero, zero, or missing. A nonzero and nonmissing result causes the expression to be true; a zero or missing result causes the expression to be false.
It is possible to specify the subsetting IF statement from the previous example in either of the following ways. The first IF statement checks specifically for a value of 1. The second IF statement checks for a value that is neither missing nor 0 (which for IN= variables is always 1).
if indemog=1 and invisit=1;  

if indemog and invisit;
Last updated: January 10, 2018
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.226.177.86