Match-Merging: Details

Match-Merging Syntax

Match-merging combines observations from two or more data sets into a single observation in a new data set according to the values of a common variable.
When match-merging, use the MERGE statement rather than the SET statement to combine data sets.
Syntax, DATA step for match-merging:
DATA output-SAS-data-set;
MERGE SAS-data-set-1 SAS-data-set-2;
BY <DESCENDING> variable(s);
RUN;
  • output-SAS-data-set names the data set to be created.
  • SAS-data-set-1 and SAS-data-set-2 specify the data sets to be read.
  • variable(s) in the BY statement specifies one or more variables whose values are used to match observations.
  • DESCENDING indicates that the input data sets are sorted in descending order (largest to smallest numerically, or reverse alphabetical for character variables) by the variable that is specified. If you have more that one variable in the BY statement, DESCENDING applies only to the variable that immediately follows it. The default sort order is ASCENDING.
Tip
Each input data set in the MERGE statement must be sorted in order of the values of the BY variable or variables, or it must have an appropriate index. Each BY variable must have the same type in all data sets to be merged.

How Match-Merging Selects Data

During match-merging SAS sequentially checks each observation of each data set to see whether the BY values match and then writes the combined observation to the new data set.
data merged; 
  merge a b; 
  by num; 
run;
Figure 10.6 How Match-Merging Selects Data
How Match-Merging Selects Data
Basic DATA step match-merging produces an output data set that contains values from all observations in all input data sets. You can add statements and options to select only matching observations.
If your input data set does not have any observations for a value of the BY variable, then the observations in the output data set will contain missing values. The missing values are for the variables that are unique to the input data set.
Tip
In match-merging, often one data set contains unique values for the BY variable and other data sets contain multiple values for the BY variable.

Example: Using Match-Merging to Combine Data Sets

The data sets Cert.Demog and Cert.Visit have been sorted as follows:
proc sort data=cert.demog; 
  by id; 
run; 
proc print data=cert.demog; 
run;
Figure 10.7 HTML Output: Sorting Cert.Demog
Example: Sorting Cert.Demog
proc sort data=cert.visit; 
  by id; 
run; 
proc print data=cert.visit; 
run;
Figure 10.8 HTML Output: Sorting Cert.Visit
Example: Sorting Cert.Visit
You can then submit this DATA step to create Work.Merged by merging Cert.Demog and Cert.Visit according to values of the variable ID.
data work.merged; 
  merge cert.demog cert.visit; 
  by id; 
run; 
proc print data=work.merged; 
run;
Note: All observations, including unmatched observations and observations that have missing data, are written to the output data set.
Figure 10.9 HTML Output: Match-Merging Output
All observations including observations with missing data are written to the output data set shown in the output graphic.

Example: Merge in Descending Order

The example above illustrates merging two data sets that are sorted in ascending order of the BY variable ID. To sort the data sets in descending order and then merge them, you can submit the following program.
proc sort data=cert.demog; 
  by descending id; 
run; 
proc sort data=cert.visit; 
  by descending id; 
run; 
data work.merged; 
  merge cert.demog cert.visit; 
  by descending id; 
run; 
proc print data=work.merged; 
run;
Note: Specify the DESCENDING option in the BY statements in both the PROC SORT steps and the DATA step. If you omit the DESCENDING option in the DATA step, you generate error messages about improperly sorted BY variables.
Figure 10.10 HTML Output: Merge in Descending Order
The data sets are merged in descending order of the BY variable ID.
Last updated: August 23, 2018
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.14.39.59