• conditionally merging observations
• reading from the same data set twice
One-to-One Merging
Definition
One-to-one merging combines observations from two or more SAS data sets into a
single observation in a new data set. To perform a one-to-one merge, use the MERGE
statement without a BY statement. SAS combines the first observation from all data sets
in the MERGE statement into the first observation in the new data set, the second
observation from all data sets into the second observation in the new data set, and so on.
In a one-to-one merge, the number of observations in the new data set equals the number
of observations in the largest data set that was named in the MERGE statement.
If you use the MERGENOBY= SAS system option, you can control whether SAS issues
a message when MERGE processing occurs without an associated BY statement.
Syntax
Use this form of the MERGE statement to merge SAS data sets:
MERGE data-set(s);
where
data-set
names at least two existing SAS data sets.
CAUTION:
Avoid using duplicate values or different values of common variables. One-to-
one merging with data sets that contain duplicate values of common variables can
produce undesirable results. If a variable exists in more than one data set, the value
from the last data set that is read is the one that is written to the new data set. The
variables are combined exactly as they are read from each data set. Using a one-to-
one merge to combine data sets with different values of common variables can also
produce undesirable results. If a variable exists in more than one data set, the value
from the last data set read is the one that is written to the new data set even if the
value is missing. Once SAS has processed all observations in a data set, all
subsequent observations in the new data set have missing values for the variables
that are unique to that data set.
For a complete description of the MERGE statement, see the MERGE statement in SAS
Statements: Reference.
DATA Step Processing during One-to-One Merging
Compilation phase
SAS reads the descriptor information of each data set that is named in the MERGE
statement. Then, SAS creates a program data vector that contains all the variables
from all data sets as well as variables created by the DATA step.
Execution — Step 1
SAS reads the first observation from each data set into the program data vector,
reading the data sets in the order in which they appear in the MERGE statement. If
two data sets contain the same variables, the values from the second data set replace
the values from the first data set. After reading the first observation from the last data
set and executing any other statements in the DATA step, SAS writes the contents of
Combining SAS Data Sets: Methods 489