Combining SAS Data Sets: Methods (3/6)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

OBS Common Animal OBS Common Plant

1 a Ant 1 a Apple

2 b Bird 2 b Banana

3 c Cat 3 c Coconut

4 d Dog 4 d Dewberry

5 e Eagle 5 e Eggplant

6 f Frog 6 g Fig

The following program uses two SET statements to combine observations from Animal

and Plant, and prints the results:

data twosets;

set animal;

set plant;

run;

proc print data=twosets;

title 'Data Set TWOSETS - Equal Number of Observations';

run;

Output 21.6 Data Set Created from Two Data Sets That Have Equal Observations

Each observation in the new data set contains all the variables from all the data sets.

Note, however, that the Common variable value in observation 6 contains a “g.” The

value of Common in observation 6 of the Animal data set was overwritten by the value

in Plant, which was the data set that SAS read last.

Comments and Comparisons

• The results that are obtained by reading observations using two or more SET

statements are similar to those that are obtained by using the MERGE statement with

no BY statement. However, with one-to-one reading, SAS stops processing before all

observations are read from all data sets if the number of observations in the data sets

is not equal.

• Using multiple SET statements with other DATA step statements makes the

following applications possible:

• merging one observation with many

488 Chapter 21 • Reading, Combining, and Modifying SAS Data Sets

• conditionally merging observations

• reading from the same data set twice

One-to-One Merging

Definition

One-to-one merging combines observations from two or more SAS data sets into a

single observation in a new data set. To perform a one-to-one merge, use the MERGE

statement without a BY statement. SAS combines the first observation from all data sets

in the MERGE statement into the first observation in the new data set, the second

observation from all data sets into the second observation in the new data set, and so on.

In a one-to-one merge, the number of observations in the new data set equals the number

of observations in the largest data set that was named in the MERGE statement.

If you use the MERGENOBY= SAS system option, you can control whether SAS issues

a message when MERGE processing occurs without an associated BY statement.

Syntax

Use this form of the MERGE statement to merge SAS data sets:

MERGE data-set(s);

where

data-set

names at least two existing SAS data sets.

CAUTION:

Avoid using duplicate values or different values of common variables. One-to-

one merging with data sets that contain duplicate values of common variables can

produce undesirable results. If a variable exists in more than one data set, the value

from the last data set that is read is the one that is written to the new data set. The

variables are combined exactly as they are read from each data set. Using a one-to-

one merge to combine data sets with different values of common variables can also

produce undesirable results. If a variable exists in more than one data set, the value

from the last data set read is the one that is written to the new data set even if the

value is missing. Once SAS has processed all observations in a data set, all

subsequent observations in the new data set have missing values for the variables

that are unique to that data set.

For a complete description of the MERGE statement, see the MERGE statement in SAS

Statements: Reference.

DATA Step Processing during One-to-One Merging

Compilation phase

SAS reads the descriptor information of each data set that is named in the MERGE

statement. Then, SAS creates a program data vector that contains all the variables

from all data sets as well as variables created by the DATA step.

Execution — Step 1

SAS reads the first observation from each data set into the program data vector,

reading the data sets in the order in which they appear in the MERGE statement. If

two data sets contain the same variables, the values from the second data set replace

the values from the first data set. After reading the first observation from the last data

set and executing any other statements in the DATA step, SAS writes the contents of

Combining SAS Data Sets: Methods 489

the program data vector to the new data set. Only those variables that are created or

assigned values during the DATA step are set to missing.

Execution — Step 2

SAS continues until it has read all observations from all data sets.

Example 1: One-to-One Merging with an Equal Number of

Observations

The SAS data sets Animal and Plant both contain the variable Common, and the

observations are arranged by the values of Common. The following shows the Animal

and the Plant input data sets:

Animal Plant

OBS Common Animal OBS Common Plant

1 a Ant 1 a Apple

2 b Bird 2 b Banana

3 c Cat 3 c Coconut

4 d Dog 4 d Dewberry

5 e Eagle 5 e Eggplant

6 f Frog 6 g Fig

The following program merges these data sets and prints the results:

data combined;

merge animal plant;

run;

proc print data=combined;

title 'Data Set Combined';

run;

Output 21.7 Merged Data Sets That Have an Equal Number of Observations

Each observation in the new data set contains all variables from all data sets. If two data

sets contain the same variables, the values from the second data set replace the values

from the first data set, as shown in observation 6.

490 Chapter 21 • Reading, Combining, and Modifying SAS Data Sets

Example 2: One-to-One Merging with an Unequal Number of

Observations

The SAS data sets Animal1 and Plant1 both contain the variable Common, and the

observations are arranged by the values of Common. The Plant1 data set has fewer

observations than the Animal1 data set. The following shows the Animal1 and the Plant1

input data sets:

Animal1 Plant1

OBS Common Animal OBS Common Plant

1 a Ant 1 a Apple

2 b Bird 2 b Banana

3 c Cat 3 c Coconut

4 d Dog

5 e Eagle

6 f Frog

The following program merges these unequal data sets and prints the results:

data combined1;

merge animal1 plant1;

run;

proc print data=combined1;

title 'Data Set Combined1';

run;

Output 21.8 Merged Data Sets That Have an Unequal Number of Observations

Note that observations 4 through 6 contain missing values for the variable Plant.

Example 3: One-to-One Merging with Duplicate Values of Common

Variables

The following example shows the undesirable results that you can obtain by using one-

to-one merging with data sets that contain duplicate values of common variables. The

value from the last data set that is read is the one that is written to the new data set. The

variables are combined exactly as they are read from each data set. In the following

example, the data sets Animal1 and Plant1 contain the variable Common, and each data

Combining SAS Data Sets: Methods 491

set contains observations with duplicate values of Common. The following shows the

Animal1 and the Plant1 input data sets:

Animal1 Plant1

OBS Common Animal OBS Common Plant

1 a Ant 1 a Apple

2 a Ape 2 b Banana

3 b Bird 3 c Coconut

4 c Cat 4 c Celery

5 d Dog 5 d Dewberry

6 e Eagle 6 e Eggplant

The following program produces the data set MERGE1 data set and prints the results:

/* This program illustrates undesirable results. */

data merge1;

merge animal1 plant1;

run;

proc print data=merge1;

title 'Data Set MERGE1';

run;

Output 21.9 Undesirable Results with Duplicate Values of Common Variables

The number of observations in the new data set is six. Note that observations 2 and 3

contain undesirable values. SAS reads the second observation from data set Animal1. It

then reads the second observation from data set Plant1 and replaces the values for the

variables Common and Plant1. The third observation is created in the same way.

Example 4: One-to-One Merging with Different Values of Common

Variables

The following example shows the undesirable results obtained from using the one-to-one

merge to combine data sets with different values of common variables. If a variable

exists in more than one data set, the value from the last data set that is read is the one

that is written to the new data set even if the value is missing. Once SAS processes all

observations in a data set, all subsequent observations in the new data set have missing

492 Chapter 21 • Reading, Combining, and Modifying SAS Data Sets

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Combining SAS Data Sets: Methods (3/6)

Create new playlist

Sign In

Sign Up

Table of Contents for
Combining SAS Data Sets: Methods (3/6)