Combining Selected Observations from Multiple Data Sets

To create a data set that contains only the observations that are selected according to a particular criterion, you can use the subsetting IF statement and a SET statement that specifies multiple data sets. The following DATA step reads two input data sets to create a combined data set that lists only the winning teams:
data champions(drop=result);1
   set southamerican (in=S) european;2
   by Year;
   if result='won';  3
   if S then Continent='South America';  4
   else Continent='Europe';
run;

proc print data=champions;
   title 'World Cup Champions from 1954 to 1998';
   title2 'including Countries'' Continent';
run;
The following list corresponds to the numbered items in the preceding program:
1 The DROP= data set option drops the variable Result from the new data set CHAMPIONS because all values for this variable are the same.
2 The SET statement reads observations from two data sets: SOUTHAMERICAN and EUROPEAN. The S= data option creates the variable S, which is set to 1 each time an observation is contributed by the SOUTHAMERICAN data set.
3 A subsetting IF statement writes the observation to the output data set CHAMPIONS only if the value of the Result variable is won.
4 When the current observation comes from the data set SOUTHAMERICAN, the value of S is 1. Otherwise, the value is 0. The IF-THEN/ELSE statements execute one of two assignment statements, depending on the value of S. If the observation comes from the data set SOUTHAMERICAN, then the value assigned to Continent is South America. If the observation comes from the data set EUROPEAN, then the value assigned to Continent is Europe.
The following output displays the CHAMPIONS data set:
Display 22.5 Combining Selected Observations
Combining Selected Observations
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.143.223.127