Using the UNION Set Operator

Overview

The set operator UNION does both of the following:
  • selects unique rows from both tables
  • overlays columns.
overlays columns
The following example demonstrates how UNION works when used alone and with the keywords ALL and CORR.

Using the UNION Operator Alone

To display all rows from the tables One and Two that are unique in the combined set of rows from both tables, use a PROC SQL set operation that includes the UNION operator:
proc sql;
   select *
      from one
   union
   select *
      from two;
Tables One, Two, and Output
With the UNION operator, PROC SQL first concatenates and sorts the rows from the two tables, and eliminates any duplicate rows. In this example, two rows are eliminated: the second row in table One is a duplicate of the first row, and the fourth row in table Two matches the fifth row in table One. All remaining rows, the unique rows, are included in the output. The columns are overlaid by position.

Using the Keyword ALL with the UNION Operator

When the keyword ALL is added to the UNION operator, the output displays all rows from both tables, both unique and duplicate. The modified PROC SQL set operation, the tables One and Two, and the new output are shown below:
proc sql;
   select *
      from one
   union all
   select *
      from two;
Tables One, Two, and Output
When the ALL keyword is used, PROC SQL does not remove duplicates or sort the rows. The output now includes the two duplicate rows that were eliminated in the previous example: the second row in table One and the fourth row in table Two. Note that the rows are in a different order in this output than they were in the output from the previous set operation.

Using the Keyword CORR with the UNION Operator

To display all rows from the tables One and Two that are unique in the combined set of rows from both tables, based on columns that have the same name rather than the same position, add the keyword CORR after the set operator. The modified query, the tables One and Two, and the output are shown below:
proc sql;
   select *
      from one
   union corr
   select *
      from two;
Tables One, Two, and Output
X is the only column name that is common to both tables, so X is the only column that PROC SQL examines and displays in the output. In the combined set of rows from the two tables, there are duplicates of the values 1, 2, and 3, and these duplicate rows are eliminated from the output. The output displays the six unique values of X.

Using the Keywords ALL and CORR with the UNION Operator

If the keywords ALL and CORR are used together, the UNION operator displays all rows in the two tables both unique and duplicate, based on the columns that have the same name. In this example, the output displays all 12 values for X, the one column that has the same name in both tables.
proc sql;
   select *
      from one
   union all corr
   select *
      from two;
Tables One, Two, and Output

Example: UNION Operator

The UNION operator can be used to solve a realistic business problem. Suppose you are generating a report based on data from a health clinic. You want to display the results of individual patient stress tests taken in 1998, followed by the results from stress tests taken in 1999. To do this, you use the UNION operator to combine the tables Sasuser.Stress98 and Sasuser.Stress99. These two tables are similar in structure:
  • both tables contain nine columns that have the same names
  • each row contains data for an individual patient.
You are not sure whether the tables contain duplicate records, but you do not want duplicates in your output. Because the tables have the same column structure, you can overlay the columns by position, and the CORR keyword is not necessary. The PROC SQL set operation and output are shown below (the rows are ordered by IDs.):
proc sql;
  select *
    from sasuser.stress98
  union 
  select *
    from sasuser.stress99;
Output from PROC SQL with UNION Operator
Tip
If you can determine that these tables have no duplicate records, you could add the keyword ALL to speed up processing by avoiding an extra pass through the data.

Example: UNION Operator and Summary Functions

We can demonstrate another realistic business problem, to see how summary functions can be used with a set operator (in this case, UNION). Suppose you want to display the following summarized data for members of a frequent-flyer program: total points earned, total points used, and total miles traveled. All three values can be calculated from columns in the table Sasuser.Frequentflyers by using summary functions.
You might wonder why set operations are needed when only one table is involved. If you wanted to display the three summarized values horizontally, in three separate columns, you could solve the problem without a set operation, using the following simple SELECT statement:
proc sql;
   select sum(pointsearned) format=comma12.
          label='Total Points Earned',
          sum(pointsused) format=comma12.
          label='Total Points Used',
          sum(milestraveled) format=comma12.
          label='Total Miles Traveled'
      from sasuser.frequentflyers;
UNION Operator and Summary Functions
Assume, however, that you want the three values to be displayed vertically in a single column. To generate this output, you create three different queries on the same table, and then use two UNION set operators to combine the three query results:
proc sql;
title 'Points and Miles Traveled';
title2 'by Frequent Flyers'; 
   select 'Total Points Traveled:',
          sum(MilesTraveled) format=comma12.
      from sasuser.frequentflyers
   union
   select 'Total Points Earned:',
          sum(PointsEarned) format=comma12.
      from sasuser.frequentflyers
   union
   select 'Total Points Used:',
          sum(PointsUsed) format=comma12.
      from sasuser.frequentflyers;
Each SELECT clause defines two columns: a character constant as a label and the summarized value. The output is shown below.
UNION Operator and Summary Functions
Note: The preceding program reads the same table three times, so it is not the most efficient way to solve this problem.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.147.60.63