Example 2.3 Appending One Data Set to the End of Another Data Set

Goal

Add the observations from one data set to the end of another data set.

Example Features

Featured StepPROC APPEND
Featured Step Options and StatementsFORCE option
Related Technique 1PROC DATASETS, APPEND statement with FORCE option
A Closer LookDetermining When to Use PROC APPEND Using PROC APPEND When a Variable Exists in Only One of the Data Sets

Input Data Sets

Data set YTDPRICES contains 103 observations that record daily closing mutual fund prices. The first ten and the last five observations in YTDPRICES are shown.

   YTDPRICES (first 10 observations)

Obs     tradedate      price    boardmtg
  1    01/04/2010     $80.29
  2    01/05/2010     $85.36
  3    01/06/2010     $81.43
  4    01/07/2010     $79.51
  5    01/08/2010     $83.58
  6    01/11/2010     $79.80
  7    01/12/2010     $80.87       X
  8    01/13/2010     $78.95
  9    01/14/2010     $84.02
 10    01/15/2010     $85.09
         . . .

 99    05/24/2010     $90.84
100    05/25/2010     $90.93
101    05/26/2010     $94.02
102    05/27/2010     $94.12
103    05/28/2010     $93.21

Data set WEEK22 contains four observations that record daily closing mutual fund prices for the twenty-second week in June 2010.

                 WEEK22

Obs       tradedate      price    week
 1       06/01/2010     $95.59     22
 2       06/02/2010     $89.68     22
 3       06/03/2010     $93.78     22
 4       06/04/2010     $91.87     22

Resulting Data Set

Output 2.3 YTDPRICES Data Set (last nine observations)

                Example 2.3 YTDPRICES Data Set (last 9 observations) Created with PROC APPEND

                                      Obs     tradedate      price    boardmtg

                                      99     05/24/2010     $90.84
                                     100     05/25/2010     $90.93
                                     101     05/26/2010     $94.02
                                     102     05/27/2010     $94.12
                                     103     05/28/2010     $93.21
                                     104     06/01/2010     $95.59
                                     105     06/02/2010     $89.68
                                     106     06/03/2010     $93.78
                                     107     06/04/2010     $91.87


Example Overview

This program uses PROC APPEND to append the observations from a smaller data set, WEEK22, to the end of a larger data set, YTDPRICES. Base data set YTDPRICES contains the closing prices for a mutual fund for each trading day from January 4, 2010, through May 28, 2010. Data set WEEK22 contains the closing prices for the twenty-second week of 2010, which are the trading days June 1, 2010, through June 4, 2010. May 31, 2010, is Memorial Day and not a trading day.

Variable BOARDMTG is present only in base data set YTDPRICES, and variable WEEK is present only in data set WEEK22. A value of "X" for BOARDMTG indicates that a board of directors meeting was held that day. In this example, board meetings were held the second Tuesday of each month.

At the conclusion of the program, YTDPRICES contains the same three variables that it had at the beginning of the program: TRADEDATE, PRICE, and BOARDMTG. PROC APPEND did not copy variable WEEK, which was found only in data set WEEK22, to data set YTDPRICES.

The FORCE option in the PROC APPEND statement does not add WEEK, but it does allow the step to complete. For more information, see the "A Closer Look" section "Using PROC APPEND When a Variable Exists in Only One of the Data Sets."

Program

Specify the data set to which you want to append observations. Specify the data set whose observations you want to append to the data set named by the BASE= option. Force the procedure to concatenate the DATA= data set to the BASE= data set even though the DATA= data set might contain a variable not found in the BASE= data set.

proc append base=ytdprices

            data=week22

            force;
run;

Related Technique

The APPEND statement in the DATASETS procedure and the FORCE option create the same data set as the PROC APPEND step in the main example.

Invoke PROC DATASETS and process data sets in the WORK library. Do not list the names of the members of the WORK library in the output window. Specify the data set in the WORK library to which you want to append observations. Specify the data set whose observations you want to append to the data set named by the BASE= option. Force the procedure to concatenate the DATA= data set to the BASE= data set even though the DATA= data set contains a variable not found in the BASE= data set.

proc datasets library=work nolist;


  append base=ytdprices

  data=week22


  force;



run;
quit;

A Closer Look

Determining When to Use PROC APPEND

PROC APPEND is most advantageous to use when you want to add observations to the end of another data set and you do not need to add or delete any variables in the process. Either the DATA step or PROC SQL can append data sets as demonstrated in Example 2.1. However, for efficiency reasons, you might want to choose PROC APPEND when the data set to which you want to append observations is much larger than the data set that contains the observations to be appended. PROC APPEND does not recopy the data set or table as a DATA step but PROC SQL does.

The following DATA step produces the YTDPRICES data set equivalent to that created by PROC APPEND in the main example. Note the DROP=WEEK data set option that is applied to data set WEEK22. This ensures that the output data set YTDPRICES is the same as the data set that is produced by PROC APPEND.

data ytdprices;
  set ytdprices week22(drop=week);
run;

The following PROC SQL step produces a table YTDPRICES equivalent to the data set that was created by PROC APPEND in the main example.

proc sql;
  create table ytdprices as
    select * from ytdprices
      outer union corr
    select tradedate, price from week22;
quit;

The CREATE TABLE statement generates a WARNING because the name of the table that is being created is the same as one of the table names in the SELECT statement. However, in this example it does execute correctly.

 WARNING: This CREATE TABLE statement recursively references the
          target table. A consequence of this is a possible data
          integrity problem.

Using PROC APPEND When a Variable Exists in Only One of the Data Sets

By default, PROC APPEND does not execute when a variable found in the data set that is being appended does not exist in the base data set to which it is being appended. The PROC APPEND statement option FORCE overrides this default and allows such a data set to be appended to the base data set. However, the variable unique to the data set that is being appended is not copied to the resulting data set.

To execute, this example requires the FORCE option because variable WEEK is present in WEEK22 and it is not present in YTDPRICES. The program generates the following warning about this condition when option FORCE is in effect. The warning indicates that PROC APPEND did not add variable WEEK found in data set WEEK22 to base data set YTDPRICES.

WARNING: Variable week was not found on BASE file. The variable
         will not be added to the BASE file.
NOTE: FORCE is specified, so dropping/truncating will occur.

Conversely, if a variable exists in the base data set, but not in the data set that is being appended, you do not need to add the FORCE option to the PROC APPEND statement for it to execute. In this situation, PROC APPEND assigns missing values to the variable for the observations added from the data set that is being appended. The procedure generates a warning when this occurs. In this example, missing values are assigned to BOARDMTG for the four observations that were contributed from WEEK22 because BOARDMTG is not in WEEK22.

WARNING: Variable boardmtg was not found on DATA file.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.149.245.219