Appending SAS Data Sets

Overview

Now that you have seen several methods for concatenating raw data files, use the APPEND procedure to concatenate two SAS data sets.
General form, PROC APPEND:
PROC APPEND BASE=SAS-data-set DATA=SAS-data-set;
RUN;
Here is an explanation of the syntax:
BASE=SAS-data-set
names the data set to which you want to add observations.
DATA=SAS-data-set
names the SAS data set containing observations that you want to append to the end of the BASE= data set.
PROC APPEND reads only the data in the DATA= SAS data set, not the BASE= SAS data set. Therefore, this action is very efficient. PROC APPEND concatenates data sets even though there might be variables in the BASE= data set that do not exist in the DATA= data set.
When the BASE= data set contains more variables than the DATA= data set, missing values for the additional variables are assigned to the observations that are read in from the DATA= data set, and a warning message is written to the SAS log.

Example

The SAS data sets Work.Cap2001 and Work.Capacity both contain the following variables: Cap1st, CapBusiness, CapEcon, Dest, FlightID, Origin, and RouteID. However, the BASE= data set (Work.Cap2001) contains an additional variable, Date, that is not included in the DATA= data set (Work.Capacity).
When the following program is submitted, the SAS log indicates that the variable Date was not found in the DATA= file.
proc append base=work.cap2001 
            data=work.capacity;
run;
Table 14.8 SAS Log
NOTE: Appending WORK.CAPACITY to WORK.CAP2001.
WARNING: Variable Date was not found on DATA file.
NOTE: There were 50 observations read from the data set WORK.CAPACITY.
NOTE: 50 observations added.
NOTE: The data set WORK.CAP2001 has 100 observations and 8 variables.
NOTE: PROCEDURE APPEND used (Total process time):
The PROC PRINT output of the appended version of Work.Cap2001 shows that missing values have been assigned to Date in the observations that were read in from the DATA= data set.
proc print data=work.cap2001 
           (firstobs=45 obs=55);
run;
Work.Cap2001 Data Set
Note: You can also use the DATA step SET statement to combine SAS data vertically. If multiple data set names appear in the SET statement, the resulting output data set is a concatenation of all the data sets listed. Unlike the APPEND procedure, the SET statement in the DATA step reads all observations in all input data sets in order to concatenate them. Therefore, the APPEND procedure is more efficient than the SET statement in the DATA step for concatenating data sets because it reads only the data in the DATA= data set.
In the following program, SAS reads all of the observations from Work.Cap2001, and it then reads all of the observations from Work.Capacity.
data work.new;
   set work.cap2001 work.capacity;
run;
Note: You can also use the SQL procedure to combine SAS data vertically. For information about using the SQL procedure to combine data vertically, see Combining Tables Vertically Using PROC SQL.

Using the FORCE Option

In the previous example, the DATA= data set (Work.Capacity) contained fewer variables than the BASE= data set (Work.Cap2001). However, you might need to append data sets when the DATA= data set contains more variables than the BASE= data set.
You must use the FORCE option with the APPEND procedure to concatenate data sets when the DATA= data set contains variables that are not in the BASE= data set.
General form, PROC APPEND with the FORCE option:
PROC APPEND BASE=SAS-data-set DATA=SAS-data-set <FORCE>;
CAUTION:
The FORCE option can cause loss of data due to truncation or dropping of variables.
The structure of the BASE= data set is used for the appended data set.

Example

Remember that the SAS data sets Work.Cap2001 and Work.Capacity both contain the following variables: Cap1st, CapBusiness, CapEcon, Dest, FlightID, Origin, and RouteID. In this case, the DATA= data set (Work.Cap2001) contains an additional variable, Date, that is not included in the BASE= data set (Work.Capacity).
When the following program is submitted, the SAS log indicates that the data sets were not appended because the variable Date was not found in the BASE= file.
proc append base=work.capacity 
            data=work.cap2001;
run;
Table 14.9 SAS Log
NOTE: Appending WORK.CAP2001 to WORK.CAPACITY.
WARNING: Variable Date was not found on BASE file.
ERROR: No appending done because of anomalies listed above.
        Use FORCE option to append these files.
NOTE: 0 observations added.
NOTE: The data set WORK.CAPACITY has 50 observations and 7 variables.
NOTE: Statements not processed because of errors noted above.
NOTE: PROCEDURE APPEND used (Total process time): 
      real time           0.02 seconds 
      cpu time            0.03 seconds
NOTE: The SAS System stopped processing this step because of errors.
When the FORCE option is used with PROC APPEND, the SAS log indicates that observations have been read from the DATA= data set, but that dropping or truncating occurs.
proc append base=work.capacity
            data=work.cap2001 force;
run;
Table 14.10 SAS Log
NOTE: Appending WORK.CAP2001 to WORK.CAPACITY.
WARNING: Variable Date was not found on BASE file.
NOTE: FORCE is specified, so dropping/truncating will occur.
NOTE: There were 50 observations read from the data set WORK.CAP2001.
NOTE: 50 observations added.
NOTE: The data set WORK.CAPACITY has 100 observations and 7 variables.
NOTE: PROCEDURE APPEND used (Total process time):
      real time           0.03 seconds
      cpu time            0.03 seconds
PROC PRINT output shows that the variable Date has been dropped from the appended version of Work.Capacity.
proc print 
     data=work.capacity 
         (firstobs=45 obs=55);
run;
Work.Capacity Data Set

Appending Variables with Different Lengths

If the DATA= data set contains variables with greater lengths than like-named variables in the BASE= data set, the FORCE option must be used with PROC APPEND. Using the FORCE option enables you to append the data sets. However, the DATA= variable values might be truncated.

Example

In the SAS data set Work.Acities, the variable City has a length of 22. In the SAS data set Work.WestAust, City has a length of 50. You can use the CONTENTS procedure to view the attributes of the variables in each data set.
proc contents data=work.acities;
run;
Work.Acities Data Set
proc contents data=work.westaust;
run;
Work.WestAust Data Set
When the following program is submitted, the SAS log indicates that the data sets were not appended because of different lengths for City in the BASE= and DATA= data sets.
proc append base=work.acities 
            data=work.westaust;
run;
Table 14.11 SAS Log
NOTE: Appending WORK.WESTAUST to WORK.ACITIES.
WARNING: Variable City has different lengths on BASE and
         DATA files (BASE 22 DATA 50).
ERROR: No appending done because of anomalies listed above.
       Use FORCE option to append these files.
NOTE: 0 observations added.
NOTE: The data set WORK.ACITIES has 50 observations and 4 variables.
NOTE: Statements not processed because of errors noted above.
NOTE: PROCEDURE APPEND used (Total process time):
      real time           1.44 seconds
      cpu time            0.06 seconds
NOTE: The SAS System stopped processing this step because of errors.
When the FORCE option is used, the SAS log indicates that the data sets are appended, but that dropping or truncating occurs.
proc append base=work.acities 
            data=work.westaust force;
run;
Table 14.12 SAS Log
NOTE: Appending WORK.WESTAUST to WORK.ACITIES.
WARNING: Variable City has different lengths on BASE and DATA files
         (BASE 22 DATA 50).
NOTE: FORCE is specified, so dropping/truncating will occur.
NOTE: There were 50 observations read from the data set WORK.WESTAUST.
NOTE: 50 observations added.
NOTE: The data set WORK.ACITIES has 100 observations and 4 variables.
NOTE: PROCEDURE APPEND used (Total process time):
      real time           1.44 seconds
      cpu time            0.06 seconds
PROC CONTENTS output for the appended version of Work.Acities shows that the variable City has retained a length of 22 from the BASE= data set. Also notice that the variable Code has retained the label Start Point from the BASE= data set.
proc contents 
     data=work.acities;
run;
Work.Acities Data Set
PROC PRINT output shows that some of the values of City are truncated in the appended version of Work.Acities.
proc print 
     data=work.acities 
          (firstobs=45 obs=55);
run;
Work.Acities Data Set

Appending Variables with Different Types

If the DATA= data set contains a variable that does not have the same type as the corresponding variable in the BASE= data set, the FORCE option must be used with PROC APPEND. Using the FORCE option enables you to append the data sets. However, missing values are assigned to the DATA= variable values for the variable whose type did not match.

Example

In the SAS data set Work.Allemps, the variable Phone is a character variable. In the SAS data set Work.Newemps, Phone is a numeric variable. You can use PROC CONTENTS to view the attributes of the variables in each data set.
proc contents data=work.allemps;
run;
Work.Allemps Data Set
proc contents data=work.newemps;
run;
Work.Newemps Data Set
When the following program is submitted, the SAS log indicates that there is a type mismatch for the variable Phone and that data sets were not appended.
proc append base=work.allemps 
            data=work.newemps;
run;
Table 14.13 SAS Log
NOTE: Appending WORK.NEWEMPS to WORK.ALLEMPS.
WARNING: Variable Phone not appended because of type mismatch.
ERROR: No appending done because of anomalies listed above.
       Use FORCE option to append these files.
NOTE: 0 observations added.
NOTE: The data set WORK.ALLEMPS has 550 observations and 5 variables.
NOTE: Statements not processed because of errors noted above.
NOTE: PROCEDURE APPEND used (Total process time):
      real time           0.08 seconds    
      cpu time            0.01 seconds
NOTE: The SAS System stopped processing this step because of errors.
When the FORCE option is used, the SAS log indicates that the data sets are appended, but that the variable Phone is not appended due to the type mismatch.
proc append base=work.allemps 
            data=work.newemps force;
run;
Table 14.14 SAS Log
NOTE: Appending WORK.NEWEMPS to WORK.ALLEMPS.
WARNING: Variable Phone not appended because of type mismatch.
NOTE: FORCE is specified, so dropping/truncating will occur.
NOTE: There were 19 observations read from the data set WORK.NEWEMPS.
NOTE: 19 observations added.
NOTE: The data set WORK.ALLEMPS has 569 observations and 5 variables.
NOTE: PROCEDURE APPEND used (Total process time):
      real time           0.05 seconds
      cpu time            0.02 seconds
PROC CONTENTS output for the appended version of Work.Allemps shows that the variable Phone has retained the type of character from the BASE= data set.
proc contents 
     data=work.allemps;
run;
Work.Allemps Data Set
PROC PRINT output of the appended version of Work.Allemps shows that the values for Phone are missing in the records that were read in from the DATA= data set.
proc print 
     data=work.allemps 
          (firstobs=45 obs=55);
run;
Work.Allemps Data Set
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.12.163.175