Add new observations to the end of a data set, while retaining the original name of the data set.
Featured Step | DATA step |
Featured Step Options and Statements | SET statement END= option, OUTPUT statement |
Data set FINALSALES contains nine years of final sales results for five categories of books.
FINALSALES Obs year internet networks os apps training 1 2001 $62,529 $49,070 $34,506 $45,055 $79,316 2 2002 $66,068 $50,325 $33,711 $45,640 $75,248 3 2003 $69,785 $46,015 $33,092 $45,841 $77,029 4 2004 $64,115 $46,068 $35,185 $44,273 $76,394 5 2005 $60,832 $51,456 $34,757 $45,015 $79,474 6 2006 $66,635 $46,017 $34,121 $46,006 $78,829 7 2007 $69,696 $50,846 $33,560 $45,468 $77,847 8 2008 $65,127 $49,995 $35,899 $46,874 $79,364 9 2009 $63,073 $48,654 $33,237 $44,064 $75,760
Output 8.7 PROJ_FINALSALES Data SetExample 8.7 PROJ_FINALSALES Data Set Created with DATA Step Obs year internet networks os apps training type 1 2001 $62,529 $49,070 $34,506 $45,055 $79,316 Final 2 2002 $66,068 $50,325 $33,711 $45,640 $75,248 Final 3 2003 $69,785 $46,015 $33,092 $45,841 $77,029 Final 4 2004 $64,115 $46,068 $35,185 $44,273 $76,394 Final 5 2005 $60,832 $51,456 $34,757 $45,015 $79,474 Final 6 2006 $66,635 $46,017 $34,121 $46,006 $78,829 Final 7 2007 $69,696 $50,846 $33,560 $45,468 $77,847 Final 8 2008 $65,127 $49,995 $35,899 $46,874 $79,364 Final 9 2009 $63,073 $48,654 $33,237 $44,064 $75,760 Final 10 2010 $64,650 $48,800 $33,071 $44,108 $79,548 Projected 11 2011 $66,266 $48,946 $32,906 $44,152 $83,525 Projected 12 2012 $67,923 $49,093 $32,741 $44,196 $87,701 Projected |
The following DATA step shows you how to use options in the SET statement to add observations to the end of a data set.
The data in input data set FINALSALES contain final book sales figures for five categories for the nine years from 2001 to 2009.
The goal of the DATA step is to create a new data set that contains all the observations in FINALSALES plus the projected sales figures for 2010 to 2012. Variable TYPE is defined to classify the observations as final sales figures or projected sales figures.
The END= option in the SET statement detects when the end of the data set has been reached. A DO loop executes when processing the last observation. The DO loop outputs three new observations by using information from the last observation in the input data set.
The loop that executes when processing the last observation in input data set FINALSALES projects the sales for the next three years, 2010 to 2012. An array of percentage change values for the five categories is used to project the change from one year to the next. The projected sales for 2010 are based on the actual sales for 2009. The projected sales for 2011 are based on the projected sales for 2010, and the projected sales for 2012 are based on the projected sales for 2011.
Create data set PROJ_FINALSALES. Process each observation in FINALSALES. Define EOF so that it can be detected when processing the last observation in FINALSALES. Define array DEPT to hold the sales results for the five categories. Define temporary array GROWTH. Initialize the five elements of GROWTH with the annual projected percentage change in sales for the five categories.
Assign a value to TYPE for observations with final sales figures. Include an OUTPUT statement so that the DATA step writes out all observations in the input data set. This statement is needed here because one is required in the following IF-THEN block that outputs the new observations. Execute this block when processing the last observation in FINALSALES. Assign a value to TYPE for the new observations. Project values for the three years from 2010 to 2012. Project values for the five categories. Compute the projected result for a category based on the prior value and round to the nearest dollar.
Output each new observation.
data proj_finalsales; set finalsales end=eof; array dept{*} internet networks os apps training; array growth{5} _temporary_ (2.5,0.3,-.5,.1,5); drop i; retain type 'Final'; output; if eof then do; type='Projected'; do year=2010 to 2012; do i=1 to dim(dept); dept{i}=round((dept{i} + growth{i}*dept{i}/100),1); end; output; end; end; run;
18.118.142.234