Example 8.8 Adding Observations to a Data Set Based on the Value of a Variable

Goal

Add a specific number of observations to a data set based on the value of one of its variables.

Example Features

Featured StepDATA step
Featured Step Options and StatementsOUTPUT statement

Input Data Set

Data set NEWSOFTWARE contains the project steps and step duration for a new software installation.

         NEWSOFTWARE

 Obs    task          totaldays
  1     install            1
  2     benchmarks         2
  3     review 1           2
  4     parallel           5
  5     review 2           3
  6     complete           1

Resulting Data Set

Output 8.8 NEWSOFTWAREDAILY Data Set

 Example 8.8 NEWSOFTWAREDAILY Data Set Created with DATA Step

Obs                  workdate   task         totaldays   taskday

  1       Monday, Mar 1, 2010   install          1          1
  2      Tuesday, Mar 2, 2010   benchmarks       2          1
  3    Wednesday, Mar 3, 2010   benchmarks       2          2
  4     Thursday, Mar 4, 2010   review 1         2          1
  5       Friday, Mar 5, 2010   review 1         2          2
  6       Monday, Mar 8, 2010   parallel         5          1
  7      Tuesday, Mar 9, 2010   parallel         5          2
  8   Wednesday, Mar 10, 2010   parallel         5          3
  9    Thursday, Mar 11, 2010   parallel         5          4
 10      Friday, Mar 12, 2010   parallel         5          5
 11      Monday, Mar 15, 2010   review 2         3          1
 12     Tuesday, Mar 16, 2010   review 2         3          2
 13   Wednesday, Mar 17, 2010   review 2         3          3
 14    Thursday, Mar 18, 2010   complete         1          1


Example Overview

The DATA step in this example uses the value of an existing variable in an input data set to determine how many observations to output for each observation in the input data set.

The sequence of tasks in installing some new software is stored in data set NEWSOFTWARE. Each observation has the title of one task and the number of days to complete the task. The goal of the DATA step is to output one observation for each day of the project.

The DATA step executes a DO loop for every observation in data set NEWSOFTWARE. The upper index of the DO loop is the value of TOTALDAYS for the observation that is currently being processed. The DO loop outputs an observation for each day that the task will be performed.

The beginning date of the installation process is set by the RETAIN statement for WORKDATE at the beginning of the DATA step. The value of WORKDATE is incremented by 1 with each iteration of the DO loop unless the date of the task would fall on a Saturday or Sunday. When the task would fall on the weekend, the value of WORKDATE is incremented by an additional one or two days to move the task work to the following Monday.

Program

Create data set NEWSOFTWAREDAILY. Assign attributes to new variable WORKDATE. Read each observation from NEWSOFTWARE. Initialize variable WORKDATE and retain its value across iterations of the DATA step.

Define a variable to track the day within the task. Execute a DO loop the number of times equal to the value of TOTALDAYS. Increment the variable that tracks the day within the task. Determine the day of the week of the current value of WORKDATE. For Saturdays, increment WORKDATE two days to the following Monday. For Sundays, increment WORKDATE one day to the following Monday. Output one observation for each day of a task. Increment WORKDATE by 1 to move to the next day in the installation process.

data newsoftwaredaily;

  format workdate weekdate25.;

  set newsoftware;

  retain workdate '01mar2010'd;


  drop dayofweek i;
  taskday=0;

  do i=1 to totaldays;

    taskday+1;

    dayofweek=weekday(workdate);

    if dayofweek=7 then workdate+2;


    else if dayofweek=1 then workdate+1;

    output;

    workdate+1;


  end;
run;

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.227.13.219