Modifying Variables

Selected Useful Statements

Here are examples of statements that accomplish specific data-manipulation tasks.
Table 9.4 Manipulating Data Using the DATA Step
Task
Example Code
Subset data
if resthr<70 then delete;  
if tolerance='D';
Drop unwanted variables
drop timemin timesec;
Create or modify a variable
TotalTime=(timemin*60)+timesec;
Initialize and retain a variable
retain SumSec 5400; 
Accumulate totals
sumsec+totaltime;
Specify a variable's length
length TestLength $ 6;
Execute statements conditionally
  if totaltime>800 then TestLength='Long'; 
  else if 750<=totaltime<=800 
    then TestLength='Normal'; 
      else if totaltime<750 
    then TestLength='Short';
The following topics discuss these tasks.

Accumulating Totals

To add the result of an expression to an accumulator variable, you can use a sum statement in your DATA step.
Syntax, sum statement:
variable+expression;
  • variable specifies the name of the accumulator variable. This variable must be numeric. The variable is automatically set to 0 before the first observation is read. The variable's value is retained from one DATA step execution to the next.
  • expression is any valid SAS expression.
Note: If the expression produces a missing value, the sum statement ignores it.
The sum statement is one of the few SAS statements that do not begin with a keyword.
The sum statement adds the result of the expression that is on the right side of the plus sign (+) to the numeric variable that is on the left side of the plus sign. The value of the accumulator variable is initialized to 0 instead of missing before the first iteration of the DATA step. Subsequently, the variable’s value is retained from one iteration to the next.

Example: Accumulating Totals

To find the total number of elapsed seconds in treadmill stress tests, you need the variable SumSec, whose value begins at 0 and increases by the amount of the total seconds in each observation. To calculate the total number of elapsed seconds in treadmill stress tests, use the sum statement shown below:
data work.stresstest; 
  set cert.tests; 
  TotalTime=(timemin*60)+timesec; 
  SumSec+totaltime; 
run;
The value of the variable on the left side of the plus sign, SumSec, begins at 0 and increases by the value of TotalTime with each observation.
SumSec
=
TotalTime
+
Previous total
0
758 
=
758
+
0
1363 
=
605
+
758
2036
=
673
+
1363
2618
=
582
+
2036
3324
=
706
+
2618

Initializing Sum Variables

In the previous example, the sum variable SumSec was initialized to 0 before the first observation was read. However, you can initialize SumSec to a different number than 0.
Use the RETAIN statement to assign an initial value, other than 0, to an accumulator variable in a sum statement.
The RETAIN statement has several purposes:
  • It assigns an initial value to a retained variable.
  • It prevents variables from being initialized each time the DATA step executes.
Syntax, RETAIN statement for initializing sum variables:
RETAIN variable <initial-value>;
  • variable is a variable whose values you want to retain.
  • initial-value specifies an initial value (numeric or character) for the preceding variable.
Note: The following statements are true about the RETAIN statement:
  • It is a compile-time-only statement that creates variables if they do not already exist.
  • It initializes the retained variable to missing before the first execution of the DATA step if you do not supply an initial value.
  • It has no effect on variables that are read with SET, MERGE, or UPDATE statements.

Example: RETAIN Statement

Suppose you want to add 5400 seconds (the accumulated total seconds from a previous treadmill stress test) to the variable SumSec in the StressTest data set when you create the data set. To initialize SumSec with the value 5400, use the RETAIN statement shown below. Now the value of SumSec begins at 5400 and increases by the value of TotalTime with each observation.
data work.stresstest;
  set cert.tests;
  TotalTime=(timemin*60)+timesec;
  retain SumSec 5400;
  sumsec+totaltime;
run;
proc print data=work.stresstest;
run;
SumSec
=
TotalTime
+
Previous Total
5400
6158
=
758
+
5400
6763
=
605
+
6158
7436
=
673
+
6763
8018
=
582
+
7436
8724
=
706
+
8018
Last updated: August 23, 2018
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.255.87