Generate a specific number of random numbers within a range of values.
Featured Step | DATA step |
Featured Step Options and Statements | CATS, CEIL, CHAR, FLOOR, RANUNI, and SUBSTR functions |
Output 9.8 GEN_IDS Data SetGEN_IDS Obs id 1 TY773 2 DX963 3 CE942 4 ED565 5 QL714 6 MF854 7 SA711 8 PX734 9 PL994 10 AD925 11 KA798 12 VK794 13 YD539 14 CI706 15 KI724 16 XZ719 17 NM940 18 HC724 19 UY685 20 YA649 21 PX665 22 YI986 23 ZZ725 24 KL981 25 CJ543 |
This example shows how to use the RANUNI random number function to generate random numbers within a range of values. The RANUNI function generates a rational number noninclusively between 0 and 1 by using the uniform distribution.
The single argument to RANUNI is a seed value. This seed serves as a starting point from which to generate the random values. The seed must be an integer. A seed of 0 causes RANUNI to use the system clock time as the starting point, which generates a different stream of random numbers every time the program is called. A seed not equal to 0 generates the same stream of random numbers every time the program is called.
By multiplying the value that is returned by RANUNI and applying SAS functions CEIL or FLOOR, you can generate integers within a specific range of values.
The UNIFORM function is an alias for the RANUNI function.
Data sets used throughout this book are comprised of fabricated data. Many data sets require ID values. The DATA steps that fabricate the data frequently include the RANUNI function in creating these IDs.
The following DATA step shows a way to fabricate 25 5-byte ID values. It randomly picks characters from variable FULLSTRING, a 26-byte character variable. FULLSTRING is a constant with its 26 bytes equal to the 26 letters of the alphabet.
The seed value that is supplied to RANUNI is specified in variable SEED. Because this value is not 0 in this example, every time the DATA step executes, RANUNI generates the same stream of random numbers. If you make no other changes to the DATA step and keep the value of SEED the same, you would generate the same set of ID values each time you execute the DATA step. If you want a different set each time that can't be reproduced, specify 0 as the seed value. On each execution of RANUNI, the current seed is updated internally, but the value of the seed argument that is stored in SEED remains unchanged.
In this example, the requirement is that the ID values must have letters A-Z in the first two bytes and a number between 500 and 999 in the remaining three bytes.
An iterative DO loop selects two letters at random to place in the first two bytes of variable ID. The value that is returned by each invocation of RANUNI is multiplied by 26 and the CEIL function is applied to the result. The CEIL function returns the smallest integer that is greater than or equal to the argument. The values that are returned by RANUNI and multiplied by 26 will be rational numbers greater than 0 and less than 26. Applying CEIL to these values returns integers from 1 to 26 inclusively. The CHAR function picks the single character from FULLSTRING in that integer's position.
The other invocation of RANUNI returns a value that is multiplied by 500. With the CEIL function applied to the result, random numbers between 1 and 500 inclusively are generated. The value 499 is added to each random number, which results in a list of random numbers between 500 and 999.
You can use the FLOOR function instead of the CEIL function if you want to obtain the largest integer that is less than or equal to the argument, but you would need to modify any multipliers or additive adjustments so that you obtain the required range of values.
The following DATA step generates observations and does not read in a data set.
Create data set GEN_IDS.
Define variable ID. Initialize LETTERS to equal the letters of the alphabet. Initialize variable SEED that will be supplied as the argument to RANUNI. Execute an iterative DO loop 25 times to generate 25 values for ID. Initialize ID at the beginning of generating an ID value. Fill the first two bytes of ID. Generate a random number. Use the value of SEED as the seed value to the function. Multiply the random number by 26 and apply the CEIL function to the result to obtain an integer between 1 and 26 inclusively. Fill the first two positions of ID.
Generate a random number. Multiply it by 500 and apply the CEIL function to return an integer between 1 and 500. Add 499 to the integer, which results in integers from 500 to 999 inclusively that are saved in variable VALUE. It is not necessary to change the seed value that is specified in this invocation of RANUNI. The stream of random numbers was initiated in the first call to RANUNI and continues from there. It is not reset in this DATA step. Put the three-digit integer in the last three bytes of ID. Write each ID value to a separate observation. Because this DATA step is generating data and not reading a data set, it is necessary to explicitly output an observation on each iteration of the DO loop so that there will be 25 observations in GEN_IDS.
data gen_ids; drop i j letters seed pos value; length id $ 5; retain letters 'ABCDEFGHIJKLMNOPQRSTUVWXYZ' seed 3713; do i=1 to 25; id=' '; do j=1 to 2; pos=ceil(ranuni(seed)*26); id=cats(id,char(letters,pos)); end; value=ceil(ranuni(seed)*500)+499; substr(id,3)=put(value,3.); output; end; run;
18.219.35.128