Specifying the Length of Character Variables

The Basics of Specifying Length

Remember that when you use list input to read raw data, character variables are assigned a default length of 8. In this example, see what happens when list input is used to read character variables whose values are longer than 8.
The raw data file referenced by the fileref Citydata contains 2000 and 2010 population figures for several large U.S. cities. Notice that some city names are rather long.
Figure 18.6 Raw Data File with Character Values That Are Longer Than 8
Raw data that shows character variables whose values are longer than 8.
The longer character values are truncated when they are read into the program data vector.
Output that shows truncated character values.
PROC PRINT output shows the truncated values for City.
data sasuser.growth; 
   infile citydata; 
   input City $ Pop70 Pop80; 
run; 
proc print data=sasuser.growth; 
run;
Figure 18.7 Output with Truncated Values
PROC PRINT output that shows truncated character values.

The LENGTH Statement

Remember, variable attributes are defined when the variable is first encountered in the DATA step. In the program below, the LENGTH statement precedes the INPUT statement and defines both the length and type of the variable City. A length of 12 has been assigned to accommodate PHILADELPHIA, which is the longest value for City.
data sasuser.growth; 
   infile citydata; 
   length City $ 12; 
   input city $ Pop70 Pop80; 
run; 
proc print data=sasuser.growth; 
run;
Figure 18.8 Raw Data File with Character Values That Are Longer Than 8
Raw data that shows character variables whose values are longer than 8.
Using this method, you do not need to specify City's type in the INPUT statement. However, leaving the $ in the INPUT statement does not produce an error. Your output should now display the complete values for City.
Figure 18.9 Output Using Length Statement
Output Using the Length Statement
Note: Variable attributes are defined when the variable is first encountered in the DATA step, a variable that is defined in a LENGTH statement (if it precedes an INPUT statement) appears first in the data set, regardless of the order of the variables in the INPUT statement.
Last updated: January 10, 2018
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.221.117.214