Converting Data with Functions

A Word about Converting Data

The following code automatically converts the variable PayRate from character to numeric.
data work.newtemp;
  set cert.temp;
  Salary=payrate*hours;
run;
You can also use the INPUT function before performing a calculation. The INPUT function converts character data values to numeric values.
You can use the PUT function to convert numeric data values to character values.

Potential Problems of Omitting INPUT or PUT

If you skip INPUT or PUT function when converting data, SAS detects the mismatched variables and tries an automatic character-to-numeric or numeric-to-character conversion. However, this action is not always successful. Suppose each value of PayRate begins with a dollar sign ($). When SAS tries to automatically convert the values of PayRate to numeric values, the dollar sign blocks the process. The values cannot be converted to numeric values. Similar problems can occur with automatic numeric-to-character conversion.
Therefore, it is a recommended best practice to include INPUT and PUT functions in your programs to avoid data type mismatches and automatic conversion.

Automatic Character-to-Numeric Conversion

By default, if you reference a character variable in a numeric context such as an arithmetic operation, SAS tries to convert the variable values to numeric. For example, in the DATA step below, the character variable PayRate appears in a numeric context. It is multiplied by the numeric variable Hours to create a new variable named Salary.
data work.newtemp; 
  set cert.temp; 
  Salary=payrate*hours; 
run;
When this step executes, SAS automatically attempts to convert the character values of PayRate to numeric values so that the calculation can occur. This conversion is completed by creating a temporary numeric value for each character value of PayRate. This temporary value is used in the calculation. The character values of PayRate are not replaced by numeric values.
Whenever data is automatically converted, a message is written to the SAS log stating that the conversion has occurred.
Log 14.1 SAS Log
9246  data work.temp;
9247  set cert.temp;
9248  salary=payrate*hours;
9249  run;

NOTE: Character values have been converted to numeric values at the places given by:
      (Line):(Column).
      9248:8
NOTE: There were 10 observations read from the data set CERT.TEMP.
NOTE: The data set WORK.TEMP has 10 observations and 16 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds

When Automatic Conversion Occurs

Automatic character-to-numeric conversion occurs in the following circumstances:
  • A character value is assigned to a previously defined numeric variable, such as the numeric variable Rate.
    Rate=payrate;
  • A character value is used in an arithmetic operation.
    Salary=payrate*hours;
  • A character value is compared to a numeric value, using a comparison operator.
    if payrate>=rate;
  • A character value is specified in a function that requires numeric arguments.
    NewRate=sum(payrate,raise);
The following statements are true about automatic conversion.
  • It uses the w. informat, where w is the width of the character value that is being converted.
  • It produces a numeric missing value from any character value that does not conform to standard numeric notation (digits with an optional decimal point, leading sign, or scientific notation).
Table 14.2 Automatic Conversion of Character Variables
Character Value
Automatic Conversion
Numeric Value
12.47
12.47
-8.96
-8.96
1.243E1
12.43
1,742.64
.

Restriction for WHERE Expressions

The WHERE statement does not perform automatic conversions in comparisons. The simple program below demonstrates what happens when a WHERE expression encounters the wrong data type. The variable Number contains a numeric value, and the variable Character contains a character value, but the two WHERE statements specify the wrong data type.
data work.convtest; 
  Number=4; 
  Character='4'; 
run; 
proc print data=work.convtest; 
  where character=4; 
run; 
proc print data=work.convtest; 
  where number='4'; 
run;
This mismatch of character and numeric variables and values prevents the program from processing the WHERE statements. Automatic conversion is not performed. Instead, the program stops, and error messages are written to the SAS log.
Log 14.2 SAS Log
9254 data work.convtest;
9255  Number=4;
9256  Character='4';
9257 run;

NOTE: The data set WORK.CONVTEST has 1 observations and 2 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds

9258 proc print data=work.convtest;
9259  where character=4;
ERROR: WHERE clause operator requires compatible variables.
9260 run;

NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE PRINT used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds

9261 proc print data=work.convtest;
9262  where number='4';
ERROR: WHERE clause operator requires compatible variables.
9263 run;

NOTE: The SAS System stopped processing this step because of errors.

Explicit Character-to-Numeric Conversion

Using the INPUT Function

Use the INPUT function to convert character data values to numeric values. You can explicitly convert the character values of PayRate to numeric values by using the INPUT function.
Syntax, INPUT function:
INPUT(source, informat)
  • source indicates the character variable, constant, or expression to be converted to a numeric value.
  • a numeric informat must also be specified, as in this example:
    input(payrate,2.)
When choosing the informat, be sure to select a numeric informat that can read the form of the values.
Table 14.3 Character Values and Associated Informats
Character Value
Informat
2115233
7.
2,115,233
COMMA9.

Example: INPUT Function

The function uses the numeric informat COMMA9. to read the values of the character variable SaleTest. Then the resulting numeric values are stored in the variable Test. Here is an example of the INPUT function:
Test=input(saletest,comma9.);
You can use the INPUT function to convert the character values of PayRate to numeric values.
Because PayRate has a length of 2, the numeric informat 2. is used to read the values of the variable.
input(payrate,2.)
In the following program, the function is added to the assignment statement in the DATA step.
data work.newtemp; 
  set cert.temp; 
  Salary=input(payrate,2.)*hours; 
run;
After the DATA step is executed, the new data set, which contains the variable Salary, is created. Notice that no conversion messages appear in the SAS log when the INPUT function is used.
Log 14.3 SAS Log
9272 data work.newtemp;
9273  set cert.temp;
9274  Salary=input(payrate,2.)*hours;
9275 run;

NOTE: There were 10 observations read from the data set CERT.TEMP.
Output 14.1 PROC PRINT Output of Work.NewTemp (partial output)
Partial Output: PROC PRINT Output of Work.NewTemp with Salary variable
The syntax of the INPUT function is very similar to the syntax of the PUT function (which performs numeric-to-character conversions).
INPUT(source, informat)
PUT(source, format))
However, note that the INPUT function requires an informat, whereas the PUT function requires a format. To remember which function requires a format versus an informat, note that the INPUT function requires an informat.

Automatic Numeric-to-Character Conversion

The automatic conversion of numeric data to character data is very similar to character-to-numeric conversion. Numeric data values are converted to character values whenever they are used in a character context.
For example, the numeric values of the variable Site are converted to character values if you do the following:
  • assign the numeric value to a previously defined character variable, such as the character variable SiteCode: SiteCode=site;
  • use the numeric value with an operator that requires a character value, such as the concatenation operator: SiteCode=site||dept;
  • specify the numeric value in a function that requires character arguments, such as the SUBSTR function: Region=substr(site,1,4);
Specifically, SAS writes the numeric value with the BEST12. format, and the resulting character value is right-aligned. This conversion occurs before the value is assigned or used with any operator or function. However, automatic numeric-to-character conversion can cause unexpected results. For example, suppose the original numeric value has fewer than 12 digits. The resulting character value has leading blanks, which might cause problems when you perform an operation or function.
Automatic numeric-to-character conversion also causes a message to be written to the SAS log indicating that the conversion has occurred.

Explicit Numeric-to-Character Conversion

Use the PUT function to explicitly convert numeric data values to character data values.
Suppose you want to create a new character variable named Assignment that concatenates the values of the numeric variable Site and the character variable Dept. The new variable values must contain the value of Site followed by a slash (/) and then the value of Dept (for example, 26/DP).
Figure 14.1 SAS Data Set Cert.Temp (partial data set)
The following figure shows selected variables from the Cert.Temp data set.
Here is an assignment statement that contains the concatenation operator (||) to indicate that Site should be concatenated with Dept, using a slash as a separator.
data work.newtemp; 
  set cert.temp; 
  Assignment=site||'/'||dept; 
run;
Note: The slash is enclosed in quotation marks. All character constants must be enclosed in quotation marks.
Submitting this DATA step causes SAS to automatically convert the numeric values of Site to character values because Site is used in a character context. The variable Site appears with the concatenation operator, which requires character values. To explicitly convert the numeric values of Site to character values, you must add the PUT function to your assignment statement.
Syntax, PUT function:
PUT(source, format)
  • source indicates the numeric variable, constant, or expression to be converted to a character value
  • a format matching the data type of the source must also be specified, as in this example:
    put(site,2.)
Here are several facts about the PUT function.
  • The PUT function always returns a character string.
  • The PUT function returns the source written with a format.
  • The format must agree with the source in type.
  • Numeric formats right-align the result; character formats left-align the result.
  • When you use the PUT function to create a variable that has not been previously identified, it creates a character variable whose length is equal to the format width.
When you use a numeric variable as the source, you must specify a numeric format.
To explicitly convert the numeric values of Site to character values, use the PUT function in an assignment statement, where Site is the source variable. Because Site has a length of 2, choose 2. as the numeric format. The DATA step adds the new variable from the assignment statement to the data set.
data work.newtemp; 
  set cert.temp; 
  Assignment=put(site,2.)||'/'||dept; 
run;
proc print data=work.newtemp;
run;
Output 14.2 PROC PRINT Output of Work.NewTemp (partial output)
This graphic displays the new Assignment column that was created by the PUT function.
Notice that no conversion messages appear in the SAS log when you use the PUT function.
Log 14.4 SAS Log
9355 data work.newtemp;
9356  set cert.temp;
9357  Assignment=put(site,2.)||'/'||dept;
9358  run;

NOTE: There were 10 observations read from the data set CERT.TEMP.
NOTE: The data set WORK.NEWTEMP has 10 observations and 17 variables.
Last updated: August 23, 2018
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.224.137.197