Creating Formats from SAS Data Sets

Overview

You have seen that you can create a format by specifying values and labels in a PROC FORMAT step. You can also create a format from a SAS data set that contains value information (called a control data set). To do this, you use the CNTLIN= option to read the data and create the format.
General form, CNTLIN= option in PROC FORMAT:
PROC FORMAT LIBRARY=libref.catalog
CNTLIN=SAS-data-set;
Here is an explanation of the syntax:
libref.catalog
is the name of the catalog in which you want to store the format.
SAS-data-set
is the name of the SAS data set that you want to use to create the format.

Example

Suppose you have a SAS data set named Routes that has variables that are required to create a format. You specify the data set in the CNTLIN= option as follows:
proc format lib=myfmts cntlin=routes;
run;
As you can see, the code for creating a format from a SAS data set is simple. However, the control data set must contain certain variables before it can be used to create a format. Therefore, most data sets must be restructured before they can be used.

Rules for Control Data Sets

When you create a format using programming statements, you specify the name of the format, the range or value, and the label for each range or value as shown in the VALUE statement below:
value rainfall 0='none';
The control data set you use to create a format must contain variables that supply this same information. That is, the data set that is specified in the CNTLIN= option must meet the following requirements:
  • It must contain the variables FmtName, Start, and Label, which contain the format name, value or beginning value in the range, and label.
  • It must contain the variable End if a range is specified. If there is no End variable, SAS assumes that the ending value of the format range is equal to the value of Start.
  • It must contain the variable Type for character formats, unless the value for FmtName begins with a $.
  • It must be grouped by FmtName if multiple formats are specified.

Example

Overview

Suppose you want to create a format that labels a three-letter airport code with the name of the city where the airport is located. You have a data set, Sasuser.Acities, that contains airport codes and airport cities. However, the data does not have the required variables for the CNTLIN= option.
Table 17.3 SAS Data Set Sasuser.Acities (Partial Listing)
City Where Airport Is Located
Start Point
Airport Name
Country Where Airport Is Located
Auckland
AKL
International
New Zealand
Amsterdam
AMS
Schiphol
Netherlands
Anchorage, AK
ANC
Anchorage International Airport
USA
Stockholm
ARN
Arlanda
Sweden
Athens (Athinai)
ATH
Hellinikon International Airport
Greece
Birmingham, AL
BHM
Birmingham International Airport
USA
Bangkok
BKK
Don Muang International Airport
Thailand
To create a format from this data set, you need to do these things:
  1. List data set variables.
  2. Restructure the data.

Step 1: List Data Set Variables

Remember that you need to have the variables FmtName, Start, and Label. You can submit a PROC CONTENTS step to get a listing of the variables in the Sasuser.Acities data set.
Partial Output
proc contents data=sasuser.acities; 
run;
Partial Output
Tip
You can also get a list of variable names by using PROC DATASETS with a CONTENTS statement or by viewing the properties of the SAS data set in the SAS Explorer window.

Step 2: Restructure the Data

Once you have looked at the data and know the variable names, you are ready to write a DATA step to manipulate the data. The variable Code is the three-letter airport code and the variable City is the city where the airport is located. You can rename the variable Code to Start and the variable City to Label, but you also need to create the variable FmtName.
The code below is an efficient way to prepare your data. The DATA step uses the following statements:
  • the KEEP statement to write only the specified variables to the output data set
  • the RETAIN statement to create the variable FmtName and set the value to '$airport'
  • the RENAME data set option to rename the variable Code to Start (you do not need a variable named End because you are labeling discrete values rather than ranges) and to rename the variable City to Label
data sasuser.aports;
   keep Start Label FmtName;
   retain FmtName '$airport';
   set sasuser.acities (rename=(Code=Start 
       City= Label));
run;

proc print data=sasuser.aports(obs=10) noobs;
run;
Below is the listing of the first ten observations in the new data set Sasuser.Aports.
first ten observations
This data set is now in the proper format to be used to create a format with the CNTLIN= option.
Once you have the data in the proper format, you can use the CNTLIN= option to create the format. The first PROC FORMAT step creates a format from the data set Sasuser.Aports. The second PROC FORMAT step documents the new format.
proc format library=sasuser cntlin=sasuser.aports;
run;

proc format library=sasuser fmtlib;
   select $airport;
run;
The first few lines of the output are shown below.
Table 17.4 Partial SAS Output
FORMAT NAME : $AIRPORT LENGTH: 22 NUMBER OF VALUES: 52 MIN LENGTH: 1 MAX LENGTH: 40 DEFAULT LENGTH: 22 FUZZ: 0
START
END
LABEL (VER. V7|V8 21OCT2002:14:13:14)
AKL
AKL
Auckland
AMS
AMS
Amsterdam
ANC
ANC
Anchorage, AK
ARN
ARN
Stockholm
ATH
ATH
Athens (Athinai)
BHM
BHM
Birmingham, AL
BKK
BKK
Bangkok

Apply the Format

Consider the format that is applied to the data set Sasuser.Cargo99. The following PROC PRINT code assigns the $AIRPORT. format to both the Dest and Origin variables:
options fmtsearch=(sasuser);
proc print data=sasuser.cargo99 (obs=5);
   var origin dest cargorev;
   format origin dest $airport.;
run;
Sasuser.Cargo99
Tip
For more information about using the CNTLIN= option, see the SAS documentation for the FORMAT procedure.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.137.217.198