The FREQ Procedure

What Does the FREQ Procedure Do?

PROC FREQ is a procedure that is used give descriptive statistics about a SAS data set. The procedure creates one-way, two-way, and n-way frequency tables. It also describes data by reporting the distribution of variable values. The FREQ procedure creates crosstabulation tables to summarize data for two or more categorical values by displaying the number of observations for each combination of variable values.
Tip
It is a best practice that you use the TABLES statement with PROC FREQ.

FREQ Procedure Syntax

The FREQ procedure can include many statements and options for controlling frequency output.
FREQ procedure syntax
Syntax, FREQ procedure:
PROC FREQ <options>;
RUN;
The following table lists the options that are available in the PROC FREQ statement.
Table 15.4 PROC FREQ Statement Options
Option
Description
COMPRESS
Begins the display of the next one-way frequency table on the same page as the preceding one-way table if there is enough space to begin the table. By default, the next one-way table begins on the current page only if the entire table fits on that page.
Note: The COMPRESS option is not valid with the PAGE option.
DATA=SAS-data-set
Names the SAS-data-set to be analyzed by PROC FREQ. If you omit the DATA= option, the procedure uses the most recently created SAS data set.
FORMCHAR(1,2,7)='formchar-string'
Defines the characters to be used for constructing the outlines and dividers for the cells of crosstabulation table displays. The formchar-string should be three characters long. The characters are used to draw the vertical separators (position 1), the horizontal separators (position 2), and the vertical-horizontal intersections (position 7). If you do not specify the FORMCHAR= option, PROC FREQ uses FORMCHAR(1,2,7)='|-+' by default.
Position 1
Default: |
The characters are used to draw vertical separators.
Position 2
Default: —
The characters are used to draw horizontal separators.
Position 7
Default: +
The characters are used to draw intersections of vertical and horizontal separators.
Specifying all blanks for formchar-string produces crosstabulation tables with no outlines or dividers—for example, FORMCHAR(1,2,7)=' '. You can use any character in formchar-string, including hexadecimal characters. If you use hexadecimal characters, you must put an x after the closing quotation mark.
NLEVELS
Displays the "Number of Variable Levels" table, which provides the number of levels for each variable named in the TABLES statements.
NOPRINT
Suppresses the display of all output. You can use the NOPRINT option when you want to create only an output data set.
<ORDER=DATA | FORMATTED | FREQ | INTERNAL>=
Specifies the order of the variable levels in the frequency and crosstabulation tables, which you request in the TABLES statement.
The ORDER= option can take the following values:
DATA
order of appearance in the input data set
FORMATTED
external formatted value, except for numeric variables with no explicit format, which are sorted by their unformatted (internal) value
FREQ
descending frequency count; levels with the most observations come first in the order
INTERNAL
unformatted value
Note: The ORDER= option does not apply to missing values, which are always ordered first.
PAGE
Displays only one table per page. Otherwise, PROC FREQ displays multiple tables per page as space permits.
Note: The PAGE option is not valid with the COMPRESS option.

Example: Creating a One-Way Frequency Table (Default)

By default, the FREQ procedure creates a one-way table that contains the frequency, percent, cumulative frequency, and cumulative percent of every value of every variable in the input data set. In the following example, the FREQ procedure creates crosstabulation tables for each of the variables.
proc freq data=cert.usa;
run;
Output 15.9 PROC FREQ Output of Cert.Usa
PROC FREQ Output of Cert.Usa: Dept Variable
PROC FREQ Output of Cert.Usa: WageCat Variable
PROC FREQ Output of Cert.Usa: WageRate Variable
PROC FREQ Output of Cert.Usa: Manager Variable
PROC FREQ Output of Cert.Usa: JobType Variable

Specifying Variables Using the TABLES Statement

By default, the FREQ procedure creates frequency tables for every variable in a data set. But this is not always what you want. A variable that has continuous numeric values (such as DateTime) can result in a lengthy and meaningless table. Likewise, a variable that has a unique value for each observation (such as FullName) is unsuitable for PROC FREQ processing. Frequency distributions work best with variables whose values are categorical, and whose values are better summarized by counts rather than by averages.
To specify the variables to be processed by the FREQ procedure, include a TABLES statement.
Syntax, TABLES statement:
TABLES variable(s);
variable(s) lists the variables to include.

Example: Creating a One-Way Table for One Variable

The TABLES statement tells SAS the specific frequency tables that you want to create. The following example creates only one frequency table for the variable Sex as specified in the TABLES statement. The other variables are suppressed.
proc freq data=cert.diabetes;
  tables sex;
run;
Output 15.10 One-Way Table for the Variable Sex
One-Way Table for the Variable Sex

Example: Determining the Report Layout

The order in which the variables appear in the TABLES statement determines the order in which they are listed in the PROC FREQ report.
Consider the SAS data set Cert.Loans. The variables Rate and Months are categorical variables, so they are the best choices for frequency tables.
proc freq data=cert.loans;
  tables rate months;
run;
Output 15.11 Frequency Tables for Rate and Months
Frequency Tables for Rate and Months
In addition to listing variables separately, you can use a numbered range of variables.
proc freq data=cert.survey;
  tables item1-item3;
run;
Output 15.12 Frequency Tables for Item1–Item3
Frequency Tables for Item1–Item3
Tip
To suppress the display of cumulative frequencies and cumulative percentages in one-way frequency tables and in list output, add the NOCUM option to your TABLES statement. Here is the syntax:
TABLES variable(s) / NOCUM;

Create Two-Way and N-Way Tables

The simplest crosstabulation is a two-way table. To create a two-way table or n-way table, join the variables with an asterisk (*) in the TABLES statement in a PROC FREQ step. For a two-way table, one table is created. For n-way tables, a series of tables are produced with a table for each level of the variables.
Syntax, TABLES statement for crosstabulation:
TABLES variable-1 *variable-2 <* ... variable-n>;
Here are the options for two-way tables:
  • variable-1 specifies table rows.
  • variable-2 specifies table columns.
Tip:You can include up to 50 variables in a single multi-way table request.
When crosstabulations are specified, PROC FREQ produces tables with cells that contain the following frequencies:
  • cell frequency
  • cell percentage of total frequency
  • cell percentage of row frequency
  • cell percentage of column frequency

Example: Creating Two-Way Tables

In the following example, you can create a two-way table to see the frequency of fasting glucose levels for each value for the variable Sex.
proc freq data=cert.diabetes;
  tables sex*fastgluc;
run;
Output 15.13 Two-Way Table Output Cert.Diabetes
PROC FREQ Output of Cert.Diabetes
Note that the first variable, Sex, forms the table rows, and the second variable, FastGluc, forms the columns. Reversing the order of the variables in the TABLES statement would reverse their positions in the table. Note also that the statistics are listed in the legend box.

Examples: Creating N-Way Tables

The following example creates a series of two-way tables with a table for each level of the other variables. The variables WhiteCells and AG are the rows and columns that are crosstabulated by the variable Survived.
proc format;
  value Survive 0='Dead'
                1='Alive';
run;
proc freq data=cert.leukemia;
  tables Survived*AG*WhiteCells;
  format Survived survive.;
run;
Output 15.14 N-Way Tables
Creating N-Way Tables: Table 1
Creating N-Way Tables: Table 2

Creating Tables Using the LIST Option

When three or more variables are specified, the multiple levels of n-way tables can produce considerable output. Such bulky, often complex crosstabulations are often easier to read when they are arranged as a continuous list. Although this arrangement eliminates row and column frequencies and percentages, the results are compact and clear.
Tip
The LIST option is not available when you also specify statistical options.
To generate list output for crosstabulations, add a slash (/) and the LIST option to the TABLES statement in your PROC FREQ step.
Syntax, TABLES statement:
TABLES variable-1 *variable-2 <* ... variable-n> / LIST;
Here are the options for two-way tables:
  • variable-1 specifies table rows.
  • variable-2 specifies table columns.
Tip:You can include up to 50 variables in a single multi-way table request.

Example: Using the LIST Option

As in the previous example, the following example creates a series of two-way tables with a table for each level of the other variables. The variables WhiteCells and AG are the rows and columns that are crosstabulated by the variable Survived. Use the LIST option in the TABLES statement to make the PROC FREQ output easier to read. The output is generated in a continuous list.
proc format;
  value survive 0='Dead'
                1='Alive';
run;
proc freq data=cert.leukemia;
  tables Survived*AG*WhiteCells / list;
  format Survived survive.;
run;
Output 15.15 PROC FREQ Output in List Format
PROC FREQ with LIST option

Example: Using the CROSSLIST Option

The CROSSLIST option displays crosstabulation tables in ODS column format instead of the default crosstabulation cell format. In a CROSSLIST table display, the rows correspond to the crosstabulation table cells, and the columns correspond to descriptive statistics such as Frequency and Percent. The CROSSLIST table displays the same information as the default crosstabulation table, but uses an ODS column format instead of the table cell format
proc format;
  value survive 0='Dead'
                1='Alive';
run;
proc freq data=cert.leukemia;
  tables Survived*AG*whitecells / crosslist;
  format Survived survive.;
run;
Output 15.16 Table Created by the CROSSLIST Option Survived=Dead
Table Created by CROSSLIST Option Survived=Dead
Output 15.17 Table Created by the CROSSLIST Option Survived=Alive
Table Created by CROSSLIST Option Survived=Alive

Suppressing Table Information

Another way to control the format of crosstabulations is to limit the output of the FREQ procedure to a few specific statistics. Remember that when crosstabulations are run, PROC FREQ produces tables with cells that contain these frequencies:
  • cell frequency
  • cell percentage of total frequency
  • cell percentage of row frequency
  • cell percentage of column frequency
You can use options to suppress any of these statistics. To control the depth of crosstabulation results, add any combination of the following options to the TABLES statement:
  • NOFREQ suppresses cell frequencies
  • NOPERCENT suppresses cell percentages
  • NOROW suppresses row percentages
  • NOCOL suppresses column percentages

Example: Suppressing Percentages

You can suppress frequency counts, rows, and column percentages by using the NOFREQ, NOROW, and NOCOL options in the TABLES statement.
proc format;
  value survive  0='Dead'
                 1='Alive';
run;
proc freq data=cert.leukemia;
  tables Survived*AG*whitecells / nofreq norow nocol;
  format Survived survive.;
run;
Output 15.18 Suppressing Percentage Information
Suppressing Table Information
Last updated: August 23, 2018
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.148.108.112