Creating Frequency Tables with PROC FREQ

The FREQ procedure produces frequency distributions for quantitative variables as well as classification variables. For example, you can use PROC FREQ to determine the percentage of participants who “agreed strongly” with a statement on a questionnaire, the percentage who “agreed somewhat,” and so forth.

The PROC FREQ and TABLES Statements

The general form for the procedure is as follows:

PROC FREQ   DATA=dataset-name;
   TABLES  variable-list  /   options;
RUN;

In the TABLES statement, you list the names of the variables to be analyzed, with each name separated by at least one blank space. Below are the PROC FREQ and TABLES statements from the program presented earlier in this chapter (analyzing data from the volunteerism survey):

PROC FREQ   DATA=D1;
   TABLES SEX CLASS RESNEEDY;
RUN;

Reviewing the Output

These statements will cause SAS to create three frequency distributions: one for the SEX variable; one for CLASS; and one for RESNEEDY. This output appears in Output 5.2.

Output 5.2. Results of the FREQ Procedure
                      The FREQ Procedure

                                   Cumulative     Cumulative
   SEX    Frequency     Percent     Frequency       Percent
   ---------------------------------------------------------
   F            11       78.57            11         78.57
   M             3       21.43            14        100.00


                                    Cumulative     Cumulative
  CLASS    Frequency     Percent     Frequency       Percent
  -----------------------------------------------------------
      .           2       14.29             2         14.29
      1           5       35.71             7         50.00
      2           3       21.43            10         71.43
      3           1        7.14            11         78.57
      4           2       14.29            13         92.86
      5           1        7.14            14        100.00


                                     Cumulative    Cumulative
RESNEEDY    Frequency     Percent     Frequency      Percent
-------------------------------------------------------------
       .           1        7.14             1          7.14
       1           2       14.29             3         21.43
       3           1        7.14             4         28.57
       4           7       50.00            11         78.57
       5           3       21.43            14        100.00

Output 5.2 shows that the name for the variable being analyzed appears on the far left side of the frequency distribution, just above the dotted line. The values assumed by the variable appear below this variable name. The first distribution provides information about the SEX variable; below the word “SEX” appear the values “F” and “M.” Information about female participants appears to the right of “F,” and information about males appears to the right of “M.” When reviewing a frequency distribution, it is useful to think of these different values as representing categories to which participants can belong.

Under the heading “Frequency,” the output indicates the number of individuals in a given category. Here, you can see that 11 participants were female while 3 were male. Below “Percent,” the percent of participants in each category appears. The table shows that 78.57% of the participants were female while 21.43% were male.

Under “Cumulative Frequency” is the number of observations that appear in the current category plus all of the preceding categories. For example, the first (top) category for SEX is “female.” There were 11 participants in that category so the cumulative frequency is 11. The next category is “male,” and there are 3 participants in that category. The cumulative frequency for the “male” category is therefore 14 (because 11 + 3 = 14). In the same way, the “Cumulative Percent” category provides the percent of observations in the current category plus all of the preceding categories.

The next table presents results for the CLASS variable. Notice that the first line of this table begins with a period (.) in the left-hand column (row 1). This indicates that there are missing data for two participants on the CLASS variable, a fact that can be verified by reviewing the data as they appear in the preceding program: CLASS values are blank for two participants.

If no participant appears in a given category, the value representing that category does not appear in the frequency distribution at all. This is demonstrated with the third table, which presents the frequency distribution for the RESNEEDY variable. Notice that, under the “RESNEEDY” heading, you can find only the values “1,” “3,” “4,” and “5.” The value “2” does not appear because none of the participants checked “Disagree Somewhat” for question 1.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.111.129