The FREQ procedure produces frequency distributions for quantitative variables as well as classification variables. For example, you can use PROC FREQ to determine the percentage of participants who “agreed strongly” with a statement on a questionnaire, the percentage who “agreed somewhat,” and so forth.
The general form for the procedure is as follows:
PROC FREQ DATA=dataset-name; TABLES variable-list / options; RUN;
In the TABLES statement, you list the names of the variables to be analyzed, with each name separated by at least one blank space. Below are the PROC FREQ and TABLES statements from the program presented earlier in this chapter (analyzing data from the volunteerism survey):
PROC FREQ DATA=D1; TABLES SEX CLASS RESNEEDY; RUN;
These statements will cause SAS to create three frequency distributions: one for the SEX variable; one for CLASS; and one for RESNEEDY. This output appears in Output 5.2.
The FREQ Procedure Cumulative Cumulative SEX Frequency Percent Frequency Percent --------------------------------------------------------- F 11 78.57 11 78.57 M 3 21.43 14 100.00 Cumulative Cumulative CLASS Frequency Percent Frequency Percent ----------------------------------------------------------- . 2 14.29 2 14.29 1 5 35.71 7 50.00 2 3 21.43 10 71.43 3 1 7.14 11 78.57 4 2 14.29 13 92.86 5 1 7.14 14 100.00 Cumulative Cumulative RESNEEDY Frequency Percent Frequency Percent ------------------------------------------------------------- . 1 7.14 1 7.14 1 2 14.29 3 21.43 3 1 7.14 4 28.57 4 7 50.00 11 78.57 5 3 21.43 14 100.00 |
Output 5.2 shows that the name for the variable being analyzed appears on the far left side of the frequency distribution, just above the dotted line. The values assumed by the variable appear below this variable name. The first distribution provides information about the SEX variable; below the word “SEX” appear the values “F” and “M.” Information about female participants appears to the right of “F,” and information about males appears to the right of “M.” When reviewing a frequency distribution, it is useful to think of these different values as representing categories to which participants can belong.
Under the heading “Frequency,” the output indicates the number of individuals in a given category. Here, you can see that 11 participants were female while 3 were male. Below “Percent,” the percent of participants in each category appears. The table shows that 78.57% of the participants were female while 21.43% were male.
Under “Cumulative Frequency” is the number of observations that appear in the current category plus all of the preceding categories. For example, the first (top) category for SEX is “female.” There were 11 participants in that category so the cumulative frequency is 11. The next category is “male,” and there are 3 participants in that category. The cumulative frequency for the “male” category is therefore 14 (because 11 + 3 = 14). In the same way, the “Cumulative Percent” category provides the percent of observations in the current category plus all of the preceding categories.
The next table presents results for the CLASS variable. Notice that the first line of this table begins with a period (.) in the left-hand column (row 1). This indicates that there are missing data for two participants on the CLASS variable, a fact that can be verified by reviewing the data as they appear in the preceding program: CLASS values are blank for two participants.
If no participant appears in a given category, the value representing that category does not appear in the frequency distribution at all. This is demonstrated with the third table, which presents the frequency distribution for the RESNEEDY variable. Notice that, under the “RESNEEDY” heading, you can find only the values “1,” “3,” “4,” and “5.” The value “2” does not appear because none of the participants checked “Disagree Somewhat” for question 1.
18.189.171.153