Compute statistics for the combinations of classification variables in a data set. Create a separate page for each value of one of the classification variables so that each page becomes a complete report for the observations with that value. Add a last page that summarizes the separate pages.
Summarization of Jobs by Region 1 for Each Gender and for All Employees -------------------------------------------------- --------------------------------------------- |Gender: Female | Region | | | |------------------------------ -----------------------------| | | | North | South | East | West | All Regions | | |--------------+--------------+ --------------+--------------+--------------| | | | % of | | % of | | % of | | % of | | % of | | |Count|Category|Count|Category| Count|Category|Count|Category|Count|Category| |------------------+-----+--------+-----+--------+ -----+--------+-----+--------+-----+--------| |Technical | 7| 50.0| 3| 75.0| 5| 50.0| 1| 16.7| 16| 47.1| |------------------+-----+--------+-----+--------+ -----+--------+-----+--------+-----+--------| |Manager/Supervisor| 7| 58.3| 6| 100.0| 7| 50.0| 0| 0| 20| 57.1| |------------------+-----+--------+-----+--------+ -----+--------+-----+--------+-----+--------| |Clerical | 7| 70.0| 2| 28.6| 4| 66.7| 1| 20.0| 14| 50.0| |------------------+-----+--------+-----+--------+ -----+--------+-----+--------+-----+--------| |Administrative | 0| 0| 6| 54.5| 5| 62.5| 0| 0| 11| 42.3| |------------------+-----+--------+-----+--------+ -----+--------+-----+--------+-----+--------| |All Jobs | 21| 50.0| 17| 60.7| 21| 55.3| 2| 13.3| 61| 49.6| -------------------------------------------------- --------------------------------------------- |
Summarization of Jobs by Region 2 for Each Gender and for All Employees -------------------------------------------------- --------------------------------------------- |Gender: Male | Region | | | |------------------------------ -----------------------------| | | | North | South | East | West | All Regions | | |--------------+--------------+ --------------+--------------+--------------| | | | % of | | % of | | % of | | % of | | % of | | |Count|Category|Count|Category| Count|Category|Count|Category|Count|Category| |------------------+-----+--------+-----+--------+ -----+--------+-----+--------+-----+--------| |Technical | 7| 50.0| 1| 25.0| 5| 50.0| 5| 83.3| 18| 52.9| |------------------+-----+--------+-----+--------+ -----+--------+-----+--------+-----+--------| |Manager/Supervisor| 5| 41.7| 0| 0| 7| 50.0| 3| 100.0| 15| 42.9| |------------------+-----+--------+-----+--------+ -----+--------+-----+--------+-----+--------| |Clerical | 3| 30.0| 5| 71.4| 2| 33.3| 4| 80.0| 14| 50.0| |------------------+-----+--------+-----+--------+ -----+--------+-----+--------+-----+--------| |Administrative | 6| 100.0| 5| 45.5| 3| 37.5| 1| 100.0| 15| 57.7| |------------------+-----+--------+-----+--------+ -----+--------+-----+--------+-----+--------| |All Jobs | 21| 50.0| 11| 39.3| 17| 44.7| 13| 86.7| 62| 50.4| -------------------------------------------------- --------------------------------------------- |
Data Set | JOBCLASS |
Featured Step | PROC TABULATE |
Featured Step Statements and Options | TABLE statement: PCTN statistic with denominator definition |
Formatting Features | PROC TABULATE statement: FORMAT= option
TABLE statement: BOX= and MISSTEXT= options TABLE statement: RTS= option when sending output to the LISTING destination |
Related Technique | OPTIONS NOBYLINE, PROC TABULATE with a BY statement, and a TITLE statement with #BYLINE |
A Closer Look | Comparing PROC TABULATE’s Use of the BY Statement and the Page Dimension |
Other Examples That Use This Data Set | Examples 3.4, 3.5, 6.3, and 6.7 |
Presenting tables on separate pages can clarify the differences among tables and make it easier to compare the tables.
The first two pages of this report show by gender the frequency counts and percentages for each job class in each region. The last page summarizes the first two and shows for all employees the frequency counts and percentages for each job class in each region.
The percentage in each table cell compares the corresponding frequency count for that table cell to the frequency count for all employees in the category represented by that table cell. On the first two pages, the percentage represents the proportion of each gender in the specific category defined by job class and region. Since the last page summarizes over gender, the percentage cells are all 100%.
Each observation in JOBCLASS corresponds to the information for one employee.
You can use BY processing with PROC TABULATE to produce a report that is similar to the one featured in this example.
The program that follows produces a basic crosstabulation of OCCUPAT and REGION, with totals for each value of the BY variable, GENDER. Figure 3.6 shows the output from this program.
Summarization of Jobs by Region 1 Data for Gender=Female ------------------------------------------------------------------------------------- | | Region | | | |---------------------------------------------------| | | | North | South | East | West |All Regions | | |------------+------------+------------+------------+------------| | |Count| % |Count| % |Count| % |Count| % |Count| % | |------------------+-----+------+-----+------+-----+------+-----+------+-----+------| |Technical | 7| 11.48| 3| 4.92| 5| 8.20| 1| 1.64| 16| 26.23| |------------------+-----+------+-----+------+-----+------+-----+------+-----+------| |Manager/Supervisor| 7| 11.48| 6| 9.84| 7| 11.48| 0| 0| 20| 32.79| |------------------+-----+------+-----+------+-----+------+-----+------+-----+------| |Clerical | 7| 11.48| 2| 3.28| 4| 6.56| 1| 1.64| 14| 22.95| |------------------+-----+------+-----+------+-----+------+-----+------+-----+------| |Administrative | 0| 0| 6| 9.84| 5| 8.20| 0| 0| 11| 18.03| |------------------+-----+------+-----+------+-----+------+-----+------+-----+------| |All Jobs | 21| 34.43| 17| 27.87| 21| 34.43| 2| 3.28| 61|100.00| ------------------------------------------------------------------------------------- ----------------------------------------page break -------------------------------------------------- 2 Summarization of Jobs by Region Data for Gender=Male ------------------------------------------------------------------------------------- | | Region | | | |---------------------------------------------------| | | | North | South | East | West |All Regions | | |------------+------------+------------+------------+------------| | |Count| % |Count| % |Count| % |Count| % |Count| % | |------------------+-----+------+-----+------+-----+------+-----+------+-----+------| |Technical | 7| 11.29| 1| 1.61| 5| 8.06| 5| 8.06| 18| 29.03| |------------------+-----+------+-----+------+-----+------+-----+------+-----+------| |Manager/Supervisor| 5| 8.06| 0| 0| 7| 11.29| 3| 4.84| 15| 24.19| |------------------+-----+------+-----+------+-----+------+-----+------+-----+------| |Clerical | 3| 4.84| 5| 8.06| 2| 3.23| 4| 6.45| 14| 22.58| |------------------+-----+------+-----+------+-----+------+-----+------+-----+------| |Administrative | 6| 9.68| 5| 8.06| 3| 4.84| 1| 1.61| 15| 24.19| |------------------+-----+------+-----+------+-----+------+-----+------+-----+------| |All Jobs | 21| 33.87| 11| 17.74| 17| 27.42| 13| 20.97| 62|100.00| |
When you use a BY statement instead of the page dimension, your report can compute percentages only for a table within a BY group. You cannot summarize information over all BY groups, because PROC TABULATE processes each BY group individually. Table 3.6 compares the usage of the page dimension and the BY statement in PROC TABULATE.
The following program generated the output in Figure 3.6.
Define formats to associate with the classification variables and the BY variable. | proc format; value gendfmt 1='Female' 2='Male'; value occupfmt 1='Technical' 2='Manager/Supervisor' 3='Clerical' 4='Administrative'; value regfmt 1='North' 2='South' 3='East' 4='West'; run; |
Sort the data set by the variable that defines the BR groups in the report. | proc sort data=jobclass; by gender; run; |
Prevent BY lines from being printed above each BY group of the report. | options nobyline; proc tabulate data=jobclass format=5.; title 'Summarization of Jobs by Region'; |
Put the default BY line text in the title in the position taken by #BYLINE. | title3 'Data for #byline'; |
Produce a table for each value of the BY variable. | by gender; |
Do not include the BY variable as a class variable. | class region occupat; table occupat=' ' all='All Jobs', (region='Region' all='All Regions') |
Compute percentages based on totals within the BY group. | *(n='Count' pctn='%'*f=7.2) / rts=20 misstext='0'; label gender='Gender'; format gender gendfmt. occupat occupfmt. region regfmt.; run; |
Reset the option to its default setting. | options byline; |
A Closer Look |
Table 3.6 describes the differences between PROC TABULATE with a BY statement and PROC TABULATE with a page dimension.
BY statement processing. See “BY statement” in the “Statements” section of SAS 9.1 Language Reference: Dictionary, and “Statements with the Same Function in Multiple Procedures” in the “Concepts” section of Base SAS 9.1 Procedures Guide.
The BYLINE/NOBYLINE SAS System Options. See “SAS System Options” in SAS 9.1 Language Reference: Dictionary.
Inserting BY-group information in titles. See “Creating Titles that Contain BY-Group Information” in the “Fundamental Concepts in Using Base SAS Procedures” section of Base SAS 9.1 Procedures Guide.
PROC TABULATE reference, usage information, and additional examples. See “The TABULATE Procedure,” in the “Procedures” section of Base SAS 9.1 Procedures Guide.
18.223.108.105