Example 3.6. Creating Multipage Summary Tables

Goal

Compute statistics for the combinations of classification variables in a data set. Create a separate page for each value of one of the classification variables so that each page becomes a complete report for the observations with that value. Add a last page that summarizes the separate pages.

Report

                                Summarization of
 Jobs by Region                                   1
                             for Each Gender and
 for All Employees

--------------------------------------------------
---------------------------------------------
|Gender: Female    |                         
 Region                           |              |
|                 
 |------------------------------
-----------------------------|              |
|                  |    North     |    South     |
     East     |     West     | All Regions  |
|                 
 |--------------+--------------+
--------------+--------------+--------------|
|                  |     |  % of  |     |  % of  |
     |  % of  |     |  % of  |     |  % of  |
|                 
 |Count|Category|Count|Category|
Count|Category|Count|Category|Count|Category|
|------------------+-----+--------+-----+--------+
-----+--------+-----+--------+-----+--------|
|Technical         |    7|    50.0|    3|    75.0|
    5|    50.0|    1|    16.7|   16|    47.1|
|------------------+-----+--------+-----+--------+
-----+--------+-----+--------+-----+--------|
|Manager/Supervisor|    7|    58.3|    6|   100.0|
    7|    50.0|    0|       0|   20|    57.1|
|------------------+-----+--------+-----+--------+
-----+--------+-----+--------+-----+--------|
|Clerical          |    7|    70.0|    2|    28.6|
    4|    66.7|    1|    20.0|   14|    50.0|
|------------------+-----+--------+-----+--------+
-----+--------+-----+--------+-----+--------|
|Administrative    |    0|       0|    6|    54.5|
    5|    62.5|    0|       0|   11|    42.3|
|------------------+-----+--------+-----+--------+
-----+--------+-----+--------+-----+--------|
|All Jobs          |   21|    50.0|   17|    60.7|
   21|    55.3|    2|    13.3|   61|    49.6|
--------------------------------------------------
---------------------------------------------


                                Summarization of
 Jobs by Region                                   2
                             for Each Gender and
 for All Employees

--------------------------------------------------
---------------------------------------------
|Gender: Male      |                         
 Region                           |              |
|                 
 |------------------------------
-----------------------------|              |
|                  |    North     |    South     |
     East     |     West     | All Regions  |
|                 
 |--------------+--------------+
--------------+--------------+--------------|
|                  |     |  % of  |     |  % of  |
     |  % of  |     |  % of  |     |  % of  |
|                 
 |Count|Category|Count|Category|
Count|Category|Count|Category|Count|Category|
|------------------+-----+--------+-----+--------+
-----+--------+-----+--------+-----+--------|
|Technical         |    7|    50.0|    1|    25.0|
    5|    50.0|    5|    83.3|   18|    52.9|
|------------------+-----+--------+-----+--------+
-----+--------+-----+--------+-----+--------|
|Manager/Supervisor|    5|    41.7|    0|       0|
    7|    50.0|    3|   100.0|   15|    42.9|
|------------------+-----+--------+-----+--------+
-----+--------+-----+--------+-----+--------|
|Clerical          |    3|    30.0|    5|    71.4|
    2|    33.3|    4|    80.0|   14|    50.0|
|------------------+-----+--------+-----+--------+
-----+--------+-----+--------+-----+--------|
|Administrative    |    6|   100.0|    5|    45.5|
    3|    37.5|    1|   100.0|   15|    57.7|
|------------------+-----+--------+-----+--------+
-----+--------+-----+--------+-----+--------|
|All Jobs          |   21|    50.0|   11|    39.3|
   17|    44.7|   13|    86.7|   62|    50.4|
--------------------------------------------------
---------------------------------------------


                                Summarization of
 Jobs by Region                                   3
                             for Each Gender and
 for All Employees

--------------------------------------------------
---------------------------------------------
|All Employees     |                         
 Region                           |              |
|                 
 |------------------------------
-----------------------------|              |
|                  |    North     |    South     |
     East     |     West     | All Regions  |
|                 
 |--------------+--------------+
--------------+--------------+--------------|
|                  |     |  % of  |     |  % of  |
     |  % of  |     |  % of  |     |  % of  |
|                 
 |Count|Category|Count|Category|
Count|Category|Count|Category|Count|Category|
|------------------+-----+--------+-----+--------+
-----+--------+-----+--------+-----+--------|
|Technical         |   14|   100.0|    4|   100.0|
   10|   100.0|    6|   100.0|   34|   100.0|
|------------------+-----+--------+-----+--------+
-----+--------+-----+--------+-----+--------|
|Manager/Supervisor|   12|   100.0|    6|   100.0|
   14|   100.0|    3|   100.0|   35|   100.0|
|------------------+-----+--------+-----+--------+
-----+--------+-----+--------+-----+--------|
|Clerical          |   10|   100.0|    7|   100.0|
    6|   100.0|    5|   100.0|   28|   100.0|
|------------------+-----+--------+-----+--------+
-----+--------+-----+--------+-----+--------|
|Administrative    |    6|   100.0|   11|   100.0|
    8|   100.0|    1|   100.0|   26|   100.0|
|------------------+-----+--------+-----+--------+
-----+--------+-----+--------+-----+--------|
|All Jobs          |   42|   100.0|   28|   100.0|
   38|   100.0|   15|   100.0|  123|   100.0|
--------------------------------------------------
---------------------------------------------


Example Features

Data SetJOBCLASS
Featured StepPROC TABULATE
Featured Step Statements and OptionsTABLE statement: PCTN statistic with denominator definition
Formatting FeaturesPROC TABULATE statement: FORMAT= option

TABLE statement: BOX= and MISSTEXT= options

TABLE statement: RTS= option when sending output to the LISTING destination
Related TechniqueOPTIONS NOBYLINE, PROC TABULATE with a BY statement, and a TITLE statement with #BYLINE
A Closer LookComparing PROC TABULATE’s Use of the BY Statement and the Page Dimension
Other Examples That Use This Data SetExamples 3.4, 3.5, 6.3, and 6.7

Example Overview

Presenting tables on separate pages can clarify the differences among tables and make it easier to compare the tables.

The first two pages of this report show by gender the frequency counts and percentages for each job class in each region. The last page summarizes the first two and shows for all employees the frequency counts and percentages for each job class in each region.

The percentage in each table cell compares the corresponding frequency count for that table cell to the frequency count for all employees in the category represented by that table cell. On the first two pages, the percentage represents the proportion of each gender in the specific category defined by job class and region. Since the last page summarizes over gender, the percentage cells are all 100%.

Each observation in JOBCLASS corresponds to the information for one employee.

Program

Define formats to associate with the classification variables.
proc format;
  value gendfmt 1='Female'
                2='Male';
  value occupfmt 1='Technical'
                 2='Manager/Supervisor'
                 3='Clerical'
                 4='Administrative';
  value regfmt 1='North'
               2='South'
               3='East'
               4='West';
run;

proc tabulate data=jobclass

Specify a default format for each cell in the table.
     format=5.;

  title  'Summarization of Jobs by Region';
  title2  'for Each Gender and for All
           Employees';

  class gender region occupat;
  table

Start with the classifications in the page dimension. Supply headings to elements of the table by enclosing text within quotation marks.
        gender='Gender: '

Produce a page that summarizes the other pages by including ALL as one of the class variables in the page dimension.
        all='All Employees',

Specify the classifications in the row dimension.
        occupat=' '

Summarize over OCCUPAT in the last row on each page by including ALL as one of the class variables in the row dimension.
        all='All Jobs',

Conclude with the column dimension.
        (region='Region'

Summarize over REGION in the last column on each page by including ALL as one of the class variables in the column dimension.
        all='All Regions')

Nest the statistics under the class variables in the column dimension. As the denominator of the percentage calculation, use the sum of the frequency counts for males and females for each combination of OCCUPAT and REGION.
        *(n='Count' pctn<gender all>=
            '% of Category'*f=8.1)

When sending output to the LISTING destination, specify the space allocated to row titles.
        / rts=20

Specify the text to print in cells that contain missing values.
          misstext='0'

Place the default page heading in the empty box above the row titles.
          box=_page_;

Since GENDER is in the page dimension, write the formatted values of GENDER in the empty box above the row titles.
  format gender gendfmt. occupat occupfmt.
         region regfmt.;
run;


Related Technique

You can use BY processing with PROC TABULATE to produce a report that is similar to the one featured in this example.

The program that follows produces a basic crosstabulation of OCCUPAT and REGION, with totals for each value of the BY variable, GENDER. Figure 3.6 shows the output from this program.

Figure 3.6. Output from PROC TABULATE Using BY Processing
                                 Summarization of Jobs by Region                          
          1
                                     Data for Gender=Female
         -------------------------------------------------------------------------------------
         |                  |                      Region                       |            |
         |                  |---------------------------------------------------|            |
         |                  |   North    |   South    |    East    |    West    |All Regions |
         |                  |------------+------------+------------+------------+------------|
         |                  |Count|  %   |Count|  %   |Count|  %   |Count|  %   |Count|  %   |
         |------------------+-----+------+-----+------+-----+------+-----+------+-----+------|
         |Technical         |    7| 11.48|    3|  4.92|    5|  8.20|    1|  1.64|   16| 26.23|
         |------------------+-----+------+-----+------+-----+------+-----+------+-----+------|
         |Manager/Supervisor|    7| 11.48|    6|  9.84|    7| 11.48|    0|     0|   20| 32.79|
         |------------------+-----+------+-----+------+-----+------+-----+------+-----+------|
         |Clerical          |    7| 11.48|    2|  3.28|    4|  6.56|    1|  1.64|   14| 22.95|
         |------------------+-----+------+-----+------+-----+------+-----+------+-----+------|
         |Administrative    |    0|     0|    6|  9.84|    5|  8.20|    0|     0|   11| 18.03|
         |------------------+-----+------+-----+------+-----+------+-----+------+-----+------|
         |All Jobs          |   21| 34.43|   17| 27.87|   21| 34.43|    2|  3.28|   61|100.00|
         -------------------------------------------------------------------------------------

----------------------------------------page break
 --------------------------------------------------
                                                                                          
           2
                                 Summarization of Jobs by Region
                                       Data for Gender=Male
         -------------------------------------------------------------------------------------
         |                  |                      Region                       |            |
         |                  |---------------------------------------------------|            |
         |                  |   North    |   South    |    East    |    West    |All Regions |
         |                  |------------+------------+------------+------------+------------|
         |                  |Count|  %   |Count|  %   |Count|  %   |Count|  %   |Count|  %   |
         |------------------+-----+------+-----+------+-----+------+-----+------+-----+------|
         |Technical         |    7| 11.29|    1|  1.61|    5|  8.06|    5|  8.06|   18| 29.03|
         |------------------+-----+------+-----+------+-----+------+-----+------+-----+------|
         |Manager/Supervisor|    5|  8.06|    0|     0|    7| 11.29|    3|  4.84|   15| 24.19|
         |------------------+-----+------+-----+------+-----+------+-----+------+-----+------|
         |Clerical          |    3|  4.84|    5|  8.06|    2|  3.23|    4|  6.45|   14| 22.58|
         |------------------+-----+------+-----+------+-----+------+-----+------+-----+------|
         |Administrative    |    6|  9.68|    5|  8.06|    3|  4.84|    1|  1.61|   15| 24.19|
         |------------------+-----+------+-----+------+-----+------+-----+------+-----+------|
         |All Jobs          |   21| 33.87|   11| 17.74|   17| 27.42|   13| 20.97|   62|100.00|

When you use a BY statement instead of the page dimension, your report can compute percentages only for a table within a BY group. You cannot summarize information over all BY groups, because PROC TABULATE processes each BY group individually. Table 3.6 compares the usage of the page dimension and the BY statement in PROC TABULATE.

The following program generated the output in Figure 3.6.

Define formats to associate with the classification variables and the BY variable.
proc format;
  value gendfmt 1='Female'
                2='Male';
  value occupfmt 1='Technical'
                 2='Manager/Supervisor'
                 3='Clerical'
                 4='Administrative';
  value regfmt 1='North'
               2='South'
               3='East'
               4='West';
run;

Sort the data set by the variable that defines the BR groups in the report.
proc sort data=jobclass;
  by gender;
run;

Prevent BY lines from being printed above each BY group of the report.
options nobyline;

proc tabulate data=jobclass format=5.;
   title 'Summarization of Jobs by Region';

Put the default BY line text in the title in the position taken by #BYLINE.
title3 'Data for #byline';

Produce a table for each value of the BY variable.
by gender;

Do not include the BY variable as a class variable.
      class region occupat;
      table occupat=' ' all='All Jobs',
            (region='Region' all='All Regions')

Compute percentages based on totals within the BY group.
            *(n='Count' pctn='%'*f=7.2)

              / rts=20 misstext='0';
                label gender='Gender';
  format gender gendfmt. occupat occupfmt.
         region regfmt.;
run;

Reset the option to its default setting.
options byline;


A Closer Look

Comparing PROC TABULATE’s Use of the BY Statement and the Page Dimension

Table 3.6 describes the differences between PROC TABULATE with a BY statement and PROC TABULATE with a page dimension.

Table 3.6. Comparing the Use of the BY Statement and the Page Dimension in PROC TABULATE
IssuePROC TABULATE with a BY StatementPROC TABULATE with a Page Dimension in the TABLE Statement
Order of observations in the input data setThe observations in the input data set must by sorted by the BY variables.Sorting or indexing is unnecessary.
Creating one report summarizing all BY groupsYou cannot create one report that summarizes over the BY groups.Use ALL in the page dimension to create a report for all classes.
Calculating percentagesThe percentages in the table are based on the frequency counts for the BY group value. You cannot calculate percentages for a BY group using the total frequency counts for all BY groups combined, because PROC TABULATE prepares the individual reports separately. Data for the report for one BY group is not available to the report for another BY group.You can use denominator definitions to control the meaning of PCTN, or you can use the percentage statistics (ROWPCTN, COLPCTN, PAGEPCTN, and REPPCTN; ROWPCTSUM, COLPCTSUM, PAGEPCTSUM, and REPPCTSUM).
TitlesYou can use the #BYVAL, #BYVAR, and #BYLINE specifications in the TITLE statements to customize the titles for each BY group. When you use these specifications, suppress the default BY line by setting the SAS system option NOBYLINE.The BOX= option in the TABLE statement customizes the page headings, but the titles remain the same on each page.

Where to Go from Here

BY statement processing. See “BY statement” in the “Statements” section of SAS 9.1 Language Reference: Dictionary, and “Statements with the Same Function in Multiple Procedures” in the “Concepts” section of Base SAS 9.1 Procedures Guide.

The BYLINE/NOBYLINE SAS System Options. See “SAS System Options” in SAS 9.1 Language Reference: Dictionary.

Inserting BY-group information in titles. See “Creating Titles that Contain BY-Group Information” in the “Fundamental Concepts in Using Base SAS Procedures” section of Base SAS 9.1 Procedures Guide.

PROC TABULATE reference, usage information, and additional examples. See “The TABULATE Procedure,” in the “Procedures” section of Base SAS 9.1 Procedures Guide.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.223.108.105