Example 3.2. Computing Descriptive Statistics for Specific Groups

Goal

Compute specific descriptive statistics for specific combinations of several classification variables. Omit unneeded combinations of the classification variables from the report.

Save in a data set the minimum and maximum values for specific variables for use in a later example. Save record identifiers with the results.

Report

                  Nutritional Information about
 Breads Available in the Region
                     Values Per Bread Slice,
 Calories in kcal, Fiber in Grams

                                       The MEANS
 Procedure

                      Variable           N      
 Mean    Minimum     Maximum
                     
 ------------------------------------------------------
                      calories         124      87
.59      56.00      138.00
                      dietary_fiber    124       2
.01       0.00        4.80
                     
 ------------------------------------------------------

                Type of
                Bread        Variable           N 
      Mean    Minimum     Maximum
               
 ----------------------------------
---------------------------------
                Specialty    calories          68 
     89.44      71.00      138.00
                             dietary_fiber     68 
      2.02       0.00        4.80

                Sandwich     calories          56 
     85.34      56.00      111.00
                             dietary_fiber     56 
      1.99       0.50        3.90
               
 ----------------------------------
---------------------------------

               Primary Flour
               Ingredient        Variable         
 N      Mean   Minimum    Maximum
              
 -----------------------------------
---------------------------------
               Multigrain        calories        
 29     88.83     76.00     108.00
                                 dietary_fiber   
 29      2.37      0.00       3.90

               Oatmeal           calories         
 5     93.00     84.00     111.00
                                 dietary_fiber    
 5      3.30      2.80       3.90

               Rye               calories        
 20     86.70     71.00     105.00
                                 dietary_fiber   
 20      2.16      1.00       4.20

               White             calories        
 45     87.42     56.00     138.00
                                 dietary_fiber   
 45      1.11      0.00       2.90

               Whole Wheat       calories        
 25     86.08     72.00     101.00
                                 dietary_fiber   
 25      2.82      1.00       4.80
              
 -----------------------------------
---------------------------------

                 Source     Variable           N  
     Mean    Minimum     Maximum
                
 ---------------------------------
--------------------------------
                 Bakery     calories          48  
    92.88      71.00      138.00
                            dietary_fiber     48  
     2.10       0.30        4.40

                 Grocery    calories          76  
    84.25      56.00      112.00
                            dietary_fiber     76  
     1.95       0.00        4.80
                
 ---------------------------------
--------------------------------
                         Type of
                Source   Bread      Variable      
   N      Mean  Minimum  Maximum
               
 ----------------------------------
--------------------------------
                Bakery   Specialty  calories      
  31    93.16    71.00    138.00
                                    dietary_fiber 
  31     1.99     0.30      4.40

                         Sandwich   calories      
  17    92.35    73.00    111.00
                                    dietary_fiber 
  17     2.31     0.50      3.90

                Grocery  Specialty  calories      
  37    86.32    72.00    112.00
                                    dietary_fiber 
  37     2.05     0.00      4.80

                         Sandwich   calories      
  39    82.28    56.00    101.00
                                    dietary_fiber 
  39     1.85     0.50      3.50
               
 ----------------------------------
--------------------------------

               Source    Brand             
 Variable          N      Mean    Minimum
              
 -----------------------------------
----------------------------------
               Bakery    Aunt Sal Bakes    
 calories         10     91.30      74.00
                                           
 dietary_fiber    10      1.84       0.50

                         Demeter           
 calories         15     96.67      71.00
                                           
 dietary_fiber    15      2.29       0.50
              
 -----------------------------------
----------------------------------

               Source    Brand             
 Variable          N      Mean    Minimum
              
 -----------------------------------
----------------------------------
               Bakery    Downtown Bakers   
 calories         10     96.90      82.00
                                           
 dietary_fiber    10      2.35       1.00

                         Pain du Prairie   
 calories         13     86.62      74.00
                                           
 dietary_fiber    13      1.89       0.30

               Grocery   BBB Brands        
 calories          5     81.80      65.00
                                           
 dietary_fiber     5      1.76       1.20

                         Choice 123        
 calories          6     80.67      71.00
                                           
 dietary_fiber     6      1.52       0.50

                         Fabulous Breads   
 calories         15     82.80      71.00
                                           
 dietary_fiber    15      1.84       0.00

                         Five Chimneys     
 calories         10     86.60      75.00
                                           
 dietary_fiber    10      2.55       1.20

                         Gaia's Hearth     
 calories         11     89.00      77.00
                                           
 dietary_fiber    11      1.83       0.00

                         Mill City Bakers  
 calories          9     85.33      66.00
                                           
 dietary_fiber     9      1.92       0.50

                         Owasco Ovens      
 calories         12     83.33      72.00
                                           
 dietary_fiber    12      2.20       0.80

                         RiseNShine Bread  
 calories          8     81.88      56.00
                                           
 dietary_fiber     8      1.65       0.90
              
 -----------------------------------
----------------------------------

                       Source    Brand            
  Variable         Maximum
                      
 -----------------------------------------------------
                       Bakery    Aunt Sal Bakes   
  calories          105.00
                                                  
  dietary_fiber       3.90

                                 Demeter          
  calories          111.00
                                                  
  dietary_fiber       4.20
                                 Downtown Bakers  
  calories          138.00
                                                  
  dietary_fiber       4.30

                                 Pain du Prairie  
  calories          108.00
                                                  
  dietary_fiber       4.40
                       Grocery   BBB Brands       
  calories           90.00
                                                  
  dietary_fiber       2.60

                                 Choice 123       
  calories           92.00
                                                  
  dietary_fiber       2.30

                                 Fabulous Breads  
  calories           97.00
                                                  
  dietary_fiber       3.20

                                 Five Chimneys    
  calories           98.00
                                                  
  dietary_fiber       3.80

                                 Gaia's Hearth    
  calories          101.00
                                                  
  dietary_fiber       3.60

                                 Mill City Bakers 
  calories          112.00
                                                  
  dietary_fiber       3.10

                                 Owasco Ovens     
  calories           92.00
                                                  
  dietary_fiber       4.80

                                 RiseNShine Bread 
  calories          100.00
                                                  
  dietary_fiber       3.10
                      
 -----------------------------------------------------


Example Features

Data SetBREAD
Featured StepPROC MEANS
Featured Step Statements and OptionsPROC MEANS statement: NONOBS option

CLASS statement

TYPES statement
Formatting FeaturesPROC MEANS statement: MAXDEC= option; FW= option when sending output to the LISTING destination
Related TechniquePROC TABULATE
A Closer LookViewing the Output Data Set Created by This Example

Creating Categories for Analysis with PROC MEANS

Comparing the BY Statement and CLASS Statement in PROC MEANS

Taking Advantage of the CLASS Statement in PROC MEANS
ODS Enhanced Version of Related TechniqueExample 6.5
ODS Enhanced Version of Output Data Set from this ExampleExample 6.15
Other Examples That Use This Data SetExamples 6.5 and 6.15

Example Overview

This example summarizes nutritional information for a sample of bread products that are available either from grocery stores or from bakeries. The goal is to identify products with the lowest calories and highest dietary fiber in several categories. The data are categorized by four variables:

source (grocery store vs. bakery)
brand
primary flour ingredient in the product
type of bread (sandwich and a variety of specialty products)

The program computes statistics only for specific combinations of the four classification variables, not for all possible combinations of the four variables. It also computes the overall statistics. The results are saved in an output data set.

The following list shows the combinations for which this example saves statistics. One request computes overall statistics, three requests compute statistics for categories defined by a single classification variable, and two requests compute statistics for the categories defined by crossing two classification variables.

overall
type of bread
primary flour ingredient in the product
source
source and type of bread
source and brand

PROC MEANS saves the results of the six requests in one data set. Each observation in the output data set contains the results for one category. With the exception of the request for overall statistics, this yields multiple observations per request; the request for overall statistics is saved in the output data set in one observation.

The output data set saves statistics on calories and dietary fiber and it saves information that identifies the category and request. Each observation also contains identifying information for the three breads with the lowest calories and for the three breads with the highest dietary fiber for the category that the observation represents. The identifying information for these rankings includes the brand, the primary flour ingredient, and the type. The rankings are shown in Example 6.15. See Figure 3.2b for a listing of the output data set.

Program

 
proc format;

Define a format to group the types of products. Specify the missing value level, which will exist in the output data set.
  value $type 'Sandwich'='Sandwich'
              ' '=' '
              other='Specialty';
run;

Compute specific statistics.
proc means data=bread n mean min max

Specify the maximum number of decimal places to display the statistics.
     maxdec=2

When sending output to the LISTING destination, specify the width of the field in which to display each statistic in the output.
     fw=7

Suppress the column that displays the total number of observations for each category (“N Obs”). (Because there is no missing data in this data set, the “N Obs” column is identical to the “N” column. When analyzing your own data, verify this condition before applying this option.)
     nonobs;

 
  title 'Nutritional Information about Breads
         Available in the Region';
  title2 'Values Per Bread Slice, Calories in
          kcal, Fiber in Grams';

Specify the variables whose values define the categories for the analysis.
  class source brand flour type;

Include the overall statistics in the report.
  types ()

List the specific combinations of the classification variables that define the categories for the analyses.
        type flour source source*type
        source*brand;

  var calories dietary_fiber;

Create an output data set containing specific statistics.
  output out=breadstats

Save the three extreme minimum calorie values for each category specified in the TYPES statement.
         idgroup(min(calories) out[3]

Save identifying information as well as the calorie values for the three selected observations per category.
             (brand flour type calories)=

Specify the variable name prefix for each of the identifying variables and for the minimum calorie value.
              wherecal flourcal typecal mincal)

Save the three extreme maximum fiber values for each category specified in the TYPES statement.
         idgroup(max(dietary_fiber) out[3]

Specify the variable name prefix for each of the identifying variables and for the maximum fiber value.
             (brand flour type dietary_fiber)=
              wherefiber flourfiber typefiber
              maxfiber);

Suppress the variable labels for the two analysis variables in the PROC MEANS output display.
  label calories=' '
        dietary_fiber=' ';

  format type $type.;
run;


Related Technique

A similar report can be produced easily with PROC TABULATE, as shown in Figure 3.2a. The advantage of PROC TABULATE is that it provides you with more formatting options than PROC MEANS. The PROC MEANS step constructed six separate tables. With all six table specifications on one TABLE statement, PROC TABULATE combines the six tables into one large report. If you want PROC TABULATE to generate six separate tables, replace the single TABLE statement with a TABLE statement for each table.

Figure 3.2a. Output Produced by PROC TABULATE
                      Nutritional Information about Breads Available in the Region
                        Values Per Bread Slice, Calories in kcal, Fiber in Grams

         ---------------------------------------------------------------------------------
-----
         |                            |    Calories per Slice      |Dietary Fiber(g) per
 Slice |
         |                           
 |---------------------------+----------------------------|
         |                             | N | Mean  |  Min  |  Max  | N | Mean  |  Min  | 
 Max  |
         |----------------------------+---+-------+-------+-------+---+-------+-------+---
-----|
         |Overall                     |124|  87.59|  56.00| 138.00|124|   2.01|   0.00|   
 4.80|
         |----------------------------+---+-------+-------+-------+---+-------+-------+---
-----|
         |Source                      |   |       |       |       |   |       |       |   
     |
         |----------------------------|   |       |       |       |   |       |       |   
     |
         |Bakery                      | 48|  92.88|  71.00| 138.00| 48|   2.10|   0.30|   
 4.40|
         |----------------------------+---+-------+-------+-------+---+-------+-------+---
-----|
         |Grocery                     | 76|  84.25|  56.00| 112.00| 76|   1.95|   0.00|   
 4.80|
         |----------------------------+---+-------+-------+-------+---+-------+-------+---
-----|
         |Primary Flour Ingredient    |   |       |       |       |   |       |       |   
     |
         |----------------------------|   |       |       |       |   |       |       |   
     |
         |Multigrain                  | 29|  88.83|  76.00| 108.00| 29|   2.37|   0.00|   
 3.90|
         |----------------------------+---+-------+-------+-------+---+-------+-------+---
-----|
         |Oatmeal                     |  5|  93.00|  84.00| 111.00|  5|   3.30|   2.80|   
 3.90|
         |----------------------------+---+-------+-------+-------+---+-------+-------+---
-----|
         |Rye                         | 20|  86.70|  71.00| 105.00| 20|   2.16|   1.00|   
 4.20|
         |----------------------------+---+-------+-------+-------+---+-------+-------+---
-----|
         |White                       | 45|  87.42|  56.00| 138.00| 45|   1.11|   0.00|   
 2.90|
         |----------------------------+---+-------+-------+-------+---+-------+-------+---
-----|
         |Whole Wheat                 | 25|  86.08|  72.00| 101.00| 25|   2.82|   1.00|   
 4.80|
         |----------------------------+---+-------+-------+-------+---+-------+-------+---
-----|
         |Type of Bread               |   |       |       |       |   |       |       |   
     |
         |----------------------------|   |       |       |       |   |       |       |   
     |
         |Specialty                   | 68|  89.44|  71.00| 138.00| 68|   2.02|   0.00|   
 4.80|
         |----------------------------+---+-------+-------+-------+---+-------+-------+---
-----|
         |Sandwich                    | 56|  85.34|  56.00| 111.00| 56|   1.99|   0.50|   
 3.90|
         |----------------------------+---+-------+-------+-------+---+-------+-------+---
-----|
         |Source       |Type of Bread |   |       |       |       |   |       |       |   
     |
         |-------------+--------------|   |       |       |       |   |       |       |   
     |
         |Bakery       |Specialty     | 31|  93.16|  71.00| 138.00| 31|   1.99|   0.30|   
 4.40|
         |            
 |--------------+---+-------+-------+-------+---+-------+-------+--------|
         |             |Sandwich      | 17|  92.35|  73.00| 111.00| 17|   2.31|   0.50|   
 3.90|
         |-------------+--------------+---+-------+-------+-------+---+-------+-------+---
-----|
         |Grocery      |Specialty     | 37|  86.32|  72.00| 112.00| 37|   2.05|   0.00|   
 4.80|
         |            
 |--------------+---+-------+-------+-------+---+-------+-------+--------|
         |             |Sandwich      | 39|  82.28|  56.00| 101.00| 39|   1.85|   0.50|   
 3.50|
         |-------------+--------------+---+-------+-------+-------+---+-------+-------+---
-----|
         |Source       |Brand         |   |       |       |       |   |       |       |   
     |
         |-------------+--------------|   |       |       |       |   |       |       |   
     |
         |Bakery       |Aunt Sal Bakes| 10|  91.30|  74.00| 105.00| 10|   1.84|   0.50|   
 3.90|
         |            
 |--------------+---+-------+-------+-------+---+-------+-------+--------|
         |             |Demeter       | 15|  96.67|  71.00| 111.00| 15|   2.29|   0.50|   
 4.20|
         |            
 |--------------+---+-------+-------+-------+---+-------+-------+--------|
         |             |Downtown      |   |       |       |       |   |       |       |   
     |
         |             |Bakers        | 10|  96.90|  82.00| 138.00| 10|   2.35|   1.00|   
 4.30|
         |            
 |--------------+---+-------+-------+-------+---+-------+-------+--------|
         |             |Pain du       |   |       |       |       |   |       |       |   
     |
         |             |Prairie       | 13|  86.62|  74.00| 108.00| 13|   1.89|   0.30|   
 4.40|
         |-------------+--------------+---+-------+-------+-------+---+-------+-------+---
-----|
         |Grocery      |BBB Brands    |  5|  81.80|  65.00|  90.00|  5|   1.76|   1.20|   
 2.60|
         |            
 |--------------+---+-------+-------+-------+---+-------+-------+--------|
         |             |Choice 123    |  6|  80.67|  71.00|  92.00|  6|   1.52|   0.50|   
 2.30|
         |            
 |--------------+---+-------+-------+-------+---+-------+-------+--------|
         |             |Fabulous      |   |       |       |       |   |       |       |   
     |
         |             |Breads        | 15|  82.80|  71.00|  97.00| 15|   1.84|   0.00|   
 3.20|
         |            
 |--------------+---+-------+-------+-------+---+-------+-------+--------|
         |             |Five Chimneys | 10|  86.60|  75.00|  98.00| 10|   2.55|   1.20|   
 3.80|
         |            
 |--------------+---+-------+-------+-------+---+-------+-------+--------|
         |             |Gaia's Hearth | 11|  89.00|  77.00| 101.00| 11|   1.83|   0.00|   
 3.60|
         |            
 |--------------+---+-------+-------+-------+---+-------+-------+--------|
         |             |Mill City     |   |       |       |       |   |       |       |   
     |
         |             |Bakers        |  9|  85.33|  66.00| 112.00|  9|   1.92|   0.50|   
 3.10|
         |            
 |--------------+---+-------+-------+-------+---+-------+-------+--------|
         |             |Owasco Ovens  | 12|  83.33|  72.00|  92.00| 12|   2.20|   0.80|   
 4.80|
         |            
 |--------------+---+-------+-------+-------+---+-------+-------+--------|
         |             |RiseNShine    |   |       |       |       |   |       |       |   
     |
         |             |Bread         |  8|  81.88|  56.00| 100.00|  8|   1.65|   0.90|   
 3.10|
         ---------------------------------------------------------------------------------
-----

The advantages of PROC MEANS are in the features associated with the CLASS statement, such as the TYPES and WAYS statements and their options, and in the ease with which you can create output data sets that can then be processed by other procedures. While you can create data sets with PROC TABULATE, you cannot save information about minimum and maximum values with the output data set as you can with PROC MEANS.

The following PROC TABULATE step produces the output shown in Figure 3.2a.

 
proc tabulate data=bread;
  title  'Nutritional Information about Breads
          Available in the Region';
  title2 'Values Per Bread Slice, Calories in
          kcal, Fiber in Grams';

Specify the same CLASS variables as in the PROC MEANS step.
  class source brand flour type;
  var calories dietary_fiber;

Compute an overall statistics table.
  table all='Overall'

Compute the one-way statistics tables.
        source flour type

Compute the two-way statistics tables.
        source*type source*brand,

Place the analysis variables in the column dimension and nest the statistics under each variable.
        (calories dietary_fiber)*
        (n*f=3. (mean min max)*f=7.2) / rts=30;

  format type $type.;
run;


A Closer Look

Viewing the Output Data Set Created by This Example

The statistics computed by PROC MEANS in this example are saved in an output data set. The TYPES statement specified six requests, which results in the statistics for the six requests saved in the output data set BREADSTATS.

SAS identifies each request with a unique number and saves this value in the variable _TYPE_ that it defines. (Note that the values of _TYPE_ are numeric by default; specifying the PROC MEANS statement option CHARTYPE defines these values as character values.)

Submitting the following PROC PRINT step lists the contents of the output data set BREADSTATS. The values of _TYPE_ are formatted in the report as shown in Figure 3.2b. The variables identifying items with minimum calories and highest fiber are saved in the data set.

proc format;
  value _type_ 0='0: Overall'
               1='1: Type'
               2='2: Flour'
               8='8: Source'
               9='9: Source*Type'
              12='12: Source*Brand';
run;

proc print data=breadstats;
  title 'Nutritional Information about Breads Available in the
    Region';
  title2 'Values Per Bread Slice, Calories in kcal, Fiber in
    Grams';
  title3 'Breads with Fewest Calories and Most Dietary Fiber';

  by _type_;
  format _type_ _type_.;
run;

Figure 3.2b shows the PROC PRINT output. The _FREQ_ variable, which is another variable that PROC MEANS defines, contains the number of observations that the given category represents.

Figure 3.2b. PROC PRINT Listing of the BREADSTATS Data Set Created by PROC MEANS
                 Nutritional Information about Breads Available in the Region
                   Values Per Bread Slice, Calories in kcal, Fiber in Grams
                      Breads with Fewest Calories and Most Dietary Fiber

-------------------------------------- _TYPE_=0: Overall
 --------------------------------------

 Obs source   brand   flour   type   _FREQ_      wherecal_1      wherecal_2       wherecal_3

   1                                   124    RiseNShine Bread   BBB Brands    Mill City
 Bakers

 Obs flourcal_1 flourcal_2 flourcal_3 typecal_1 typecal_2 typecal_3 mincal_1 mincal_2 mincal_3

   1   White     White     White   Sandwich Sandwich Sandwich    56       65       66

 Obs wherefiber_1   wherefiber_2      wherefiber_3  flourfiber_1  flourfiber_2  flourfiber_3

   1 Owasco Ovens  Pain du Prairie  Downtown Bakers  Whole Wheat  Whole Wheat  Whole Wheat

 Obs typefiber_1    typefiber_2    typefiber_3    maxfiber_1    maxfiber_2     maxfiber_3


   1 Specialty     Specialty     Specialty        4.8          4.4          4.3


--------------------------------------- _TYPE_=1: Type
 ----------------------------------------

Obs  source  brand  flour    type     _FREQ_  wherecal_1         wherecal_2       wherecal_3

  2                        Specialty    68    Demeter           Owasco Ovens  Fabulous Breads
  3                        Sandwich     56    RiseNShine Bread  BBB Brands    Mill City Bakers

Obs  flourcal_1 flourcal_2   flourcal_3    typecal_1  typecal_2  typecal_3  mincal_1  mincal_2

  2    White    Whole Wheat  Whole Wheat  Specialty  Specialty  Specialty     71         72
  3    White    White        White        Sandwich   Sandwich   Sandwich      56         65

Obs  mincal_3    wherefiber_1    wherefiber_2       wherefiber_3     flourfiber_1  
 flourfiber_2

  2     74      Owasco Ovens     Pain du Prairie   Downtown Bakers   Whole Wheat    Whole
 Wheat
  3     66      Aunt Sal Bakes   Demeter           Gaia's Hearth     Multigrain     Oatmeal

Obs  flourfiber_3   typefiber_1   typefiber_2   typefiber_3   maxfiber_1   maxfiber_2  
 maxfiber_3

  2  Whole Wheat   Specialty    Specialty    Specialty       4.8         4.4          4.3
  3  Oatmeal       Sandwich     Sandwich     Sandwich        3.9         3.9          3.5


--------------------------------------- _TYPE_=2: Flour
 ---------------------------------------

 Obs source brand flour       type _FREQ_    wherecal_1    wherecal_2          wherecal_3

   4              Multigrain         29   RiseNShine Bread Choice 123       Mill City Bakers
   5              Oatmeal             5   Five Chimneys    Fabulous Breads  Owasco Ovens
   6              Rye                20   Mill City Bakers Five Chimneys    Fabulous Breads
   7              White              45   RiseNShine Bread BBB Brands       Mill City Bakers
   8              Whole Wheat        25   Owasco Ovens     Fabulous Breads  Owasco Ovens


 Obs flourcal_1   flourcal_2   flourcal_3   typecal_1  typecal_2  typecal_3  mincal_1 
 mincal_2

   4 Multigrain   Multigrain   Multigrain   Sandwich   Sandwich   Sandwich     76         76
   5 Oatmeal      Oatmeal      Oatmeal      Sandwich   Sandwich   Sandwich     84         85
   6 Rye          Rye          Rye          Sandwich   Sandwich   Sandwich     71         77
   7 White        White        White        Sandwich   Sandwich   Sandwich     56         65
   8 Whole Wheat  Whole Wheat  Whole Wheat  Specialty  Specialty  Sandwich     72         74
 Obs mincal_3  wherefiber_1     wherefiber_2     wherefiber_3     flourfiber_1  flourfiber_2
   4    77     Aunt Sal Bakes   Demeter          Demeter          Multigrain    Multigrain
   5    90     Demeter          Gaia's Hearth    Five Chimneys    Oatmeal       Oatmeal
   6    78     Demeter          Gaia's Hearth    Five Chimneys    Rye           Rye
   7    66     Downtown Bakers  Five Chimneys    Aunt Sal Bakes   White         White
   8    75     Owasco Ovens     Pain du Prairie  Downtown Bakers  Whole Wheat   Whole Wheat


 Obs flourfiber_3   typefiber_1   typefiber_2   typefiber_3   maxfiber_1   maxfiber_2  
 maxfiber_3

   4 Multigrain     Sandwich      Sandwich      Specialty     3.9          3.5          3.3
   5 Oatmeal        Sandwich      Sandwich      Sandwich      3.9          3.5          3.3
   6 Rye            Specialty     Specialty     Specialty     4.2          3.6          3.5
   7 White          Specialty     Sandwich      Specialty     2.9          2.3          2.3
   8 Whole Wheat    Specialty     Specialty     Specialty     4.8          4.4          4.3


-------------------------------------- _TYPE_=8: Source
 ---------------------------------------

 Obs  source   brand  flour  type  _FREQ_  wherecal_1        wherecal_2      wherecal_3

   9  Bakery                         48    Demeter           Demeter      Pain du Prairie
  10  Grocery                        76    RiseNShine Bread  BBB Brands   Mill City Bakers

  Obs flourcal_1  flourcal_2  flourcal_3 typecal_1 typecal_2 typecal_3  mincal_1  mincal_2

   9    White       White      White    Specialty  Sandwich  Specialty     71        73
  10    White       White      White    Sandwich   Sandwich  Sandwich      56        65

 Obs  mincal_3    wherefiber_1      wherefiber_2     wherefiber_3   flourfiber_1  
 flourfiber_2

   9     74      Pain du Prairie   Downtown Bakers   Demeter         Whole Wheat   Whole Wheat
  10     66      Owasco Ovens      Owasco Ovens      Five Chimneys   Whole Wheat   Whole Wheat

Obs   flourfiber_3   typefiber_1   typefiber_2   typefiber_3   maxfiber_1   maxfiber_2  
 maxfiber_3

   9  Rye           Specialty    Specialty    Specialty       4.4         4.3         4.2
  10  Whole Wheat   Specialty    Specialty    Specialty       4.8         4.0         3.8


------------------------------------ _TYPE_=9: Source*Type
 ------------------------------------

 Obs source  brand flour   type    _FREQ_ wherecal_1       wherecal_2         wherecal_3

  11 Bakery              Specialty   31   Demeter          Pain du Prairie  Aunt Sal Bakes
  12 Bakery              Sandwich    17   Demeter          Pain du Prairie  Aunt Sal Bakes
  13 Grocery             Specialty   37   Owasco Ovens     Fabulous Breads  RiseNShine Bread
  14 Grocery             Sandwich    39   RiseNShine Bread BBB Brands       Mill City Bakers


 Obs flourcal_1   flourcal_2  flourcal_3  typecal_1  typecal_2  typecal_3  mincal_1  mincal_2

  11 White        White          White    Specialty  Specialty  Specialty     71        74
  12 White        Whole Wheat    Rye      Sandwich   Sandwich   Sandwich      73        80
  13 Whole Wheat  Whole Wheat    White    Specialty  Specialty  Specialty     72        74
  14 White        White          White    Sandwich   Sandwich   Sandwich      56        65


 Obs mincal_3   wherefiber_1    wherefiber_2     wherefiber_3     flourfiber_1  flourfiber_2

  11    74     Pain du Prairie  Downtown Bakers  Demeter          Whole Wheat  Whole Wheat
  12    81     Aunt Sal Bakes   Demeter          Demeter          Multigrain   Oatmeal
  13    74     Owasco Ovens     Owasco Ovens     Five Chimneys    Whole Wheat  Whole Wheat
  14    66     Gaia's Hearth    Five Chimneys    Fabulous Breads  Oatmeal      Oatmeal
 Obs flourfiber_3   typefiber_1   typefiber_2   typefiber_3   maxfiber_1   maxfiber_2  
 maxfiber_3
  11 Rye           Specialty    Specialty    Specialty       4.4         4.3         4.2
  12 Multigrain    Sandwich     Sandwich     Sandwich        3.9         3.9         3.5
  13 Whole Wheat   Specialty    Specialty    Specialty       4.8         4.0         3.8
  14 Whole Wheat   Sandwich     Sandwich     Sandwich        3.5         3.3         3.1

----------------------------------- _TYPE_=12: Source*Brand
 -----------------------------------

Obs source    brand              flour   type   _FREQ_   wherecal_1          wherecal_2

 15 Bakery    Aunt Sal Bakes                      10     Aunt Sal Bakes      Aunt Sal Bakes
 16 Bakery    Demeter                             15     Demeter             Demeter
 17 Bakery    Downtown Bakers                     10     Downtown Bakers     Downtown Bakers
 18 Bakery    Pain du Prairie                     13     Pain du Prairie     Pain du Prairie
 19 Grocery   BBB Brands                           5     BBB Brands          BBB Brands
 20 Grocery   Choice 123                           6     Choice 123          Choice 123
 21 Grocery   Fabulous Breads                     15     Fabulous Breads     Fabulous Breads
 22 Grocery   Five Chimneys                       10     Five Chimneys       Five Chimneys
 23 Grocery   Gaia's Hearth                       11     Gaia's Hearth       Gaia's Hearth
 24 Grocery   Mill City Bakers                     9     Mill City Bakers    Mill City Bakers
 25 Grocery   Owasco Ovens                        12     Owasco Ovens        Owasco Ovens
 26 Grocery   RiseNShine Bread                     8     RiseNShine Bread    RiseNShine Bread


Obs wherecal_3       flourcal_1  flourcal_2  flourcal_3  typecal_1 typecal_2 typecal_3
 mincal_1

 15 Aunt Sal Bakes   White       Whole Wheat Rye         Specialty Specialty Sandwich     74
 16 Demeter          White       White       Rye         Specialty Sandwich  Sandwich     71
 17 Downtown Bakers  Rye         White       Whole Wheat Sandwich  Specialty Specialty    82
 18 Pain du Prairie  White       Whole Wheat Whole Wheat Specialty Specialty Sandwich     74
 19 BBB Brands       White       Rye         Rye         Sandwich  Sandwich  Specialty    65
 20 Choice 123       White       White       Multigrain  Sandwich  Sandwich  Sandwich     71
 21 Fabulous Breads  White       Whole Wheat White       Sandwich  Specialty Specialty    71
 22 Five Chimneys    White       Rye         White       Specialty Sandwich  Specialty    75
 23 Gaia's Hearth    White       Whole Wheat Multigrain  Specialty Specialty Specialty    77
 24 Mill City Bakers White       Rye         Multigrain  Sandwich  Sandwich  Sandwich     66
 25 Owasco Ovens     Whole Wheat Whole Wheat Whole Wheat Specialty Sandwich  Specialty    72
 26 RiseNShine Bread White       White       Multigrain  Sandwich  Specialty Sandwich     56

Obs mincal_2  mincal_3  wherefiber_1      wherefiber_2      wherefiber_3       flourfiber_1

 15    81        81     Aunt Sal Bakes    Aunt Sal Bakes    Aunt Sal Bakes     Multigrain
 16    73        94     Demeter           Demeter           Demeter            Rye
 17    82        85     Downtown Bakers   Downtown Bakers   Downtown Bakers    Whole Wheat
 18    76        80     Pain du Prairie   Pain du Prairie   Pain du Prairie    Whole Wheat
 19    82        84     BBB Brands        BBB Brands        BBB Brands         Whole Wheat
 20    71        76     Choice 123        Choice 123        Choice 123         Multigrain
 21    74        77     Fabulous Breads   Fabulous Breads   Fabulous Breads    Multigrain
 22    77        82     Five Chimneys     Five Chimneys     Five Chimneys      Whole Wheat
 23    80        81     Gaia's Hearth     Gaia's Hearth     Gaia's Hearth      Rye
 24    71        77     Mill City Bakers  Mill City Bakers  Mill City Bakers   Multigrain
 25    75        79     Owasco Ovens      Owasco Ovens      Owasco Ovens       Whole Wheat
 26    74        76     RiseNShine Bread  RiseNShine Bread  RiseNShine Bread   Multigrain

Obs flourfiber_2 flourfiber_3 typefiber_1 typefiber_2 typefiber_3 maxfiber_1 maxfiber_2
 maxfiber_3

 15 Rye          Whole Wheat  Sandwich    Specialty   Sandwich      3.9       3.5       2.4
 16 Whole Wheat  Oatmeal      Specialty   Specialty   Sandwich      4.2       3.9       3.9
 17 Whole Wheat  Rye          Specialty   Sandwich    Specialty     4.3       3.0       3.0
 18 Whole Wheat  Whole Wheat  Specialty   Sandwich    Specialty     4.4       2.9       2.8
 19 Rye          Rye          Sandwich    Specialty   Sandwich      2.6       2.0       1.5
 20 Multigrain   Whole Wheat  Specialty   Sandwich    Sandwich      2.3       2.0       1.8
 21 Whole Wheat  Rye          Specialty   Sandwich    Specialty     3.2       3.1       3.1
 22 Rye          Oatmeal      Specialty   Specialty   Sandwich      3.8       3.5       3.3
 23 Oatmeal      Whole Wheat  Specialty   Sandwich    Sandwich      3.6       3.5       2.3
 24 Multigrain   Whole Wheat  Specialty   Sandwich    Sandwich      3.1       2.9       2.6
 25 Whole Wheat  Rye          Specialty   Specialty   Specialty     4.8       4.0       2.9
 26 Whole Wheat  White        Specialty   Sandwich    Specialty     3.1       2.0       1.7

Creating Categories for Analysis with PROC MEANS

PROC MEANS provides several ways to define categories in generating statistics tables. The remaining topics in this section describe some of the ways you can create categories in PROC MEANS.

When you save PROC MEANS results in a data set, you can then go on to produce customized reports using other SAS procedures, DATA steps, and ODS. Understanding your choices for defining categories for analysis will help you more efficiently use PROC MEANS to create data sets that can in turn be used to generate your final reports.

Example 6.15 uses DATA steps and ODS features to produce a report summarizing some of the information saved in the output data set created by PROC MEANS in this example.

Comparing the BY Statement and CLASS Statement in PROC MEANS

The two main ways to tell PROC MEANS you want your analyses done by categories are the BY statement and the CLASS statement.

When you use a BY statement, you request a separate analysis of each BY group. Your analysis data set must be sorted or indexed by the variables on the BY statement, or you must add the NOTSORTED option to the BY statement in the PROC MEANS step.

When you use the CLASS statement, you specify the variables whose values define the categories for the analysis. If the PROC MEANS step includes no other statements or options associated with the CLASS statement, the categories are defined by all the possible combinations of the CLASS variables, with all CLASS variables represented in each category. For example, if you have four CLASS variables, the categories for analysis are all the four-way combinations of the values of the four variables.

You do not have to sort or index your analysis data set by the CLASS variables before executing a PROC MEANS step that contains a CLASS statement, but no BY statement.

If you have several CLASS variables or your CLASS variables have many values, your report can be quite lengthy and may require additional computing resources because of the complexity. PROC MEANS must keep a copy of each unique value of each CLASS variable in memory. So in situations of limited computing resources, you may want to change some of your CLASS variables to BY variables. Also, you can adjust the PROC MEANS option SUMSIZE= to provide potentially more memory for your PROC MEANS step.

You can use a BY statement and a CLASS statement in the same PROC MEANS step. When you do, SAS analyzes the data by each BY group and applies the CLASS statement to each BY group.

Taking Advantage of the CLASS Statement in PROC MEANS

Several statements and options can be used in conjuction with the CLASS statement. These features allow you to produce tables of statistics for specific combinations of the CLASS variables as well as for specific values of the CLASS variables. You can also create output data sets that contain only these specific tables.

Understanding the choices in how to define categories will help you more efficiently focus only on the tables and categories that you need.

The BREAD data set contains two values for SOURCE, twelve values for BRAND, two values for TYPE when the TYPE format has been applied, and five values for FLOUR. To obtain the statistics shown in the preceding PROC MEANS report, without using the TYPES statement, add the PRINTALLTYPES to the PROC MEANS statement, and delete the TYPES statement. A modified version of the featured PROC MEANS step that does this follows:

proc means data=bread n mean min max maxdec=2 fw=7
           nonobs printalltypes;

  title 'Nutritional Information about Breads Available
         in the Region';
  title2 'Values Per Bread Slice, Calories in kcal,
          Fiber in Grams';

  class source brand flour type;

  var calories dietary_fiber;
  output out=breadstats
         idgroup(min(calories) out[3]
           (brand flour type calories)=
            wherecal flourcal typecal mincal)
         idgroup(max(dietary_fiber) out[3]
           (brand flour type dietary_fiber)=
            wherefiber flourfiber typefiber maxfiber);

  label calories=' '
        dietary_fiber=' ';

  format type $type.;
run;

The PROC MEANS step above produces tables for all one-way, two-way, three-way, and four-way combinations of the values of the four class variables. It also produces an overall table. This yields sixteen tables when the original request as specified on the TYPES statement specified six: five requests with classification variables plus one overall table. The output data set created by the PROC MEANS step that computes all sixteen tables contains 405 observations, whereas the output data set from the PROC MEANS step that computes the six specific tables contains 26 observations.

The four-way table alone in the PROC MEANS step that includes the PRINTALLTYPES option could generate statistics for a maximum of 240 categories: 2 sources*12 brands*2 formatted values of type*5 flours=240. (The BREAD data set, however, does not contain data for all combinations. The total number of categories in the four-way table for this data set is 89.)

Table 3.2a illustrates several ways to define categories when you use the CLASS statement in combination with other PROC MEANS features.

Table 3.2a. Combining the CLASS Statement with PROC MEANS to Define Categories
PurposeSpecificationDescription and Example
Produce a table of statistics for the most complete combination of the CLASS variablesAdditional statement: none

PROC MEANS statement options: none

CLASS statement options: none
proc means data=bread;
  class source brand flour type;
  var calories dietary_fiber;
run;

Result: Table with statistics for each combination of the values of the four variables taken four variables at a time.

Restrictions: The one-, two-, and three-way combinations are not evaluated.

If many CLASS variables exist and/or the CLASS variables have many values, the report can be lengthy and can require additional computing resources.
Produce tables of statistics for all possible combinations of the CLASS variables at all levels, plus an overall tableAdditional statement: none

PROC MEANS statement options: PRINTALLTYPES

CLASS statement options: none
proc means data=bread printalltypes;
  class source brand flour type;
  var calories dietary_fiber;
run;

Result: Tables with statistics for each combination of the values of the four variables taken one variable at a time, two variables a time, three variables at a time, and four variables at a time. Also includes a table of overall statistics.

Restrictions: If many CLASS variables exist and/or the CLASS variables have many values, the report can be lengthy and can require additional computing resources.
Produce tables for selected combinations of the CLASS variablesAdditional statement: TYPES

PROC MEANS statement options: none

CLASS statement options: none
proc means data=bread;
  class source brand flour type;
  types () type flour source source*type
    source*brand;
  var calories dietary_fiber;
run;

Result: Six tables of statistics: overall; three 1-way, and two 2-way. The one-way tables: categories defined by TYPE; by FLOUR; and by SOURCE. The two-way tables: categories defined by the combinations of SOURCE and TYPE; and by SOURCE and BRAND.

Note: The text () specifies the computation of an overall summary.
Produce tables for selected levels of combinations of the CLASS variablesAdditional statement: WAYS

PROC MEANS statement options: none

CLASS statement options: none
proc means data=bread;
  class source brand flour type;
  ways 2 4;
  var calories dietary_fiber;
run;

Result: Seven tables of statistics: six 2-way tables and one 4-way table. The 2-way tables: categories defined by the combination of SOURCE and BRAND; SOURCE and FLOUR; SOURCE and TYPE; BRAND and FLOUR; BRAND and TYPE; and FLOUR and TYPE. The 4-way table: categories defined by the combination of the values of SOURCE, BRAND, FLOUR, and TYPE.

Restrictions: If many CLASS variables exist and/or the CLASS variables have many values, the report can be lengthy and can require additional computing resources.
Produce a table of statistics for the most complete combination of the CLASS variables. Included in the output are the categories defined in the CLASSDATA= data set, even if the categories are not present in the analysis data set. These categories have an N of 0.Additional statement: None

PROC MEANS statement options: CLASSDATA=

CLASS statement options: none
proc means data=bread;
  class source brand flour type;
  ways 2 4;
  var calories dietary_fiber;
run;

PROC  PRINT  of  data  set  COMPLEX:
 Obs    source       flour
  1     Grocery    Whole Wheat
  2     Bakery     Whole Wheat
  3     Other      Whole Wheat
  4     Grocery    Multigrain
  5     Bakery     Multigrain
  6     Other      Multigrain

Result: Twelve categories in the output. One table that contains all combinations of the values of SOURCE and FLOUR that are in the BREAD data set plus two additional rows. These two rows are SOURCE=’Other’ and FLOUR=’Whole Wheat’, and SOURCE=’Other’ and FLOUR=’Multigrain’. These two rows have an N of 0, because of no representation in the BREAD data set.

Restrictions: Must include all CLASS variables in the CLASSDATA= data set.

CLASS variables in the CLASSDATA= data set must be defined exactly as they are in the analysis data set.
Produce a table of statistics for all possible combinations of CLASS variables, including those in the analysis data set as well as those represented in the CLASSDATA= data set, even if the categories are not present in the analysis data set. Categories without representation in the analysis data set have an N of 0.Additional statement: none

PROC MEANS statement options: CLASSDATA=, COMPLETETYPES

CLASS statement options: none
proc means data=bread classdata=complex
     completetypes;
  class source flour;
  var calories dietary_fiber;
run;

This example uses the same COMPLEX data set as listed in the previous row in this table.

Result: Fifteen categories in the output. One table that contains all the combinations of the values of SOURCE and FLOUR that are present in the BREAD data set and in the COMPLEX data set (3 values of SOURCE and 5 values of FLOUR=15 categories). The five rows where SOURCE=’Other’ have an N of 0.

Restrictions: Must include all CLASS variables in the CLASSDATA= data set.

CLASS variables in the CLASSDATA= data set must be defined exactly as they are in the analysis data set.

If specifications in the CLASSDATA= data set are not in the analysis data set, there are potentially many rows in the table with no statistics other than N=0. This example has five rows with N=0.
Produce a table of statistics for the most complete combination of the CLASS variables for which categories are defined in the CLASSDATA= data set, even if the categories are not present in the analysis data set. These categories have an N of 0. Do not include categories that are only in the analysis data set and not in the CLASSDATA= data set.Additional statement: none

PROC MEANS statement options: CLASSDATA=, EXCLUSIVE

CLASS statement options: none
proc means data=bread classdata=complex
     exclusive;
  class source flour;
  var calories dietary_fiber;
run;

This example uses the same COMPLEX data set as used in the previous two rows of this table.

Result: Six categories in one table in the output. The table contains only the combinations of the values of SOURCE and TYPE that are present in the CLASSDATA= data set. Combinations that are present only in the analysis data set are excluded from the results. Two categories are generated for each of the three values of SOURCE that are present in COMPLEX, one for TYPE=’Whole Wheat’ and one for TYPE=’Multigrain’. The two rows for SOURCE=’Other’ have an N of 0.

Restrictions: Must include all CLASS variables in the CLASSDATA= data set.

CLASS variables in the CLASSDATA= data set must be defined exactly as they are in the analysis data set.
Produce tables for all combinations of the CLASS variables as specified by the formats associated with them, even if the combinations are not in the analysis data setAdditional statement: FORMAT, if formats not already assigned to CLASS variables

PROC MEANS statement options: COMPLETETYPES

CLASS statement options: PRELOADFMT
proc format;
  value $texture
        'Multigrain','Oatmeal','Whole
         Wheat'='Whole Grain'
        'White','Bleached White'='Refined'
        'Garbanzo','Soy'='Beans';
run;

proc means data=bread completetypes;
  class flour / preloadfmt;
  var calories dietary_fiber;
  format flour $texture.;
run;

Result: Four categories in one table in the output. The table contains the categories in the variable FLOUR as formatted by $TEXTURE. Values not represented in $TEXTURE are included in the analysis as their unformatted values. Values that are in $TEXTURE, but not in the BREAD data set, are included in the analysis. The report will show a row for “Beans” with N=0, because there are no products in the BREAD data set whose primary flour ingredient is garbanzo bean flour or soy flour. The report will show a row for “Rye,” the one value in the BREAD data set that is not included in $TEXTURE.

Restrictions: Need to define formats for the CLASS variables before the PROC MEANS step.
Produce tables for all combinations of the CLASS variables as specified by the formats associated with them. Omit combinations that are in the associated format, but not in the data set. Omit combinations that are in the data set and not in the associated format.Additional statement: FORMAT, if formats not already assigned to CLASS variables

PROC MEANS statement options: none

CLASS statement options: PRELOADFMT, EXCLUSIVE
proc format;
  value $texture
        'Multigrain','Oatmeal','Whole
         Wheat'='Whole Grain'
        'White','Bleached White'='Refined'
        'Garbanzo','Soy'='Beans';
run;

proc means data=bread;
  class flour / preloadfmt exclusive;
   var calories dietary_fiber;
   format type $texture.;
run;

Result: One table that contains the categories in the variable FLOUR as formatted by $TEXTURE and restricted by the EXCLUSIVE option: one category for “Whole Grain” and one for “Refined.” The one value of FLOUR not represented in $TEXTURE ( “Rye”) is not included in the report. The one formatted value in $TEXTURE (“Beans”) that does not have any corresponding values in FLOUR is not included in the report.

Restrictions: Need to define formats for the CLASS variables before the PROC MEANS step.

Note: This discussion places the EXCLUSIVE option on the CLASS statement. If you place the EXCLUSIVE option on the PROC MEANS statement instead, your report will contain the same results plus a row for “RYE.”

Table 3.2a does not describe the process of defining categories in which some of the values are represented multiple times. You can do this by creating multilabel formats with PROC FORMAT. See Example 3.7 for an example of applying multilabel formats.

Where to Go from Here

PROC FORMAT reference, usage information, and additional examples. See “The FORMAT Procedure” in the “Procedures” section of Base SAS 9.1 Procedures Guide.

PROC MEANS reference, usage information, and additional examples. See “The MEANS Procedure” in the “Procedures” section of Base SAS 9.1 Procedures Guide.

PROC TABULATE reference, usage information, and additional examples. See, “The TABULATE Procedure” in the “Procedures” section of Base SAS 9.1 Procedures Guide.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.188.189.138