Using the LSMEANS Statement to Analyze Data from Unbalanced Designs

So far, the examples presented in this chapter are what are known as balanced designs, meaning that the same number of observations or participants have been assigned to each group or cell. Of the 18 children in the aggression study (both examples), six were assigned to each of the three treatment conditions (i.e., 0 grams of sugar, 20 grams of sugar, and 40 grams of sugar). When a research design is balanced, it is generally appropriate to use the MEANS statement with PROC GLM.

In contrast, a research design is said to be unbalanced if the number of observations per cell are unequal. In the aggression study, for instance, if 8 children were assigned to the first treatment condition (0 grams of sugar at lunch) and 5 children were assigned to both of the remaining cells (i.e., 20 grams of sugar and 40 grams of sugar), the design would then be unbalanced.

When analyzing data from an unbalanced research design, it is preferable to use the LSMEANS statement in your program rather than the MEANS statement. This is because the LSMEANS statement estimates the marginal means over a balanced population. In other words, LSMEANS estimates what the marginal means would be if you did have equal cell sizes. The marginal means estimated by the LSMEANS statement are less likely to be biased. (“LSMEANS” stands for “least-squares” means.)

Writing the LSMEANS Statements

Below is the syntax for the PROC GLM step of a SAS program that uses the LSMEANS statement rather than the MEANS statement:

PROC GLM DATA=filename;
   CLASS    predictor-variable;
   MODEL    criterion-variable = predictor-variable;
   LSMEANS  predictor-variable / PDIFF ADJUST=TUKEY;
   LSMEANS  predictor-variable;
RUN;

The preceding syntax is very similar to the syntax that used the MEANS statement presented earlier in this chapter. The primary difference is that the two MEANS statements have been replaced with two LSMEANS statements. In addition, the PDIFF command requests SAS to print p values for the significance tests related to the multiple comparison procedure. (These p values tell you whether or not there are significant differences between least-squares means for the different levels of the predictor variable.) The ADJUST=TUKEY command requests a multiple comparison adjustment for the p values for differences between least-squares means.

The Actual SAS Statements

Below are the actual statements that you would include in a SAS program to request a one-way ANOVA using LSMEANS rather than the MEANS statement. These statements could be used to analyze data from the studies described in this chapter. As both our examples have equal numbers of participants in each cell, the SAS output will not be computed here (and the associated output would be virtually identical given that both are balanced designs).

PROC GLM DATA=D1;
   CLASS   REWARDGROUP;
   MODEL   COMMITMENT = REWARDGROUP;
   LSMEANS REWARDGROUP / PDIFF ADJUST=TUKEY;
   LSMEANS REWARDGROUP;
RUN;

Discussion of the use of the LSMEANS and the associated SAS program is provided for reference should you encounter instances in which you have unequal numbers of observations.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
13.59.234.214