Chapter 6 Understanding Linear Models Concepts

6.1 Introduction

6.2 The Dummy-Variable Model

6.2.1 The Simplest Case: A One-Way Classification

6.2.2 Parameter Estimates for a One-Way Classification

6.2.3 Using PROC GLM for Analysis of Variance

6.2.4 Estimable Functions in a One-Way Classification

6.3 Two-Way Classification: Unbalanced Data

6.3.1 General Considerations

6.3.2 Sums of Squares Computed by PROC GLM

6.3.3 Interpreting Sums of Squares in Reduction Notation

6.3.4 Interpreting Sums of Squares in the -Model Notation

6.3.5 An Example of Unbalanced Two-Way Classification

6.3.6 The MEANS, LSMEANS, CONTRAST, and ESTIMATE Statements in a Two-Way Layout

6.3.7 Estimable Functions for a Two-Way Classification

6.3.7.1 The General Form of Estimable Functions

6.3.7.2 Interpreting Sums of Squares Using Estimable Functions

6.3.7.3 Estimating Estimable Functions

6.3.7.4 Interpreting LSMEANS, CONTRAST, and ESTIMATE Results Using Estimable Functions

6.3.8 Empty Cells

6.4 Mixed-Model Issues

6.4.1 Proper Error Terms

6.4.2 More on Expected Mean Squares

6.4.3 An Issue of Model Formulation Related to Expected Mean Squares

6.5 ANOVA Issues for Unbalanced Mixed Models

6.5.1 Using Expected Mean Squares to Construct Approximate F-Tests for Fixed Effects

6.6 GLS and Likelihood Methodology Mixed Model

6.6.1 An Overview of Generalized Least Squares Methodology

6.6.2 Some Practical Issues about Generalized Least Squares Methodology

6.1 Introduction

The purpose of this chapter is to provide detailed information about how PROC GLM and PROC MIXED work for certain applications. Certainly, this is not a complete documentation. Rather, it provides enough information to you for a basic understanding and a basis for further reading.

Both GLM and MIXED utilize “dummy” variables, which are also called “indicator” variables in mathematics. They are created whenever a CLASS statement is specified. The primary distinction between GLM and MIXED in this regard is that MIXED separates the sets of dummy variables into a group for fixed effects and a group for random effects, whereas the primary computations of GLM use the dummy variables as representing fixed effects. The general linear model approach uses dummy variables in a regression model. Although this technique is useful in all situations, it is primarily applied to analysis of variance with unbalanced data, where the direct computation of sums of squares fails, and to analysis of covariance and associated techniques.

While the dummy variable approach is capable of handling a vast array of applications, it also presents some complications that must be overcome. Two of the principal complications regarding fixed effects are

❏ specifying model parameters and their estimates

❏ setting up meaningful combinations of parameters for testing and estimation.

Both of these are concerned with estimable functions. These complications must be dealt with in computer programs using general linear models. The purpose of this chapter is to explain, with the use of fairly simple examples, how the GLM procedure deals with the complications. A more technical description of GLM features is given in the SAS/STAT User’s Guide, Version 8, Volume 2.

This chapter describes the essence of general linear model and mixed-model computations. It is more or less self-contained, and you will notice some overlap with previous and subsequent chapters. In particular, the CONTRAST and ESTIMATE statements are discussed in Chapter 3, “Analysis of Variance for Balanced Data,” and the RANDOM statement is discussed in Chapter 4, “Analyzing Data with Random Effects.” This present chapter delves more deeply into some of the same topics. Section 6.2 provides essential concepts of using dummy variables in the context of a one-way classification. Section 6.3 does the same for a two-way classification with both factors fixed. Then Section 6.4 discusses technical issues for mixed models.

6.2 The Dummy-Variable Model

This section presents the analysis-of-variance model using dummy variables, methods for specifying model parameters, and the methods used by PROC GLM. For simplicity, an analysis-of-variance model with one-way classification that results from a completely randomized design illustrates the discussion. In application, however, such a structure might be adequately (and more efficiently) analyzed by using the ANOVA procedure (see Section 3.4, “Analysis of One-Way Classification of Data”).

6.2.1 The Simplest Case: A One-Way Classification

Data for the one-way classification consist of measurements classified according to a one-dimensional criterion. An example of this kind of structure is a set of student exam scores, where each student is taught by one of three teachers. The exam scores are thus grouped or classified according to TEACHER. The most straightforward model for data of this type is

yij = μi + εij

where

yij

represents the jth measurement in the ith group.

μi

represents the population mean for the ith group.

εij

represents the random error with mean=0 and variance=σ2.

i = 1,..., t

where t equals the number of groups.

j = 1,..., ni

where ni equals the number of observations in the ith group.

This is called the means or μ -model because it uses the means μ1,..., μt, as the basic parameters in the mathematical expression for the model (Hocking and Speed 1975). The corresponding estimates of these parameters are

ˆμ1=ˉy1....ˆμt=ˉyt.μˆ1=ȳ 1....μˆt=ȳ t.

where ˉyi.=(Σjyij)/niy̅i.=(Σjyij)/ni is the mean of ni observations in group i.

In these situations, the statistical inference of interest is often about differences between the means of the form (μi − μi,) or between the means and some reference or baseline value μ. Therefore, many statistical textbooks present a model for the one-way structure that employs these differences as basic parameters. This is the familiar analysis-of-variance model illustrated in Section 2.3.4:

yij = μ + τi + εij

where μ equals the reference value and

τi = μi − μ

Thus, the means can be expressed as

μi = μ + τi

This relates the set of t means μi,..., μt, to a set of t+1 parameters, μ, τ1,..., τt. Therefore, this model is said to be overspecified. Consequently, the parameters, μ, τ1,..., τt are not well defined. For any set of values of μ1,..., μt, there are infinitely many choices for μ, τ1,..., τ1, which satisfy the basic equations μ1 = μ + τ1, i = 1,..., t. The choice may depend on the situation at hand, or it may not be necessary to fully define the parameters.

For the implementation of the dummy-variable model, the analysis-of-variance model

yij = μ + τi + εij

is rewritten as a regression model

yij = μ + τ1x1 + ... + τtxt + εij

where the dummy variables x1,...,xt are defined as follows:

x1 equals 1 for an observation in group 1 and 0 otherwise.
x2 equals 1 for an observation in group 2 and 0 otherwise.
·  
·  
·  
xt equals 1 for an observation in group t and 0 otherwise.

In matrix notation, the model equations for the data become

Y=[y11...y1n1y21...y2n2...yt1...ytnt]=[110...0............110...0101...0............101...0............100...1............100...1][β0β1...βt]+[ε11...ε1n1ε21...ε2n2...εt1...εtnt]=Xβ+εY=y11...y1n1y21...y2n2...yt1...ytnt=110...0............110...0101...0............101...0............100...1............100...1β0β1...βt+ε11...ε1n1ε21...ε2n2...εt1...εtnt=Xβ+ε

Thus, the matrices of the normal equations are

XX=[n.n1n2...ntn1n10...0n20n2...0............nt00...nt],XY=[Y..Y1.Y2....Yt.]XX=n.n1n2...ntn1n10...0n20n2...0............nt00...nt,XY=Y..Y1.Y2....Yt.

where Yi. and Y.. are totals corresponding to i. and ... The normal equations (XX) β̂ = XY are equivalent to the set

ˆμ+ˆτ1=ˉy1.ˆμ+ˆτ2=ˉy2....ˆμ+ˆτt=ˉyt.μˆμˆ...μˆ+++τˆ1=y¯1.τˆ2=y¯2.τˆt=y¯t.

Because there are only t- equations, there is no unique solution for the (t+1) estimates ˆμ,ˆτ1,...,ˆτt.μˆ,τˆ1,...,τˆt. Corresponding to this, the X′X matrix describing the set of normal equations is of dimension (t+1) x(t+1) and of rank t. In this model the first row of X′X is equal to the sum of the other t-rows. The same relationship exists among the columns of X′X. Therefore, X′X is said to be of less than full rank.

6.2.2 Parameter Estimates for a One-Way Classification

There are two popular methods for obtaining estimates with a less-than-full-rank model. Restrictions can be imposed on the parameters to obtain a full-rank model, or a generalized inverse of X′X can be obtained. PROC GLM uses the latter method. This section reviews both methods in order to put the approach used by PROC GLM into perspective.

The restrictions method is based on the fact that any definition of one of the parameters in the model (say the reference parameter) causes the other parameters to be uniquely defined. The definition can be restated in the form of a restriction. Another view of the term restriction is to define the parameters to have a unique interpretation. The corresponding estimates are then required to coincide with the definition of the parameters.

One type of restriction is to define one of the τi equal to 0, say τt = 0. In this case, becomes the mean of the tth group μτ = μ + τt = μ and τi becomes the difference between the mean for the ith group and the mean for the tth group, τi = μ − μ = μi − μt.

The corresponding restriction on the solution to the normal equations is to require ˆτi=0.τˆi=0. Requiring ˆτt=0.τˆt=0. leads automatically to a unique set of values for the remaining set of estimates ˆμ,ˆτ1,...,ˆτt1.μˆ,τˆ1,...,τˆt1.. This occurs because τtτt is dropped from the linear model. Consequently, the column corresponding to τt is dropped from the X matrix, producing the following model equation:

[y11...y1n1...yt1,1...yt1,nt1yt1...ytnt]=[11...0.........11...0.........10...1.........10...110...0.........10...0][μτ1...τt1]+[ε11...ε1n1...εt1,1...εt1,nt1εt1...εtnt]y11...y1n1...yt1,1...yt1,nt1yt1...ytnt=11...0.........11...0.........10...1.........10...110...0.........10...0μτ1...τt1+ε11...ε1n1...εt1,1...εt1,nt1εt1...εtnt

The solution to the corresponding normal equation (XX)ˆβ=XY(XX)βˆ=XY, where X′X is now nonsingular, results in

ˆμ=ˉyt.ˆτ1=ˉy1.ˉyt.ˆτ2=ˉy2.ˉyt....ˆτ(t1)=ˉy(t1).ˉyt.μˆ=y¯t.τˆ1=y¯1.y¯t.τˆ2=y¯2.y¯t....τˆ(t1)=y¯(t1).y¯t.

Another approach defines μ to be equal to the mean of μ1, μ2,..., μt; — that is, μ = (μ1 + μ2 +...+ μt)/t. Then μ is called the grand mean and the τi are called the group effects. From this definition of μ, it follows that Σi τi = 0. Consequently,

τ1 = − τ1 − τ2 −…− τt−1

Therefore, observations ytj = μ + τt + εij in the tth group can be written

ytj = μ − τ1−τ2−…−τt−1ij

The parameter τt is dropped from the model, which now becomes

[y11y1n1yt1,1yt1,nt1yt1ytnt]=[110011001001100111111111][μτ1τ2τt1]+[ε11ε1n1εt1,1εt1,nt1εt1εtnt]y11y1n1yt1,1yt1,nt1yt1ytnt=110011001001100111111111μτ1τ2τt1+ε11ε1n1εt1,1εt1,nt1εt1εtnt

The solution to the corresponding normal equation yields

ˆμ=(ˉy1.+...+ˉyt.)/tˆτ1=ˉy1.ˉy..ˆτ2=ˉy2.ˉy.....ˆτt1=ˉy(t1).ˉy..μˆ=(y¯1.+...+y¯t.)/tτˆ1=y¯1.y¯..τˆ2=y¯2.y¯.....τˆt1=y¯(t1).y¯..

and the implementation of the condition τ1=τ1τ2τt1τ1=τ1τ2τt1 yields

ˆτt=ˉyt.ˉy..τˆt=y¯t.y¯..

The use of generalized inverses and estimable functions may be preferable for a variety of reasons. In the restrictions method, it might not be clear which particular restriction is desired. In cases of empty cells in multiway classifications, it can be difficult to define the parameters. In fact, it is often hard to identify the empty cells in large, multiway classifications, let alone to define a set of parameters that adequately describe all pertinent effects and interactions. The generalized-inverse approach partially removes the burden of defining parameters from the data analyst.

Section 2.4.4, “Using the Generalized Inverse,” showed that there is no unique solution to a system of equations with a less-than-full-rank coefficient matrix and introduced the generalized inverse to obtain a nonunique solution. Although the set of parameter estimates produced using the generalized inverse is not unique, there is a class of linear functions of parameters called estimable functions for which unique estimates do exist. For example, the function (τi − τj) is estimable: its least-squares estimate is the same regardless of the particular solution obtained for the normal equations. For a discussion of the definition of estimable functions as it relates to the theory of linear models, see Graybill (1976) or Searle (1971).

PROC GLM uses a generalized inverse to obtain a solution that produces one set of estimates The technique, in some respects, is parallel to using a set of restrictions that set some of the parameter estimates to 0 Quantities to be estimated or comparisons to be made are specified, and PROC GLM determines whether or not the estimates or comparisons represent estimable functions PROC GLM then provides estimates, standard errors, and test statistics.

For certain applications, there is more than one set of hypotheses that can be tested To cover these situations, PROC GLM provides four types of sums of squares and associated F-statistics and also gives additional information to assist in interpreting the hypotheses tested.

6.2.3 Using PROC GLM for Analysis of Variance

Using PROC GLM for analysis of variance is similar to using PROC ANOVA; the statements listed for PROC ANOVA in Section 3.3.2, “Using the ANOVA and GLM Procedures,” are also used for PROC GLM. In addition to the statements listed for PROC ANOVA, the following SAS statements can be used with PROC GLM:

CONTRAST ‘label’ effect values< . . . effect values> < / options>;
ESTIMATE 'label’ effect values< . . . effect values> < / options>;
ID variables;
LSMEANS effects< / options>;
OUTPUT <OUT=SAS-data-set> keyword= names < . . . keyword=names>;
RANDOM effects< / options>;
WEIGHT variable;

The CONTRAST statement provides a way of obtaining custom hypotheses tests. The ESTIMATE statement can be used to estimate linear functions of the parameters. The LSMEANS (least-squares means) statement specifies effects for which least-squares estimates of means are computed. The uses of these statements are illustrated in Section 6.2.4, “Estimable Functions in the One-Way Classification,” and Section 6.3.6, “MEANS, LSMEANS, CONTRAST, and ESTIMATE Statements in the Two-Way Layout.” The RANDOM statement specifies which effects in the model are random (see Section 6.4.1, “Proper Error Terms”). When predicted values are requested as a MODEL statement option, values of the variable specified in the ID statement are printed for identification beside each observed, predicted, and residual value. The OUTPUT statement produces an output data set that contains the original data set values along with predicted and residual values. The WEIGHT statement is used when a weighted residual sum of squares is needed. For more information, refer to Chapter 24 in the SAS/STAT User’s Guide, Version 8, Volume 2.

Implementing PROC GLM for an analysis-of-variance model is illustrated by an example of test scores made by students in three classes taught by three different teachers. The data appear in Output 6.1.

Output 6.1 Data for One-Way Analysis of Variance

The SAS System
 
Obs    teach score1 score2
 
1    JAY 69 75
2    JAY 69 70
3    JAY 71 73
4    JAY 78 82
5    JAY 79 81
6    JAY 73 75
7    PAT 69 70
8    PAT 68 74
9    PAT 75 80
10    PAT 78 85
11    PAT 68 68
12    PAT 63 68
13    PAT 72 74
14    PAT 63 66
15    PAT 71 76
16    PAT 72 78
17    PAT 71 73
18    PAT 70 73
19    PAT 56 59
20    PAT 77 83
21    ROBIN 72 79
22    ROBIN 64 65
23    ROBIN 74 74
24    ROBIN 72 75
25    ROBIN 82 84
26    ROBIN 69 68
27    ROBIN 76 76
28    ROBIN 68 65
29    ROBIN 78 79
30    ROBIN 70 71
31    ROBIN 60 61

In terms of the analysis-of-variance model described above, the τj are the parameters associated with the different teachers (TEACH)—τ1 is associated with JAY, τ2 with PAT, and τ3 with ROBIN. The following SAS statements are used to analyze SCORE2:

proc glm;
   class teach;
   model score2=teach / solution xpx i;

In this example, the CLASS variable TEACH identifies the three classes. In effect, PROC GLM establishes a dummy variable (1 for presence, 0 for absence) for each level of each CLASS variable. In this example, the CLASS statement causes PROC GLM to create dummy variables corresponding to JAY, PAT, and ROBIN, resulting in the following X matrix:

INTERCEPT JAY PAT ROBIN

X=[μτ1τ2τ31100............11001010............10101001............1001]6 rows for Jay's group14 rows for Pat's group11 rows for Robin's groupX=μτ1τ2τ31100............11001010............10101001............10016 rows for Jay's group14 rows for Pat's group11 rows for Robin's group

Note that the columns for the dummy variables are in alphabetical order; the column positioning depends only on the values of the CLASS variable. For example, the column for JAY would appear after the columns for PAT and ROBIN if the value JAY were replaced by ZJAY.1

The MODEL statement has the same purpose in PROC GLM as it does in PROC REG and PROC ANOVA. Note that the MODEL statement contains the SOLUTION option. This option is used because PROC GLM does not automatically print the estimated parameter vector when a model contains a CLASS statement. The results of the SAS statements shown above appear in Output 6.2.

Output 6.2 One-Way Analysis of Variance from PROC GLM

The GLM Procedure
 
Dependent Variable: resista  
 
  Sum of  
  Source   DF Squares   Mean Square  F Value   Pr > F
  Model 2 49.735861 24.867930 0.56 0.5776
 
  Error 28   1243.941558 44.426484
 
  Corrected Total 30 1293.677419
 
R-Square Coeff Var Root MSE yield Mean
 
0.038445 9.062496 6.665320 73.54839
 
Source DF Type I SS Mean Square F Value Pr > F
 
teach 2 49.73586091 24.86793046 0.56 0.5776
 
Source DF Type III SS Mean Square F Value Pr > F
 
teach 2 49.73586091 24.86793046 0.56 0.5776
 
      Standard    
Parameter   Estimate   Error  t Value Pr > |t|
 
Intercept   72.45454545 B 2.00966945 36.05  <.0001
teach JAY 3.54545455 B 3.38277775  1.05  0.3036
teach PAT 0.90259740 B 2.68553376  0.34  0.7393
teach ROBIN 0.00000000 B  ⋅          ⋅    ⋅      
 
NOTE: The X'X matrix has been found to be singular, and a generalized inverse was used to solve the normal equations. Terms whose estimates are followed by the letter 'B' are not uniquely estimable.

The first portion of the output, as in previous examples, shows the statistics for the overall model.

The second portion partitions the model sum of squares (MODEL SS) into portions corresponding to factors defined by the list of variables in the MODEL statement. In this model there is only one factor, TEACH, so the Type I and Type III SS are the same as the MODEL SS. Type II and Type IV have no special meaning here and would be the same as Type I and Type III.

The final portion of the output contains the parameter estimates obtained with the generalized inverses. Specifying XPX and I in the list of options in the MODEL statement causes the X′X and (X′X) matrices to be printed. Results appear in Output 6.3.

Output 6.3 XX and (XX) Matrices for a One-Way Classification

The GLM Procedure
 
The X'X Matrix
 
  Intercept  teach JAY  teach PAT  teach ROBIN  score2  
 
Intercept 31 6 14 11 2280
teach JAY 6 6 0 0 456
teach PAT 14 0 14 0 1027
teach ROBIN 11 0 0 11 797
score2 2280 456 1027 797 168984
The SAS System 14:53 Wednesday, November 14, 2001 3
X'X Generalized Inverse (g2)
 
  Intercept  teach JAY  teach PAT  teach ROBIN  score2
 
Intercept  0.0909090909  -0.090909091  -0.090909091 0  72.454545455
teach JAY -0.090909091 0.2575757576 0.0909090909 0 3.5454545455
teach PAT -0.090909091 0.0909090909 0.1623376623 0 0.9025974026
teach ROBIN 0 0 0 0 0
score2 72.454545455 3.5454545455 0.9025974026 0 1243.9415584

For this example, the matrix X′Y is

X′Y=SCORE2totaloverallSCORE2totalforJAYSCORE2totalforPATSCORE2totalforROBIN=22804561027797XY=SCORE2SCORE2SCORE2SCORE2totaltotaltotaltotaloverallforforforJAYPATROBIN=22804561027797

Taking (X′X) from the PROC GLM output and using X′Y above, the solution ˆβ=(XX)XYβˆ=(XX)XY is

[ˆβ0ˆβ1ˆβ2ˆβ3]=[.0909.0909.0909.0000.0909.2575.0909.0000.0909.0909.1623.0000.0000.0000.0000.0000][22804561027797]=[72.453.540.900.00]βˆ0βˆ1βˆ2βˆ3=.0909.0909.0909.0000.0909.2575.0909.0000.0909.0909.1623.0000.0000.0000.0000.000022804561027797=72.453.540.900.00

As pointed out in Section 2.4.4., the particular generalized inverse used by PROC GLM causes the last row and column of (X′X)- to be set to 0. This yields a set of parameter estimates equivalent in this example to the set given by the restriction that τ3 = 0. Using the principles discussed in Section 6.2.2, “Parameter Estimates for a One-Way Classification,” it follows that the INTERCEPT μ is actually the mean for the reference group ROBIN. The estimate ˆτ1τˆ1 labeled JAY is the difference between the mean for Jay’s group and the mean for Robin’s group, and similarly, the estimate ˆτ2τˆ2 labeled PAT is the mean for Pat’s group minus the mean for Robin’s group. Finally, the estimate ˆτ3τˆ3 labeled ROBIN, which is set to 0, can be viewed as the mean for Robin’s group minus the mean for Robin’s group.

Remember that these estimates are not unique—that is, they depend on the alphabetical order of the values of the CLASS variable. This fact is recognized in the output by denoting the estimates as biased, which is explained in the note after the listing of estimates.

The other MODEL statement options (P, CLM, CLI, and TOLERANCE), as well as the BY, ID, WEIGHT, FREQ, and OUTPUT statements, are not affected by the use of CLASS variables and may be used as described in Section 2.2.4, “The SS1 and SS2 Options: Two Types of Sums of Squares” and Section 2.2.5, “Tests of Subsets and Linear Combinations of Coefficients.”

6.2.4 Estimable Functions in a One-Way Classification

It is often the case that the particular parameter estimates obtained by the SOLUTION option in PROC GLM are not the estimates of interest, or there may be additional functions of the parameters that you want to estimate. You can specify such other estimates with PROC GLM.

An estimable function is a member of a special class of linear functions of parameters (see Section 2.2.4). An estimable function of the parameters has a definite interpretation regardless of how the parameters themselves are specified. Denote with L a vector of coefficients (L1, L2,…, Lt, Lt + 1). Then = L1μ + L2τ1 +…+ Lt + 1 τt is a linear function of the model parameters and is estimable (for this example) if it can be expressed as a linear function of the means μ1,…,μt. Let β̂ be a solution to the normal equation. The function is estimated by Lβ̂, the corresponding linear function of the parameters. If is estimable, then Lβ̂ will have the same value regardless of the solution obtained from the normal equations. In the example,

ˆβ=ˆμˆτ1ˆτ2ˆτ3=72.4543.5450.9020.000INTERCEPTJAYPATROBINβˆ=μˆτˆ1τˆ2τˆ3=72.4543.5450.9020.000INTERCEPTJAYPATROBIN

To illustrate, define

L = [1 1 0 0]

Then Lˆβ=ˆμ+ˆτ1=ˆμ1=76.0,Lβˆ=μˆ+τˆ1=μˆ1=76.0, which is the estimate of the mean score of Jay's group Alternately, let

L = [0 + 1 −1 0]

Then Lβ̂ = (τ̂1 − τ̂2) = μ̂1 – μ̂2 = 2.643, the estimated mean difference between Jay’s and Pat’s groups. Because both of these are estimable functions, identical estimates would be obtained using a different generalized inverse—for example, if different names for the teachers changed the order of the dummy variables.

Variances of these estimates can be readily obtained with standard formulas that involve elements of the generalized inverse (see Section 2.2.4).

You can obtain the general form of the estimable functions with the E option in the MODEL statement

model score2 = teach / e ;

Output 6.4 shows you that L4, the coefficient or τ3, must be equal to L1 – L2 – L3. Equivalently, L1 = L2 + L3 + L4. That is, the coefficient on μ must be the sum of the coefficients on τ1, τ2 and τ3.

Output 6.4 Obtaining the General Form of Estimable Functions Using the E Option

The GLM Procedure
 
General Form of Estimable Functions
 
Effect   Coefficients
 
Intercept   L1
teach JAY L2
teach PAT L3
teach ROBIN L1-L2-L3

PROC GLM calculates estimates and variances for several special types of estimable functions with LSMEANS, CONTRAST, or ESTIMATE statements as well as estimates of user-supplied functions.

The LSMEANS statement produces the least-squares estimates of CLASS variable means—these are sometimes referred to as adjusted means. For the one-way structure, these are simply the ordinary means. In terms of model parameter estimates, they are ˆμ+ˆτiμˆ+τˆi. The following SAS statement lists the least-squares means for the three teachers for all dependent variables in the MODEL statement:

lsmeans teach / options;

The available options in the LSMEANS statement are

STDERR

prints the standard errors of each estimated least-squares mean and the t-statistic for a test of the hypothesis that the mean is 0

PDIFF

prints the p-values for the tests of equality of all pairs of CLASS means

E

prints a description of the linear function used to obtain each least-squares mean; this has importance in more complex situations

E=

specifies an effect in the model to use as an error term

ETYPE=

specifies the type (1, 2, 3, or 4) of the effect specified in the E= option

SINGULAR=

tunes the estimability checking

For more information, refer to Chapter 24 in the SAS/STAT User's Guide, Version 8, Volume 2. Output 6.5 shows results from the following SAS statement:

lsmeans teach / stderr pdiff;

Output 6.5 Results of the LSMEANS Statement

The GLM Procedure
Least Squares Means
 
  score2 Standard  LSMEAN
teach LSMEAN Error  Pr > |t| Number
 
JAY   76.0000000   2.7211053 <.0001 1
PAT 73.3571429 1.7813816 <.0001 2
ROBIN 72.4545455 2.0096694 <.0001 3
 
Least Squares Means for effect teach
Pr > |t| for H0: LSMean(i)=LSMean(j)
 
Dependent Variable: score2
 
i/j   1 2 3
 
1   0.4233 0.3036
2 0.4233   0.7393
3 0.3036 0.7393  

NOTE: To ensure overall protection level, only probabilities associated with pre-planned comparisons should be used.

The least-squares mean for JAY is computed as ˆμ+ˆτ1=7245+354μˆ+τˆ1=7245+354. Note that this linear function has coefficients L1=1, L2=1, L3=0 and L4=0, so it meets the estimability condition L1 = L2 + L3 + L4.

Least-squares means should not, in general, be confused with ordinary means, which are available with a MEANS statement. The MEANS statement produces simple, unadjusted means of all observations in each class or treatment. Except for one-way designs and some nested and balanced factorial structures that are normally analyzed with PROC ANOVA, these unadjusted means are generally not equal to the least-squares means. Note that for this example, the least-squares means are the same as the means obtained with the MEANS statement. (The MEANS statement is discussed in Section 3.4.2.)

A contrast is a linear function such that the elements of the coefficient vector sum to 0 for each effect. PROC GLM can be instructed to calculate a sum of squares and associated F-test due to one or more contrasts.

As an example, assume that teacher JAY used a special teaching method. You might then be interested in testing whether Jay’s students had mean scores different from the students of the other teachers, and whether PAT and ROBIN, using the same method, produced different mean scores. The corresponding contrasts are shown below:

Multipliers for TEACH

Contrast

JAY

PAT

ROBIN

JAY vs others

–2

+1

+1

PAT vs ROBIN

0

–1

+1

Taking β=(μ,τ1,τ2,τ3)β=(μ,τ1,τ2,τ3) the contrasts are

Lβ=2μ1+μ2+μ3(JAY vs others)=2τ1+τ2+τ3Lβ=2μ1+μ2+μ3(JAY vs others)=2τ1+τ2+τ3

and

Lβ=μ2+μ3(PAT vs ROBIN)=τ2+τ3Lβ=μ2+μ3(PAT vs ROBIN)=τ2+τ3

The corresponding CONTRAST statements are as follows:

contrast 'JAY vs others' teach -2 1 1;
contrast 'PAT vs ROBIN' teach 0 -1 1;

The results appear in Output 6.6.

Output 6.6 Results of the CONTRAST and ESTIMATE Statements

The GLM Procedure
 
Contrast DF Contrast SS Mean Square F Value Pr > F
 
JAY vs others 1 46.19421179 46.19421179 1.04 0.3166
PAT vs ROBIN 1 5.01844156 5.01844156 0.11 0.7393
    Standard    
Parameter Estimate Error t Value Pr > |t|
 
LSM JAY 76.0000000 2.72110530 27.93 <.0001
LSM PAT 73.3571429 1.78138157 41.18 <.0001
LSM ROBIN 72.4545455 2.00966945 36.05 <.0001

Keep the following points in mind when using the CONTRAST statement:

❏ You must know how many classes (categories) are present in the effect and in what order they are sorted by PROC GLM. If there are more effects (classes) in the data than the number of coefficients specified in the CONTRAST statement, PROC GLM adds trailing zeros. In other words, there is no check to see if the proper number of classes has been specified.

❏ The name or label of the contrast must be 20 characters or less.

❏ Available CONTRAST statement options are

    E prints the entire L vector.
    E=effect specifies an alternate error term.
    ETYPE=n specifies the type (1, 2, 3, or 4) of the E=effect.

❏ Multiple degrees-of-freedom contrasts can be specified by repeating the effect name and coefficients as needed. Thus, the statement

contrast 'ALL' teach -2 1 1, teach 0 -1 1;

produces a two DF sum of squares due to both contrasts. This feature can be used to obtain partial sums of squares for effects through the reduction principle, using sums of squares from multiple degrees-of-freedom contrasts that include and exclude the desired contrasts.

❏ If a non-estimable contrast has been specified, a message to that effect appears in the SAS log.

❏ Although only (t–1) linearly independent contrasts exist for t classes, any number of contrasts can be specified.

❏ The contrast sums of squares are not partial of (adjusted for) other contrasts that may be specified for the same effect (see the fourth point above).

❏ The CONTRAST statement is not available with PROC ANOVA; thus, the computational inefficiency of PROC GLM for analyzing balanced data may be justified if contrasts are required. However, contrast variables can be defined in a DATA step and estimates and statistics can be obtained by a full-rank regression analysis.

The ESTIMATE statement is used to obtain statistics for estimable functions other than least-squares means and contrasts, although it can also be used for these. For the current example, the ESTIMATE statement is used to re-estimate the least-squares means.

The respective least-squares means for JAY, PAT, and ROBIN estimate μ1=μ+τ1μ1=μ+τ1, μ2=μ+τ2μ2=μ+τ2, and μ3=μ+τ3μ3=μ+τ3. The following statements duplicate the least-squares means:

estimate 'LSM JAY' intercept 1 teach 1;
estimate 'LSM PAT' intercept 1 teach 0 1;
estimate 'LSM ROBIN' intercept 1 teach 0 0 1;

Note the use of the term INTERCEPT (referring to) and the fact that the procedure supplies trailing zero-valued coefficients. The results of these statements appear after the listing of parameter estimates at the bottom of Output 6.6 for convenient comparison with the results of the LSMEANS statement.

6.3 Two-Way Classification: Unbalanced Data

The major applications of the two-way structure are the two-factor factorial experiment and the randomized blocks. These applications usually have balanced data. In this section, the two-way classification with unbalanced data is explored. This introduces new questions, such as how means and sums of squares should be computed.

6.3.1 General Considerations

The two-way classification model is

yijk=μ+αi+βj+(αβ)ij+εijkyijk=μ+αi+βj+(αβ)ij+εijk

where

yijk

equals the kth observed score for the (i, j)th cell.

αi

equals the effect of the ith level of factor A.

βj

equals an effect of the jth level of factor B

(αβ)ij

equals the interaction effect for the ith level of factor A and the jth level of factor B.

εijk

equals the random error associated with individual observations.

The model can be defined without the interaction term when appropriate. Let nij denote the number of observations in the cell for level i of A and level j of B. If μij denotes the population cell mean for level i of A and level j of B, then

μij = μ + αi + βj + (αβ)ij

At this point, no further restrictions on the parameters are assumed.

The computational formulas for PROC ANOVA that use the various treatment means provide correct statistics for balanced data—that is, data with an equal number of observations (nij=n for all i, j) for each treatment combination. When data are not balanced, sums of squares computed by PROC ANOVA can contain functions of the other parameters of the model, and thereby produce biased results.

To illustrate the effects of unbalanced data on the estimation of differences between means and computation of sums of squares, consider the data in this two-way table:

B

1

2

A

1

7,9

5

2

8

4,6

Within level 1 of B, the cell mean for each level of A is 8—that is, ˉy11=(7+9)/2=y̅11=(7+9)/2= and ˉy21=8y̅21=8. Hence, there is no evidence of a difference between the levels of A within level 1 of B. Similarly, there is no evidence of a difference between levels of A within level 2 of B, because ˉy12=y̅12= and ˉy22=(4+6)/2=5y̅22=(4+6)/2=5. Therefore, you may conclude that there is no evidence in the table of a difference between the levels of A. However, the marginal means for A are

ˉy1..=(7+9+5)/3=7y̅1..=(7+9+5)/3=7

and

ˉy2..=(8+4+6)/3=6y¯2..=(8+4+6)/3=6

The difference of 7–6=1 between these marginal means may be erroneously interpreted as measuring an overall effect of the factor A. Actually, the observed difference between the marginal means for the two levels of A measures the effect of factor B in addition to the effect of factor A. This can be verified by expressing the observations in terms of the analysis-of-variance model yijk=μ+αi+βjyijk=μ+αi+βj. (For simplicity, the interaction nd error terms have been left out of the model.)

B

1

2

A

1

7 = μ + α1 + β1

5 = μ + α1 + β1

9 = μ + α1 + β2

2

4 = μ + α2 + β1

8 = μ + α2 + β2

6 = μ + α2 + β2

The difference between marginal means for A1 and A2 is shown to be

ˉy1..ˉy2..=(1/3)[(α1+β1)+(α1+β1)+(α1+β2)](1/3)[(α2+β1)+(α2+β2)+(α2+β2)]=(α1α2)+(1/3)(β1β2)y¯1..y¯2..==(1/3)[(α1+β1)+(α1+β1)+(α1+β2)](1/3)[(α2+β1)+(α2+β2)+(α2+β2)](α1α2)+(1/3)(β1β2)
H0:α1α2=0H0:α1α2=0

Thus, instead of estimating (α1α2)(α1α2), the difference between the marginal means of A estimates (α1α2)(α1α2) plus a function of the factor B parameters (β1β2)/3(β1β2)/3. In other words, the difference between the A marginal means is biased by factor B effects.

The null hypothesis about A that would normally be tested is

H0: α1 – α2 = 0

However, for this example, the sum of squares for A computed by PROC ANOVA can be shown to equal 3(ˉy1ˉy2)2/23(y̅1y̅2)2/2. Hence, the PROC ANOVA F-test for A actually tests the hypothesis

H0:(α1α2)+(β1β2)/3=0H0:(α1α2)+(β1β2)/3=0

which involves the factor B difference (β1β2)(β1β2) in addition to the factor A difference (α1α2)(α1α2).

In terms of the μ model yijk=μij+εijkyijk=μij+εijk, you usually want to estimate (μ11+μ12)/2(μ11+μ12)/2 and (μ21+μ22)/2(μ21+μ22)/2 or the difference between these quantities. However, the A marginal means for the example are

ˉy1=(2μ11+μ12)/3+ˉε1y̅1=(2μ11+μ12)/3+ε̅1.

and

ˉy2=(μ21+2μ22)/3+ˉε2y̅2=(μ21+2μ22)/3+ε̅2.

These means estimate 2(μ11+μ22)/32(μ11+μ22)/3 and (μ21+μ22)/3(μ21+μ22)/3, which are functions of the cell frequencies and might not be meaningful.

In summary, a major problem in the analysis of unbalanced data is the contamination of differences between factor means by effects of other factors. The solution to this problem is to adjust the means to remove the contaminating effects.

6.3.2 Sums of Squares Computed by PROC GLM

PROC GLM recognizes different theoretical approaches to analysis of variance by providing four types of sums of squares and associated statistics. The four types of sums of squares in PROC GLM are called Type I, Type II, Type III, and Type IV (SAS Institute). The four types of sums of squares are explained in general, conceptual terms, followed by more technical descriptions.

Type I sums-of-squares retain the properties discussed in Chapter 2, “Regression.” They correspond to adding each source (factor) sequentially to the model in the order listed. For example, the Type I sum of squares for the first factor listed is the same as PROC ANOVA would compute for that effect. It reflects differences between unadjusted means of that factor as if the data consist of a one-way structure. The Type I SS may not be particularly useful for analysis of unbalanced multiway structures but may be useful for nested models, polynomial models, and certain tests involving the homogeneity of regression coefficients (see Chapter 7, “Analysis of Covariance”). Also, comparing Type I and other types of sums of squares provides some information on the effect of the lack of balance.

Type II sums of squares are more difficult to understand. Generally, the Type II SS for an effect U, which may be a main effect or interaction, is adjusted for an effect V if and only if V does not contain U. Specifically, for a two-factor structure with interaction, the main effects, A and B, are not adjusted for the A*B interactions because the symbol A*B contains both A and B. Factor A is adjusted for B because the symbol B does not contain A. Similarly, B is adjusted for A, and the A*B interaction is adjusted for the two main effects.

Type II sums of squares for the main effects A and B are mainly appropriate for situations in which no interaction is present. These are the sums of squares presented in many major statistical textbooks. Their method of computation is often referred to as the method of fitting constants.

The Type II analysis relates to the following general guidelines often given in applied statistical texts. First, test for the significance of the A*B interaction. If A*B is insignificant, delete it from the model and analyze main effects, each adjusted for the other. If A*B is significant, then abandon main-effects analysis and focus your attention on simple effects.

Note that for full-rank regression models, the Type II sums of squares are adjusted for cross-product terms. This occurs because, for example,

y=β0+β1x1+β2x2+β3x1x2+εy=β0+β1x1+β2x2+β3x1x2+ε

where the product x1x2 is dealt with simply as another independent variable with no concept of order of the term.

The Type III sums of squares correspond to Yates’s weighted squares of means analysis. Their principal use is in situations that require a comparison of main effects even in the presence of interaction. Type III sums of squares are partial sums of squares. In this sense, each effect is adjusted for all other effects. In particular, main effects A and B are adjusted for the interaction A*B if all these terms are in the model. If the model contains only main effects, then Type II and Type III analyses are the same. See Steel and Torrie (1980), Searle (1971), and Speed et al. (1978) for further discussion of the method of fitting constants and the method of weighted squares of means.

The Type IV functions were designed primarily for situations where there are empty cells. The principles underlying the Type IV sums of squares are quite involved and can be discussed only in a framework using the general construction of estimable functions. It should be noted that the Type IV functions are not necessarily unique when there are empty cells, but the functions are identical to those provided by Type III when there are no empty cells.

You can request four sums of squares in PROC GLM as options in the MODEL statement. For example, the following SAS statement specifies the printing of Type I and Type IV sums of squares:

model . . . / ss1 ss4;

Any or all types may be requested. If no sums of squares are specified, PROC GLM computes the Type I and Type III sums of squares by default.

The next two sections interpret the different sums of squares in terms of reduction notation and the μ-model.

6.3.3 Interpreting Sums of Squares in Reduction Notation

The types of sums of squares can be explained in terms of the reduction notation that is developed for regression models in Chapter 2. This requires writing the model as a regression model using dummy variables, with certain restrictions imposed on the parameters to give them unique interpretation.

As an example, consider a 2×3 factorial structure with nij observations in the cell in row i, column j. The equation for the model is

yijk=μ+αi+βj+αβij+εijkyijk=μ+αi+βj+αβij+εijk

where i=1, 2, j=1, 2, 3, and k=1, . . . , nij. Assume nij>0 for all i, j. An expression of the form R(α|μ,β)R(α|μ,β) means the same as R(α1,α2|μ,β1,β2,β3)R(α1,α2|μ,β1,β2,β3). The sums of squares printed by PROC GLM can be interpreted in reduction notation most easily under the restrictions

Σiαi=Σjβj=Σiαβij=Σjαβij=0Σiαi=Σjβj=Σiαβij=Σjαβij=0   (6.1)

that is, by taking an X matrix with full-column rank given by

X=μα1β1β2αβ11αβ12[111010..................111010110101..................110101111111..................111111111010..................111010110101..................110101111111..................111111]n11 rows for observations in cell 11n12 rows for observations in cell 12n13 rows for observations in cell 13n21 rows for observations in cell 21n22 rows for observations in cell 22n23 rows for observations in cell 23X=μα1β1β2αβ11αβ12111010..................111010110101..................110101111111..................111111111010..................111010110101..................110101111111..................111111n11 rows for observations in cell 11n12 rows for observations in cell 12n13 rows for observations in cell 13n21 rows for observations in cell 21n22 rows for observations in cell 22n23 rows for observations in cell 23

With this set of restrictions or definitions of the parameters, the sums of squares that result from the following MODEL statement are summarized below:

model y=a b a*b / ss1 ss2 ss3 ss4;

Effect Type I Type II Type III = Type IV

A

R(α|μ)R(α|μ) R(α|μ,β)R(α|μ,β) R(α|μ,β,αβ)R(α|μ,β,αβ)

B

R(β|μ,α)R(β|μ,α) R(β|μ,α)R(β|μ,α) R(β|μ,α,αβ)R(β|μ,α,αβ)

A*B

R(αβ|μ,α,β)R(αβ|μ,α,β) R(αβ|μ,α,β)R(αβ|μ,α,β) R(αβ|μ,α,β)R(αβ|μ,α,β)

You should be careful when using reduction notation with less-than-full-rank models. If no restrictions had been specified on the model for the two-way structure above, then R(α|μ,β,αβ)R(α|μ,β,αβ) because the columns of the X matrix corresponding to the αi would be linearly dependent on the columns corresponding to μ and the αβij.

In addition, the dependence of reduction notation on the restrictions imposed cannot be overemphasized. For example, imposing the restriction

α2 = β3 = αβ21 = αβ22 = αβ13 = αβ23 = 0    (6.2)

results in a different value for R(α | μ, β, αβ). Although the restrictions of equation (6.6) are those that correspond to the sums of squares computed by PROC GLM, the restrictions of equation 6.2 are those that correspond to the (biased) parameter estimates computed by PROC GLM.

There is a relationship between the four types of sums of squares and the four types of data structures in a two-way classification. The relationship derives from the principles of adjustment that the sums-of-squares types obey. Letting nijdenote the number of observations in level i of factor A and level j of factor B, the four types of data structures are

❏ equal cell frequencies: nij=common value for all i, j

❏ proportionate cell frequencies: nij/ nil= nkj/ nkl for all i, j, k, l

❏ disproportionate, nonzero cell frequencies: nij/nil=nkj / nkl for some i, j, k, l, but nij>0 for all i, j

❏ empty cell(s): nij=0 for some i, j.

The display below shows the relationship between sums-of-squares types and data structure types pertaining to the following MODEL statement:

model y=a b a*b / ss1 ss2 ss3 ss4;

For example, writing III=IV indicates that Type III is equal to Type IV.

Data Structure Type

1 2 3 4
(Equal nij) (Proportionate nij) (Disproportionate,
nonzero nij)
(Empty Cell)

A

I=II=III=IV

I=II, III=IV

III=IV

B

I=II=III=IV

I=II, III=IV

I=II, III=IV

I=II

A*B

I=II=III=IV

I=II=III=IV

I=II=III=IV

I=II=III=IV

6.3.4 Interpreting Sums of Squares in the μ-Model Notation

The μ model for the two-way structure takes the form

yijk = μij + εijk    (6.3)

The parameters of the model relate to the parameters of the standard analysis-of-variance model according to the equation

μij = μ + αi + βj + αβij

This relation holds regardless of any restriction that may be imposed upon the α, β, and αβ parameters. The advantage of using the -model notation over standard analysis-of-variance notation is that all of the μij parameters are clearly defined without specifying restrictions; thus, a hypothesis stated in terms of the μij can be easily understood.

Speed et al. (1978) give interpretations of the different types of sums of squares (I, II, III, and IV) computed by PROC GLM using the -model notation. It is assumed that all nij>0, making Type III equal to Type IV.

Using their results, the sums of squares obtained from the following MODEL statement are expressed in terms of the μij as given in Table 6.1.

model response = a b a*b / ss1 ss2 ss3 ss4;

Table 6.1 Interpretation of Sums of Squares in the -Model Notation

TypeEffect
 Effect A
Ij n1j)/n1. = ... = (Σj najμaj)/na
IIΣjn1jμ1j=ΣiΣj(n1jnijμij/n.j),...,Σjnajμaj=ΣiΣj(najnijμij/n.j)Σjn1jμ1j=ΣiΣj(n1jnijμij/n.j),...,Σjnajμaj=ΣiΣj(najnijμij/n.j)
III & IVμ11+...+μ1b=...=μa1+...+μabμ11+...+μ1b=...=μa1+...+μab
that is, ˉμ1.=...=ˉμa.μ¯¯1.=...=μ¯¯a.
where ˉμi.=Σjμij/bμ¯¯i.=Σjμij/b
 Effect B
I & IIΣini1μi1=ΣiΣj(ni1nijμij/ni.,...,Σinibμib=ΣiΣj(nibnijμij)/ni.)Σini1μi1=ΣiΣj(ni1nijμij/ni.,...,Σinibμib=ΣiΣj(nibnijμij)/ni.)
III & IVμ11+...+μa1=...μ1b+...+μabμ11+...+μa1=...μ1b+...+μab
that is, ˉμ.1=...=ˉμ.bμ¯¯.1=...=μ¯¯.b
where ˉμij=Σiμij/aμ¯¯ij=Σiμij/a
 Effect A*B
I, II, III, IVμij − μim − μlj + μlm = 0 for all i, j, l, m

Table 6.1 shows that the tests can be expressed in terms of equalities of weighted cell means, only some of which are easily interpretable. Considering the Type I A effect, the weights nij/ni.attached to μij are simply the fraction of the ni. observations in level i of A that were in level j of B. If these weights reflect the distribution across the levels of B in the population of units in level i of A, then the Type I test may have meaningful interpretation. That is, suppose the population of units in level i of A is made up of a fraction ρi1 of units in level 1 of B, of a fraction ρi2 of units in level 2 of B, and so on, where ρi1 +…+ρib = 1. Then it may be reasonable to test

H0 : Σjρ1jμ1j = ... = Σjρajμaj

which would be the Type I test in case nij / ni. = ρij..

Practical interpretation of the Type II weights is more difficult—refer to Section 6.3.7.2, “Interpreting Sums of Squares Using Estimable Functions.” Recall that the Type II tests are primarily for main effects with no interaction. You can see from Table 6.1 that the Type II hypothesis clearly depends on the nij, the numbers of observations in the cells.

The interpretation of Type III and Type IV tests is clear because all weights are unity. When the hypotheses are stated in terms of the -model, the benefit of the Type III test is more apparent because the Type III hypothesis does not depend on the nij, the numbers of observations in the cells. Type I and Type II hypotheses do depend on the nij, and this may not be desirable.

For example, suppose a scientist sets up an experiment with ten plants in each combination of four levels of nitrogen (N) and three levels of lime (P). Suppose also that some plants die in some of the cells for reasons unrelated to the effects of N and P, leaving some cells with nij <10. A hypothesis test concerning the effects of N and P, which depends on the values of nij, would be contaminated by the accidental variation in the nij. The scientific method declares that the hypotheses to be tested should be stated before data are collected. It would be impossible to state Type I and Type II hypotheses prior to data collection because the hypotheses depend on the nij, which are known only after data are collected.

Note that the Type I and Type II hypotheses are different for effect A but the same for effect B. This occurs because the Type I sums of squares are model-order dependent. Being sequential, the Type I sums of squares are A (unadjusted), B (adjusted for A), and A*B (adjusted for A and B). Thus, the Type I sums of squares for the effects A and B listed in the MODEL statement in the order A, B, A*B would not be the same as in the order B, A, A*B. The Type II hypotheses are not model-order dependent because, for the two-way structure, both Type II main-effect sums of squares are adjusted for each other—that is, A (adjusted for B), B (adjusted for A), and A*B (adjusted for A and B). These Type II sums of squares are the partial sums of squares if no A*B interaction is specified in the model, in which case Type II, Type III, and Type IV would be the same.

Another interpretation of these tests is given by Hocking and Speed (1980), who point out that Type I, Type II, and Type III=Type IV tests for effect A each represent a test of

H0: μ11 +…+ μ1b =…= μa1 +…+ μab

subject to certain conditions on the cell means. The conditions are

Type I

no B effect, μ.1 = … = μ.k, and
no A*B effect, μij – μim – μij + μim = 0 for all i, j, l, m

Type II

no A*B effect, μij – μim – μij + μim = 0 for all i, j, l, m

Type III=Type IV

none, (provided nij > 0 for all ij).

6.3.5 An Example of Unbalanced Two-Way Classification

This example is a two-factor layout with data presented by Harvey (1975). Two types of feed rations (factor A) are given to calves from three different sires (factor B). The observed dependent variable yijk (variable Y) is the coded amount of weight gained by each calf. Because unequal numbers of calves of each sire are fed each ration, this is an unbalanced experiment. However, there are observations for each ration-sire combination; that is, there are no empty cells. The data appear in Output 6.7.

Output 6.7 Data for an Unbalanced Two-Way Classification

Obs a b y
 
1 1 1 5
2 1 1 6
3 1 2 2
4 1 2 3
5 1 2 5
6 1 2 6
7 1 2 7
8 1 3 1
9 2 1 2
10 2 1 3
11 2 2 8
12 2 2 8
13 2 2 9
14 2 3 4
15 2 3 4
16 2 3 6
17 2 3 6
18 2 3 7

The analysis-of-variance model for these data is

yijk = μ + αi + βj + αβij + εijk

where

i

equals 1, 2

j

equals 1, 2, 3.

i and j have elements as defined in Section 6.3.1, “General Considerations.” The model contains twelve parameters, which are more than can be estimated uniquely by the six cell means that are the basis for estimating parameters. The analysis is implemented with the following SAS statements:

proc glm;
   class a b;
   model y=a b a*b / ss1 ss2 ss3 ss4 solution;

The statements above cause PROC GLM to create the following twelve dummy variables:

❏ 1 dummy variable for the mean (or intercept)

❏ 2 dummy variables for factor A (ration)

❏ 3 dummy variables for factor B (sire)

❏ 6 dummy variables for the interaction A*B (all six possible pairwise products of the variables from factor A with those from factor B).

The options requested are SOLUTION, and for purposes of illustration, all four types of sums of squares. The results appear in Output 6.8.

Output 6.8 Sums of Squares for an Unbalanced Two-Way Classification

The GLM Procedure
 
Dependent Variable: y  
 
  Sum of  
  Source DF Squares   Mean Square  F Value  Pr > F
 
  Model 5 63.71111111 12.74222222 5.87 0.0057
 
  Error 12  26.06666667  2.17222222
 
  Corrected Total 17 89.77777778
 
 
R-Square Coeff Var Root MSE yield Mean
 
0.709653 28.83612 1.473846 5.111111
 
Source DF Type I SS Mean Square F Value Pr > F
 
a 1 7.80277778 7.80277778 3.59 0.0824
b 2 20.49185393 10.24592697 4.72 0.0308
a*b 2 35.41647940 17.70823970 8.15 0.0058
 
Source DF Type II SS Mean Square F Value Pr > F
 
a 1 15.85018727 15.85018727 7.30 0.0193
b 2 20.49185393 10.24592697 4.72 0.0308
a*b 2 35.41647940 17.70823970 8.15 0.0058
 
Source DF Type III SS Mean Square F Value Pr > F
 
a 1 9.64065041 9.64065041 4.44 0.0569
b 2 30.86591760 15.43295880 7.10 0.0092
a*b 2 35.41647940 17.70823970 8.15 0.0058
 
Source DF Type IV SS Mean Square F Value Pr > F
 
a 1 9.64065041 9.64065041 4.44 0.0569
b 2 30.86591760 15.43295880 7.10 0.0092
a*b 2 35.41647940 17.70823970 8.15 0.0058
 
      Standard    
Parameter   Estimate Error  t Value  Pr > |t|
 
Intercept     5.400000000 B   0.65912400 8.19 <.0001
a 1    -4.400000000 B 1.61451747 -2.73 0.0184
a 2    0.000000000 B  ⋅         ⋅    ⋅      
b 1    -2.900000000 B 1.23310809 -2.35 0.0366
b 2    2.933333333 B 1.07634498 2.73 0.0184
b 3    0.000000000 B  ⋅         ⋅    ⋅      
a*b 1  1 7.400000000 B 2.18606699 3.39 0.0054
a*b 1  2 0.666666667 B 1.94040851 0.34 0.7371
a*b 1  3 0.000000000 B  ⋅         ⋅    ⋅      
a*b 2  1 0.000000000 B  ⋅         ⋅    ⋅      
a*b 2  2 0.000000000 B  ⋅         ⋅    ⋅      
a*b 2  3 0.000000000 B  ⋅         ⋅    ⋅      
NOTE: The X'X matrix has been found to be singular, and a generalized inverse was used to solve the normal equations. Terms whose estimates are followed by the letter 'B' are not uniquely estimable.

The first portion shows the statistics for the overall model. The overspecification of the model is obvious: The twelve dummy variables generate only six degrees of freedom (five for the terms listed in the MODEL statement plus one for the intercept).

The next portion of the output shows the four types of sums of squares. Note that Types III and IV give identical results. This is because nij>0 for all i, j.

As noted in Section 6.3.3, “Interpreting Sums of Squares in Reduction Notation,” the parameter estimates printed by PROC GLM are a solution to the normal equations corresponding to the restriction in equation (6.2.) The same condition applies to the parameter estimates

ˆα2=ˆβ3=^αβ13=^αβ21=^αβ22=^αβ23=0(6.4)αˆ2=βˆ3=αβˆ13=αβˆ21=αβˆ22=αβˆ23=0(6.4)

These values, which now equal 0, appear in Output 6.8.

Section 6.3.4, “Interpreting Sums of Squares in the -Model Notation,” also shows that the parameters of the model relate to the parameters of the standard analysis-of-variance model (see equation 6.3). A corresponding relation holds between the respective parameter estimates, namely

ˆyij=ˆμij=ˆμ+ˆαi+ˆβj+^αβij(6.5)yˆij=μˆij=μˆ+αˆi+βˆj+αβˆij(6.5)

Putting equations (6.4) and (6.5) together as shown in the table below gives the interpretation of the parameter estimates printed by PROC GLM. The table below also shows the relationship between means and parameter estimates.

  Sire 1 Sire 2 Sire 3
Ration 1 ˆμ11=ˉy11.     =ˆμ+ˆα1+ˆβ1+^αβ11μˆ11=y¯11.     =μˆ+αˆ1+βˆ1+αβ11ˆ ˆμ12=ˉy12.     =ˆμ+ˆα1+ˆβ2+^αβ12μˆ12=y¯12.     =μˆ+αˆ1+βˆ2+αβ12ˆ ˆμ13=ˉy13.     =ˆμ+ˆα1μˆ13=y¯13.     =μˆ+αˆ1
Ration 2 ˆμ21=ˉy21.     =ˆμ+ˆβ1μˆ21=y¯21.     =μˆ+βˆ1 ˆμ22=ˉy22.     =ˆμ+ˆβ2μˆ22=y¯22.     =μˆ+βˆ2 ˆμ23=ˉy23.     =ˆμμˆ23=y¯23.     =μˆ

Note especially the following items in Output 6.8:

❏ The intercept ˆμμˆ printed by PROC GLM is the cell mean ˉy23.y¯23. for the lower right-hand cell (ration 2, sire 3).

❏ The estimate ˉα1=2.4α¯¯1=2.4 is the difference between the cell means for the two rations fed to sire 3, ˆα1=ˉy13ˉy23.αˆ1=y¯13y¯23..

❏ The interaction parameter estimate ^αβ11αβˆ11 = 5.4 is the interaction of ration 1 and ration 2 by sire 1 and sire 3, αβ11=ˉy11.ˉy13.ˉy21.+ˉy23.αβ11=y¯11.y¯13.y¯21.+y¯23.. Generally, in a two-way layout with a rows and b columns, the interaction parameter estimates αβij and measures the interaction of rows i and a by columns j and b.

6.3.6 The MEANS, LSMEANS, CONTRAST, and ESTIMATE Statements in a Two-Way Layout

The parameter estimates printed by PROC GLM are the result of a computational algorithm and may or may not be the estimates with the greatest practical value. However, there is no single choice of estimates (corresponding to a particular generalized inverse or set of restrictions) that satisfies the requirements of all applications. In most instances, specific estimable functions of these parameter estimates, like the estimates obtained with the LSMEANS, CONTRAST, and ESTIMATE statements, can be used to provide more useful estimates. The CONTRAST and ESTIMATE statements for balanced data applications are discussed in Chapter 2.

The CONTRAST, LSMEANS, and ESTIMATE statements are similar for one- and two-way models, but principles and interpretations become more complex. Consider the results from the following SAS statements:

proc glm;
   class a b;
   model y=a b a*b;
   means a b;
   lsmeans a b / stderr;
   contrast 'A EFFECT' a -1 1;
   contrast 'B 1 vs 2 & 3' b -2 1 1;
   contrast 'B 2 vs 3' b 0 -1 1;
   contrast 'ALL B' b -2 1 1, b 0 -1 1;
   contrast 'A*B 2 vs 3' a*b 0 1 -1 0 -1 1;
   estimate 'B2, B3 MEAN' intercept 1 a .5 .5 b 0 .5 .5
                                 a*b 0 .25 .25 0 .25 .25;
   estimate 'A in B1' a -1 1 a*b -1 0 0 1;

The MEANS statement provides the raw or unadjusted main-effect and interaction means. The LSMEANS statement produces least-squares (adjusted) means for main effects together with their standard errors. The results of these two statements are combined in Output 6.9. (PROC GLM prints these results on separate pages.)

Output 6.9 Results of the MEANS and LSMEANS Statements for a Two-Way Classification

The GLM Procedure
 
Level of   --------------y--------------
a N     Mean Std Dev
 
1 8     4.37500000 2.13390989
2 10     5.70000000 2.35937845
 
Level of   --------------y--------------
b N     Mean Std Dev
 
1 4     4.00000000 1.82574186
2 8     6.00000000 2.50713268
3 6     5.00000000 1.54919334
 
The GLM Procedure
Least Squares Means
 
    Standard     
a y LSMEAN Error   Pr > |t|
 
1 3.70000000 0.64055339   <.0001
2 5.41111111 0.49940294   <.0001
 
 
    Standard    
b y LSMEAN Error   Pr > |t|
 
1 4.00000000 0.73692303   0.0002
2 6.46666667 0.53817249   <.0001
3 3.20000000 0.80725874   0.0019

The raw and least-squares means are different for all levels except B1, which is balanced with respect to factor A.

Quantities estimated by the raw means and least-squares means can be expressed in terms of the μ model. For level 1 of factor A, the raw mean (4.625) is an estimate of (2μ11 + μ12 + μ13)/(2 + 5 + 1), whereas the least-squares mean (4.367) is an estimate of (μ11 + μ12 + μ13)/3. The raw means estimate weighted averages of the μij whose weights are a function of sample sizes. The least-squares means estimate unweighted averages of the μij.

The results of the five CONTRAST statements appear in Output 6.10.

Output 6.10 Contrast in a Two-Way Classification

Dependent Variable: y  
 
 Contrast  DF  Contrast SS  Mean Square  F Value  Pr > F
 
 A effect 1 9.64065041 9.64065041 4.44 0.0569
 B 1 vs 2 & 3 1 1.93798450 1.93798450 0.89 0.3635
 B 2 vs 3 1 24.62564103 24.62564103 11.34 0.0056
 ALL B 2 30.86591760 15.43295880 7.10 0.0092
 A*B 2 vs 3 1 0.25641026 0.25641026 0.12 0.7371

The first four CONTRAST statements are similar to those presented for the one-way structure. Note that when a contrast uses all available degrees of freedom for the factor (such as the ALL B contrast), the sums of squares are the same as the Type III sums of squares for the factor.

The fifth CONTRAST statement requests the interaction between the factor A contrast and the B2 vs 3 contrast. It is constructed by computing the product of corresponding main-effect contrasts for each AB treatment combination. The procedure is illustrated in the table below:

Construction of Interaction Contrast
      Level of Factor B            
   

1

2

3

Level of
Factor A

Factor A
Contrast

Factor B Contrast

0

1

–1

1

  1

0

  1

–1

2

–1

0

–1

  1

Main-effect contrasts are given on the top and left, and interaction contrasts (products of marginal entries) are given in the body of the table. These are inserted into the CONTRAST statement in the same order of interaction cells as indicated by the CLASS statement (levels of B within levels of A).

In terms of the μ model, the hypothesis tested by the F-statistic for this interaction contrast is

H0: μ12 − μ13 − μ22 + μ23 = 0

The two ESTIMATE statements request estimates of linear functions of the model parameters. The first function to be estimated is the average of the cell means for levels 2 and 3 of factor B. The other statement requests an estimate of the effect of factor A within level 1 of factor B, or an estimate of μ21 – μ11. Output 6.11 summarizes results from the ESTIMATE statements.

Output 6.11 Parameter Estimates in a Two-Way Classification

    Standard    
Parameter Estimate Error  t Value  Pr > |t|
 
B2, B3 MEAN 4.83333333 0.48510213 9.96 <.0001
A in B1 -3.00000000 1.47384606 -2.04 0.0645

Expressing each comparison in terms of model parameters αi, βj, (αβ)ij is the key to filling in the coefficients of the CONTRAST and ESTIMATE statements. Consider the ESTIMATE A in B1 statement, which is used to estimate μ21 – μ11. Writing this expression as a function of the model parameters by substituting

μij = μ + αi + βj + αβij

yields

μ21 − μ11 = − α1 + α2 − α β11 + αβ21

The −α1 + α2 term tells you to insert A − 1 1 into the ESTIMATE statement. There are no β’s in the function, so no B expression appears in the ESTIMATE statement. The −αβ11 + αβ21 term tells you to insert A*B–1 0 0 1 0 0, or, equivalently, A*B–1 0 0 1 into the ESTIMATE statement. The ordering in the statement

class a b;

specifies that the ordering of the coefficients following A*B corresponds to αβ11 αβ12 αβ13 αβ21 αβ22 αβ23. The SAS statement

class b a;

would indicate an ordering that corresponds to αβ11 αβ21 αβ12 αβ22 αβ13 αβ23.

Now consider the CONTRAST A statement. The hypothesis to be tested is

H0: −(μ11 + μ12 + μ13)/3 + (μ21 + μ22 + μ23)/ 3 = 0

Substituting μij + αi + βj + αβij gives the equivalent hypothesis:

H0: −α1 + α2 − (αβ11 + αβ12 + αβ13)/3
  + (αβ21 + αβ22 + αβ23) / 3 = 0

Again the −α1 + α2 term in the function tells you to insert A – 1 1. This brings up an important usage note: Specifying A – 1 1 causes the coefficients of the A*B interaction term to be automatically included by PROC GLM. That is, the SAS statement

contrast 'A' a -1 1;

is equivalent to the statement

contrast 'A' a -1 1 a*b -.333333 -.333333 -.333333 .333333 .333333 .333333;

Similarly, the SAS statement

estimate 'A EFFECT' a -1 1;

provides an estimate of

– (μ11 + μ12 + μ13)/3 + (μ21 + μ22 + μ23)/3

without explicitly specifying the coefficients of the αβij terms. However, you should note that specifying the αβij coefficients does not cause PROC GLM to automatically include coefficients for αi or βj. For example, the term A – 1 1 must appear in the ESTIMATE A in B1 statement. Similarly, a contrast to test H0: −μ11 + μ21 = 0 requires the following statement:

contrast 'A in B1' a -1 1 a*b -1 0 0 1;

The A – 1 1 term must be included.

6.3.7 Estimable Functions for a Two-Way Classification

The previous section discussed the application of the CONTRAST statement, which employs the concept of estimable functions. PROC GLM can display the construction of estimable functions as an optional request in the MODEL, LSMEANS, CONTRAST, and ESTIMATE statements. This section discusses the construction of estimable functions and their relation to the sums of squares and associated hypotheses available in the GLM procedure, and to CONTRAST, ESTIMATE, and LSMEANS statements. The presentation of estimable functions consists of results obtained using the unbalanced factorial data given in Output 6.7. For more thorough discussions of these principles, see Graybill (1976) and Searle (1971).

6.3.7.1 The General Form of Estimable Functions

The general form of estimable functions is a vector of elements that are the building blocks for generating specific estimable functions. The number of unique symbols in the vector represents the maximum number of linearly independent coefficients estimated by the model, which is equal to the rank of the X′X matrix. In the GLM procedure this is obtained by the E option in the MODEL statement

proc glm;
   class a b;
   model y=a b a*b / e solution;

Table 6.2 gives the vector of coefficients of the general form of estimable functions for our example. There are only six elements (L1, L2, L4, L5, L7, L8), which correspond to the number of degrees of freedom in the model (including the intercept). The number of elements for an effect corresponds to the degrees of freedom for that effect; for example, L4 and L5 are introduced opposite the effect B, indicating B has 2 degrees of freedom.

Table 6.2 General Form of Estimable Functions

Effect Parameters*    Coefficients
Intercept

μ

L1
A

1

α1

L3

2

α2

L1-L2
B

1

β1

L4

2

β2

L5

3

β3

L1 – L4-L5
A*B

1

1

αβ11

L7

1

2

αβ12

L8

1

3

αβ13

L2 – L7 – L8

2

1

αβ21

L4 – L7

2

2

αβ22

L5 – L8

2

3

αβ23

L1 – L2 – L4 – L5 + L7 + L8

* These are implied by the output but not printed in this manner.

According to Table 6.2, any estimable function Lβ must be of the form

Lβ=L1μ+L2α1+(L1L2)α2+L4αβ1+L5β2+(L1L4L5)β3+L7αβ11+L8αβ12+(L2L7L8)αβ13+(L4L7)αβ21(6.6)+(L5L8)αβ22+(L1L2L4L5+L7+L8)αβ23Lβ=L1μ+L2α1+(L1L2)α2+L4αβ1+L5β2+(L1L4L5)β3+L7αβ11+L8αβ12+(L2L7L8)αβ13+(L4L7)αβ21+(L5L8)αβ22+(L1L2L4L5+L7+L8)αβ23(6.6)

for some specific values of L1 through L8. The various tests in PROC GLM test hypotheses of the form H0: Lβ =0.

Coefficients for any specific estimable function are constructed by assigning values to the individual L’s. For example, setting L2=1 and all others equal to 0 provides the estimable function α1 − α2 + αβ13 − αβ23. It is clear, however, that no estimable function can be constructed in this manner to equal 1 or 2 individually. That is, no matter what values you choose for L1 through L8, you cannot make = α1 or = α2. This is because α1 and α2 are non-estimable functions; without additional restrictions there is no linear function of the data whose expected value is α1 or α2.

6.3.7.2 Interpreting Sums of Squares Using Estimable Functions

The coefficients required to construct estimable functions for each effect in the MODEL statement are available for any type of sum of squares requested as an option in the MODEL statement. For example,

model y = a b a*b / e e1 e2 e3;

will provide the general form, the coefficients of estimable functions for Types I, II, and III, and the corresponding sums of squares for each effect listed in the MODEL statement.

Table 6.3 gives the coefficients of the Type I, Type II, and Type III estimable functions associated with factor A. Types III and IV are identical for this example because all nij > 0.

Table 6.3Estimable Functions for Factor A

Effect Type I
Parameters
Type II
Coefficients
Type III
Coefficients
Intercept

0

0

0

A

1

L2

L2

L2

2

–L2

–L2

–L2

B

1

0.05*L2

0

0

2

0.325*L2

0

0

3

–0.375*L2

0

0

A*B

1

1

0.25*L2

0.2697*L2

0.3333*L2

1

2

0.625*L2

0.5056*L2

0.3333*L2

1

3

0.125*L2

0.2247*L2

0.3333*L2

2

1

–0.2*L2

–0.2697*L2

–0.3333*L2

2

2

–0.3*L2

–0.5056*L2

–0.3333*L2

2

3

–0.5*L2

–0.2247*L2

–0.3333*L2

All coefficients involve only one element (L2), since the A effect has only 1 degree of freedom. Estimable functions are constructed by assigning specific values to the elements. For factor A, with only one variable, the best choice is L2=1. Application to the Type I coefficients generates the estimable function

=α1α2+0.05β1+0.325β20.375β2+0.25αβ11+0.625αβ12+0.125αβ130.2αβ210.3αβ220.5αβ23Lβ=α1α2+0.05β1+0.325β20.375β2+0.25αβ11+0.625αβ12+0.125αβ130.2αβ210.3αβ220.5αβ23

Thus, using the Type I sum of squares in the numerator of an F-statistic tests the hypothesis = 0 for this particular . In addition to α1 − α2, this also involves the function of coefficients of factor B

0.5β1 + 0.325β2 − 0.375β3

as well as a function of the interaction parameters. Actually, this is to be expected, since the Type I function for A is unadjusted—it is based on the difference between the two A factor means (ˉy1..ˉy2..)(y̅1..y̅2..).

As explained in Section 6.3.1, the mean for A is

ˉy1..=1/8[(2ˉy11.+5ˉy12.+ˉy13.)]y¯1..=1/8[(2y¯11.+5y¯12.+y¯13.)]

Each cell mean ˉyij.y̅ij. is an estimate of the function μ + αii + βj + αβij. Omitting for this discussion the interaction parameters, ˉy1..y̅1.. is an estimate of

1/8[2(μ+α1+β1)+5(μ+α1+β2)+(μ+α1+β3)]=μ+α1+(0.25β1+0.625β2+0.125β3)1/8[2(μ+α1+β1)+5(μ+α1+β2)+(μ+α1+β3)]=μ+α1+(0.25β1+0.625β2+0.125β3)

Likewise ˉy2..y̅2.. is an estimate of

μ + α2 + (0.2β1 + 0.3 β2 + 0.5β3)

Hence, (ˉy1..ˉy2..)(y̅1..y̅2..) is an estimate of

a1 − α2 + 0.05β1 + 0.325β2 − 0.375β3

which is the function provided by Type I.

The coefficients associated with A*B provide the coefficients of the interaction terms in the Type I estimable functions. The coefficients associated with A*B are useful for expressing the estimable functions and interpreting the tests in terms of the model. Recall that

μij = μ + αi + βj + αβij

A little algebra shows that any estimable with coefficients as given in Table 6.3 can also be written as

Lβ=L7μ11+L8μ12+(L2L7L8)μ13+(L4L7)μ21+(L5L8)μ22+(L1L2L4L5+L8)μ23(6.7)Lβ=L7μ11+L8μ12+(L2L7L8)μ13+(L4L7)μ21+(L5L8)μ22+(L1L2L4L5+L8)μ23(6.7)

This is easily verified by starting with in equation (6.7), replacing each μij with μ + αi + βj + αβij, and combining terms to end up with the original expression for in equation 6.6. For example, after factoring out L2, we see that the Type I estimable function for A is

Lβ=L2(0.25μ11+0.625μ12+0.125μ130.2μ210.3μ20.5μ23)Lβ=L2(0.25μ11+0.625μ12+0.125μ130.2μ210.3μ20.5μ23)

Thus the Type I F-test for A tests the hypothesis H0: = 0, or equivalently,

H0: 0.25 μ11 + 0.625 μ12 + 0.125 μ12 = 0.2 μ21 + 0.3 μ22 + 0.5 μ23

This is the hypothesis that is tested in Table 6.1. Since the coefficients are functions of the frequencies of the cells, the hypothesis might not be particularly useful.

Applying the same method to the Type II coefficients for A, we have, after setting L2=1,

Lβ=.2697μ11+.5056μ12+.2247μ13.2697μ21.5056μ22.2247μ23=.2697(μ11μ21)+.5056(μ12μ22)+.2247(μ13μ23)Lβ=.2697μ11+.5056μ12+.2247μ13.2697μ21.5056μ22.2247μ23=.2697(μ11μ21)+.5056(μ12μ22)+.2247(μ13μ23)

This expression sheds some light on the meaning of the Type II coefficients. Recall that the Type II SS are based on a main-effects model. With no interaction we have, for example, μ11 − μ21 = μ12 − μ22 = μ13 − μ23. Let Δ denote the common value of these differences. The Type II coefficients are the coefficients of the best linear unbiased estimate ˆΔΔˆ of Δ given by

ˆΔ=Σwj(ˉy1j.ˉy2j.)Δˆ=Σwj(y̅1j.y̅2j.)

where

wj=n1jn2j/(n1j+n2j)k(n1kn2k/(n1k+n2k))wj=n1jn2j/(n1j+n2j)k(n1kn2k/(n1k+n2k))

For example,

.2697=(2)(2)/(2+2)(2)(2)/(2+2)+(5)(3)/(5+3)+(1)(5)/(1+5).2697=(2)(2)/(2+2)(2)(2)/(2+2)+(5)(3)/(5+3)+(1)(5)/(1+5)

Note that these are functions of cell frequencies and thus do not necessarily generate meaningful hypotheses.

Type III (and Type IV) estimable functions for A likewise (Table 6.2) do not involve the parameters of the B factor. Further, in terms of the parameters of the cell means (μ model)

= 1/3(μ11 + μ12 + μ13) − 1/3(μ21 + μ22 + μ23)

Thus the Type III F-statistic tests H0:ˉμ1.=ˉμ2.H0:μ̅1.=μ̅2. as stated in Table 6.1. Note again that this hypothesis does not involve the cell frequencies, n1j.

Table 6.3 gives the coefficients of estimable functions for factor B. There are 2 degrees of freedom for factor B, thus two elements, L4 and L5. Consider first the Type III coefficients because they are more straightforward. The Type III F-test for factor B is testing simultaneously that any two linearly independent functions are equal to 0; functions are obtained by selecting two choices of values for L4 and L5.

The simplest choices are to take L4=1, L5=0 and L4=0, L5=1. This gives the estimable functions

L2β = β2 − β3 + (αβ12 − αβ13 + αβ22 − αβ23)/2

and

L2β = β2 − β3 + (αβ12 − αβ13 + αβ22 − αβ23) / 2

In terms of the μ model, this gives

L1β = (μ11 − μ13 + μ21 − μ23) / 2

and

L2β = (μ12 − μ13 + μ22 − μ23) / 2

Thus, the Type III F-statistic tests

H0:ˉμ.1=ˉμ.3andˉμ.2=ˉμ.3H0:μ̅.1=μ̅.3andμ̅.2=μ̅.3

or, in equivalent form

H0:ˉμ.1=ˉμ.2=ˉμ.3(6.8)H0:μ¯¯.1=μ¯¯.2=μ¯¯.3(6.8)

Another set of choices is L4=1, L5=1 and L4=L5=–1. These lead to

H0:ˉμ.1=ˉμ.2and(ˉμ.1+ˉμ.2)/2=ˉμ.3H0:μ̅.1=μ̅.2and(μ̅.1+μ̅.2)/2=μ̅.3

which is also equivalent to equation (6.8). Therefore, both sets of choices lead to the same H0. It is significant that the H0 in equation (6.8) is independent of cell frequencies and, thus, is desirable for the usual case where cell frequencies are unrelated to the effects of the factors. Table 6.4 gives the coefficients of the Type I & Type II and Type III & Type IV estimable functions associated with factor B.

Table 6.4 Estimable Functions for Factor B

Effect Type I & Type II
Coefficients
Type III & Type IV
Coefficients
Intercept

0

0

A

1

0

0

2

0

0

B

1

L4

L4

2

L5

L5

3

–L4–L5

–L4–L5

A*B

1

1

0.401*L4-0.1236*L5

0.5*L4

1

2

–0.1658*L4+0.3933*L5

0.5*L5

1

3

–0.2416*L4–0.2697*L5

–0.5*L4–0.5*L5

2

1

0.5899*L4+0.1236*L5

0.5*L4

2

2

–0.1685*L4+0.6067*L5

0.5*L5

2

3

–0.7584*L4+0.7303*L5

–0.5*L4–0.5*L5

 

Recall that since B followed A in the MODEL statement, the Type I SS for B is the same as the Type II SS for B. The coefficients are again a function of the cell frequencies. The nature of the function is not easy to determine but is similar to the Type II coefficients for factor A (see Table 6.1).

As a matter of computational interest, the Type II estimable functions for B are equal to the Type III estimable functions if there is no interaction. For then αβij = 0, and Table 6.4 shows that the coefficients for αi and βj are the same for Types II and III. This is not to say that the Type II SS and Type III SS will be equal, but rather that they give tests of the same hypothesis when there is no interaction. If, indeed, there is no interaction, then the Type II F-test is more powerful than the Type III F-test. The assumption of no interaction is, however, probably rarely satisfied in nature.

Table 6.5 Estimable Functions for A*B

Effect Coefficients
for All Types
Intercept

0

A

1

0

0

2

0

B

1

0

2

0

3

0

A*B

1

L7

1

L8

1

–L7–L8

2

–L7

2

–L8

2

L7+L8

 

Table 6.5 gives the coefficients of the estimable function for the A*B interaction and, again, two elements are available. In this case all types of effects give the same results, since for each type the interaction effects are adjusted for all other effects. The estimable functions can be readily interpreted if the coefficients are recorded in the 2×3 cell format implied by the factorial array. For example, let L7 = –1 and L8 = 0; the resulting function can be illustrated as follows:

B

1

2

3

A

1

–1

0

+1

2

+1

0

–1

This is the interaction in the 2×2 subtable consisting of the columns for B1 and B3, or the interaction of the contrast (α1 − α2) with the contrast (β1 − β2).

6.3.7.3 Estimating Estimable Functions

Estimates of estimable functions can be obtained by multiplying the vector of coefficients by the vector of parameter estimates using the SOLUTION option. For example, letting L2=1 in Table 6.3 for Type I results in the vector

L = (0 1 −1 .05 .325 −.375 −.375 .25 .625 .125 −.2 −3 −.5)

The vector of parameter estimates (see Output 6.7) is

β̂ = (5.4 −2.4 .0 −2.9 2.933 .0 5.4 −1.33 .0 .0 .0 .0)

The estimate LˆβLβˆ = 1.075 is equal to ˉy1..ˉy2..y̅1..y̅2.., the unadjusted treatment difference. Likewise, using the Type III coefficients gives an estimate of 1.044, which is the difference between the two least-squares means of the A factor (see Table 6.8).

Variances of these estimates can be obtained by the standard formula for the variance of a linear function. The estimated variance of the estimates is s2(L(XX) L′), where s2, the estimated error variance, is the residual mean square from the overall analysis of variance, and (X′X)– is the generalized inverse of the X′X matrix generated by the dummy variables. The square root of this variance provides the standard error of the estimated function; hence, a t-test is readily constructed.

Since the LSMEANS, CONTRAST, and ESTIMATE statements offer methods of estimating and testing the most desired functions, the preceding technique is seldom employed. However, if these statements produce functions that are nonestimable, the generation of estimates from scratch may provide otherwise unobtainable estimates.

6.3.7.4 Interpreting LSMEANS, CONTRAST, and ESTIMATE Results Using Estimable Functions

Sometimes it may be useful to examine the construction of estimable functions associated with the LSMEANS, CONTRAST, and ESTIMATE statements. Information on the construction of these functions is available by specifying E as one of the options in the statement. (Don’t confuse this with the E=effect option, which specifies an alternate error term.) Output 6.12 shows the results from the E option:

proc glm;
   class a b;
   model y=a b a*b / solution;
   contrast 'B 2 vs 3' b 0 -1 1 / e;
   estimate 'A in B1' a -1 1 a*b -1 0 0 1 / e;
   lsmeans a / stderr e;

Output 6.12 Estimable Functions for the LSMEANS, CONTRAST, and ESTIMATE Statements

Coefficients for Contrast A*B 2 vs 3
 
Row 1
 
Intercept 0
 
a 1 0
a 2 0
 
b 1 0
b 2 0
b 3 0
 
a*b 1 1 0
a*b 1 2 1
a*b 1 3 -1
a*b 2 1 0
a*b 2 2 -1
a*b 2 3 1
 
Coefficients for Estimate A in B1
 
Row 1
 
Intercept 0
a 1 -1
a 2 1
 
b 1 0
b 2 0
b 3 0
 
a*b 1 1 -1
a*b 1 2 0
a*b 1 3 0
a*b 2 1 1
a*b 2 2 0
a*b 2 3 0
 
Coefficients for a Least Square Means
 
a Level
Effect 1 2
 
Intercept 1 1
a 1   1 0
a 2   0 1
b 1   0.33333333 0.33333333
b 2   0.33333333 0.33333333
b 3   0.33333333 0.33333333
a*b 1 1   0.33333333 0
a*b 1 2   0.33333333 0
a*b 1 3   0.33333333 0
a*b 2 1   0 0.33333333
a*b 2 2   0 0.33333333
a*b 2 3   0 0.33333333

The hypothesis tested by the CONTRAST statement is

H0: − β2 + β3 − .5αβ12 + .5αβ13 − .5αβ22 + .5αβ23 = 0

or in the μ-model notation

−.5μ12 + .5 μ13 − .5μ22 + .5μ23

Note that the coefficients of the interaction effects are supplied by the procedure.

The function estimated by the ESTIMATE statement is

= − α1 + α2 − αβ11 + αβ21

or in μ-model notation

= − μ11 + μ21

The least-squares means for A1 estimates is

μ + α1 + (β1 + β2 + β3 + αβ11 + αβ12 + αβ13) / 3

or, in μ-model notation

11 + μ12 + μ13) / 3

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.220.136.165