6.2. The Mixed Effects Linear Model

Let yi be the pi × 1 vector of repeated measures on the ith subject. Then consider a mixed effects model described as


where Xi and Zi are the known matrices of orders pi by q and pi by r respectively, and β is the fixed q by 1 vector of unknown (nonrandom) parameters. The r by 1 vectors νi are random effects with Ei)=0, and Di)=σ2G1. Finally ϵi are the pi by 1 vectors of random errors whose elements are no longer required to be uncorrelated. We assume that Ei)=0, Di)=σ2Ri, covii)=0, cov (ϵii)= 0, covii′)=0 for all ii′, and covii)=0. Such assumptions seem to be reasonable in repeated measures data where subjects are assumed to be independent, yet the repeated data on a given subject may be correlated. Note here that Ri is the appropriate pi × pisubmatrix of a p × p positive definite matrix, where p is the number of time points in the data set where observations have been made. An appropriate covariance structure can be assigned to the data by an appropriate choice of matrices G1 and Ri. Note that since yi is a pi by 1 vector, i = 1,...,n, the model can account for the unbalanced repeated measures data, that is, when data are such that all the subjects have not been observed at all time points.

The n submodels in Equation 6.1 can be stacked one below the other to give a single model


or


where the definitions of y, X, Z, ν, and ϵ in terms of the matrices and vectors of submodels are self explanatory. In view of the assumptions made on Equation 6.1, we have E(ν)=0, E(ϵ)=0,


and


The symbol ⊗ here stands for the Kronecker product (Rao, 1973) defined for two matrices Us × t=(uij) and Wl × m=(wij) as


It follows from Equation 6.2 that


The above representation D(y) = σ2V is taken as a convenience in the algorithm of the MIXED procedure. It may be remarked that in many situations, the variance covariance matrix of y may not be in the above form where the parameter σ2 has been explicitly factored out. However, with appropriate (but not necessarily unique) modifications in the matrices G and R, some parameter σ2 (not necessarily unique) can be factored out. For example, to factor out σ2, one only needs to divide all elements of G and R by σ2 and use their reparametrized versions for the purpose of defining the appropriate V. Thus, there is no loss of generality in defining the covariance structure as is the case with the MIXED procedure algorithm.

6.2.1. Estimation of Effects When V Is Known

If G1 and R1,...,Rn are assumed to be known then the Best (minimum mean squared error) Linear Unbiased Estimator (BLUE) using the generalized least squares estimator of β is given by (assuming that it uniquely exists)


The variance covariance matrix of is


Similarly, the Best Linear Unbiased Predictor (BLUP) of ν is given by GZ′(ZGZ′ +R)−1(yXβ). Further an unbiased estimator of σ2 is obtained as


where and is the error degrees of freedom. If X′(ZGZ′ +R)−1X does not admit an inverse, for most estimation problems a generalized inverse would replace the inverse in Equation 6.3 provided estimability of the functions under consideration has been ensured.

The above BLUE of β and the BLUP of ν can also be obtained by solving the system of mixed model equations (Henderson, 1984),


If in addition, multivariate normality is assumed for νi and ϵi, i = 1,...,n, then,


In this case and are also the maximum likelihood estimator and maximum likelihood predictor of β and ν respectively.

Consider the problem of testing a linear hypothesis of the form H0: Lβ=0, where L is a full (row) rank matrix. Then the usual test statistic for testing H0 is


which under the null hypothesis H0 is distributed as Fν12, where ν1= Rank(L), ν2 is the error degrees of freedom, and V=(ZGZ′ +R).

6.2.2. Estimation of σ2 and V

When the matrices G and/or R (or V) are unknown, estimation of these matrices can be carried out using the standard likelihood based methods (i.e. ML or REML) under the assumption of joint multivariate normality of ν and ϵ. In practice, certain structure on either one or both of these matrices is assumed so that V is a function of only a few unknown parameters, say θ1,...,θs. The above methods are iterative in that first for a fixed value of V, an estimator of β using the form of the BLUE is obtained. Then the likelihood function of V is maximized with respect to θ1,...,θs to get an estimate of V. These two steps are iterated until a certain user specified convergence criterion is met.

The ML estimators of θ1,...,θs and hence of V (and hence of G and R) and of σ2 are obtained by maximizing the logarithm of the normal likelihood function


simultaneously with respect to these parameters. The ML estimator of σ2 expressed in terms of will be The ML estimates of θ1,...,θs, generally have to be obtained using iterative schemes.

Alternatively, estimators of θ1,...,θs, and finally of σ2 can be obtained by maximizing the function:


which is obtained from the log-likelihood function after factoring and profiling a residual variance .

Similarly, another set of estimators commonly known as the Restricted Maximum Likelihood (REML) estimators is obtained by maximizing the function (after profiling )


where k=Rank (X). The ML and REML estimators are known to be asymptotically equivalent.

Suppose ′ is the ML estimate of θ = (θ1,...θs)′. Let hs) be a certain, possibly vector valued, function of θ. Then the three asymptotic tests to test H0: h(θ)= 0 against the alternative H1: h(θ) ≠ 0 are given by


where is the ML estimator of θ under the null hypothesis H0, U(θ)= , H(θ)= , and I(θ) is the Fisher information matrix (Rao, 1973).

Under certain regularity conditions each of the statistics TW, TL, and TR has an asymptotic χr2 distribution under H0, where r=Rank(H(θ)). See Rao (1973) and also Sen and Singer (1993) for proofs and more details about these tests. Since REML and ML estimates are asymptotically equivalent one may alternatively use the REML estimates in the above expressions.

Since under certain regularity conditions, the ML estimator follows a multivariate normal distribution with the mean vector θ and the variance covariance matrix I−1(θ) one can also construct a test for the hypothesis about any component θi of θ using the standard normal distribution. This asymptotic test is also known as Wald's test. Using this asymptotic result, approximate confidence intervals can be constructed as well.

6.2.3. Estimation of Effects When V Is Estimated

Suppose and are the estimators of G and R respectively, obtained by using one of the above two methods. Then the respective estimates of β and ν are obtained by solving the plug-in version of mixed model equations,


where the estimators and respectively have been used for G and R in the mixed model equations stated earlier. Upon solving we obtain and , where is obtained by substituting and for G and R respectively in V. Note that is an estimator of the best linear unbiased estimator (BLUE) (XV−1X) XV−1y of β and is an estimator of the best linear unbiased predictor (BLUP) GZV−1(yX(XV−1X)XV−1y) of the random effects vector ν.

For simplicity of presentation, let us denote the estimate of σ2 by , whatever the method may have been used for the estimation. The estimated variances and covariance matrices of these estimators are: , , and . It may however be cautioned that


usually underestimates , the true variance covariance matrix of .

6.2.4. Tests for Fixed Effect Parameters

Consider the problem of testing a linear hypothesis of the form H0: Lβ=0, where L is a full rank matrix. A suggested test statistic for H0 is


The exact distribution of F is complicated due to many facts. For example, C11 is only an approximate version of BLUE since G1 and R1,...,Rn are unknown and hence their estimates have been used in their expressions. The matrix c11 is also an estimated version of the variance covariance matrix of . Further, the distribution of F also depends on the type of unbalancedness that exists in the data. However for large samples the test statistic F will have an approximate F distribution with numerator degrees of freedom ν1=Rank(L) and denominator degrees of freedom ν2 appropriately estimated.

A brief description of the MIXED procedure follows. This procedure implements the likelihood based approach described above and hence is useful in the repeated measures context. More specific details of this procedure will be provided in later sections as the need arises.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.194.57