6.3. An Overview of the MIXED Procedure

PROC MIXED, available in SAS/STAT, can be used to analyze data under the model stated earlier in Equation 6.1. To describe the choices available in PROC MIXED, we will follow the notations used above. More details can be found in SAS/STAT Software: Changes and Enhancements through Release 6.12.

Consider the following mixed effects model defined earlier in Equation 6.1,


The first statement that invokes this procedure is

proc mixed;

The inferential approach in the MIXED procedure is predominantly likelihood based. The multivariate normality assumptions as stated in the previous section are needed. The ML and REML estimation procedures have been implemented for the estimation of all parameters and the prediction of random effects. The test procedures for the purpose of hypothesis testing rely heavily on likelihood based functions. Examples of such tests are LRT and Wald's tests. Additionally, another estimation procedure suggested by C. R. Rao (1972) known as MIVQUE0 is also available. No normality assumptions need to be made for this method. Consequently, no statistical tests based on MIVQUE0 are available.

6.3.1. Structures for G and R

Recall that G is the variance covariance matrix of random effects and R is the variance covariance matrix of error vectors corresponding to repeated measures on the same subject. Various choices of structures for G and R are available in the MIXED procedure. Accordingly, structures for G are selected using the TYPE= option in the RANDOM statement and those for R are selected using the TYPE= option in the REPEATED statement of PROC MIXED. The MIXED procedure also has the ability to allow the Kronecker product (named as a direct product (notation: @) in SAS/IML documentation) covariance structure for R. See Example 10 for an illustration.

6.3.2. Estimation of G and R

As stated earlier estimation of the covariance parameters is carried out using one of the three methods, namely the maximum likelihood (ML), restricted maximum likelihood (REML), and minimum variance quadratic unbiased estimation (MIVQUE0). The first two are iterative. For these we need the joint multivariate normality of the error vector ϵ and the random effects ν.

The minimum variance quadratic unbiased estimator, MIVQUE0 (developed using the formulas given in Rao (1972), is non-iterative and no multivariate normality assumption is needed. It is usually used as an initial estimator in the iterative process of the ML or REML method. For certain designs with balanced data this estimator coincides with the REML estimator.

Using the METHOD = ML, REML, or MIVQUE0 option of the PROC MIXED statement, one of the three methods of estimating the covariance parameters can be adopted. For example, the syntax for using the REML method is

proc mixed method=reml;

It must be remarked that this specification also implements the same method of estimation for the fixed effect parameters in the case of ML and REML. For MIVQUE0, the fixed effects are obtained by generalized least squares where the estimate of G and R have been used in place of true values.

6.3.3. Selection of Appropriate Structure for G and R

Given numerous choices of structures for G and R, one of the problems a practitioner faces is the selection of appropriate structure. Under the heading "Model Fitting Information," PROC MIXED prints out certain useful statistics that are helpful in selecting an appropriate covariance structure for either G or for R or for both. Two of these which are used often are Akaike's Information Criterion (AIC) and Schwarz's Bayesian Criterion (BIC).

Akaike's Information Criterion (AIC) is defined as


where l(θ) is the log-likelihood function as given in Equation 6.4 (or the restricted log-likelihood function) and is the maximum log-likelihood function (or the restricted maximum log-likelihood function) and q is the number of the estimated covariance parameters. The structure expressed in terms of θ with the largest AIC is preferred.

Schwarz's Bayesian Criterion (BIC) is defined as


where n*=n for ML and (nk) for REML. Similar to AIC interpretation, a model with a larger value of BIC is preferred.

Keselman, Algina, Kowalchuk and Wolfinger (1998) indicate through extensive simulation studies that AIC performs better than BIC in trying to identify the true models. However, both criteria frequently fail to identify the correct covariance structure. These authors have also speculated that the poor performance of BIC may be due to the fact that in PROC MIXED the penalty criterion for BIC is a function of n, the total number of observations, rather than the number of subjects. Further, in SAS Version 7, the number of subjects rather than the number of observations are used in the penalty criterion for BIC.

In the context of selecting a covariance structure for R, LRT on covariance structure can be performed to decide if the particular covariance structure is deemed adequate. This approach will be discussed in detail in the next section.

6.3.4. Inference for Covariance Parameters

The estimates, standard errors (SEs) of the estimates, and the asymptotic tests using the standard normal distribution (Wald's test) for each of the covariance parameters are produced when the COVTEST option is specified in the PROC MIXED statement. The standard errors and (Wald's) tests are determined from the general theory of the maximum likelihood estimates which states that the vector of ML estimates of a vector parameter, under certain regularity conditions, is consistent and has a multivariate normal distribution with the inverse of the Fisher's information matrix as its variance covariance matrix. The tests provided on covariance parameters are for two-sided alternative hypotheses. Thus, care should be exercised in interpreting the p value since in certain cases it is more meaningful to test a particular hypothesis under a one sided alternative (e. g., when the parameters are interpreted as the variance components).

PROC MIXED also provides confidence intervals for all the unknown parameters in the variance covariance matrix. The 95% confidence intervals for these parameters can be obtained using the CL option in the PROC MIXED statement. The default 95% for the confidence level can be changed, if needed, using the ALPHA= option of the PROC MIXED statement. For the parameters which have a natural lower bound constraint of zero (for example, the variance components and the diagonal elements of the variance covariance matrix), the confidence intervals are provided using the Satterthwaite approximation. For all the other parameters, the confidence limits are obtained using the corresponding Wald's statistics.

6.3.5. Inference for Fixed and Random Effects Parameters

As indicated earlier, a linear hypothesis of the form H0: Lβ=0, where L is a full rank matrix is tested using the approximate F test described in the previous section. The test statistic


under the null hypothesis H0 is approximately distributed as an F with the degrees of freedom ν1 and ν2, where ν2= Rank(L) and ν2 is the degrees of freedom of the error sum of squares. However, different estimates of ν2, to improve the approximation, can be used in practice. The MIXED procedure allows one to specify predetermined degrees of freedom using the DDF= option in the MODEL statement. The procedure also provides several built-in choices for ν2 using the DDFM= option in the MODEL statement. For example, the DDFM=RESIDUAL option conducts all the tests using the error sum of squares degrees of freedom, which is nRank(X). The DDF=SATTERTH option conducts a general Satterthwaite approximation for obtaining ν2. The default sums of squares used are of Type III. The Type I sums of squares can also be utilized using the HTYPE=1 option in the MODEL statement. Approximate p values of the tests are reported using a certain standard estimate of ν2.

As an alternative to the above F statistic one can use the log-likelihood ratio test statistic or the chi-square statistic associated with that. The degrees of freedom of the chi-square distribution are determined by taking the difference between the number of parameters in the full model and that in the reduced (under the null hypothesis) model. This chi-square test can be requested using the CHISQ option in the MODEL statement and METHOD=ML option in the PROC MIXED statement. Since the REML method produces estimators that are not the maximum likelihood estimators, whenever chi-square tests are requested the ML and not the REML method must be used for estimating the parameters.

The estimates of the fixed effects are obtained using the LSMEANS statement of PROC MIXED. The multiple comparison tests of these effects using one of the standard methods (for example Tukey's method) can be carried out using the ADJUST= option in the LSMEANS statement. Estimation and testing of the hypotheses of certain specific contrasts of the parameters can be carried out using the ESTIMATE and CONTRAST statements in PROC MIXED.

In the following sections, we provide several applications of this approach in conjunction with PROC MIXED for a variety of models. It is not possible to address all the aspects and options of this very general procedure. For different applications and detailed description, we refer the reader to Littell, Milliken, Stroup, and Wolfinger (1996) and The MIXED Procedure Chapter in SAS/STAT Software: Changes and Enhancements through Release 6.12.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.129.45.92