Multilevel regression

To solve all these issues, we can rely on a kind of analysis that can partial out (take away) the variance due to the context. This can be done using multilevel regression analysis (also known as mixed-effect regression). We will not go into the detail of the computations of such highly complex analyses but will simply provide the amount of information necessary to understand and perform the analysis at a basic level. The necessary diagnostic checks are not fully presented here. Simply note that diagnostics for linear regression apply, and that additional diagnostics should be performed, such as checking the normality of residuals at level 2. We will not discuss this further here. The Handbook of multilevel analysis book, edited by De Leeuw and Meijer, provides the necessary information for diagnostics of multilevel models.

When we discussed regression in Chapter 9, Linear Regression, we showed that the value of a criterion attribute for an observation is computed as the sum of:

  • The intercept (the average value when the value of all included predictors equal 0)
  • The slope coefficient multiplied by the value of the predictor (for each predictor)
  • The residual (the difference between the predicted value and the observed value)

We explained how the regression algorithm finds the parameters that minimize the residuals on the whole sample.

Random intercepts and fixed slopes

In multilevel modeling, when considering predictors only at level 1 and considering a common slope for all groups, the value of a criterion attribute on an observation is schematically computed as the sum of:

  • A common intercept
  • A group-specific residual (which is the difference between the group's intercept and the common intercept)
  • The slope coefficient multiplied by the value of the predictor (for each predictor)
  • An observation specific residual

In this type of model, the effect of the attributes at level 1 is considered the same across all groups. Only the intercept varies.

When considering predictors at levels 1 and 2 and considering common slopes for all groups, the computation is schematically the sum of:

  • A common intercept
  • A group-specific residual corresponding to the difference between the group's intercept and the common intercept
  • An overall slope coefficient multiplied by the value of the predictor (for each predictor at level 1)
  • An observation-specific residual

The computations are actually more complex, but this goes beyond the material covered in this chapter. Simply note that it is the job of multilevel regression to find the parameters that minimize the residuals on the whole sample.

Random intercepts and random slopes

Until now, we have considered that the slope of level 1 predictors is the same across groups. This is not always the case. Let's examine this with an example. We will use simulated data generated from real data, with attributes about burnout (personal accomplishment, depersonalization, and emotional exhaustion) and work satisfaction. You might remember that we have used similar data in the chapter about regression.

The following code loads the covariance's data and generates the dataset from it (100 observations for each of 17 hospitals):

1  library(MASS)
2  set.seed(999)
3  Covariances = read.table("Covariances.dat", sep = "	", header=T)
4  df = data.frame(matrix(nrow=0,ncol=4))
5  colnames(df) = c("Hospital","Accomp","Depers","Exhaus","WorkSat")
6  for (i in 1:17){
7    if(i == 1) {start_ln = 1}
8    else start_ln = 1+((i-1)*4)
9    end_ln = start_ln + 3
10    covs = Covariances[start_ln:end_ln, 3:6]
11    rownames(covs)=Covariances[start_ln:end_ln,2]
12    dat=mvrnorm(n=100, c(rep(0,4)), covs)
13    df = rbind(df,dat)
14  }
15  df$hosp = as.factor(c(rep(1,100), rep(2,100), rep(3,100),   
16     rep(4,100), rep(5,100), rep(6,100),    
17     rep(7,100),rep(8,100),rep(9,100), 
18     rep(10,100),rep(11,100),rep(12,100),
19     rep(13,100),rep(14,100),rep(15,100),
20     rep(16,100),rep(17,100)))

The following code will plot the relationship (using an lm() model) between depersonalization and work satisfaction with each of the hospitals:

1  library(lattice)
2  attach(df)
3  xyplot(WorkSat~Depers | hosp, panel = function(x,y) { 
4    panel.xyplot(x,y) 
5    panel.lmline(x,y)
6  })

As can be seen on the following plot, there is usually a negative relationship between depersonalization and work satisfaction, but groups do not show this pattern to the same extent, and in some cases the relationship is not even present.

Random intercepts and random slopes

The relationship between depersonalization and work satisfaction by the hospital

Whether to take into account these variations or not in the analysis is the decision of the analyst. We will discuss this further in the next section. A random slopes model refers to a model in which the slopes are allowed to vary between groups.

For now, let's examine the schematic computation of the individual values when dealing with random slopes models. This value is obtained as the sum of:

  • A common intercept.
  • A group-specific residual (the difference between the group's intercept and the common intercept).
  • A group-specific coefficient multiplied by the value of the predictor, for each predictor for which random slopes are included. This group's specific coefficients are composed of a fixed part, the common slope; and a random part, the residual that corresponds for each slope to the variation of the group around that slope.
  • For each predictor for which the slope is not allowed to vary between groups (if any), a slope coefficient multiplied by the value for the predictor.
  • An observation-specific residual.

Again, the job of multilevel regression is to find the parameters that minimize the residuals. Note that the effect of some predictors can be defined as varying between groups, and the others as not varying.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.226.66