Random effects

So far, we have explored models where we have different fixed levels for each effect. This makes a lot of sense when we have a set of possible levels for an effect that we control and are interested in measuring. It also makes sense when we have a blocking effect that has a finite (and small) set values (for example, the sex or occupation of a person). In some cases, we will have a huge amount of levels that will be generally unimportant, for example, if we want to measure whether a drug is effective, and we are dealing with multiple observations per person, we want to add a blocking effect for a person. In these cases, we are not interested in the effect per se, but we certainly want to use it as a control variable for our model. A model that uses proper blocks, will be more efficient: think of ANOVA as a method of attributing variability to factors. If we have the right factors in place, we can be sure that we are putting the variance in the right buckets.

The first issue is that, if we have lots of levels (people, in our example), our model will require lots of parameters. If the number of observations for some people is small, that will imply that those parameters won't be accurately estimated. The second issue is that estimating a parameter for each person is generally not very useful, since we will use this effect just to control the variability in the model. As it usually happens in statistics, we tend to prefer models that are as simple as possible, since more complex models have a natural tendency to overfit (instead of fitting to the structure of the data, the model starts to capture the noise).

These models/examples (that we have used so far in ANOVA) are fixed effects models (we have k-1 parameters for each effect/factor). Random effects models are different: in this case, we assume that all observations belonging to the same subject are correlated (they all share a random shock). We then need to estimate a single parameter: the variance of the random effect. When all of the effects are random (this is an unusual situation), we say that the model is a random effects model. If all of the effects are fixed, we say that the model is a fixed effects model. And if we have fixed and random effects, we say that the model is a mixed effects model.

How can we decide whether an effect should be fixed or random? In most cases, this is solved by looking at whether the levels for the effect are a sample of a larger population. This is natural when working with people, companies, or animals.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.188.115.155