The Gauss—Markov theorem

To assess the statistical of the model and conduct inference, we need to make assumptions about the residuals, that is, the properties of the unexplained part of the input. The Gauss—Markov theorem (GMT) defines the assumptions required for OLS to produce unbiased estimates of the model parameters , and when these estimates have the lowest standard error among all linear models for cross-sectional data.

The baseline multiple regression model makes the following GMT assumptions:

  1. In the population, linearity holds,  where  are unknown but constant and  is a random error
  2. The data for the input variables  are a random sample from the population
  3. No perfect collinearity—there are no exact linear relationships among the input variables
  4. The error has a conditional mean of zero given any of the inputs: 
  5. Homoskedasticity, the error term  has constant variance given the inputs: 

The fourth assumption implies that no missing variable exists that is correlated with any of the input variables. Under the first four assumptions, the OLS method delivers unbiased estimates: including an irrelevant variable does not bias the intercept and slope estimates, but omitting a relevant variable will bias the OLS estimates. OLS is then also consistent: as the sample size increases, the estimates converge to the true value as the standard errors become arbitrary. The converse is unfortunately also true: if the conditional expectation of the error is not zero because the model misses a relevant variable or the functional form is wrong (that is, quadratic or log terms are missing), then all parameter estimates are biased. If the error is correlated with any of the input variables then OLS is also not consistent, that is, adding more data will not remove the bias.

If we add the fifth assumptions, then OLS also produces the best linear, unbiased estimates (BLUE), where best means that the estimates have the lowest standard error among all linear estimators. Hence, if the five assumptions hold and statistical inference is the goal, then the OLS estimates is the way to go. If the goal, however, is to predict, then we will see that other estimators exist that trade off some bias for a lower variance to achieve superior predictive performance in many settings.

Now that we have introduced the basic OLS assumptions, we can take a look at inference in small and large samples.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
54.234.45.33