What Are We Assuming?

The preceding discussion outlines the conditions under which we can generalize using regression analysis. First, we need a logical or theoretical reason to anticipate that Y and X have a linear relationship. Second, the default method[3] that we use to estimate the line of best fit works reliably. We know that the method works reliably when the random errors, εi, satisfy four conditions:

[3] Like all statistical software, JMP uses a default method to line-fitting that is known as ordinary least squares estimation, or OLS. A full discussion of OLS is well beyond the scope of this book, but it's worth noting that these assumptions refer to OLS in particular, not to regression in general.

  • They are normally distributed.

  • They have a mean value of 0.

  • They have a constant variance, σ2, regardless of the value of X.

  • They are independent across observations.

At this early stage in the presentation of this technique, it might be difficult to grasp all of the implications of these conditions. Start by understanding that the following might be red flags to look for in a scatterplot with a fitted line:

  • The points seem to bend or oscillate around the line.

  • There are a small number of outliers that stand well apart from the mass of the points.

  • The points seem snugly concentrated near one end of the line, but fan out toward the other end.

  • There seem to be greater concentrations of points distant from the line, but not so many points concentrated near the line.

In this example none of these trouble signs is present. In the next chapter, we'll learn more about looking for problems with the important conditions for inference. For now, let's proceed.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.185.40