Introduction to Linear Regression

Getting Started

You have already learned what is meant by correlation and covariance in Chapter 8. These concepts describe the linear association between two variables. We now move on to linear regression. Essentially regression takes covariances and formalizes them into linear relationships where we assume one of the variables is dependent on the others.

A Definition of Linear Regression

A definition of regression helps to describe the major issues further. Regression is about associations between multiple variables where you believe that one of the variables (the dependent variable DV) is explained or predicted by the others (the independent variables independent variables). To understand regression we must understand each of these concepts: association, variables, explanation and prediction. Some of these concepts were discussed earlier in the text, but I will provide reminders here.

Variables in Multiple Regression

Variables in Linear Regression

Regression is about relating variables to one another – specifically making one variable such as sales dependent on the others (the independent variables), so that the independent variables are assumed to be causing the dependent variable. There are several crucial characteristics of the variables.
  1. Type of variable:
    1. In linear regression of the form we are discussing here, the dependent variable needs to be continuous data (interval or ratio). There are other types of regression that are suitable for dependent variables that are ordinal or categorical; these are not discussed here. Sales, the dependent variable in our case example, is certainly continuous.
    2. In various types of regression, the independent variables that, as a set, are assumed to explain or predict the dependent variable can be any type (continuous, ordinal or categorical). However when independent variables are ordinal or categorical, there are special ways to deal with them in practice, which I discuss later. Therefore it is crucial to understand the type of variable.
  2. Spread of continuous variables is actually at the heart of regression analysis, as the next section explains.

Explaining Variance: The Aim of Linear Regression

Remember that in regression we have a dependent variable that we try to explain using one or more predictor (independent) variables. In the example used in this chapter, we wish to explain why and how customers buy more or fewer services by looking at predictors such as satisfaction and trust. This is desirable either because such explanation has intrinsic value to our analysis, or because explaining it helps us to predict the dependent variable in the future.
What do we mean by “explaining” a dependent variable? This question is at the absolute heart of understanding regression.
Saying that the aim is to explain the dependent variable (sales) is all very well, but exactly what do we want to explain about it? In fact, in regression what we are trying to explain is why, when or how the dependent variable occurs at different levels, in other words why, when and how it spreads away from the center. Why do some customers have high sales, while some have low or medium sales?
In other words, in regression our aim is to explain the spread of the dependent variable. Precisely, we want to explain the statistical variance of the dependent variable based on given levels of independent variables.
Remember, the variance is the standard deviation squared. Therefore, if the dependent variable has a variance of 120, we want to see if other variables can explain this spread, that is, accurately explain when one observation will be low, another medium, and another high on the dependent variable.
A second question is how we phrase the effects of an independent variable on the dependent variable. What we wish to be able to say is that when the independent variable increases by 1 unit in whatever metric it is measured, the dependent variable changes by so-many of its units.
For example, we want to be able to say something like “if trust increases by 1 unit (an increase of 1 unit in the independent variable) then the dependent variable Sales is expected to increase by $203,764.” (This is only an example; the actual association measure might be bigger or smaller than this.)
Last updated: April 18, 2017
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.140.197.136