Working With Covariation

When we talk of covariation, there are three essential things to consider: the direction, the strength, and the statistical significance of such covariation. Let us start with direction, and use another example for illustration. Suppose a market research agency is working on a project for a chain of stores that has not been very profitable in the recent past. Management has therefore undertaken certain initiatives to improve their profitability, and one of these pertains to productivity enhancements in the stores through workforce reduction. The research charter then is to ascertain the link between productivity and profitability. One school of thought could be that stores that are more productive will be more profitable in the long run because of the elimination of redundancies and reduction of the number of in-store employees. The other school of thought could be that fewer in-store employees will lead to less favorable customer experiences, which in turn will lead to customer exodus and therefore lower store profitability in the long run. From a statistical perspective, while both these arguments acknowledge that there is likely to be covariation between the measures of productivity and profitability, there is less certainty regarding the direction of the covariation: It could be positive or negative. Positive covariation means that as productivity goes up, profitability will go up too, while negative covariation means that as productivity goes up, profitability will go down.

After ascertaining the direction of the relationship, the next charter is to determine the strength of the relationship. In the previous example, then, if the analysis confirmed a negative relationship between productivity and profitability, the next analytical step is to determine its magnitude. Such magnitude estimation will help assess whether every percentage point increase in productivity would lead to a percentage point decrease in profit, or would it be smaller than or larger than a percentage point? Simply knowing that the relationship between productivity and profitability is negative is not going to provide much help to management. Instead, they would want to know the magnitude of this negative relationship is meaningfully large. While some of the covariance-based techniques such as regression analysis can help determine the magnitude, sometime simpler techniques such as correlation analysis can often provide an upper and lower bound for relationship strength.

Finally, when the direction and magnitude of covarying measures has been determined, we examine the statistical significance of the relationship. The purpose of this step is to assess whether the observed covariance relationship is true and authentic, or merely an outcome of chance. So if the observed relationship is being tested at a significance level at say 99%, in simple English it means that we are willing to accept a 1 out of 100 chance that the observed relationship does not exist in the real world (i.e., it is zero) and that we observe it simply out of chance. Thus if the analysis points to a significant relationship between variables, the analytical team can be 99% confident that the relationship is meaningfully different from zero, with only a 1% chance that the conclusion might erroneous. Most statisticians and analysts commonly use the confidence levels of 90%, 95%, or 99%, which means that they are willing to accept 10 in 100, 5 in 100, or 1 in 100 chances respectively of concluding that a meaningful relationship exists, when in reality it does not. However, an important point of caution when working with statistical significance is that the volume of data influences such conclusions. In the interest of keeping the discussion nontechnical, we will simply state that more volume of data gives greater power to data, and therefore increases the likelihood of finding statistically significant relationships. For a data set, say with a million records for example, a correlation coefficient of 0.05 (correlation coefficient can range from zero to one—a point we explain shortly) is very likely to be statistically significant. On the other hand, for a data set with 50 observations, even a correlation of 0.30 may not be significantly different from zero. Thus, as one reviews the statistical significance information, one has to be careful about the statistical versus the substantive difference. One needs to ask if 0.05 is substantively different from zero just because the difference is statistically significant. Conversely, one should make a similar argument and ask whether a correlation coefficient of 0.3 should be dismissed merely because it was not statistically different from zero.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.237.176