Intermediate statistics – associations

In the previous chapter, you learned about discrete statistics methods for getting information about the distribution of discrete and continuous variables. In a data science project, the next typical step is to check for the associations between pairs of variables.

When checking for the associations between pairs of variables, you have three possibilities:

  • Both variables are discrete
  • Both variables are continuous
  • There is one discrete and one continuous variable

Besides dealing with two variables only, this section also introduces linear regression, one of the most important statistical methods, where you model a single response (or dependent) variable with a regression formula that includes one or more predictor (or independent) variables.

Altogether, you will learn about the following in this section:

  • Chi-squared test of the independence of two discrete variables
  • Phi coefficient, contingency coefficient, and Cramer's V coefficient, which measure the association of two discrete variables
  • Covariance and correlations between two continuous variables
  • T-Test and one-way ANOVA, which measure associations between one continuous and one discrete variable
  • Simple and polynomial linear regression
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.144.194