Summary

In this chapter, we saw the types of data (numeric, categorical, and ordinal data) that you might encounter and how to categorize them and how you treat them differently depending on what kind of data you're dealing with. We also walked through the statistical concepts of mean, median and mode, and we also saw the importance of choosing between median and mean, and that often the median is a better choice than the mean because of outliers.

Next, we analyzed how to compute mean, median, and mode using Python in an IPython Notebook file. We learned the concepts of standard deviation and variance in depth and how to compute them in Python. We saw that they’re a measure of the spread of a data distribution. We also saw a way to visualize and measure the actual chance of a given range of values occurring in a dataset using probability density functions and probability mass functions.

We looked at the types of data distributions (Uniform distribution, Normal or Gaussian distribution, Exponential probability distribution, Binomial probability mass function, Poisson probability mass function) in general and how to visualize them using Python. We analyzed the concepts of percentiles and moments and saw how to compute them using Python.

In the next chapter, we'll look at using the matplotlib library more extensively, and also dive into the more advanced topics of covariance and correlation.

 

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.147.66.24