Let's play around in Python and actually compute these moments and see how you do that. To play around with this, go ahead and open up the Moments.ipynb, and you can follow along with me here.
Let's again create that same normal distribution of random data. Again, we're going to make it centered around zero, with a 0.5 standard deviation and 10,000 data points, and plot that out:
import numpy as np import matplotlib.pyplot as plt vals = np.random.normal(0, 0.5, 10000) plt.hist(vals, 50) plt.show()
So again, we get a randomly generated set of data with a normal distribution around zero.
Now, we find the mean and variance. We've done this before; NumPy just gives you a mean and var function to compute that. So, we just call np.mean to find the first moment, which is just a fancy word for the mean, as shown in the following code:
np.mean(vals)
This gives the following output in our example:
Out [2]:-0.0012769999428169742
The output turns out to be very close to zero, just like we would expect for normally distributed data centered around zero. So, the world makes sense so far.
Now we find the second moment, which is just another name for variance. We can do that with the following code, as we've seen before:
np.var(vals)
Providing the following output:
Out[3]:0.25221246428323563
That output turns out to be about 0.25, and again, that works out with a nice sanity check. Remember that standard deviation is the square root of variance. If you take the square root of 0.25, it comes out to 0.5, which is the standard deviation we specified while creating this data, so again, that checks out too.
The third moment is skew, and to do that we're going to need to use the SciPy package instead of NumPy. But that again is built into any scientific computing package like Enthought Canopy or Anaconda. Once we have SciPy, the function call is as simple as our earlier two:
import scipy.stats as sp
sp.skew(vals)
This displays the following output:
Out[4]: 0.020055795996111746
We can just call sp.skew on the vals list, and that will give us the skew value. Since this is centered around zero, it should be almost a zero skew. It turns out that with random variation it does skew a little bit left, and actually that does jive with the shape that we're seeing in the graph. It looks like we did kind of pull it a little bit negative.
The fourth moment is kurtosis, which describes the shape of the tail. Again, for a normal distribution that should be about zero.SciPy provides us with another simple function call
sp.kurtosis(vals)
And here's the output:
Out [5]:0.059954502386585506
Indeed, it does turn out to be zero. Kurtosis reveals our data distribution in two linked ways: the shape of the tail, or the how sharp the peak If I just squish the tail down it kind of pushes up that peak to be pointier, and likewise, if I were to push down that distribution, you can imagine that's kind of spreading things out a little bit, making the tails a little bit fatter, and the peak of it a little bit lower. So that's what kurtosis means, and in this example, kurtosis is near zero because it is just a plain old normal distribution.
If you want to play around with it, go ahead and, again, try to modify the distribution. Make it centered around something besides 0, and see if that actually changes anything. Should it? Well, it really shouldn't because these are all measures of the shape of the distribution, and it doesn't really say a whole lot about where that distribution is exactly. It's a measure of the shape. That's what the moments are all about. Go ahead and play around with that, try different center values, try different standard deviation values, and see what effect it has on these values, and it doesn't change at all. Of course, you'd expect things like the mean to change because you're changing the mean value, but variance, skew, maybe not. Play around, find out.
There you have percentiles and moments. Percentiles are a pretty simple concept. Moments sound hard, but it's actually pretty easy to understand how to do it, and it's easy in Python too. Now you have that under your belt. It's time to move on.