Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Distribution fitting

In Timothy Sturm's example, we claim that the histogram of some data seemed to fit a normal distribution. SciPy has a few routines to help us approximate the best distribution to a random variable, together with the parameters that best approximate this fit. For example, for the data in that problem, the mean and standard deviation of the normal distribution that realizes the best fit can be found in the following way:

>>> from scipy.stats import norm     # Gaussian distribution
>>> mean,std=norm.fit(dataDiff)

We can now plot the (normed) histogram of the data, together with the computed probability density function, as follows:

>>> plt.hist(dataDiff, normed=1)
>>> x=numpy.linspace(dataDiff.min(),dataDiff.max(),1000)
>>> pdf=norm.pdf(x,mean,std)
>>> plt.plot(x,pdf)
>>> plt.show()

We will obtain the following graph showing the maximum likelihood estimate to the normal distribution that best fits dataDiff:

We may even fit the best probability density function without specifying any particular distribution, thanks to a non-parametric technique, kernel density estimation. We can find an algorithm to perform Gaussian kernel density estimation in the scipy.stats.kde submodule. Let us show by example with the same data as before:

>>> from scipy.stats import gaussian_kde
>>> pdf=gaussian_kde(dataDiff)

A slightly different plotting session as given before, offers us the following graph, showing probability density function obtained by kernel density estimation on dataDiff:

The full piece of code is as follows:

>>> from scipy.stats import gaussian_kde
>>> pdf = gaussian_kde(dataDiff)
>>> pdf = pdf.evaluate(x)
>>> plt.hist(dataDiff, normed=1)
>>> plt.plot(x,pdf,'k')
>>> plt.savefig("hist2.png")
>>> plt.show()

For comparative purposes, the last two plots can be combined into one:

>>> plt.hist(dataDiff, normed=1)
>>> plt.plot(x,pdf,'k.-',label='Kernel fit')
>>> plt.plot(x,norm.pdf(x,mean,std),'r',label='Normal fit')
>>> plt.legend() 
>>> plt.savefig("hist3.png")
>>> plt.show()

The output is the combined plot as follows:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Distribution fitting

Create new playlist

Sign In

Sign Up

Distribution fitting

Table of Contents for
Distribution fitting