The pandas DataFrame
has a dozen statistical methods. The following table lists these methods along with a short description:
Using the same data as in the previous example, we will demonstrate these statistical methods. The full script is in the stats_demo.py
of this book's code bundle:
import Quandl # Data from http://www.quandl.com/SIDC/SUNSPOTS_A-Sunspot-Numbers-Annual # PyPi url https://pypi.python.org/pypi/Quandl sunspots = Quandl.get("SIDC/SUNSPOTS_A") print "Describe", sunspots.describe() print "Non NaN observations", sunspots.count() print "MAD", sunspots.mad() print "Median", sunspots.median() print "Min", sunspots.min() print "Max", sunspots.max() print "Mode", sunspots.mode() print "Standard Deviation", sunspots.std() print "Variance", sunspots.var() print "Skewness", sunspots.skew() print "Kurtosis", sunspots.kurt()
The following is the output of the script:
Describe Number count 314.000000 mean 49.528662 std 40.277766 min 0.000000 25% 16.000000 50% 40.000000 75% 69.275000 max 190.200000 [8 rows x 1 columns] Non NaN observations Number 314 dtype: int64 MAD Number 32.483184 dtype: float64 Median Number 40 dtype: float64 Min Number 0 dtype: float64 Max Number 190.2 dtype: float64 Mode Number 0 47 [1 rows x 1 columns] Standard Deviation Number 40.277766 dtype: float64 Variance Number 1622.298473 dtype: float64 Skewness Number 0.994262 dtype: float64 Kurtosis Number 0.469034 dtype: float64
3.133.124.145