In order to calculate standard descriptive statistics, you can use the following built-in functions:
- VARIANCE: This gives you the variance of your data, which calculates what the difference is for each individual point and the mean of all the points. If zero is returned, then all the data points are the same. A larger value returned means that individual data points are farther from the mean.
- STDDEV: This gives you the standard deviation, which helps you understand how spread out your data is, as well as how close each individual data point is to the mean (you calculate the mean by using the AVG built-in function). If zero is returned, then all the data points are exactly the same. A larger value returned means that individual data points are farther from the mean. Using the standard deviation gives you a standard way of knowing what is normal, and helps to find outlier values.
Execute the following query to see each of the statistical aggregate functions in action:
USE lahmansbaseballdb;
SELECT
STDDEV(h) AS 'stddev',
VARIANCE(h) AS 'variance'
FROM batting;
The previous query provides the results shown in the following table:
stddev |
stddev |
52.4710125 |
2753.20716 |
The first column shows you the stddev of all hits in the batting table. The second column shows you the variance of hits in the batting table.