Kurtosis

Kurtosis is a way of quantifying differences in the shape of distributions, which may look very similar in terms of means and variances, yet are actually different. In such cases, kurtosis becomes a good measure of the weight of the distribution at the tail of the distribution, as compared to the middle of the distribution.

The kurtosis API has several implementations, as follows. The exact API used depends on the specific use case.

def kurtosis(columnName: String): Column
Aggregate function: returns the kurtosis of the values in a group.

def kurtosis(e: Column): Column
Aggregate function: returns the kurtosis of the values in a group.

Let's look at an example of invoking kurtosis on the DataFrame on the Population column:

import org.apache.spark.sql.functions._
scala> statesPopulationDF.select(kurtosis("Population")).show

+--------------------+
|kurtosis(Population)|
+--------------------+
| 7.727421920829375|
+--------------------+
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.92.199