Useful numerical methods of NumPy arrays

NumPy arrays have many functions that can be applied to the arrays. Many of these are statistical functions that you can use for data analysis. The following example describes several of the useful functions.

Note

Note that most of these functions work on multi-dimensional arrays, and the axis to which the function is applied to is specified by the axis parameter. We will examine this for the .min() and .max() functions, but note that the axis parameter applies to many other NumPy functions.

The .min() and .max() methods return the minimum and maximum values in an array. The .argmax() and .argmin() functions return the position of the maximum or minimum value in the array:

In [82]:
   # demonstrate some of the properties of NumPy arrays
   m = np.arange(10, 19).reshape(3, 3)
   print (a)
   print ("{0} min of the entire matrix".format(m.min()))
   print ("{0} max of entire matrix".format(m.max()))
   print ("{0} position of the min value".format(m.argmin()))
   print ("{0} position of the max value".format(m.argmax()))
   print ("{0} mins down each column".format(m.min(axis = 0)))
   print ("{0} mins across each row".format(m.min(axis = 1)))
   print ("{0} maxs down each column".format(m.max(axis = 0)))
   print ("{0} maxs across each row".format(m.max(axis = 1)))

   [[ 0  1  2]
    [ 3  4  5]
    [ 6  7  8]
    [ 9 10 11]]
   10 min of the entire matrix
   18 max of entire matrix
   0 position of the min value
   8 position of the max value
   [10 11 12] mins down each column
   [10 13 16] mins across each row
   [16 17 18] maxs down each column
   [12 15 18] maxs across each row

The .mean(), .std(), and .var() methods compute the mathematical mean, standard deviation, and variance of the values in an array:

In [83]:
   # demonstrate included statistical methods
   a = np.arange(10)
   a

Out[83]:
   array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [84]:
   a.mean(), a.std(), a.var()

Out[84]:
   (4.5, 2.8722813232690143, 8.25)

The sum and products of all the elements in an array can be computed with the .sum() and .prod() methods:

In [85]:
   # demonstrate sum and prod
   a = np.arange(1, 6)
   a

   Out[85]:
   array([1, 2, 3, 4, 5])

In [86]:
   a.sum(), a.prod()

Out[86]:
   (15, 120)

The cumulative sum and products can be computed with the .cumsum() and .cumprod() methods:

In [87]:
   a # and cumulative sum and prod
   a.cumsum(), a.cumprod()

Out[87]:
   (array([ 1,  3,  6, 10, 15]), array([  1,   2,   6,  24, 120]))

The .all() method returns True if all elements of an array are true, and .any() returns True if any element of the array is true.

In [88]:
   # applying logical operators
   a = np.arange(10)
   (a < 5).any() # any < 5?

Out[88]:
   True

In [89]:
   (a < 5).all() # all < 5? (a < 5).any() # any < 5?

Out[89]:
   False

The .size property returns the number of elements in the array across all dimensions:

In [90]:
   # size is always the total number of elements
   np.arange(10).reshape(2, 5).size

Out[90]:
   10

Also, .ndim returns the overall dimensionality of an array:

In [91]:
   # .ndim will give you the total # of dimensions
   np.arange(10).reshape(2,5).ndim

Out[91]:
   2

There are a number of valuable statistical functions, as well as a number of descriptive statistical functions besides those demonstrated here. This was meant to be a brief overview of NumPy arrays, and the next two chapters on pandas Series and DataFrame objects will dive deeper into these additional methods.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.225.234.24