Mathematical operations with NaN

The pandas and numpy libraries handle NaN values differently for mathematical operations.

Consider the following example:

ar1 = np.array([100, 200, np.nan, 300])
ser1 = pd.Series(ar1)

ar1.mean(), ser1.mean()

The output of the preceding code is the following:

(nan, 200.0)

Note the following things:

  • When a NumPy function encounters NaN values, it returns NaN
  • Pandas, on the other hand, ignores the NaN values and moves ahead with processing. When performing the sum operation, NaN is treated as 0. If all the values are NaN, the result is also NaN.

Let's compute the total quantity of fruits sold by store4:

ser2 = dfx.store4
ser2.sum()

The output of the preceding code is as follows:

38.0

Note that store4 has five NaN values. However, during the summing process, these values are treated as 0 and the result is 38.0

Similarly, we can compute averages as shown here:

ser2.mean()

The output of the code is the following:

19.0

Note that NaNs are treated as 0s. It is the same for cumulative summing:

ser2.cumsum()

And the output of the preceding code is as follows:

apple 20.0
banana NaN
kiwi NaN
grapes NaN
mango NaN
watermelon 38.0
oranges NaN
Name: store4, dtype: float64

Note that only actual values are affected in computing the cumulative sum. 

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.107.193