Filling missing values

We can use the fillna() method to replace NaN values with any particular values.

Check the following example:

filledDf = dfx.fillna(0)
filledDf

The output of the preceding code is shown in the following screenshot:

Note that in the preceding dataframe, all the NaN values are replaced by 0. Replacing the values with 0 will affect several statistics including mean, sum, and median.

Check the difference in the following two examples:

dfx.mean()

And the output of the preceding code is as follows:

store1 20.0
store2 21.0
store3 22.0
store4 19.0
store5 NaN
dtype: float64

Now, let's compute the mean from the filled dataframe with the following command:

filledDf.mean()

And the output we get is as follows:

store1 17.142857
store2 18.000000
store3 18.857143
store4 5.428571
store5 0.000000
dtype: float64

Note that there are slightly different values. Hence, filling with 0 might not be the optimal solution. 

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.223.107.85