Puzzle 18 | Off with Their NaNs |
| import numpy as np |
| import pandas as pd |
| |
| s = pd.Series([1, np.nan, 3]) |
| print(s[~(s == np.nan)]) |
Guess the Output | |
---|---|
Try to guess what the output is before moving to the next page. |
This code will print:
| 0 1.0 |
| 1 NaN |
| 2 3.0 |
| dtype: float64 |
We covered some of the floating-point oddities in the puzzle Puzzle 12, Multiplying . NaN (or np.nan) is another oddity. The name NaN stands for not a number. It serves two purposes: illegal computation and missing values.
Here’s an example of a bad computation:
| In [1]: np.float64(0)/np.float64(0) |
| RuntimeWarning: invalid value encountered in |
| double_scalars np.float64(0)/np.float64(0) |
| Out[1]: nan |
You see a warning but not an exception, and the return value is nan.
nan does not equal any number, including itself.
| In [2]: np.nan == np.nan |
| Out[2]: False |
To check that a value is nan, you need to use a special function such as pandas.isnull:
| In [3]: pd.isnull(np.nan) |
| Out[3]: True |
You can use pandas.isnull to fix this teaser.
| import numpy as np |
| import pandas as pd |
| |
| s = pd.Series([1, np.nan, 3]) |
| print(s[~pd.isnull(s)]) |
pandas.isnull works with all Pandas “missing” values: None, pandas.NaT (not a time), and the new pandas.NA.
Floating points have several other special “numbers” such as inf (infinity), -inf, -0, +0, and others. You can learn more about them in the following links.
http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.isnull.html
http://pandas.pydata.org/pandas-docs/stable/user_guide/missing_data.html#missing-data-na
http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html
18.116.62.45