How it works...

By default, the mask method covers up data with missing values. The first parameter to the mask method is the condition which is often a boolean Series such as criteria. Because the mask method is called from a DataFrame, all the values in each row where the condition is False change to missing. Step 3 uses this masked DataFrame to drop the rows that contain all missing values. Step 4 shows how to do this same procedure with boolean indexing.

During a data analysis, it is very important to continually validate results. Checking the equality of Series and DataFrames is an extremely common approach to validation. Our first attempt, in step 4, yielded an unexpected result. Some basic sanity checking, such as ensuring that the number of rows and columns are the same or that the row and column names are the same, are good checks before going deeper.

Step 6 compares the two Series of data types together. It is here where we uncover the reason why the DataFrames were not equivalent. The equals method checks that both the values and data types are the same. The assert_frame_equal function from step 7 has many available parameters to test equality in a variety of ways. Notice that there is no output after calling assert_frame_equal. This method returns None when the two passed DataFrames are equal and raises an error when they are not.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.139.240.244