How it works...

Step 2 appears at first to create two unique objects but in fact, it creates a single object that is referred to by two different variable names. The expression employee['BASE_SALARY'], technically creates a view, and not a brand new copy. This is verified with the is operator.

In pandas, a view is not a new object but just a reference to another object, usually some subset of a DataFrame. This shared object can be a cause for many issues.

To ensure that both variables reference completely different objects, we use the copy Series method and again verify that they are different objects with the is operator. Step 4 uses the sort_index method to sort the Series by race. Step 5 adds these different Series together to produce some result. By just inspecting the head, it's still not clear what has been produced.

Step 6 adds salary1 to itself to show a comparison between the two different Series additions. The length of all the Series in this recipe are output and we clearly see that series_add has now exploded to over one million values. A Cartesian product took place for each unique value in the index because the indexes were not exactly the same. This recipe dramatically shows how much of an impact the index can have when combining multiple Series or DataFrames.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.17.164.34