Sorting a data frame

Sorting in most languages means re-organizing the dataset that you are working with. In data frames, sorting can be accomplished by selecting another index to access the data frame. All data frames start out with a basic incremental row index built-in by NumPy. You can change the index used to access the data frame and effectively sort the data frame in the manner that you want.

If we look at the display of the (Titanic) data frame, we notice the unnamed first column of ordinal values:

If we were to assign another index to use on the data frame, we would sort the data frame by that index. For example:

df.set_index('name').head()

Remember, since we have not assigned this new data frame (with the name index) we still have our original data frame intact.

Along the lines of the prior section, there is actually a sorting operation that can be performed against a data frame as well, using the sort_values method. For example, if we were to use the following script:

print(df.sort_values(by='home.dest', ascending=True).head())  

This script takes the data frame, sorts it by the home.dest column in ascending order and prints the first five records (in that order)

We would see results as follows:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.227.161.132