Manipulating columns in a data frame

An interesting column manipulation is to sort. We can sort the prior age count data to determine the most common ages for travelers on the boat using the sort_values function.

The script is as follows:

import pandas as pd
df = pd.read_excel('http://biostat.mc.vanderbilt.edu/wiki/pub/Main/DataSets/titanic3.xls')
# the [[]] syntax extracts the column(s) into a new dataframe
# we groupby the age column, and 
# apply a count to the age column
ages = df[['age']].groupby('age')['age'].count()
print("The most common ages")
print (ages.sort_values(ascending=False))  

The resultant Jupyter display is as follows. From the data, there were many younger travelers on board. In light of this, it makes more sense why there were so many babies as well.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.218.172.249