How to do it...

  1. Read in the college dataset, and find a few basic summary statistics on the undergraduate population and SAT math scores by institution and religious affiliation:
>>> college = pd.read_csv('data/college.csv')
>>> cg = college.groupby(['STABBR', 'RELAFFIL'])
['UGDS', 'SATMTMID']
.agg(['size', 'min', 'max']).head(6)
  1. Notice that both index levels have names and are the old column names. The column levels, on the other hand, do not have names. Use the rename_axis method to supply level names to them:
>>> cg = cg.rename_axis(['AGG_COLS', 'AGG_FUNCS'], axis='columns')
>>> cg
  1. Now that each axis level has a name, reshaping is a breeze. Use the stack method to move the AGG_FUNCS column to an index level:
>>> cg.stack('AGG_FUNCS').head()
  1. By default, stacking places the new column level in the innermost position. Use the swaplevel method to switch the placement of the levels:
>>> cg.stack('AGG_FUNCS').swaplevel('AGG_FUNCS', 'STABBR',
axis='index').head()
  1. We can continue to make use of the axis level names by sorting levels with the sort_index method:
>>> cg.stack('AGG_FUNCS') 
.swaplevel('AGG_FUNCS', 'STABBR', axis='index')
.sort_index(level='RELAFFIL', axis='index')
.sort_index(level='AGG_COLS', axis='columns').head(6)
  1. To completely reshape your data, you might need to stack some columns while unstacking others. Chain the two methods together in a single command:
>>> cg.stack('AGG_FUNCS').unstack(['RELAFFIL', 'STABBR'])
  1. Stack all the columns at once to return a Series:
>>> cg.stack(['AGG_FUNCS', 'AGG_COLS']).head(12)
STABBR RELAFFIL AGG_FUNCS AGG_COLS AK 0 count UGDS 7.0 SATMTMID 0.0 min UGDS 109.0 max UGDS 12865.0 1 count UGDS 3.0 SATMTMID 1.0 min UGDS 27.0 SATMTMID 503.0 max UGDS 275.0 SATMTMID 503.0 AL 0 count UGDS 71.0 SATMTMID 13.0 dtype: float64
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.21.104.183