How to do it...

  1. Read in the meetup dataset, convert the join_date column into a Timestamp, place it in the index, and output the first five rows:
>>> meetup = pd.read_csv('data/meetup_groups.csv', 
parse_dates=['join_date'],
index_col='join_date')
>>> meetup.head()
  1. Let's get the number of people who joined each group each week:
>>> group_count = meetup.groupby([pd.Grouper(freq='W'), 'group']) 
.size()
>>> group_count.head()
join_date group 2010-11-07 houstonr 5 2010-11-14 houstonr 11 2010-11-21 houstonr 2 2010-12-05 houstonr 1 2011-01-16 houstonr 2 dtype: int64
  1. Unstack the group level so that each meetup group has its own column of data:
>>> gc2 = group_count.unstack('group', fill_value=0)
>>> gc2.tail()
  1. This data represents the number of members who joined that particular week. Let's take the cumulative sum of each column to get the grand total number of members:
>>> group_total = gc2.cumsum()
>>> group_total.tail()
  1. Many stacked area charts use the percentage of the total so that each row always adds up to 100 percent. Let's divide each row by the row total to find this percentage:
>>> row_total = group_total.sum(axis='columns')
>>> group_cum_pct = group_total.div(row_total, axis='index')
>>> group_cum_pct.tail()
  1. We can now create our stacked area plot, which will continually accumulate the columns, one on top of the other:
>>> ax = group_cum_pct.plot(kind='area', figsize=(18,4),
cmap='Greys', xlim=('2013-6', None),
ylim=(0, 1), legend=False)
>>> ax.figure.suptitle('Houston Meetup Groups', size=25)
>>> ax.set_xlabel('')
>>> ax.yaxis.tick_right()

>>> plot_kwargs = dict(xycoords='axes fraction', size=15)
>>> ax.annotate(xy=(.1, .7), s='R Users',
color='w', **plot_kwargs)
>>> ax.annotate(xy=(.25, .16), s='Data Visualization',
color='k', **plot_kwargs)
>>> ax.annotate(xy=(.5, .55), s='Energy Data Science',
color='k', **plot_kwargs)
>>> ax.annotate(xy=(.83, .07), s='Data Science',
color='k', **plot_kwargs)
>>> ax.annotate(xy=(.86, .78), s='Machine Learning',
color='w', **plot_kwargs)
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.138.37.20