How to do it...

  1. Read in the Denver crime hdf5 file, place the REPORTED_DATE column in the index, and sort it:
>>> crime_sort = pd.read_hdf('data/crime.h5', 'crime') 
.set_index('REPORTED_DATE')
.sort_index()
  1. The DatetimeIndex itself has many of the same attributes and methods as a pandas Timestamp. Let's take a look at some that they have in common:
>>> common_attrs = set(dir(crime_sort.index)) & 
set(dir(pd.Timestamp))
>>> print([attr for attr in common_attrs if attr[0] != '_'])

['to_pydatetime', 'normalize', 'day', 'dayofyear', 'freq', 'ceil',
'microsecond', 'tzinfo', 'weekday_name', 'min', 'quarter', 'month',
'tz_convert', 'tz_localize', 'is_month_start', 'nanosecond', 'tz',
'to_datetime', 'dayofweek', 'year', 'date', 'resolution', 'is_quarter_end',
'weekofyear', 'is_quarter_start', 'max', 'is_year_end', 'week', 'round',
'strftime', 'offset', 'second', 'is_leap_year', 'is_year_start',
'is_month_end', 'to_period', 'minute', 'weekday', 'hour', 'freqstr',
'floor', 'time', 'to_julian_date', 'days_in_month', 'daysinmonth']
  1. We can then use the index to find weekday names, similarly to what was done in step 2 of the preceding recipe:
>>> crime_sort.index.weekday_name.value_counts()
Monday 70024 Friday 69621 Wednesday 69538 Thursday 69287 Tuesday 68394 Saturday 58834 Sunday 55213 Name: REPORTED_DATE, dtype: int64
  1. Somewhat surprisingly, the groupby method has the ability to accept a function as an argument. This function will be implicitly passed the index and its return value is used to form groups. Let's see this in action by grouping with a function that turns the index into a weekday name and then counts the number of crimes and traffic accidents separately:
>>> crime_sort.groupby(lambda x: x.weekday_name) 
['IS_CRIME', 'IS_TRAFFIC'].sum()
  1. You can use a list of functions to group by both the hour of day and year, and then reshape the table to make it more readable:
>>> funcs = [lambda x: x.round('2h').hour, lambda x: x.year]
>>> cr_group = crime_sort.groupby(funcs)
['IS_CRIME', 'IS_TRAFFIC'].sum()
>>> cr_final = cr_group.unstack()
>>> cr_final.style.highlight_max(color='lightgrey')
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.226.165.70