Refactoring timezones

Next, we want to refactor the timezone based on our timezone:

  1. We can refactor timezones by using the method given here:
import datetime 
import pytz

def refactor_timezone(x):
est = pytz.timezone('US/Eastern')
return x.astimezone(est)

Note that in the preceding code, I converted the timezone into the US/Eastern timezone. You can choose whatever timezone you like.

  1. Now that our function is created, let's call it:
dfs['date'] = dfs['date'].apply(lambda x: refactor_timezone(x))
  1. Next, we want to convert the day of the week variable into the name of the day, as in, SaturdaySunday, and so on. We can do that as shown here:
dfs['dayofweek'] = dfs['date'].apply(lambda x: x.weekday_name)
dfs['dayofweek'] = pd.Categorical(dfs['dayofweek'], categories=[
'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday',
'Saturday', 'Sunday'], ordered=True)
  1. Great! Next, we do the same process for the time of the day. See the snippet given here:
dfs['timeofday'] = dfs['date'].apply(lambda x: x.hour + x.minute/60 + x.second/3600)
  1. Next, we refactor the hour, the year integer, and the year fraction, respectively. First, refactor the hour as shown here:
dfs['hour'] = dfs['date'].apply(lambda x: x.hour)
  1. Refactor the year integer as shown here:
dfs['year_int'] = dfs['date'].apply(lambda x: x.year)
  1. Lastly, refactor the year fraction as shown here:
dfs['year'] = dfs['date'].apply(lambda x: x.year + x.dayofyear/365.25)
  1. Having done that, we can set the date to index and we will no longer require the original date field. So, we can remove that:
dfs.index = dfs['date']
del dfs['date']

Great! Good work so far. We have successfully executed our data transformation steps. If some of the steps were not clear, don't worry—we are going to deal with each of these phases in detail in upcoming chapters.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.225.57.126