Dates and times can be very valuable and interesting
information. For example, a customer’s sign-up date on a credit
card is an indication of the customer’s tenure (how long the
customer has been with the company), cohort (a group of people joining
at the same time), etc. An end date is an indication of customer turnover.
The distance between start and end dates is an indication of total
tenure or retention. Times during the day might be captured in situations
such as employee shift-work; for instance you might log call center
agents in and out, and capture this data. Therefore, capturing and
analyzing dates and times can be very important.
Capturing dates or times
is easy enough. All database or spreadsheet programs have methods
for entering them, although there are complexities too. Actually analyzing
dates or times in a sensible manner is sometimes more difficult. For
example, say I want to estimate the tenure (number of months or days
employed with the company) of employees based on each employee’s
start date, and today is 22 July 2016. How do I do it? If Nelson started
on the 15 February 2010, how do I calculate the tenure that is the
gap between now and that start date?
To solve this problem,
data analysis software uses specific protocols. For example, in the
background, Microsoft Excel stores dates in terms of days
away from 1 January 1900. Therefore it turns dates
into a proper numerical variable. The background number for 1 January
2008 is 39448 because this date is that many days after the 1 January
1900. Similarly, Excel stores times as fractions of days. SAS uses
a similar protocol, just with 1 Jan 1960 as the start date.
This is the protocol
used in practically all analysis programs. It allows the program to
do date-based analysis very easily, as long as it can recognize that
the data represents dates in the first place.