pandas has extensive built-in capabilities to represent dates, time, and various intervals of time. Many of the calculations required to work with time-series data require both a richer and more accurate representation of the concepts of time than are provided in Python or NumPy.
To address this, pandas provides its own representations of dates, time, time intervals, and periods. The pandas implementations provide additional capabilities that are required to model time-series data. These include capabilities such as being able to transform data across different frequencies to change the frequency of sampled data and to apply different calendars to take into account things such as business days and holidays in financial calculations.
We will examine several of the common constructs in both Python and pandas to represent dates, time, and combinations of both, as well as intervals of time. There are many details to each of these, so here, we will focus just on the parts and patterns involved with each that are important for the understanding of the examples in the remainder of the chapter.
The datetime
object is part of the datetime
library and not a part of pandas. This class can be utilized to construct objects representing a fixed point in time at a specific date and time or simply a day without a time component or a time without a date component.
With respect to pandas, the datetime
objects do not have the accuracy needed for much of the mathematics involved in extensive calculations on time-series data. However, they are commonly used to initialize pandas objects with pandas converting them into pandas timestamp objects behind the scenes. Therefore, they are worth a brief mention here, as they will be used frequently during initialization.
A datetime
object can be initialized using a minimum of three parameters representing year, month, and day:
In [2]: # datetime object for Dec 15 2014 datetime(2014, 12, 15) Out[2]: datetime.datetime(2014, 12, 15, 0, 0)
Notice that the result has defaulted two values to 0
, which represents the hour and minute. The hour and minute components can also be specified with two more values to the constructor. The following creates a datetime
object that also specifies 5:30 p.m.
:
In [3]: # specific date and also with a time of 5:30 pm datetime(2014, 12, 15, 17, 30) Out[3]: datetime.datetime(2014, 12, 15, 17, 30)
The current date and time can be determined using the datetime.now()
function, which retrieves the local date and time:
In [4]: # get the local "now" (date and time) # can take a time zone, but that's not demonstrated here datetime.now() Out[4]: datetime.datetime(2015, 3, 6, 11, 7, 51, 216921)
A datetime.date
object represents a specific day (no time). It can be created by passing a datetime
object to the constructor:
In [5]: # a date without time can be represented # by creating a date using a datetime object datetime.date(datetime(2014, 12, 15)) Out[5]: datetime.date(2014, 12, 15)
To get the current local date, use the following:
In [6]: # get just the current date datetime.now().date() Out[6]: datetime.date(2015, 3, 6)
A time without a date component can be represented by creating a datetime.time
object by passing a datetime
object to its constructor:
In [7]: # get just a time from a datetime datetime.time(datetime(2014, 12, 15, 17, 30)) Out[7]: datetime.time(17, 30)
The current local time can be retrieved using the following:
In [8]: # get the current local time datetime.now().time() Out[8]: datetime.time(11, 7, 51, 233760)
Specific dates and times in pandas are represented using the pandas.tslib.Timestamp
class. Timestamp is based on the datetime64
dtype and has higher precision than the Python datetime
object. Timestamp objects are generally interchangeable with datetime
objects, so you can typically use them wherever you may use datetime
objects.
You can create a Timestamp
object using pd.Timestamp
(a shortcut for pandas.tslib.Timestamp
) and by passing a string representing a date, time, or date and time:
In [9]: # a timestamp representing a specific date pd.Timestamp('2014-12-15') Out[9]: Timestamp('2014-12-15 00:00:00')
A time element can also be specified, as shown here:
In [10]: # a timestamp with both date and time pd.Timestamp('2014-12-15 17:30') Out[10]: Timestamp('2014-12-15 17:30:00')
Timestamp
can be created using just a time, which will default to also assigning the current local date:
In [11]: # timestamp with just a time # which adds in the current local date pd.Timestamp('17:30') Out[11]: Timestamp('2015-03-06 17:30:00')
The following demonstrates how to retrieve the current date and time using Timestamp
:
In [12]: # get the current date and time (now) pd.Timestamp("now") Out[12]: Timestamp('2015-03-06 11:07:51.254386')
Normally, as a pandas user, you will not create Timestamp
objects directly. Many of the pandas functions that use dates and times will allow you to pass in a datetime
object or a text representation of a date/time and the functions will perform the conversion internally.
A difference between two pandas Timestamp
objects is represented by a timedelta
object, which is a representation of an exact difference in time. These are common as results of determining the duration between two dates or to calculate the date at a specific interval of time from another date and/or time.
To demonstrate, the following uses a timedelta
object to calculate a one-day increase in the time from the specified date:
In [13]: # what is one day from 2014-11-30? today = datetime(2014, 11, 30) tomorrow = today + pd.Timedelta(days=1) tomorrow Out[13]: datetime.datetime(2014, 12, 1, 0, 0)
The following demonstrates how to calculate how many days there are between two dates:
In [14]: # how many days between these two dates? date1 = datetime(2014, 12, 2) date2 = datetime(2014, 11, 28) date1 - date2 Out[14]: datetime.timedelta(4)
3.133.159.223