Parsing date-time values in any platform can be a tedious operation, not least because of the various formats in which dates and times can be written.
The lubridate package makes it easy for anyone to work with dates and times in R. In addition, data.table also includes several helpful functions in order to work with date and time values:
Install.packages("lubridate") library(lubridate)
We can create a sample table with dates and times as follows:
tdate <- data.frame(dt=(as.POSIXct("2010-03-18 19:08:10 EDT")) + 1000000 * runif(1000,-100,100), value=sample(LETTERS,1000,T)) head(tdate) str(tdate)
Note that the dt column is already defined as of class POSIXct. However, in general, when reading date times from files, they may instead get read in as strings. We'll create a separate column, called dts, where the column dt will be cast to string, as shown:
tdate$dts <- as.character(tdate$dt) str(tdate)
The lubridate package provides an easy method to translate the character column into dates and times, as in the following:
ymd_hms(tdate$dts[1:3]) # The character values in dts are read in as dates/times # [1] "2012-02-25 01:15:02 UTC" "2007-03-01 15:18:32 UTC" "2011-09-19 13:34:47 UTC" class(tdate$dts[1:5]) # [1] "character" class(ymd_hms(tdate$dts[1:5])) # [1] "POSIXct" "POSIXt"
In general, it is possible to interpret any string as a date and time using ymd and hms notations, as shown:
# For example: ymd(20190420) # Change 20190420 into year 2014, month 04 and date 20 mdy(04202019) dym(20201904) dmy(20042020) # We can verify whether they are equivalent using identical identical(ymd(20190420),mdy(0420201)) identical(ymd(20190420),dym(20201904))
Once the data type is interpreted as date and time, we can then use other operations, such as extracting only parts of the date and time values, as shown:
tdate$dts[1:3] # [1] "2012-02-25 01:15:02" "2007-03-01 15:18:32" "2011-09-19 13:34:47" ymd_hms(tdate$dts[1:3]) # [1] "2012-02-25 01:15:02 UTC" "2007-03-01 15:18:32 UTC" "2011-09-19 13:34:47 UTC" year(ymd_hms(tdate$dts[1:3])) # [1] 2012 2007 2011 month(ymd_hms(tdate$dts[1:3])) # [1] 2 3 9 day((ymd_hms(tdate$dts[1:3]))) # [1] 25 1 19 yday((ymd_hms(tdate$dts[1:3]))) #nth day of the year y # [1] 56 60 262 wday((ymd_hms(tdate$dts[1:3]))) # Weekday # [1] 7 5 2 wday((ymd_hms(tdate$dts[1:3])), label=T) # Weekday # [1] Sat Thu Mon ymd(20190420) + days(10) # [1] "2019-04-30" ymd(20190420) + months(2) # [1] "2019-06-20" ymd(20190420) + years(2) # [1] "2021-04-20" # Lubridate also includes functions to perform simple date/time calculations such as, my_birthday <- ymd("20010101") # The standard difftime object (obtained when subtracting dates/times) can be a bit difficult to interpret today() - my_birthday # Time difference of 6299 days class(today() - my_birthday) # [1] "difftime"
The lubridate package includes useful functions named d* such as dseconds, ddays, and dhours that make it easier to interpret difftime values, as shown:
as.duration(today() - my_birthday) # [1] "544233600s (~17.25 years)"
Note that we can also use data.table in order to work on date and time values. Common date and time functions in data.table include the following:
IDateTime(x, ...) # second(x) # minute(x) # hour(x) # yday(x) # wday(x) # mday(x) # week(x) # isoweek(x) # month(x) # quarter(x) # year(x)