Well, it looks like the number of flights fluctuates a lot on weekdays, which is indeed a dominant characteristic of human-related activities. Let's verify that by identifying and removing the weekly seasonality by decomposing this time-series into the seasonal, trend, and random components with moving averages.
Although this can be done manually by utilizing the diff
and lag
functions, there's a much more straightforward way to do so with the decompose
function from the stats
package:
> plot(decompose(ts(daily$N, frequency = 7)))
Removing the spikes in the means of weekly seasonality reveals the overall trend of the number of flights in 2011. As the x axis shows the number of weeks since January 1 (based on the frequency being 7), the peak interval between 25 and 35 refers to the summertime, and the lowest number of flights happened on the 46th week – probably due to Thanksgiving Day.
But the weekly seasonality is probably more interesting. Well, it's pretty hard to spot anything on the preceding plot as the very same 7-day repetition can be seen 52 times on the seasonal plot. So, instead, let's extract that data and show it in a table with the appropriate headers:
> setNames(decompose(ts(daily$N, frequency = 7))$figure, + weekdays(daily$date[1:7])) Saturday Sunday Monday Tuesday Wednesday -102.171776 -8.051328 36.595731 -14.928941 -9.483886 Thursday Friday 48.335226 49.704974
So the seasonal effects (the preceding numbers representing the relative distance from the average) suggest that the greatest number of flights happened on Monday and the last two weekdays, while there is only a relatively small number of flights on Saturdays.
Unfortunately, we cannot decompose the yearly seasonal component of this time-series, as we have data only for one year, and we need data for at least two time periods for the given frequency:
> decompose(ts(daily$N, frequency = 365)) Error in decompose(ts(daily$N, frequency = 365)) : time series has no or less than 2 periods
For more advanced seasonal decomposition, see the stl
function of the
stats
package, which uses polynomial regression models on the time-series data. The next section will cover some of this background.
3.144.82.154