Chapter 10. Time-series Data

A time series is a measurement of one or more variables over a period of time and at a specific interval. Once a time series is captured, analysis is often performed to identify patterns in the data, in essence, determining what is happening as time goes by. Being able to process time-series data is essential in the modern world, be it in order to analyze financial information or to monitor exercise on a wearable device and match your exercises to goals and diet.

pandas provides extensive support for working with time-series data. When working with time-series data, you are frequently required to perform a number of tasks, such as the following:

  • Converting string-based dates and time into objects
  • Standardizing date and time values to specific time zones
  • Generating sequences of fixed-frequency dates and time intervals
  • Efficiently reading/writing the value at a specific time in a series
  • Converting an existing time series to another with a new frequency of sampling
  • Computing relative dates, not only taking into account time zones, but also dealing with specific calendars based upon business days
  • Identifying missing samples in a time series and determining appropriate substitute values
  • Shifting dates and time forward or backward by a given amount
  • Calculating aggregate summaries of values as time changes

pandas provides abilities to handle all of these tasks (and more). In this chapter, we will examine each of these scenarios and see how to use pandas to address them. We will start with looking at how pandas represents dates and times differently than Python. Next, we look at how pandas can create indexes based on dates and time. We will then look at how pandas represents durations of time with timedelta and Period objects. We will then progress to examining calendars and time zones and how they can be used to facilitate various calculations. The chapter will finish with an examination of operations on time-series data, including shifts, up and down sampling, and moving-window calculations.

Specifically, in this chapter, we will cover:

  • Creating time series with specific frequencies
  • Date offsets
  • Representation of differences in time with timedelta
  • Durations of time with Period objects
  • Calendars
  • Time zones
  • Shifting and lagging
  • Up and down sampling
  • Time series moving-window operations

Setting up the IPython notebook

To utilize the examples in this chapter, we will need to include the following imports and settings:

In [1]:
   # import pandas, numpy and datetime
   import numpy as np
   import pandas as pd

   # needed for representing dates and times
   import datetime 
   from datetime import datetime

   # Set some pandas options for controlling output
   pd.set_option('display.notebook_repr_html', False)
   pd.set_option('display.max_columns', 10)
   pd.set_option('display.max_rows', 10)

   # matplotlib and inline graphics
   import matplotlib.pyplot as plt
   %matplotlib inline
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.30.178