Slicing time series data

In this recipe, we will learn how to slice time series data using pandas. This will help you extract information from various intervals in the time series data. We will learn how to use dates to handle subsets of our data.

How to do it…

  1. Create a new Python file, and import the following packages:
    import numpy as np
    import pandas as pd
    import matplotlib.pyplot as plt
    
    from convert_to_timeseries import convert_data_to_timeseries
  2. We will use the same text file that we used in the previous recipe to slice and dice the data:
    # Input file containing data
    input_file = 'data_timeseries.txt'
  3. We will use the third column again:
    # Load data
    column_num = 2
    data_timeseries = convert_data_to_timeseries(input_file, column_num)
  4. Let's assume that we want to extract the data between given start and end years. Let's define these, as follows:
    # Plot within a certain year range
    start = '2008'
    end = '2015'
  5. Plot the data between the given year range:
    plt.figure()
    data_timeseries[start:end].plot()
    plt.title('Data from ' + start + ' to ' + end)
  6. We can also slice the data based on a certain range of months:
    # Plot within a certain range of dates
    start = '2007-2'
    end = '2007-11'
  7. Plot the data, as follows:
    plt.figure()
    data_timeseries[start:end].plot()
    plt.title('Data from ' + start + ' to ' + end)
    
    plt.show()
  8. The full code is given in the slicing_data.py file that is provided to you. If you run the code, you will see the following image:
    How to do it…
  9. The next figure will display a smaller time frame; hence, it looks like we have zoomed into it:
    How to do it…
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.120.204