Operating on time series data

Now that we know how to slice data and extract various subsets, let's discuss how to operate on time series data. You can filter the data in many different ways. The pandas library allows you to operate on time series data in any way that you want.

How to do it…

  1. Create a new Python file, and import the following packages:
    import numpy as np
    import pandas as pd
    import matplotlib.pyplot as plt
    
    from convert_to_timeseries import convert_data_to_timeseries
  2. We will use the same text file that we used in the previous recipe:
    # Input file containing data
    input_file = 'data_timeseries.txt'
  3. We will use both the third and fourth columns in this text file:
    # Load data
    data1 = convert_data_to_timeseries(input_file, 2)
    data2 = convert_data_to_timeseries(input_file, 3)
  4. Convert the data into a pandas data frame:
    dataframe = pd.DataFrame({'first': data1, 'second': data2})
  5. Plot the data in the given year range:
    # Plot data
    dataframe['1952':'1955'].plot()
    plt.title('Data overlapped on top of each other')
  6. Let's assume that we want to plot the difference between the two columns that we just loaded in the given year range. We can do this using the following lines:
    # Plot the difference
    plt.figure()
    difference = dataframe['1952':'1955']['first'] - dataframe['1952':'1955']['second']
    difference.plot()
    plt.title('Difference (first - second)')
  7. If we want to filter the data based on different conditions for the first and second column, we can just specify these conditions and plot this:
    # When 'first' is greater than a certain threshold
    # and 'second' is smaller than a certain threshold
    dataframe[(dataframe['first'] > 60) & (dataframe['second'] < 20)].plot()
    plt.title('first > 60 and second < 20')
    
    plt.show()
  8. The full code is in the operating_on_data.py file that is already provided to you. If you run the code, the first figure will look like the following:
    How to do it…
  9. The second output figure denotes the difference, as follows:
    How to do it…
  10. The third output figure denotes the filtered data, as follows:
    How to do it…
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.222.83.185