Dynamic time warping

Next, however, I want to introduce another model, which uses a completely different algorithm. This algorithm is called dynamic time warping. What it does is give you a metric that represents the similarity between two time series:

  1. To get started, we'll need to pip install the fastdtw library:
!pip install fastdtw 
  1. Once that is installed, we'll import the additional libraries we'll need:
from scipy.spatial.distance import euclidean 
from fastdtw import fastdtw 
  1. Next, we'll create the function that will take in two series and return the distance between them:
def dtw_dist(x, y): 
    distance, path = fastdtw(x, y, dist=euclidean) 
    return distance 
  1. Now, we'll split our 18 years' worth of time series data into distinct five-day periods. We'll pair together each period with one additional point. This will serve to create our x and y data, as follows:
tseries = [] 
tlen = 5 
for i in range(tlen, len(sp), tlen): 
    pctc = sp['Close'].iloc[i-tlen:i].pct_change()[1:].values * 100 
    res = sp['Close'].iloc[i-tlen:i+1].pct_change()[-1] * 100 
    tseries.append((pctc, res)) 
  1. We can take a look at our first series to get an idea of what the data looks like:
tseries[0] 

This generates the following output:

  1. Now that we have each series, we can run them all through our algorithm to get the distance metric for each series against every other series:
dist_pairs = [] 
for i in range(len(tseries)): 
    for j in range(len(tseries)): 
        dist = dtw_dist(tseries[i][0], tseries[j][0]) 
        dist_pairs.append((i,j,dist,tseries[i][1], tseries[j][1])) 

Once we have that, we can place it into a DataFrame. We'll drop series that have 0 distance, as they represent the same series. We'll also sort according to the date of the series and look only at those where the first series is before the second, chronologically speaking:

dist_frame = pd.DataFrame(dist_pairs, columns=['A','B','Dist', 'A Ret', 'B Ret']) 
 
sf = dist_frame[dist_frame['Dist']>0].sort_values(['A','B']).reset_index(drop=1) 
 
sfe = sf[sf['A']<sf['B']] 

And finally, we'll limit our trades where the distance is less than 1 and the first series has a positive return:

winf = sfe[(sfe['Dist']<=1)&(sfe['A Ret']>0)] 
 
winf 

This generates the following output:

Let's see what one of our top patterns (A:6 and B:598) looks like when plotted:

plt.plot(np.arange(4), tseries[6][0]); 

The preceding code generates the following output:

Now, we'll plot the second one:

plt.plot(np.arange(4), tseries[598][0]) 

The preceding code generates the following output:

As you can see, the curves are nearly identical, which is exactly what we want. We're going to try to find all curves that have positive next-day gains and then, once we have a curve that is highly similar to one of these profitable curves, we'll buy it in anticipation of another gain.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.143.5.15