Time for action – trading correlated pairs

For this tutorial, we will use two sample data sets, containing the bare minimum of end-of-day price data. The first company is BHP Billiton (BHP), which is active in the mining of petroleum, metals, and diamonds. The second is Vale (VALE), which is also a metals and mining company. So there is some overlap, albeit not one hundred percent. For trading correlated pairs, follow these steps:

  1. First, load the data, specifically the close price of the two securities, from the CSV files in the example code directory of this chapter and calculate the returns. If you don't remember how to do it, there are plenty of examples in the previous chapter.
  2. Covariance tells us how two variables vary together; it is nothing more than unnormalized correlation. Compute the covariance matrix from the returns with the cov function (it's not strictly necessary to do this, but it will allow us to demonstrate a few matrix operations):
    covariance = np.cov(bhp_returns, vale_returns) 
    print "Covariance", covariance

    The covariance matrix is as follows:

    Covariance [[ 0.00028179  0.00019766]
               [ 0.00019766  0.00030123]]
    
  3. View the values on the diagonal with the diagonal function:
    print "Covariance diagonal", covariance.diagonal()

    The diagonal values of the covariance matrix are as follows:

    Covariance diagonal [ 0.00028179  0.00030123]
    

    Note

    Notice that the values on the diagonal are not equal to each other, this is different from the correlation matrix.

  4. Compute the trace, the sum of the diagonal values, with the trace function:
    print "Covariance trace", covariance.trace()

    The trace values of the covariance matrix are as follows:

    Covariance trace 0.00058302354992
    
  5. The correlation of two vectors is defined as the covariance, divided by the product of the respective standard deviations of the vectors. The equation for vectors a and b is:
    Time for action – trading correlated pairs

    Try it out:

    print covariance/ (bhp_returns.std() * vale_returns.std())

    The correlation matrix is as follows:

    [[ 1.00173366  0.70264666]
    [ 0.70264666  1.0708476 ]]
    
  6. We will measure the correlation of our pair with the correlation coefficient. The correlation coefficient takes values between -1 to 1. The correlation of a set of values with itself is 1 by definition. This would be the ideal value; however, we will be also happy with a slightly lower value. Calculate the correlation coefficient (or, more accurately, the correlation matrix) with the corrcoef function:
    print "Correlation coefficient", np.corrcoef(bhp_returns, vale_returns)

    The coefficients are as follows:

    [[ 1.          0.67841747]
    [ 0.67841747  1.        ]]
    

    The values on the diagonal are just the correlations of the BHP and VALE with themselves and are, therefore, equal to 1. In all probability, no real calculation takes place. The other two values are equal to each other since correlation is symmetrical, meaning that the correlation of BHP with VALE is equal to the correlation of VALE with BHP. It seems that the correlation is not that strong.

  7. Another important point is whether the two stocks under consideration are in sync or not. Two stocks are considered out of sync if their difference is two standard deviations from the mean of the differences.

    If they are out of sync, we could initiate a trade, hoping that they eventually will get back in sync again. Compute the difference between the close prices of the two securities to check the synchronization:

    difference = bhp - vale

    Check whether the last difference in price is out of sync; see the following code:

    avg = np.mean(difference)
    dev = np.std(difference)
    print "Out of sync", np.abs(difference[-1] – avg) > 2 * dev

    Unfortunately, we cannot trade yet:

    Out of sync False
    
  8. Plotting requires Matplotlib; this will be discussed in Chapter 9, Plotting with Matplotlib. Plotting can be done as follows:
    t = np.arange(len(bhp_returns))
    plot(t, bhp_returns, lw=1)
    plot(t, vale_returns, lw=2)
    show()

    The resulting plot:

    Time for action – trading correlated pairs

What just happened?

We analyzed the relation of the closing stock prices of BHP and VALE. To be precise, we calculated the correlation of their stock returns. This was achieved with the corrcoef function. Further, we saw how the covariance matrix can be computed, from which the correlation can be derived. As a bonus, a demonstration was given of the diagonal and trace functions that can give us the diagonal values and the trace of a matrix, respectively (see correlation.py):

import numpy as np
from matplotlib.pyplot import plot
from matplotlib.pyplot import show

bhp = np.loadtxt('BHP.csv', delimiter=',', usecols=(6,), unpack=True)

bhp_returns = np.diff(bhp) / bhp[ : -1]

vale = np.loadtxt('VALE.csv', delimiter=',', usecols=(6,), unpack=True)

vale_returns = np.diff(vale) / vale[ : -1]

covariance = np.cov(bhp_returns, vale_returns) 
print "Covariance", covariance

print "Covariance diagonal", covariance.diagonal()
print "Covariance trace", covariance.trace()

print covariance/ (bhp_returns.std() * vale_returns.std())

print "Correlation coefficient", np.corrcoef(bhp_returns, vale_returns)

difference = bhp - vale
avg = np.mean(difference)
dev = np.std(difference)

print "Out of sync", np.abs(difference[-1] - avg) > 2 * dev

t = np.arange(len(bhp_returns))
plot(t, bhp_returns, lw=1)
plot(t, vale_returns, lw=2)
show()

Pop quiz – calculating covariance

Q1. Which function returns the covariance of two arrays?

  1. covariance
  2. covar
  3. cov
  4. cvar
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.188.91.44