By the WebSim™ Team
This chapter provides a selection of expression and python examples to get a new user started. It also has descriptions of some common alpha examples with a discussion on good practices to follow when building alphas.
Try the alpha expressions in Table 31.1 with different Universe, Delay, Neutralization, etc. settings.
Table 31.1 Sample alpha expressions
Expression | Description |
1/close | Use inverse of daily close price as stock weights. More allocation of capital on the stocks with lower daily close prices. Similarly in the examples below, more allocation of capital on stocks with higher weights as defined in the “Expression” column. |
volume/adv20 | Use relative daily volume to the average in the past 20 days as stock weights. |
Correlation(close, open, 10) | Use correlation between daily close and open prices in the past 10 days as stock weights. |
open | Use daily open price as stock weights. |
(high + low)/2 - close | Use difference between average of daily high and low prices and daily close price as stock weights. |
vwap < close ? high : low | Use daily high as stock weights if the stock closes higher than daily volume weighted average price (vwap), or otherwise use daily low as stock weights. |
Rank(adv20) | Use rank of average daily volume in past 20 days (adv20) as stock weights. |
Min(0.5*(open+close), vwap) | Use the less of open close average and vwap as stock weights. |
Max(0.5*(high+low), vwap) | Use the greater of high low average and vwap as stock weights. |
1/StdDev(returns, 22) | Use inverse of standard deviation of stock returns in past 22 days as stock weights. |
Sum(sharesout, 5) | Use sum of outstanding shares in past 5 days as stock weights. |
Covariance(vwap, returns, 22) | Use covariance of vwap and returns for the past 22 days as stock weights. |
1/Abs(0.5*(open+close) - vwap) | Use absolute difference between open close average and vwap as stock weights. |
Correlation(vwap, Delay(close, 1), 5) | Use correlation between vwap and previous day’s close for past 5 days as stock weights. |
Delta(close, 5) | Use difference between daily close and close on the date 5 days earlier as stock weights. |
Decay_linear(sharesout*vwap, 5) | Use linear decay of vwap multiplied by sharesout over the last 5 days as stock weights. |
Decay_exp(close, 0.25, 5) | Use exponential decay of close with smoothing factor 0.25 over the last 5 days as stock weights. |
Product(volume/sharesout, 5) | Use product of volume/sharesout ratio for the past 5 days as stock weights. |
Tail(close/vwap, 0.9, 1.1, 1.0) | Use close/vwap ratio as stock weights if it is less than 0.9 or greater than 1.1, or otherwise use 1 as stock weights. |
Sign(close-vwap) | Use 1 if close-vwap is positive or otherwise -1 as stock weights. |
SignedPower (close-open, 0.5) | Use sqrt of absolute difference between close and open as stock weights. |
Pasteurize(1/(close-open)) | Use inverse of close-open pasteurized (set to NaN if it is INF or if the underlying instrument is not in the universe) as stock weights. |
Log(high/low) | Use natural logarithm of high/low ratio as stock weights. |
IndNeutralize(volume*vwap, 1) | Use market neutralized volume*vwap product as stock weights. |
Scale(close^0.5) | Use scaled sqrt of close (scaled such that the Book size is 1) as stock weights. |
Ts_Min(open, 22) | Use minimum open over the last 22 days as stock weights. |
Ts_Max(close, 22) | Use maximum open over the last 22 days as stock weights. |
Ts_Rank(volume, 22) | Use rank of current volume over the past 22 days as stock weights. |
Ts_Skewness(returns, 11) | Use skewness of returns over the last 11 days as stock weights. |
Ts_Kurtosis(returns, 11) | Use kurtosis of returns over the last 11 days as stock weights. |
Ts_Moment(returns, 3, 11) | Use 3rd central moment of returns over the last 11 days as stock weights. |
CountNans((close-open)^0.5, 22) | Use number of NaN values in (close-open)^0.5 for the past 22 days as stock weights. |
Step(1250)*close | Use close*Step(1250) product as stock weights. |
Sum_i(Delta(close,i),i,4,6,2) | Use summation of Delta(close,i) over i from 4 to 6 step 2 as stock weights. |
Call_i(Ts_Rank(x,5),x, close>vwap ? close : high) | Use Ts_Rank(x,5) as stock weights where x is daily close price if it’s higher than vwap, or otherwise use daily high price as stock weights. |
You should have good working knowledge of Python programming language to develop Python alphas on WebSim™. Useful links to online Python tutorials are given in Table 31.2.
Table 31.2 Links to online Python tutorials
The user should abide by the terms of use of the above-mentioned sites. The links are listed for user’s convenience only.
Keep in mind that any user-submitted Python code is always prefixed by WebSim™ with the following:
WebSim™ application is enabled to access market data on the back end by using the Python code, GetDataon, the data registry. For example:
As explained before, the market data can be thought of as a matrix of values provided for each stock for every date the data is made available. The dates are mapped to date indices di and the instruments are mapped to instrument indices ii as shown in Table 31.3.
Table 31.3 Mapping dates to date indices di and instruments to instrument indices ii
Dates | di(date index) | Instruments | ii(instr index) |
20100101 | 0 | MSFT | 0 |
20100102 | 1 | AAPL | 1 |
20100103 | 2 | PG | 2 |
20100104 | 3 | GOOG | 3 |
20100107 | 4 | AA | 4 |
20100108 | 5 | K | 5 |
… | … | … | … |
Market data, e.g. close price, would be arranged in the form of a matrix as shown in Table 31.4.
Table 31.4 Market data arranged in the form of a matrix
Instruments Dates |
MSFT (ii=0) |
HOG (ii=1) |
AAPL (ii=2) |
GOOG (ii=3) |
PG (ii=4) |
… |
20100104(di=0) | 30.95 | 25.46 | 214.01 | 626.75 | 61.12 | … |
20100105(di=1) | 30.96 | 25.65 | 214.38 | 623.99 | 61.14 | … |
20100106(di=2) | 30.77 | 25.59 | 210.97 | 608.26 | 60.85 | … |
20100107(di=3) | 30.452 | 25.8 | 210.58 | 594.1 | 60.52 | … |
20100108(di=4) | 30.66 | 25.53 | 211.98 | 602.02 | 60.44 | … |
… | … | … | … | … | … | … |
Now to access Apple’s close price on date Jan 7, 2010, we need to use close(3,2).
The sole purpose of the Generate( ) function (should be implemented in your code) is to populate the resulting alpha vector with stock weights for every stock. The Generate function will evaluate the alpha expression for every date (it acts like a loop that iterates through all date indices), hence its arguments are di (which is the date index corresponding to the current date) and alpha (the resulting vector that needs to be filled). For example:
Data can be accessed using dataname[di-delay,ii].
dataName[date index, instrument index] would give you the value of dataname, for that particular date, and that particular instrument. To assign expression to an alpha, use alpha[instrument index] = expression.
For example, alpha[ii] = dataname[di,ii].
An example alpha code to use close price data is given below:
Since we are using the list-slicing functionality (“:”) of Python, it executes the expression for each instrument (corresponding column cell in the matrix).
Another example that uses returns data to define alpha[:] is given below:
To access data for an instrument over a certain window period, say n days, use dataname[di-delay-n: di-delay, ii]. An example for this is given below:
The user should note that the window period chosen should be less that the number of lookback days (namely set to 256 days, by default). This value can be retrieved using Python function: Build.GetBackdays().
The last line is added as a validity check for values in the resultant alpha array. This will ensure that the values for instruments that don’t belong to the TOP3000 universe will be filtered out. Notice that this uses the valid variable, which was initialized as mentioned in the Python code header. This is explained further in the next section.
Here are several common examples of how Python can be used.
The following example shows us how to access and use multiple data at the same time. Use vectorization wherever possible. Avoid using loops as they are slow. This example uses NumPy’s built-in math function (numpy.subtract is called automatically). Here, the alpha vector is assigned expression close – high.
The following py code shows us how to define custom functions and use them. It also uses the NumPy function numpy.where(). This should be used instead of loops to perform vector comparison.
The valid matrix has a list of valid instruments (for example, 3,000 instruments for TOP3000) and is automatically available:
The above valid-check function can be inserted at the end of all your alpha codes as:
A list of SciPy’s statistical functions can be found at SciPy.org. We will be using the scipy .rankdata() function here. This assigns ranks to alpha weight, dealing with ties appropriately.
The above alpha expression ranks (on a scale from 0 to 1) adjusted High prices.
The following alpha example shows the usage of a SciPy function called scipy.kurtosis() on close price for 11 days.
The following alpha example shows the usage of a SciPy function called scipy.skewness() on returns for 10 days.
A list of NumPy’s statistical functions can be found here: NumPy.org.
The alpha example below shows usage of NumPy’s mean and standard deviation function numpy.std():
The alpha example below shows usage of NumPy’s maximum function numpy.amax():
Industry data here is a NumPy array of indices assigned for every available industry. Industries such as Forestry, Metal Mining, Electrical Work, Meat Packing Plants, Textile Mills, Book Printing, etc., have indices assigned to them. The alpha above shows how this industry data can be accessed and used.
Table 31.5 shows a sample of instrument indices, industry indices (values of industry[di-delay,ii]), close data.
For the above alpha example, the industry close array values for each “I” will be as follows:
Note that you can also access and use sector and subindustry data using dr.GetData() in your alphas.
In the code given below, five days’ values of close price and vwap as training sample are used to calculate regression weights w1 and w2.
18.219.189.247