By Peng Yan
Traditionally, an alpha is defined as the active return from an investment after risk adjustment is applied. In this chapter, alpha means a quantitative model to predict future investment returns.
The start of an alpha design can be a hypothesis, a paper, a story, an inspiration, or just a random idea.
Similar to academic research, many assumptions are wrong, many trials futile. Only a few of them will be successful. We are human, and market participants are human as well. Humans are different as they have different ideas; only a small portion of those ideas may generate profits consistently in the real environment. At times you will have a strong belief that the model will work, yet after testing, it is proven not true or vice versa.
Asset class prices can be affected by many factors, either directly or indirectly. One idea may affect just one and neglect others.
We call the process of testing idea simulation. There are different simulation methods, such as:
In our working environment, simulation means backtest. That is, when there is an idea, we apply it with historical data to check the model’s performance. The assumption of backtest is: if the idea worked in history, then it is more likely to work in the future. By the same token, a model will not be considered if there are no historical simulation performances.
Backtest results are used for model pre-selection, comparison between different models, and judging alphas’ potential values. Backtest results include different measures such as Sharpe ratio, turnover, returns, correlation, etc.
Backtest is just one additional step once we have an idea. Good backtest performance is not sufficient for a profitable strategy. There are many other factors that will affect investment. As a general matter, one should not invest capital solely based on backtest simulation results. Some of the reasons are:
Overfitting is the topic of this chapter. The word overfitting comes from the statistical machine learning field and is critical in our backtest framework. The financial market is noisy, and even a very good model may have minimal positive prediction power. In an efficient market hypothesis, it is presumed there is no arbitrage opportunity to make a profit. When you see some good simulation results, you need to be careful when evaluating the overfitting risk of the models.
Multiple technologies have been proposed to reduce overfitting risks. For example, 10-fold crosses validation, regularization, and prior probability. Tenfold crosses validation is a process where you break the data into 10 sets of size n/10, train on 9 data sets and test on 1, then repeat 10 times and take the mean accuracy. Regularization, as in statistics and machine learning, is used for model selection to prevent overfitting by penalizing models with extreme parameter values. Prior probability is where an uncertain quantity p is the probability distribution that would express one’s uncertainty about p before some evidence is taken into account. Recently there have been some papers on the overfitting issues in the quantitative investment field, e.g. Bailey (2014a), Bailey (2014b), Beaudan (2013), Burns (2006), Harvey et al. (2014), Lopez de Prado (2013), Schorfheide and Wolpin (2012).
There are some guidelines to reduce the overfitting risk. Some of them are borrowed from the statistical/machine learning field.
Out-of-sample test: In order to test an alpha model, an out-of-sample test needs to be a true out-of-sample test. That is, we build a model, test it daily in a real environment, and monitor how it performs. It’s not right if: (1) models are backtested based on recent N years data, then use data of N years before as out of sample, or (2) take a part of instruments, and use the other part as out of sample. In case (1), the recent N years market contains information of older history, so models that worked recently may tend to work in history. In case (2), instruments are correlated – models with good performance in one universe tend to perform in another.
Please note: when the number of out-of-sample alphas is increasing, the out-of-sample test may be biased as well. An alpha can perform randomly well due to luck. Out-of-sample performance on the single alpha level is inadequate.
These alternative methods can be useful:
3.21.162.87