Ten considerations for a backtesting model

In the previous section, we performed one replication of a backtest. Our result looks pretty optimistic. However, is this sufficient to deduce that this is a good model? The truth is that backtesting involves a lot of research that stems a literature on its own. The following list briefly covers some thoughts that you might want to consider when implementing your backtests.

Resources restricting your model

The resources that are available to your backtesting system limits how well you can implement your backtest. A financial model that generates signals using only the last closing price needs a set of historical data of closing prices. A trading system that requires reading from the order book requires all levels of the order book data to be available on every tick. This adds up the storage complexity. Other resources, such as exchange data, estimation techniques, and computer resources pose a limitation on the nature of model that can be used.

Criteria of evaluation of the model

How can we conclude that a model is good? Some factors of consideration are Sharpe ratios, hit ratios, average rate of return, VaR statistics, as well as the minimum and maximum drawdown encountered. How can a combination of such factors balance so that a model is usable? How much can the maximum drawdown be tolerated in achieving a high Sharpe ratio?

Estimating the quality of backtest parameters

Using a variety of parameters on a model typically gives us varied results. From multiple models, we can obtain additional sets of data for each model. Can the parameters from the model with the best performance be trustworthy? Using methods such as model averaging can help us correct optimistic estimates.

Note

The model averaging technique is the average fit for a number of models as opposed to using a single best model.

Be prepared to face model risk

Perhaps after extensive backtesting, you may find yourself with a good quality model. How long is it going to stay that way? In model risk, the market structure or the model parameters may change with time, or a regime change may cause the functional form of your model to change abruptly. By then, you could even be uncertain that your model is correct. A solution in addressing model risk is to use model averaging.

Performance of a backtest with in-sample data

Backtesting helps us perform extensive parameter searches that optimize the results of a model. This exploits the true as well as the idiosyncratic aspects of the sample data. Also, historical data can never mimic the way that entire data comes from live markets. These optimized results will always produce an optimistic assessment of the model and the strategy used.

Addressing common pitfalls in backtesting

The most common error made in backtesting is look-ahead bias, and it comes in many forms. For example, parameter estimates may be derived from the entire period of the sample data, which constitute to using information from the future. Statistical estimates like these and model selection should be estimated sequentially, which could actually be difficult to do so.

Errors in data come in all forms, from hardware, software, and human errors that could occur while routed by data distribution vendors. Listed companies may split, merge, or delist, resulting in substantial changes to its stock price. These actions could lead to survivorship bias in our models. Failure to clean data properly will give undue influence to idiosyncratic aspects of data, and thus affect the model parameters.

Note

Survivorship bias is the logical error of concentrating on results that have survived some past selection process. For example, a stock market index may report a strong performance even in bad times because poor performing stocks are dropped from its component weightage, resulting in an overestimation of past returns.

Failure to use shrinkage estimators or model averaging could report results containing extreme values, making it difficult for comparison and evaluation.

Note

In statistics, a shrinkage estimator is used as an alternative to an ordinary least squares estimator to produce the smallest mean squared error. They can be used to shrink raw estimates from the model output towards zero or an other fixed constant value.

Have a common sense idea of your model

Often, common sense could be lacking in our models. We may attempt to explain a trendless variable with a trended variable or infer causation from correlation. Can logarithmic values be used when the context does or does not require it?

Understanding the context for the model

Having a common sense idea of a model is barely sufficient. A good model takes into account the history, personnel involved, operating constraints, common peculiarities, and all the understanding for the rationale of the model. Are commodity prices following seasonal movements? How was the data gathered? Are the formulas used in the computation of variables reliable? These questions can help us determine the causes, should things go wrong.

Make sure you have the right data

Not many of us have access to tick-level data. Low-resolution tick data may miss out on detailed information. Even tick-level data may be fraught with errors. Using summary statistics, such as the mean, standard errors, maximums, minimums, and correlations, tells us a lot about the nature of the data; whether we can really use it, or infer backtest parameter estimates.

When data cleaning is performed, we might ask these questions: what are things to look out for? Are values realistic and logical? How is missing data coded?

Devise a system of reporting data and results. The use of graphs helps the human eye to visualize patterns that might come across as unexpected. Histograms might reveal unexpected distribution, or residual plots might show unexpected prediction error patterns. Scatterplots of residualized data may show additional modeling opportunities.

Note

Residualized data is the difference or "residuals" between the observed values and those of the model.

Data mine your results

From running over several iterations of backtests, the results represent a source of information about your model. Running your model in real-time conditions contains another source of results. By data mining all this wealth of information, we can obtain a data-driven result that can avoid tailoring the model specifications to the sample data. It is recommended that you use shrinkage estimators or model averaging when reporting the results.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.135.187.210