Chapter 14
Threats to Validity
Profitable real trading systems are complex systems, involving varying market
scenarios. While the empirical results have demonstrated the effectiveness of the
proposed strategies, there is still a long way to the production stage. In this chapter,
we provides some arguments of various assumptions made during the trading model,
back tests, and so on.
This chapter is organized as follows: Section 14.1 discusses the assumptions on
the model, and Section 14.2 discusses the assumptions on the mean reversion princi-
ples. Section 14.3 discusses the proposed algorithms from a theoretical perspective.
Section 14.4 validates the empirical studies. Finally, Section 14.5 summarizes this
chapter and proposes some future directions.
14.1 On Model Assumptions
Any statement about such encouraging empirical results achieved by the proposed
algorithms would be incomplete without acknowledging the simplified assump-
tions. To recall, we had made several assumptions regarding transaction cost, market
liquidity, and market impact that would affect the algorithms’ practical deployment.
The first assumption is that no transaction cost exists. In Section 13.4, we have
already examined the effects of varying transaction costs, and the results show that
the proposed algorithms can withstand moderate transaction costs in most cases.
Currently, with the widespread adoption of electronic communication networks and
multilateral trading facilities in financial markets, various online trading brokers
charge very small transaction cost rates, especially for large institutional investors.
They also use a flat rate,
which is based on the volume one reaches. Such measures
can facilitate the portfolio managers to lower their transaction cost rates.
The second assumption is that the market is liquid and one can buy and sell any
quantity at quoted prices. In practice, low market liquidity often means a large bid–
ask spread—the gap between prices quoted for an immediate bid and an immediate
ask. As a result, the execution of orders may incur a discrepancy between the prices
For example, for US equities and options, E*Trade (, accessed on
16 March 2011.) charges only $9.99 for $50,000+ or 30+ stocks per quarter.
T&F Cat #K23731 — K23731_C014 — page 129 — 9/26/2015 — 8:12
sent by algorithms and the prices actually executed. Moreover, stocks are often traded
in multiples of lots, which is the standard trading unit containing a number of stock
shares. In this situation, the quantity of the stocks may not be arbitrarily divisible. In
our numerical evaluations, we have tried to minimize the effect of market liquidity by
choosing the stocks that have large market capitalizations, which usually have small
bid–ask spreads and discrepancies, and thus have high market liquidity.
The third assumption is that a portfolio strategy would have no impact on the
market, that is, the stock market will not be affected by any trading algorithms. In
practice, theimpactcan be neglectedifthe market capitalization ofaportfolio is nottoo
large.However, as the experimental results show, the portfolio wealth generated by the
proposed algorithms increases very fast, which would inevitably impact the markets.
One simple way to handle this issue is to scale down the portfolio, as is done by
many quantitative funds. Moreover, the development of sell-side algorithmic trading,
which slices a big order into multiple smaller orders and schedules these orders to
minimize their market impact, can significantly decrease the potential market impact
of the proposed algorithms.
Here, we emphasize again that our current study assumes a “perfect market,”
which is consistent with existing studies in literature. It is important to note that
even in such a perfect financial market, no algorithm has ever claimed such high
performance, especially on the standard NYSE (O) dataset. Though past performance
may not be a reliable indicator of future performance, such encouraging results do
provide us confidence that the proposed algorithms may work well in future unseen
14.2 On Mean Reversion Assumptions
Though the proposed mean reversion algorithms perform well on most datasets, we
do not claim that they perform well on arbitrary portfolio pools. Note that passive–
aggressive mean reversion (PAMR)/confidence-weighted mean reversion (CWMR)
relies on the assumption that (single-period) mean reversion exists in a portfolio
pool, that is, buying underperforming stocks in previous periods is profitable. Pre-
ceding experiments seem to show that, in most cases, such mean reversion does
exist. However, it is still possible that this assumption fails to exist in certain cases,
especially when portfolio components are incorrectly selected. PAMR/CWMR’s per-
formance on the DJIA dataset indicates that (single-period) mean reversion may not
exist in the dataset. Although both are based on mean reversion, PAMR and Anticor
are formulated with different time periods of mean reversion, which may be inter-
preted as meaning that Anticor achieves a good performance on DJIA. This also
motivates the proposed online moving average reversion (OLMAR), which exploits
multiple-period instead of single-period mean reversion. Thus, before investing in
a real market, it is of crucial importance to ensure that the motivating mean rever-
sion, either single period or multiple period, does exist among the portfolio pools. In
academia, the mean reversion property in a single stock has been extensively studied
However, we cannot say that we have removed or eliminated the impact of the bid–ask spread.
T&F Cat #K23731 — K23731_C014 — page 130 — 9/26/2015 — 8:12
(Poterba and Summers 1988; Hillebrand 2003; Exley et al. 2004); one natural way
is to calculate the sign of its autocorrelation (Poterba and Summers 1988). On the
contrary, the mean reversion property among a portfolio lacks academic attention.
Our motivation in CWMR (Table 10.1) provides a preliminary method to test single-
period mean reversion. Different from the mean reversion in a single stock, mean
reversion in a portfolio concerns not only the mean reversion in individual stocks but
also the interactions among different stocks.
14.3 On Theoretical Analysis
In this book, our evaluations focus on empirical aspects of the strategies, which is
unfair to some theoretically guaranteed methods, such as UP, EG, and ONS. Although
the proposed four algorithms are not designed to asymptotically achieve the expo-
nential growth of a specific experts, such as BCRP, it is better for us to explain the
aspect of theoretical analysis, which is missing in our study.
On the one hand, we give no theoretical guarantee, or universal property, for
the four proposed algorithms. In particular, we find it hard to prove the universal
property for CORN, as it utilizes a correlation coefficient to select a similarity set.
For the three mean reversion algorithms, since the mean reversion trading idea is
counterintuitive, it is difficult to provide a traditional regret bound.
Although we
cannot prove the traditional regret bound, the proposed algorithms do provide strong
empirical evidence, which sequentially advances the state of the art.
On the other hand, it is possible to utilize certain meta-algorithms (Li et al. 2012,
2013; Li and Hoi 2012) that combine the proposed algorithms and some universal
portfolio selection algorithms, such that the entire meta-system enjoys the universal
property (Das and Banerjee 2011, Corollary 1). Meanwhile, such a meta-system can
also benefit from the proposed algorithms and can produce significant high empirical
performance. Note that even with a worst-case guarantee, some existing universal
algorithms perform poorly on the datasets. Anyway, even though it is convenient to
propose a universal meta-system, the original algorithms’ theoretical aspects are still
an open question and deserve further exploration.
14.4 On Back-Tests
Due to the unavailability of the intraday data and order books, we have conducted
all the experiments based on public daily data, even though it may suffer from cer-
tain potential problems. One potential problem is that our algorithms may be earning
“dealers profits” in an uncontrolled and unfair way, or simply they are earning from
the “bid–ask bounce” (Mcinish and Wood 1992; Porter 1992), which denotes a result
of trades replacing the market makers bid or ask quotes. This suspicion is compat-
ible with the algorithms being contrarian strategies, such as PAMR, CWMR, and
OLMAR. To eliminate this possibility, it would be good to try to eliminate the bid–
ask bounce by replacing the market prices by the midpoint of the best bid and ask
Borodin et al. (2004) failed to provide a regret bound for Anticor strategy, which passively exploits
the mean reversion idea.
T&F Cat #K23731 — K23731_C014 — page 131 — 9/26/2015 — 8:12
prices (Gosnell et al. 1996). However, calculating the midpoints of the best bid and
ask prices requires access to the order book, which is usually private and not free,
rather than simply the log of transactions. Another possibility would be to take into
account only “sell-type” (or only “buy-type”) transactions, meaning the transactions
in response to market orders to sell, in which case the buying counterpart would be
the one issuing a limit order. However, addressing the possibility also requires one
to find out the order type (Keim and Madhavan 1995; Foucault et al. 2005) of each
trade, which is usually not available to the public.
Back-tests in historical markets may suffer from “data-snooping bias” issues, one
of which is the dataset selection issue. On the one hand, we selected four datasets,
the NYSE (O), TSE, SP500, and DJIA datasets, based on previous studies with-
out consideration to the proposed approaches. On the other hand, we developed the
proposed algorithms solely based on the NYSE (O) dataset, while the other five
datasets (NYSE (N), TSE, SP500, MSCI, and DJIA) were obtained after the algo-
rithms were fully developed. However, even though we are cautious about the dataset
selection issue, it may still appear in the experiments, especially for the datasets with
a relatively long history, that is, NYSE (O) and NYSE (N). The NYSE (O) dataset,
pioneered by Cover (1991) and followed by other researchers, is a “standard” dataset
in the online portfolio selection community. Since it contains 36 large-cap NYSE
stocks that survived for 22 years, it suffers from extreme survival bias. Nevertheless,
it still has the merit to compare different algorithms as done in all previous studies. The
NYSE (N) dataset, as a continuation of NYSE (O), contains 23 assets that survived
from the previous 36 stocks for another 25 years. Therefore, it becomes even worse
than its precedent in terms of survival bias. In summary, even though the empirical
results on these datasets clearly show the effectiveness of the proposed algorithms,
one cannot make claims without noticing the deficiencies of these datasets.
Another common bias is the asset selection issue. Four of the six datasets (the
NYSE (O), TSE, SP500, and DJIA) are collected by others, and to the best of our
knowledge, their assets are mainly the largest blue chip stocks in their respective
markets. As a continuation of NYSE (O), we self-collected NYSE (N), which again
contains several of the largest survival stocks in NYSE (O). The remaining dataset
is chosen according to the world indices. In summary, we try to avoid
the asset selection bias via arbitrarily choosing some representative stocks in their
respective markets, which usually have large capitalization and high liquidity and
thus reduce the market impact caused by any proposed portfolio strategies.
Moreover, there are some critics regarding the datasets’ liquidity issue, which
assumes that the assets are available in unbounded quantities for buying or selling
at any given trading period. In Table 13.1, we observe cumulative of 10
or more,
and there are assets with capitalization less than 10
; then, obviously, the liquidity
assumption is not fulfilled. In NYSE (O), there are many such assets, and even in
NYSE (N) there are four such assets: SHERW, KODAK, COMME, and KINAR. The
most “dangerous” asset is KINAR, identified as asset #23 in Table 13.5, where there
In fact, we collected this dataset following Li et al. (2012)’s review comments, which means the dataset
does not exist before its third-round submission.
T&F Cat #K23731 — K23731_C014 — page 132 — 9/26/2015 — 8:12
are no data on its capitalization, but certainly it is a very small asset. One remedy is
to only consider the remaining 19 assets out of the 23 in the experiments, as done
by Györfi et al. (2012, Chapter 2).
Finally, following existing model assumptions and experimental setting, we do not
consider the low-quality assets, such as the bankrupt and penny stocks. The bankrupt
stock data are difficult to acquire; thus, we cannot observe their behaviors and predict
the behaviors of the proposed algorithms. In reality, the bankruptcy situation rarely
happens for blue chip stocks because typically a bankrupt stock would be removed
from the list of blue chip stocks before it actually goes into bankruptcy. The penny
stocks lack sufficient liquidity to support the trading frequency required for our current
research. Besides, one could also explore many practical strategies to exclude such
low-quality stocks from the asset pool at some early stage, such as technical and
fundamental analysis.
14.5 Summary
This chapter argued some assumptions in our models and back-tests, which will be
faced by various empirical research in trading strategies. When back-testing a strategy,
researchers should be aware of these assumptions and thus can take measures to
weaken their impacts on the profits in real trading.
T&F Cat #K23731 — K23731_C014 — page 133 — 9/26/2015 — 8:12
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.