28 FOLLOW THE WINNER
Agarwal et al. (2006) proposed the online Newton step (ONS), by solving the
optimization problem (4.4) with L2-norm regularization via online convex optimiza-
tion (Zinkevich 2003; Hazan 2006; Hazan et al. 2006, 2007). Similar to the regular
offline Newton method, the basic idea of ONS is to replace the log term via its second-
order Taylor expansion at b
t
, and then solve for the closed-form updates. Finally, the
update rule of ONS is
b
1
=
1
m
,...,
1
m
, b
t+1
=
A
t
m
(δA
1
t
c
t
),
with A
t
=
t
τ=1
x
τ
x
τ
(b
τ
·x
τ
)
2
+I
m
and c
t
=
1 +
1
β
t
τ=1
x
τ
b
τ
·x
τ
, where β is a trade-off
parameter, δ is a scaling term, I
m
denotes an m ×m diagonal matrix, and
A
t
m
(·) is
an exact projection to the simplex domain.
ONS’s regret bound is O(m
1.5
log(mn)), which is slightly worse than that of
Covers UP. Since it iteratively updates the first- and second-order information, it
costs O(m
3
) per period, which is irrelevant to the number of periods. To sum up, its
total time cost is O(m
3
n).
While FTRL focuses on the worst-case investing, Hazan and Kale (2009, 2012)
linked the worst-case investing with practically widely used average-case investing,
that is, the geometric Brownian motion (GBM) model (Bachelier 1900; Osborne 1959;
Cootner 1964). The authors designed an investment strategy that is universal in the
worst case and is capable of exploiting the GBM model. The algorithm, or so-called
Exp-Concave-FTL, follows a similar formulation to ONS, that is,
b
t+1
= arg max
b
m
t
τ=1
log(b ·x
τ
)
1
2
b
2
.
The optimization problem can be efficiently solved via online convex optimization,
which typically requires a high time complexity (i.e., similar to the ONS). If the stock
price follows the GBM model, the regret round becomes O(mlog Q), where Q is a
quadratic variability calculated as n 1 times the sample variance of price relative
vectors. Since Q is typically much smaller than n, the regret bound is significantly
improved from previous O(mlog n).
Besides the improved regret bound, the authors also discussed the relationship
between their algorithm and trading frequency. The authors asserted that increasing
the trading frequency would decrease the variance of minimum-variance CRP, while
the regret stays the same. Therefore, it is expected to see improved performance
as the trading frequency increases, which is empirically observed by Agarwal et al.
(2006).
Das and Banerjee (2011) further extended the FTRLapproach to a generalized MA
termed online Newton update (ONU), which guarantees that the overall performance
is no worse than any convex combination of the base experts.
T&F Cat #K23731 — K23731_C004 — page 28 — 9/28/2015 — 21:07
SUMMARY 29
4.5 Summary
Follow the winner is the main principle of online portfolio selection research. While
most algorithms in this category are guaranteed by the theory (regret bound), their
empirical performance is not outstanding (cf. the empirical results in Chapter 13).
We believe that the main reason for this phenomenon is that their target in hindsight,
or BCRP, assumes that price relatives follow i.i.d., which may be contradictory to
the empirical evidence of real markets. The next chapter will introduce a different
principle, follow the loser, which makes a different assumption regarding market
behaviors.
T&F Cat #K23731 — K23731_C004 — page 29 — 9/28/2015 — 21:07
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.143.168.172