Chapter 4 - Follow theWinner (1/2)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 4

Follow the Winner

The ﬁrst principle, follow the winner, is characterized by increasing the weights of

more successful experts or stocks. Rather than targeting market and best stock, algo-

rithms in this category often aim to track the BCRP strategy, that is, their target is to

be universal.

This chapter is organized as follows. Section 4.1 introduces Cover’s universal

portfolios (UP) algorithm, and Section 4.2 details the exponential gradient (EG) algo-

rithm. Sections 4.3 and 4.4 introduce the follow the leader (FTL) and follow the

regularized leader (FTRL) approaches, respectively. Finally, Section 4.5 summarizes

the follow the winner principle.

4.1 Universal Portfolios

The basic idea of Universal Portfolios–type (UP-type) algorithms is to assign the

capital to base experts of a single class, let the experts run, and ﬁnally pool their wealth.

They are analogous to the Buy-and-Hold (BAH) strategy. In particular, BAH’s base

experts belong to a special strategy investing on a single asset, and thus the number

of experts equals that of assets. In other words, BAH buys individual stocks, lets

the stocks go, and ﬁnally pools their individual wealth. On the other hand, the base

experts in the UP-type algorithms can be any single strategy class that invests over the

whole markets. Besides, UP-type algorithms are also similar to the meta-algorithms

(MA) in Chapter 7, while the latter applies to base experts of multiple classes.

Cover (1991) proposed the universal portfolio strategy, and Cover and Ordentlich

(1996) further reﬁned the algorithm as μ-weighted universal portfolio, in which μ

denotes a given distribution on the space of valid portfolio 

. Intuitively, Cover’s

UP operates similar to a fund of funds (FOF),

∗

and its main idea is to buy and hold

parameterized CRP strategies over the whole simplex domain. In particular, it initially

invests a proportion of wealth dμ(b) to each portfolio manager operating CRPstrategy

with b ∈ 

, and lets the CRP managers run. Then, at the end, each manager will

grow his wealth to S

(b)dμ(b). Finally, Cover’s UP pools the individual experts’

∗

An FOF holds a portfolio of other investment funds, rather than directly investing in stocks, futures,

etc.

T&F Cat #K23731 — K23731_C004 — page 23 — 9/28/2015 — 21:07

24 FOLLOW THE WINNER

wealth over the continuum of portfolio strategies. Note that S

(b) = e

(b)

, which

means that the portfolio grows at an exponential rate of W

(b).

Formally, its update scheme (Cover and Ordentlich 1996, Deﬁnition 1) can

be interpreted as a historical performance-weighted average of all valid constant

rebalanced portfolios,

t+1





(b)dμ(b)





(b)dμ(b)

Note that at the beginning of period t +1, one CRP manager’s wealth (historical

performance) equals S

(b)dμ(b). Incorporating the initial wealth of S

= 1, the ﬁnal

cumulative wealth is the weighted average of CRP managers’ wealth (Cover and

Ordentlich 1996, Eq. (24)):

(UP) =





(b)dμ(b). (4.1)

One special case is that μ equals a uniform distribution; the portfolio update reduces

to Cover’s UP (Cover 1991, Eq. (1.3)). Another special cases is Dirichlet



,...,



weighted UP (Cover and Ordentlich 1996), which is proved to be a more optimal

allocation.

Alternatively, if a loss function is deﬁned as the negative logarithmic function

of portfolio return, Cover’s UP is actually an exponentially weighted average fore-

caster (Cesa-Bianchi and Lugosi 2006). The regret (Cover 1991) achieved by Cover’s

UP is O(m log n), and its time complexity is O(n

), where m denotes the number

of stocks and n refers to the number of periods. Cover and Ordentlich (1996) proved

that the



,...,



weighted UP has the same scale of regret bound, but a better

constant term (Cover and Ordentlich 1996, Theorem 2).

As Cover’s UP is based on an ideal market model, one research direction is to

extend the algorithm to handle various realistic assumptions. Cover and Ordentlich

(1996) considered side information, including experts’ opinions and fundamental

data. Cover and Ordentlich (1998) extended the algorithm to handle short selling and

margin, and Blum and Kalai (1999) took account of transaction costs.

Another research direction is to generalize Cover’s UP with different base classes,

rather than the CRP strategy. Jamshidian (1992) generalized the algorithm for con-

tinuous time markets and presented its long-term performance. Vovk and Watkins

(1998) applied the aggregating algorithm (AA) (Vovk 1990) to a ﬁnite number of

arbitrary investment strategies, of which Cover’s UP becomes a specialized case

when applied to an inﬁnite number of CRPs. Ordentlich and Cover (1998) analyzed

the minimal ratio of ﬁnal wealth achieved by any nonanticipating investment strategy

to that of BCRP and presented a strategy to achieve such an optimal ratio. Cross and

Barron (2003) generalized Cover’s UP from the CRP strategy class to any parameter-

ized target class and proposed a computation favorable universal strategy. Akcoglu

et al. (2005) extended Cover’s UP from a parameterized CRP class to a wide class

of investment strategies, including trading strategies operating on a single stock and

portfolio strategies operating on the whole stock market. Kozat and Singer (2011)

proposed a similar universal algorithm based on the class of semiconstant rebalanced

T&F Cat #K23731 — K23731_C004 — page 24 — 9/28/2015 — 21:07

EXPONENTIAL GRADIENT 25

portfolios (Helmbold et al. 1998), which provides good asymptotic performance in

case of nonzero transaction costs.

Besides our intuitive analysis, various works have been proposed to discuss the

connection between Cover’s UP with universal prediction (Feder et al. 1992), data

compression (Rissanen 1983), and Markowitz’s mean variance theory (Markowitz

1952). Algoet (1992) discussed a universal scheme for prediction, gambling, and

portfolio selection. Cover (1996) and Ordentlich (1996) discussed the connection

between UP selection and data compression. Belentepe (2005) presented a statistical

view of Cover’s UP strategy and claimed its approximate equivalence to the mean

variance portfolio theory.

Although Cover’s UP has a tight regret bound, its implementation is exponential

to the number of stocks, which restricts its practical applicability. To handle the

computational issue, Kalai and Vempala (2002) presented an efﬁcient implementation

based on rapidly mixing nonuniform random walks, improving the running time from

original O(n

) to O(m

4.2 Exponential Gradient

Algorithms in the exponential gradient type (EG type) focus on the following

optimization formulation:

t+1

= arg max

b∈

η log b ·x

−R(b, b

), (4.2)

where R(b, b

) denotes a regularization term and η > 0 denotes a learning rate. One

straightforward interpretation is to track the best stock in the last period while keeping

previous portfolio information via a regularization term.

Helmbold et al. (1998) proposed the EG strategy, which is based on the same

algorithm for mixture estimation (Helmbold et al. 1997). Following Equation 4.2,

EG adopts relative entropy as its regularization term, that is,

R(b, b

) =



i=1

log

t,i

EG’s formulation is convex in b; however, it is hard to solve since the log function is

nonlinear. Thus, the authors adopted log’s ﬁrst-order Taylor expansion at b

, that is,

log b·x

≈ log(b

·x

) +

·x

(b −b

Then the nonlinear log term becomes linear and the optimization is easy to solve.

Solving the optimization, we can obtain EG’s update rule as

t+1,i

= b

t,i

exp



t,i

·x





Z, i = 1,...,m,

where Z denotes the normalization term such that the portfolio weights sum to 1.

T&F Cat #K23731 — K23731_C004 — page 25 — 9/28/2015 — 21:07

26 FOLLOW THE WINNER

Besides the multiplicative update rule (EG), the optimization problem can also

be solved using the gradient projection (GP) and expectation–maximization (EM)

(Helmbold et al. 1997). Rather than EG’s relative entropy, GP adopts an L2-norm

regularization, and EM adopts an χ

regularization term, that is,

R(b, b

) =





i=1

−b

t,i

)



i=1

−b

t,i

)

t,i

Solving corresponding optimization problems, we can obtain GP’s update rule as

t+1,i

= b

t,i

+η

⎛

⎝

t,i

·x

−



j=1

t,j

·x

⎞

⎠

and EM’s update rule as

t+1,i

= b

t,i



t,i

·x

−1



The latter can also be viewed as EG’s ﬁrst-order approximation.

One key parameter for the EG-type algorithms is the learning rate η. To achieve a

universal regret bound, η has to be small. However, as η → 0, its update approaches

uniform,

∗

which degrades to UCRP. Such an analysis will be empirically veriﬁed in

Section 13.6.

EG has a regret bound of O(



n log m) and a running time of O(mn). The regret

is not as tight as Cover’s UP; however, its linear time substantially surpasses that of

UP. Besides, the authors also proposed a variant by transforming all price relatives,

which has a tight regret bound of O(m log n). Though not ofﬁcially proposed for

online portfolio selection (Helmbold et al. 1997), GP can straightforwardly achieve

a regret of O(

√

mn), which is signiﬁcantly worse than EG.

Das and Banerjee (2011) generalized EG-type algorithms to an MA named online

gradient updates, which combine underlying experts such that the overall system

performs no worse than any convex combination of its base experts.

4.3 Follow the Leader

FTL strategies directly track the BCRP till time t:

t+1

= b

∗

= arg max

b∈



τ=1

log(b ·x

). (4.3)

Intuitively, this category follows the BCRP leader over the known periods, and the

ultimate leader is BCRP over the whole periods.

∗

In case of η = 0, b

t+1,i

= b

t,i

=···=b

1,i

T&F Cat #K23731 — K23731_C004 — page 26 — 9/28/2015 — 21:07

FOLLOW THE REGULARIZED LEADER 27

Ordentlich (1996, Chapter 4.4) brieﬂy mentioned a strategy to obtain portfolios

by mixing BCRP to date and uniform portfolio:

t+1

t +1

∗

t +1

However, its worst-case regret bound is worse than that of Cover’s UP.

Gaivoronski and Stella (2000) proposed successive constant rebalanced port-

folios (SCRPs) and weighted successive constant rebalanced portfolios (WSCRPs)

for stationary markets. For each period, SCRP directly adopts the BCRP to date,

t+1

= b

∗

. The authors further solved the optimal portfolio b

∗

via stochastic opti-

mization (Birge and Louveaux 1997), resulting in the updates (Gaivoronski and Stella

2000, Algorithm 1). On the other hand, WSCRP outputs a convex combination of the

SCRP and previous portfolio:

t+1

= (1 −γ)b

∗

+γb

where γ ∈[0, 1] represents a trade-off parameter. The regret bounds achieved by

SCRP and WSCRP are O(mlog n), which is the same as that of Cover’s UP.

Rather thanassuminga stationary market, somealgorithmsin this categoryassume

that the historical market is nonstationary. Gaivoronski and Stella (2000) proposed

variable rebalanced portfolios (VRP), which calculate the BCRP on a latest sliding

window. To be speciﬁc, VRP updates the portfolio as

t+1

= arg max

b∈



τ=t−W +1

log(b ·x

where W denotes a speciﬁed window size.

Gaivoronski and Stella (2003) further proposed adaptive portfolio selection

(APS). By changing the objective, APS can handle three portfolio selection tasks,

that is, adaptive Markowitz portfolio, log-optimal constant rebalanced portfolio, and

index tracking. To handle the transaction cost issue, they further proposed threshold

portfolio selection, which only rebalances the portfolio if the expected return of a new

portfolio exceeds that of the last portfolio by a threshold.

4.4 Follow the Regularized Leader

The FTRL approach adds a regularization term to Equation 4.3:

t+1

= arg max

b∈



τ=1

log(b ·x

) −

R(b), (4.4)

where β denotes a trade-off parameter and R(b) is the regularization term on b. Note

that the ﬁrst term includes all historical information; thus, the regularization term only

relates to the next portfolio, which is different from the EG algorithm. One typical

regularization is L2-norm, that is, R(b) =b

T&F Cat #K23731 — K23731_C004 — page 27 — 9/28/2015 — 21:07

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 4 - Follow theWinner (1/2)

Create new playlist

Sign In

Sign Up

Table of Contents for
Chapter 4 - Follow theWinner (1/2)