Chapter 6
Pattern Matching
Besides follow the winner and follow the loser, another category utilizes both
winners and losers, and it is based on pattern matching. This category mainly covers
nonparametric sequential investment strategies, which guarantee an optimal growth
of capital under minimal assumptions on the market, that is, stationary and ergodic
of the financial time series. Based on nonparametric prediction (Györfi and Schäfer
2003), this category consists of several pattern matching–based investment strate-
gies (Györfi et al. 2006, 2007, 2008; Li et al. 2011a). Note that in the data-mining
communities, some researchers focus on detecting important signals or patterns in
time series (Mcinish and Wood 1992; Berndt and Clifford 1994; Agrawal and Srikant
1995; Srikant and Agrawal 1996; Ting et al. 2006; Cañete et al. 2008; Du et al. 2009),
which is beyond our discussion.
In general, the pattern matching–based approaches (Györfi et al. 2006) consist of
two steps,thatis, the sampleselectionand portfolio optimization steps.Supposewe are
choosing a portfolio for period t +1. First, the sample selection step selects a set C
t
of
similar historical indices, whose corresponding price relatives will be used to predict
the next one. Then, each price relative vector x
i
,i C
t
, is assigned a probability of
P
i
,i C
t
. Existing methods often choose uniform probability P
i
=
1
|C
t
|
, where |·|
denotes the cardinality of a set. Second, the portfolio optimization step learns an
optimal portfolio based on the selected set, that is,
b
t+1
= arg max
b
m
U(b,C
t
),
where U(·) is a specified utility function, such as log utility. In case of an empty
sample set, a uniform portfolio is adopted.
In this chapter, we concretize the sample selection step in Section 6.1 and the port-
folio optimization step in Section 6.2. We finally combine the two steps to formulate
specific online portfolio selection algorithms in Section 6.3. Based on the principle,
we further proposed the correlation-driven nonparametric learning (CORN) algorithm
in Chapter 8.
35
T&F Cat #K23731 — K23731_C006 — page 35 — 9/26/2015 — 8:11
36 PATTERN MATCHING
6.1 Sample Selection Techniques
The general idea of this step is to select similar samples from historical price relatives
by comparing two preceding market windows. Suppose we are locating the price rel-
ative vectors that are similar to the next vector x
t+1
. The basic routine is to iterate all
historic price relatives x
i
,i = w +1,...,t and count x
i
as one similar vector, if its
preceding market window x
i1
iw
is similar to the latest market window x
t
tw+1
. A set
C
t
contains the indexes of similar price relatives. Note that the market window
is a w ×m-matrix and the similarity is typically calculated on the concatenated
w ×m-vectors. Algorithm 6.1 further illustrates the procedure.
Algorithm 6.1: Sample selection procedure (C(x
t
1
,w)).
Input: x
t
1
: Historical market sequence; w: window size.
Output: C
t
: Index set of similar price relatives.
Initialize C
t
=∅;
if t w then
return;
end
for i = w +1,w+2,...,t do
if x
i1
iw
is similar to x
t
tw+1
then
C
t
= C
t
i;
end
end
A nonparametric histogram-based sample selection (Györfi and Schäfer 2003)
predefines a set of discretized partitions, partitions both the latest market win-
dow (x
t
tw+1
) and historical market windows (x
i1
iw
,i = w +1,...,t), and finally
chooses price relatives x
i
whose preceding market window (x
i1
iw
) is in the same
partition as x
t
tw+1
. In particular, given a partition P = A
j
,j = 1, 2,...,d, which
discretizes R
m
+
into d disjoint sets, and a corresponding discretization function
G(x) = j , we can define the similarity set as
C
H
x
t
1
,w
=
w<i<t+1 : G
x
t
tw+1
= G
x
i1
iw

.
Nonparametric kernel-based sample selection (Györfi et al. 2006) identifies the
similarity set by evaluating the Euclidean distance between two market windows,
C
K
x
t
1
,w
=
w<i<t+1 :
x
t
tw+1
x
i1
iw
c
,
where c and are thresholds used to control the number of similar samples.
Nonparametric nearest neighbor-based sample selection (Györfi et al. 2008)
searches price relatives whose preceding market windows are within the k nearest
neighbors of the latest market window, that is,
C
N
x
t
1
,w
=
w<i<t+1 : x
i1
iw
is among the k NNs of x
t
tw+1
,
where k is a threshold parameter.
T&F Cat #K23731 — K23731_C006 — page 36 — 9/26/2015 — 8:11
PORTFOLIO OPTIMIZATION TECHNIQUES 37
6.2 Portfolio Optimization Techniques
The second step of pattern matching–based approaches is to construct an optimal
portfolio based on the sample set C
t
. Two main principles are Kelly’s (1956) capital
growth portfolio and Markowitz’s (1952) mean variance portfolio.
Györfi et al. (2006) proposed to figure out a log-optimal (Kelly) portfolio, based
on similar price relatives, which clearly follows the capital growth portfolio theory.
Given a sample set C
t
, the log-optimal utility function is defined as
U
L
(b,C
t
) = E{log b ·x|x
i
,i C
t
}=
iC
t
P
i
log b·x
i
,
where P
i
denotes the probability assigned to x
i
,i C
t
. Györfi et al. (2006) assumed
a uniform probability, thus equivalently,
U
L
(b,C
t
) =
iC
t
log b·x
i
. (6.1)
Maximizing the above function results in a BCRP portfolio (Cover 1991) over the
similar price relatives.
Györfi et al. (2007) introduced semi-log-optimal utility function, which approx-
imates log utility in Equation 6.1 aiming to release its computational complexity;
and Vajda (2006) presented corresponding theoretical analysis and proved its
universality. The semi-log-optimal utility function is defined as
U
S
(b,C
t
) = E{f(b ·x)|x
i
,i C
t
}=
iC
t
P
i
f(b ·x
i
),
where f(·) is the second-order Taylor expansion of log z with respect to z = 1, that
is,
f(z) = z 1
1
2
(z 1)
2
.
Györfi et al. (2007) adopted a uniform probability of P
i
, thus, equivalently,
U
S
(b,C
t
) =
iC
t
f(b ·x
i
).
Ottucsák and Vajda (2007) proposed a Markowitz-type utility function, which
further generalizes the semi-log-optimalstrategy.The basic idea is to trade offbetween
portfolio mean and variance, which is similar to Markowitz’s mean variance theory.
To be specific, its utility function is defined as
U
M
(b,C
t
) = E{b·x|x
i
,i C
t
}−λVar {b ·x|x
i
,i C
t
}
= E{b ·x|x
i
,i C
t
}−λE{(b ·x)
2
|x
i
,i C
t
}+λ(E{b ·x|x
i
,i C
t
})
2
,
where λ is a trade-off parameter. In particular, simple numerical transformations show
that the semi-log-optimal portfolio is one special case of this utility function.
T&F Cat #K23731 — K23731_C006 — page 37 — 9/26/2015 — 8:11
38 PATTERN MATCHING
To solve the problem with transaction costs, Györfi and Vajda (2008) proposed
a GV-type utility function
by incorporating the transaction costs (Gyorfi and Walk
2012),
U
T
(b,C
t
) = E{log b ·x +log w(b
t
, b, x
t
)},
where w(·) (0, 1) is the transaction cost factor, which represents the remaining pro-
portion after transaction costs. With a uniform probability assumption, it is equivalent
to calculate:
U
T
(b,C
t
) =
iC
t
(log b·x
i
+log w(b
t
, b, x
t
)).
In any of the above procedures, if the similarity set is non-empty, we can obtain an
optimal portfolio based on the similar price relatives and their assumed probability. In
the case of an empty set, we can choose either a uniform portfolio or the last portfolio.
6.3 Combinations
Finally, let us combine the two steps and describe specific algorithms in the pattern
matching–based approach. Table 6.1 summarizes all existing combinations.
One default utility function is the log-optimal function. Györfi and Schäfer (2003)
introduced thenonparametrichistogram-basedlog-optimal investment strategy(B
H
),
which combines the histogram-based sample selection and log-optimal utility func-
tion. Györfi et al. (2006) presented the nonparametric kernel-based log-optimal
investment strategy (B
K
), which combines the kernel-based sample selection and
log-optimal utility function. Györfi et al. (2008) proposed the nonparametric near-
est neighbor log-optimal investment strategy (B
NN
), which combines the nearest
neighbor sample selection and log-optimal utility function.
Besides the log-optimal utility function, several algorithms using different util-
ity functions have been proposed. Györfi et al. (2007) proposed the nonparametric
kernel-based semi-log-optimal investment strategy (B
S
) by combining the kernel-
based sample selection and semi-log-optimal utility function, which greatly eases
Table 6.1 Pattern matching–based approaches: sample selection and portfolio optimization
Sample Selection Techniques
Portfolio Optimization Histogram Kernel Nearest Neighbor
Log-optimal B
H
:C
H
+U
L
B
K
:C
K
+U
L
B
NN
:C
N
+U
L
Correlation-driven CORN
Semi-log-optimal B
S
:C
K
+U
S
Markowitz-type B
M
:C
K
+U
M
GV-type B
GV
:C
K
+U
R
Note: —, no algorithm in the combinations.
Algorithm 2 in Györfi and Vajda (2008).
T&F Cat #K23731 — K23731_C006 — page 38 — 9/26/2015 — 8:11
SUMMARY 39
the computation of B
K
. Ottucsák and Vajda (2007) proposed the nonparametric
kernel-based Markowitz-type investment strategy (B
M
) by combining the kernel-
based sample selection and Markowitz-type utility function. Györfi and Vajda (2008)
proposed the nonparametric kernel-based GV-type investment strategy (B
GV
)by
combining the kernel-based sample selection and GV-type utility function to select
portfolios in case of nonzero transaction costs.
6.4 Summary
This chaptersummarizesthepattern matching–based principle, which mainlyincludes
pattern-matching and portfolio optimization steps. Empirically, these algorithms
exploit recurring patterns over the history and produce good empirical performance.
One of its key problems is to identify the recurring patterns, which leads to our CORN
strategy in Chapter 8.
T&F Cat #K23731 — K23731_C006 — page 39 — 9/26/2015 — 8:11
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.227.251.194