The actual tragedies of life bear no relation to one’s preconceived ideas. In the event, one is always bewildered by their simplicity, their grandeur of design, and by that element of the bizarre which seems inherent in them.
Jean Cocteau
On the one hand, vectorized backtesting with NumPy
and pandas
is generally convenient and efficient to implement, due to the concise code, and also fast to execute, due to these packages being optimized for such operations. However, the approach cannot cope with all types of trading strategies nor with all phenomena that the trading reality presents an algorithmic trader with. When it comes to vectorized backtesting, potential shortcomings of the approach are:
Look-ahead bias: Vectorized backtesting is based on the complete data set available and does not take into account that new data arrives incrementally.
Simplification: For example, fixed transaction costs cannot be modeled by vectorization which is mainly based on relative returns. Also fixed amounts per trade or the non-divisibility of single financial instruments (for example, a share of a stock) cannot be modeled properly.
Non-recursiveness: Algorithms, embodying trading strategies, might take recurse to state variables over time, like profit and loss up to a certain point in time or similar path-dependent statistics. Vectorization cannot cope with such features.
On the other hand, event-based backtesting allows to address these issues by a more realistic approach to model trading realities. On a basic level, an event is characterized by the arrival of new data. Backtesting a trading strategy for the Apple, Inc. stock based on end-of-day data, an event would be a new closing price for the Apple stock. It can also be a change in an interest rate. Or the hitting of a stop loss level. Advantages of the event-based backtesting approach generally are:
Incremental approach: As in the trading reality, backtesting takes place on the premise that new data arrives incrementally, tick-by-tick and quote-by-quote.
Realistic modeling: One has complete freedom to model those processes that are triggered by a new and specific event.
Path dependency: It is straightforward to keep track of conditional, recursive, or otherwise path-dependent statistics, such as the maximum or minimum price seen so far, and to include them in the trading algorithm.
Re-usability: Backtesting different types of trading strategies requires a similar base functionality that can be implemented and unified through object-oriented programming.
Close to trading: Certain elements of an event-based backtesting system can sometimes also be used for the automated implementation of the trading strategy.
In what follows, a new event is generally identified by a bar which represents one unit of new data. For example, events can be one-minute bars for an intraday trading strategy or one-day bars for a trading strategy based on daily closing prices.
The chapter is organized as follows. “Backtesting Base Class” presents a base class for the event-based backtesting of trading strategies. “Long Only Backtesting Class” and “Long Short Backtesting Class” make use of the base class to implement long only and long-short backtesting classes, respectively.
The goals of this chapter are to understand event-based modeling, to create classes that allow a more realistic backtesting and to have a foundational backtesting infrastructure available as a starting point for further enhancements and refinements.
When it gets to building the infrastructure — in the form of a Python class — for event-based backtesting, a couple of requirements must be met:
Retrieving and preparing data: The base class shall take care of the data retrieval and possibly the preparation for the backtesting itself. To keep the discussion focused, end-of-day (EOD) data as read from a CSV file is the type of data the base class shall allow for.
Helper and convenience functions: It shall provide a couple of helper and convenience functions that make backtesting easier. Examples are functions for plotting data, printing out state variables or returning date and price information for a given bar.
Placing orders: The base class shall cover the placing of basic buy and sell orders. For simplicity, only market buy and sell orders are modeled.
Closing out positions: At the end of any backtesting, market positions (if any) need to be closed out. The base class shall take care of this final trade.
If the base class meets these requirements, respective classes to backtest strategies based on simple moving averages (SMAs), momentum or mean-reversion (see Chapter 4) as well as on machine learning-based prediction (see Chapter 5) can be built upon it. “Backtesting Base Class” presents an implementation of such a base class called BacktestBase
. The following is a walk through the single methods of this class to get an overview of its design.
With regard to the special method 0
there are only a few things noteworthy. First, the initial amount available is stored twice, both in a private attribute 1
that is kept constant and in a regular attribute amount
that represents the running balance. The default assumption is that there are no transaction costs.
def
__init__
(
self
,
symbol
,
start
,
end
,
amount
,
ftc
=
0.0
,
ptc
=
0.0
,
verbose
=
True
)
:
self
.
symbol
=
symbol
self
.
start
=
start
self
.
end
=
end
self
.
initial_amount
=
amount
self
.
amount
=
amount
self
.
ftc
=
ftc
self
.
ptc
=
ptc
self
.
units
=
0
self
.
position
=
0
self
.
trades
=
0
self
.
verbose
=
verbose
self
.
get_data
(
)
Stores the initial amount in a private attribute.
Sets the starting cash balance value.
Defines fixed transaction costs per trade.
Defines proportional transaction costs per trade.
Units of the instrument (for example, number of shares) in the portfolio initially.
Sets the initial position to market neutral.
Sets the initial number of trades to zero.
Sets self.verbose
to True
to get full output.
During initialization, the get_data
method is called which retrieves EOD data from a CSV file for the provided symbol and the given time interval. It also calculates the log returns. The Python code that follows has been used already extensively in Chapter 4 and Chapter 5 — it therefore does not need to be explained in detail here.
def
get_data
(
self
):
''' Retrieves and prepares the data.
'''
raw
=
pd
.
read_csv
(
'http://hilpisch.com/pyalgo_eikon_eod_data.csv'
,
index_col
=
0
,
parse_dates
=
True
)
.
dropna
()
raw
=
pd
.
DataFrame
(
raw
[
self
.
symbol
])
raw
=
raw
.
loc
[
self
.
start
:
self
.
end
]
raw
.
rename
(
columns
=
{
self
.
symbol
:
'price'
},
inplace
=
True
)
raw
[
'return'
]
=
np
.
log
(
raw
/
raw
.
shift
(
1
))
self
.
data
=
raw
.
dropna
()
The .plot_data()
method is just a simple helper method to plot the (adjusted close) values for the provided symbol.
def
plot_data
(
self
,
cols
=
None
):
''' Plots the closing prices for symbol.
'''
if
cols
is
None
:
cols
=
[
'price'
]
self
.
data
[
'price'
]
.
plot
(
figsize
=
(
10
,
6
),
title
=
self
.
symbol
)
A method that gets frequently called is .get_date_price()
. For a given bar
it returns the date and price information.
def
get_date_price
(
self
,
bar
):
''' Return date and price for bar.
'''
date
=
str
(
self
.
data
.
index
[
bar
])[:
10
]
price
=
self
.
data
.
price
.
iloc
[
bar
]
return
date
,
price
.print_balance()
prints out the current cash balance given a certain bar while .print_net_wealth()
does the same for the net wealth (= current balance plus value of trading position).
def
print_balance
(
self
,
bar
):
''' Print out current cash balance info.
'''
date
,
price
=
self
.
get_date_price
(
bar
)
(
f
'{date} | current balance {self.amount:.2f}'
)
def
print_net_wealth
(
self
,
bar
):
''' Print out current cash balance info.
'''
date
,
price
=
self
.
get_date_price
(
bar
)
net_wealth
=
self
.
units
*
price
+
self
.
amount
(
f
'{date} | current net wealth {net_wealth:.2f}'
)
Two core methods are .place_buy_order()
and .place_sell_order()
. They allow the emulated buying and selling of units of a financial instrument. First, the .place_buy_order()
method which is commented in detail.
def
place_buy_order
(
self
,
bar
,
units
=
None
,
amount
=
None
)
:
''' Place a buy order. '''
date
,
price
=
self
.
get_date_price
(
bar
)
if
units
is
None
:
units
=
int
(
amount
/
price
)
self
.
amount
-
=
(
units
*
price
)
*
(
1
+
self
.
ptc
)
+
self
.
ftc
self
.
units
+
=
units
self
.
trades
+
=
1
if
self
.
verbose
:
(
f
'
{date} | selling {units} units at {price:.2f}
'
)
self
.
print_balance
(
bar
)
self
.
print_net_wealth
(
bar
)
The date and price information for the given bar
is retrieved.
If no value for units
is given …
… the number of units
is calculated given the value for amount
(note that one needs to be given). The calculation does not include transaction costs.
The current cash balance is reduced by the cash outlays for the units of the instrument to be bought plus the proportional and fixed transaction costs. Note that it is not checked whether there is enough liquidity available.
The value of self.units
is increased by the number of units bought.
This increases the counter for the number of trades by one.
If self.verbose
is True
…
… print out information about trade execution …
… the current cash balance …
… and the current net wealth.
Second, the .place_sell_order()
method which has only two minor adjustments compared to the .place_buy_order()
method.
def
place_sell_order
(
self
,
bar
,
units
=
None
,
amount
=
None
)
:
''' Place a sell order. '''
date
,
price
=
self
.
get_date_price
(
bar
)
if
units
is
None
:
units
=
int
(
amount
/
price
)
self
.
amount
+
=
(
units
*
price
)
*
(
1
-
self
.
ptc
)
-
self
.
ftc
self
.
units
-
=
units
self
.
trades
+
=
1
if
self
.
verbose
:
(
f
'
{date} | selling {units} units at {price:.2f}
'
)
self
.
print_balance
(
bar
)
self
.
print_net_wealth
(
bar
)
The current cash balance is increased by the proceeds of the sale minus transactions costs.
The value of self.units
is decreased by the number of units sold.
No matter what kind of trading strategy is backtested, the position at the end of the backtesting period needs to be closed out. The code in the BacktestBase
class assumes that the position is not liquidated but rather accounted for with its asset value to calculate and print the performance figures.
def
close_out
(
self
,
bar
)
:
''' Closing out a long or short position. '''
date
,
price
=
self
.
get_date_price
(
bar
)
self
.
amount
+
=
self
.
units
*
price
self
.
units
=
0
self
.
trades
+
=
1
if
self
.
verbose
:
(
f
'
{date} | inventory {self.units} units at {price:.2f}
'
)
(
'
=
'
*
55
)
(
'
Final balance [$] {:.2f}
'
.
format
(
self
.
amount
)
)
perf
=
(
(
self
.
amount
-
self
.
initial_amount
)
/
self
.
initial_amount
*
100
)
(
'
Net Performance [
%
] {:.2f}
'
.
format
(
perf
)
)
(
'
Trades Executed [#] {:.2f}
'
.
format
(
self
.
trades
)
)
(
'
=
'
*
55
)
No transaction costs are subtracted at the end.
The final balance consists of the current cash balance plus the value of the trading position.
This calculates the net performance in percent.
The final part of the Python script is the 0
section which gets executed when the file is run as a script.
if
__name__
==
'__main__'
:
bb
=
BacktestBase
(
'AAPL.O'
,
'2010-1-1'
,
'2019-12-31'
,
10000
)
(
bb
.
data
.
info
())
(
bb
.
data
.
tail
())
bb
.
plot_data
()
It instantiates an object based on the BacktestBase
class. This leads automatically to the data retrieval for the symbol provided. Figure 6-1 shows the resulting plot. The output below shows the meta information for the respective DataFrame
object and the five most recent data rows.
In [1]: %run BacktestBase.py <class 'pandas.core.frame.DataFrame'> DatetimeIndex: 2515 entries, 2010-01-05 to 2019-12-31 Data columns (total 2 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 price 2515 non-null float64 1 return 2515 non-null float64 dtypes: float64(2) memory usage: 58.9 KB None price return Date 2019-12-24 284.27 0.000950 2019-12-26 289.91 0.019646 2019-12-27 289.80 -0.000380 2019-12-30 291.52 0.005918 2019-12-31 293.65 0.007280 In [2]:
The two subsequent sections present classes to backtest long only and long short trading strategies. Since these classes rely on the base class presented in this section, the implementation of the backtesting routines themselves is rather concise.
Using object-oriented programming allows to build a basic backtesting infrastructure in the form of a Python class. Standard functionality needed during the backtesting of different kinds of algorithmic trading strategies is made available by such a class in a non-redundant, easy-to-maintain fashion. It is also straightforward to enhance the base class to provide more features by default that might benefit a multitude of other classes built on top of it.
Certain investor preferences or regulations might prohibit short selling as part of a trading strategy. As a consequence, a trader or portfolio manager is only allowed to enter long positions or to park capital in the form of cash or similar low risk assets, like money market accounts. “Long Only Backtesting Class” shows the code of a backtesting class for long only strategies called BacktestLongOnly
. Since it relies on and inherits from the BacktestBase
class, the code to implement the three strategies based on SMAs, momentum and mean reversion is rather concise.
The method .run_mean_reversion_strategy()
implements the backtesting procedure for the mean-reversion-based strategy. This method is commented in detail since it might be a bit trickier from an implementation stand point. The basic insights, however, carry easily over to the methods implementing the other two strategies.
def
run_mean_reversion_strategy
(
self
,
SMA
,
threshold
)
:
''' Backtesting a mean reversion-based strategy. Parameters ========== SMA: int simple moving average in days threshold: float absolute value for deviation-based signal relative to SMA '''
msg
=
f
'
Running mean reversion strategy |
'
msg
+
=
f
'
SMA={SMA} & thr={threshold}
'
msg
+
=
f
'
fixed costs {self.ftc} |
'
msg
+
=
f
'
proportional costs {self.ptc}
'
(
msg
)
(
'
=
'
*
55
)
self
.
position
=
0
self
.
trades
=
0
self
.
amount
=
self
.
initial_amount
self
.
data
[
'
SMA
'
]
=
self
.
data
[
'
price
'
]
.
rolling
(
SMA
)
.
mean
(
)
for
bar
in
range
(
SMA
,
len
(
self
.
data
)
)
:
if
self
.
position
==
0
:
if
(
self
.
data
[
'
price
'
]
.
iloc
[
bar
]
<
self
.
data
[
'
SMA
'
]
.
iloc
[
bar
]
-
threshold
)
:
self
.
place_buy_order
(
bar
,
amount
=
self
.
amount
)
self
.
position
=
1
elif
self
.
position
==
1
:
if
self
.
data
[
'
price
'
]
.
iloc
[
bar
]
>
=
self
.
data
[
'
SMA
'
]
.
iloc
[
bar
]
:
self
.
place_sell_order
(
bar
,
units
=
self
.
units
)
self
.
position
=
0
self
.
close_out
(
bar
)
At the beginning, this method prints out an overview of the major parameters for the backtesting.
The position is set to market neutral, which is done here for more clarity and should be the case anyways.
The current cash balance is reset to the initial amount in case another backtest run has overwritten the value.
This calculates the SMA values needed for the strategy implementation.
The start value SMA
ensures that there are SMA values available to start implementing and backtesting the strategy.
The condition checks whether the position is market neutral.
If the position is market neutral, it is checked whether the current price is low enough relative to the SMA to trigger a buy order and to go long.
This executes the buy order in the amount of the current cash balance.
The market position is set to long.
The condition checks whether the position is long the market.
If that is the case, it is checked whether the current price has returned to the SMA level or above.
In such a case, a sell order is placed for all units of the financial instrument.
The market position is set to neutral again.
At the end of the backtesting period, the market position gets closed out if one is open.
Executing the Python script in “Long Only Backtesting Class” yields backtesting results as shown below. The examples illustrate the influence of fixed and proportional transaction costs. First, they eat into the performance in general. In any case, taking account of transaction costs reduces the performance. Second, they bring to light the importance of the number of trades a certain strategy triggers over time. Without transaction costs, the momentum strategy significantly outperforms the SMA-based strategy. With transaction costs, the SMA-based strategy outperforms the momentum strategy since it relies on fewer trades.
Running SMA strategy | SMA1=42 & SMA2=252 fixed costs 0.0 | proportional costs 0.0 ======================================================= Final balance [$] 56204.95 Net Performance [%] 462.05 ======================================================= Running momentum strategy | 60 days fixed costs 0.0 | proportional costs 0.0 ======================================================= Final balance [$] 136716.52 Net Performance [%] 1267.17 ======================================================= Running mean reversion strategy | SMA=50 & thr=5 fixed costs 0.0 | proportional costs 0.0 ======================================================= Final balance [$] 53907.99 Net Performance [%] 439.08 ======================================================= Running SMA strategy | SMA1=42 & SMA2=252 fixed costs 10.0 | proportional costs 0.01 ======================================================= Final balance [$] 51959.62 Net Performance [%] 419.60 ======================================================= Running momentum strategy | 60 days fixed costs 10.0 | proportional costs 0.01 ======================================================= Final balance [$] 38074.26 Net Performance [%] 280.74 ======================================================= Running mean reversion strategy | SMA=50 & thr=5 fixed costs 10.0 | proportional costs 0.01 ======================================================= Final balance [$] 15375.48 Net Performance [%] 53.75 =======================================================
Chapter 5 emphasizes that there are two sides of the performance coin: the hit ratio for the correct prediction of the market direction and the market timing, that is when exactly the prediction is correct. The results shown here illustrate that there is even a “third side”: the number of trades triggered by a strategy. A strategy that demands a higher frequency of trades has to bear higher transaction costs that easily eat up an alleged outperformance over another strategy without or low transaction costs. Among others, this makes often the case for low cost passive investment strategies based, for example, on exchange-traded funds (ETFs).
“Long Short Backtesting Class” presents the BacktestLongShort
class which also inherits from the BacktestBase
class. In addition to implementing the respective methods for the backtesting of the different strategies, it also implements two additional methods to go long and short, respectively. Only the .go_long()
method is commented in detail since the .go_short()
method does exactly the same in the opposite direction.
def
go_long
(
self
,
bar
,
units
=
None
,
amount
=
None
)
:
if
self
.
position
==
-
1
:
self
.
place_buy_order
(
bar
,
units
=
-
self
.
units
)
if
units
:
self
.
place_buy_order
(
bar
,
units
=
units
)
elif
amount
:
if
amount
==
'
all
'
:
amount
=
self
.
amount
self
.
place_buy_order
(
bar
,
amount
=
amount
)
def
go_short
(
self
,
bar
,
units
=
None
,
amount
=
None
)
:
if
self
.
position
==
1
:
self
.
place_sell_order
(
bar
,
units
=
self
.
units
)
if
units
:
self
.
place_sell_order
(
bar
,
units
=
units
)
elif
amount
:
if
amount
==
'
all
'
:
amount
=
self
.
amount
self
.
place_sell_order
(
bar
,
amount
=
amount
)
In addition to bar
, the methods expect either a number for the units of the traded instrument or a currency amount.
In the .go_long()
case, it is first checked whether there is a short position.
If so, this short position gets closed first.
It is then checked whether units
is given …
… which triggers a buy order accordingly.
If amount
is given, there can be two cases.
First, the value is all
which translates into …
… all the available cash in the current cash balance.
Second, the value is a number that is then simply taken to place the respective buy order. Note that it is not checked, for instance, whether there is enough liquidity.
To keep the implementation concise throughout, there are many simplifications in the Python classes that transfer responsibility to the user. For example, the classes do not take care of whether there is enough liquidity or not to execute a trade. This is an economic simplification since in theory one could assume enough or even unlimited credit for the algorithmic trader. As another example, certain methods expect that at least one of two parameters (either units
or amount
) is specified. There is no code that catches the case where both are not set. This is a technical simplification.
The following presents the core loop from the .run_mean_reversion_strategy()
method of the BacktestLongShort
class. Again, the mean reversion strategy is picked since the implementation is a bit more involved. For instance, it is the only strategy that also leads to intermediate market neutral positions. This necessitates more checks compared to the other two strategies as seen in “Long Short Backtesting Class”.
for
bar
in
range
(
SMA
,
len
(
self
.
data
)
)
:
if
self
.
position
==
0
:
if
(
self
.
data
[
'
price
'
]
.
iloc
[
bar
]
<
self
.
data
[
'
SMA
'
]
.
iloc
[
bar
]
-
threshold
)
:
self
.
go_long
(
bar
,
amount
=
self
.
initial_amount
)
self
.
position
=
1
elif
(
self
.
data
[
'
price
'
]
.
iloc
[
bar
]
>
self
.
data
[
'
SMA
'
]
.
iloc
[
bar
]
+
threshold
)
:
self
.
go_short
(
bar
,
amount
=
self
.
initial_amount
)
self
.
position
=
-
1
elif
self
.
position
==
1
:
if
self
.
data
[
'
price
'
]
.
iloc
[
bar
]
>
=
self
.
data
[
'
SMA
'
]
.
iloc
[
bar
]
:
self
.
place_sell_order
(
bar
,
units
=
self
.
units
)
self
.
position
=
0
elif
self
.
position
==
-
1
:
if
self
.
data
[
'
price
'
]
.
iloc
[
bar
]
<
=
self
.
data
[
'
SMA
'
]
.
iloc
[
bar
]
:
self
.
place_buy_order
(
bar
,
units
=
-
self
.
units
)
self
.
position
=
0
self
.
close_out
(
bar
)
The first top level condition checks whether the position is market neutral.
If this is true, it is then checked whether the current price is low enough relative to the SMA.
In such a case, the .go_long()
method is called …
… and the market position is set to long.
If the current price is high enough relative to the SMA, the .go_short()
method is called …
… and the market position is set to short.
The second top level condition checks for a long market position.
In such a case, it is further checked whether the current price is at or above the SMA level again.
If so, the long position gets closed out by selling all units in the portfolio.
The market position is reset to neutral.
Finally, the third top level condition checks for a short position.
If the current price is at or below the SMA …
… a buy order for all units short is triggered to close out the short position.
The market position is then reset to neutral.
Executing the Python script in “Long Short Backtesting Class” yields performance results that shed further light on strategy characteristics. One might be inclined to assume that adding the flexibility to short a financial instruments yields better results. However, reality shows that this is not necessarily true. All strategies perform worse, both without and after transaction costs. Some configurations even pile up net losses or even a position of debt. Although these are specific results only, they illustrate that it is risky in such a context to jump to conclusions too early and to not take into account limits for piling up debt.
Running SMA strategy | SMA1=42 & SMA2=252 fixed costs 0.0 | proportional costs 0.0 ======================================================= Final balance [$] 45631.83 Net Performance [%] 356.32 ======================================================= Running momentum strategy | 60 days fixed costs 0.0 | proportional costs 0.0 ======================================================= Final balance [$] 105236.62 Net Performance [%] 952.37 ======================================================= Running mean reversion strategy | SMA=50 & thr=5 fixed costs 0.0 | proportional costs 0.0 ======================================================= Final balance [$] 17279.15 Net Performance [%] 72.79 ======================================================= Running SMA strategy | SMA1=42 & SMA2=252 fixed costs 10.0 | proportional costs 0.01 ======================================================= Final balance [$] 38369.65 Net Performance [%] 283.70 ======================================================= Running momentum strategy | 60 days fixed costs 10.0 | proportional costs 0.01 ======================================================= Final balance [$] 6883.45 Net Performance [%] -31.17 ======================================================= Running mean reversion strategy | SMA=50 & thr=5 fixed costs 10.0 | proportional costs 0.01 ======================================================= Final balance [$] -5110.97 Net Performance [%] -151.11 =======================================================
Situations where trading might eat up all the initial equity and might even lead to a position of debt arise, for example, in the context of trading contracts-for-difference (CFDs). These are highly leveraged products for which the trader only needs to put down, say, 5% of the position value as the initial margin (when the leverage is 20). If the position value changes by 10%, say, the trader might be required to meet a corresponding margin call. For a long position of 100,000 USD, equity of 5,000 USD is required. If the position drops to 90,000 USD, the equity is wiped out and the trader need to put down 5,000 USD more to cover the losses. This assumes that no margin stop outs are in place which would close the position as soon as the remaining equity drops to 0 USD.
This chapter presents classes for the event-based backtesting of trading strategies. Compared to vectorized backtesting, event-based backtesting makes intentional and heavy use of loops and iterations to be able to tackle every single new event (in general the arrival of new data) individually. This allows for a more flexible approach that can, among others, easily cope with fixed transaction costs or more complex strategies (and variations thereof).
“Backtesting Base Class” presents a base class with certain methods useful for the backtesting of a variety of trading strategies. “Long Only Backtesting Class” and “Long Short Backtesting Class” build on this infrastructure to implement classes that allow the backtesting of long only and long short trading strategies. Mainly for comparison reasons, the implementations include all three strategies formally introduced in Chapter 4. Taking the classes of this chapter as a starting point, enhancements and refinements are easily achieved.
Previous chapters introduce the basic ideas and concepts with regard to the three trading strategies covered in this chapter. This chapter for the first time makes a more systemic use of Python classes and object-oriented programming (OOP). A good introduction to OOP with Python and Python’s data model is found in Ramalho (2021). A more concise introduction to OOP applied to finance is found in Hilpisch (2018, ch. 6).
Hilpisch, Yves (2018): Python for Finance — Mastering Data-Driven Finance. 2nd ed., O’Reilly, Beijing et al.
Ramalho, Luciano (2021): Fluent Python — Clear, Concise, and Effective Programming. 2nd ed., O’Reilly, Beijing et al.
The Python ecosystem provides a number of optional packages that allow the backtesting of algorithmic trading strategies. Four of them are the following:
Zipline
, for example, powers the popular Quantopian platform for the backtesting of algorithmic trading strategies but can also be installed and used locally.
Although these packages might allow for a more thorough backtesting of algorithmic trading strategies than the rather simple classes presented in this chapter, the main goal of this book is to empower the reader and algorithmic trader to implement Python code in a self-contained fashion. Even if standard packages are later used to do the actual backtesting, a good understanding of the different approaches and their mechanics is for sure beneficial if not required.
This section presents Python scripts referenced and used in this chapter.
The following Python code contains the base class for event-based backtesting.
#
# Python Script with Base Class
# for Event-based Backtesting
#
# Python for Algorithmic Trading
# (c) Dr. Yves J. Hilpisch
# The Python Quants GmbH
#
import
numpy
as
np
import
pandas
as
pd
from
pylab
import
mpl
,
plt
plt
.
style
.
use
(
'seaborn'
)
mpl
.
rcParams
[
'font.family'
]
=
'serif'
class
BacktestBase
(
object
):
''' Base class for event-based backtesting of trading strategies.
Attributes
==========
symbol: str
TR RIC (financial instrument) to be used
start: str
start date for data selection
end: str
end date for data selection
amount: float
amount to be invested either once or per trade
ftc: float
fixed transaction costs per trade (buy or sell)
ptc: float
proportional transaction costs per trade (buy or sell)
Methods
=======
get_data:
retrieves and prepares the base data set
plot_data:
plots the closing price for the symbol
get_date_price:
returns the date and price for the given bar
print_balance:
prints out the current (cash) balance
print_net_wealth:
prints auf the current net wealth
place_buy_order:
places a buy order
place_sell_order:
places a sell order
close_out:
closes out a long or short position
'''
def
__init__
(
self
,
symbol
,
start
,
end
,
amount
,
ftc
=
0.0
,
ptc
=
0.0
,
verbose
=
True
):
self
.
symbol
=
symbol
self
.
start
=
start
self
.
end
=
end
self
.
initial_amount
=
amount
self
.
amount
=
amount
self
.
ftc
=
ftc
self
.
ptc
=
ptc
self
.
units
=
0
self
.
position
=
0
self
.
trades
=
0
self
.
verbose
=
verbose
self
.
get_data
()
def
get_data
(
self
):
''' Retrieves and prepares the data.
'''
raw
=
pd
.
read_csv
(
'http://hilpisch.com/pyalgo_eikon_eod_data.csv'
,
index_col
=
0
,
parse_dates
=
True
)
.
dropna
()
raw
=
pd
.
DataFrame
(
raw
[
self
.
symbol
])
raw
=
raw
.
loc
[
self
.
start
:
self
.
end
]
raw
.
rename
(
columns
=
{
self
.
symbol
:
'price'
},
inplace
=
True
)
raw
[
'return'
]
=
np
.
log
(
raw
/
raw
.
shift
(
1
))
self
.
data
=
raw
.
dropna
()
def
plot_data
(
self
,
cols
=
None
):
''' Plots the closing prices for symbol.
'''
if
cols
is
None
:
cols
=
[
'price'
]
self
.
data
[
'price'
]
.
plot
(
figsize
=
(
10
,
6
),
title
=
self
.
symbol
)
def
get_date_price
(
self
,
bar
):
''' Return date and price for bar.
'''
date
=
str
(
self
.
data
.
index
[
bar
])[:
10
]
price
=
self
.
data
.
price
.
iloc
[
bar
]
return
date
,
price
def
print_balance
(
self
,
bar
):
''' Print out current cash balance info.
'''
date
,
price
=
self
.
get_date_price
(
bar
)
(
f
'{date} | current balance {self.amount:.2f}'
)
def
print_net_wealth
(
self
,
bar
):
''' Print out current cash balance info.
'''
date
,
price
=
self
.
get_date_price
(
bar
)
net_wealth
=
self
.
units
*
price
+
self
.
amount
(
f
'{date} | current net wealth {net_wealth:.2f}'
)
def
place_buy_order
(
self
,
bar
,
units
=
None
,
amount
=
None
):
''' Place a buy order.
'''
date
,
price
=
self
.
get_date_price
(
bar
)
if
units
is
None
:
units
=
int
(
amount
/
price
)
self
.
amount
-=
(
units
*
price
)
*
(
1
+
self
.
ptc
)
+
self
.
ftc
self
.
units
+=
units
self
.
trades
+=
1
if
self
.
verbose
:
(
f
'{date} | selling {units} units at {price:.2f}'
)
self
.
print_balance
(
bar
)
self
.
print_net_wealth
(
bar
)
def
place_sell_order
(
self
,
bar
,
units
=
None
,
amount
=
None
):
''' Place a sell order.
'''
date
,
price
=
self
.
get_date_price
(
bar
)
if
units
is
None
:
units
=
int
(
amount
/
price
)
self
.
amount
+=
(
units
*
price
)
*
(
1
-
self
.
ptc
)
-
self
.
ftc
self
.
units
-=
units
self
.
trades
+=
1
if
self
.
verbose
:
(
f
'{date} | selling {units} units at {price:.2f}'
)
self
.
print_balance
(
bar
)
self
.
print_net_wealth
(
bar
)
def
close_out
(
self
,
bar
):
''' Closing out a long or short position.
'''
date
,
price
=
self
.
get_date_price
(
bar
)
self
.
amount
+=
self
.
units
*
price
self
.
units
=
0
self
.
trades
+=
1
if
self
.
verbose
:
(
f
'{date} | inventory {self.units} units at {price:.2f}'
)
(
'='
*
55
)
(
'Final balance [$] {:.2f}'
.
format
(
self
.
amount
))
perf
=
((
self
.
amount
-
self
.
initial_amount
)
/
self
.
initial_amount
*
100
)
(
'Net Performance [%] {:.2f}'
.
format
(
perf
))
(
'Trades Executed [#] {:.2f}'
.
format
(
self
.
trades
))
(
'='
*
55
)
if
__name__
==
'__main__'
:
bb
=
BacktestBase
(
'AAPL.O'
,
'2010-1-1'
,
'2019-12-31'
,
10000
)
(
bb
.
data
.
info
())
(
bb
.
data
.
tail
())
bb
.
plot_data
()
plt
.
savefig
(
'../../images/ch06/backtestbaseplot.png'
)
The following presents Python code with a class for the event-based backtesting of long only strategies, with implementations for strategies based on SMAs, momentum, and mean reversion.
#
# Python Script with Long Only Class
# for Event-based Backtesting
#
# Python for Algorithmic Trading
# (c) Dr. Yves J. Hilpisch
# The Python Quants GmbH
#
from
BacktestBase
import
*
class
BacktestLongOnly
(
BacktestBase
):
def
run_sma_strategy
(
self
,
SMA1
,
SMA2
):
''' Backtesting a SMA-based strategy.
Parameters
==========
SMA1, SMA2: int
shorter and longer term simple moving average (in days)
'''
msg
=
f
'
Running SMA strategy | SMA1={SMA1} & SMA2={SMA2}'
msg
+=
f
'
fixed costs {self.ftc} | '
msg
+=
f
'proportional costs {self.ptc}'
(
msg
)
(
'='
*
55
)
self
.
position
=
0
# initial neutral position
self
.
trades
=
0
# no trades yet
self
.
amount
=
self
.
initial_amount
# reset initial capital
self
.
data
[
'SMA1'
]
=
self
.
data
[
'price'
]
.
rolling
(
SMA1
)
.
mean
()
self
.
data
[
'SMA2'
]
=
self
.
data
[
'price'
]
.
rolling
(
SMA2
)
.
mean
()
for
bar
in
range
(
SMA2
,
len
(
self
.
data
)):
if
self
.
position
==
0
:
if
self
.
data
[
'SMA1'
]
.
iloc
[
bar
]
>
self
.
data
[
'SMA2'
]
.
iloc
[
bar
]:
self
.
place_buy_order
(
bar
,
amount
=
self
.
amount
)
self
.
position
=
1
# long position
elif
self
.
position
==
1
:
if
self
.
data
[
'SMA1'
]
.
iloc
[
bar
]
<
self
.
data
[
'SMA2'
]
.
iloc
[
bar
]:
self
.
place_sell_order
(
bar
,
units
=
self
.
units
)
self
.
position
=
0
# market neutral
self
.
close_out
(
bar
)
def
run_momentum_strategy
(
self
,
momentum
):
''' Backtesting a momentum-based strategy.
Parameters
==========
momentum: int
number of days for mean return calculation
'''
msg
=
f
'
Running momentum strategy | {momentum} days'
msg
+=
f
'
fixed costs {self.ftc} | '
msg
+=
f
'proportional costs {self.ptc}'
(
msg
)
(
'='
*
55
)
self
.
position
=
0
# initial neutral position
self
.
trades
=
0
# no trades yet
self
.
amount
=
self
.
initial_amount
# reset initial capital
self
.
data
[
'momentum'
]
=
self
.
data
[
'return'
]
.
rolling
(
momentum
)
.
mean
()
for
bar
in
range
(
momentum
,
len
(
self
.
data
)):
if
self
.
position
==
0
:
if
self
.
data
[
'momentum'
]
.
iloc
[
bar
]
>
0
:
self
.
place_buy_order
(
bar
,
amount
=
self
.
amount
)
self
.
position
=
1
# long position
elif
self
.
position
==
1
:
if
self
.
data
[
'momentum'
]
.
iloc
[
bar
]
<
0
:
self
.
place_sell_order
(
bar
,
units
=
self
.
units
)
self
.
position
=
0
# market neutral
self
.
close_out
(
bar
)
def
run_mean_reversion_strategy
(
self
,
SMA
,
threshold
):
''' Backtesting a mean reversion-based strategy.
Parameters
==========
SMA: int
simple moving average in days
threshold: float
absolute value for deviation-based signal relative to SMA
'''
msg
=
f
'
Running mean reversion strategy | '
msg
+=
f
'SMA={SMA} & thr={threshold}'
msg
+=
f
'
fixed costs {self.ftc} | '
msg
+=
f
'proportional costs {self.ptc}'
(
msg
)
(
'='
*
55
)
self
.
position
=
0
self
.
trades
=
0
self
.
amount
=
self
.
initial_amount
self
.
data
[
'SMA'
]
=
self
.
data
[
'price'
]
.
rolling
(
SMA
)
.
mean
()
for
bar
in
range
(
SMA
,
len
(
self
.
data
)):
if
self
.
position
==
0
:
if
(
self
.
data
[
'price'
]
.
iloc
[
bar
]
<
self
.
data
[
'SMA'
]
.
iloc
[
bar
]
-
threshold
):
self
.
place_buy_order
(
bar
,
amount
=
self
.
amount
)
self
.
position
=
1
elif
self
.
position
==
1
:
if
self
.
data
[
'price'
]
.
iloc
[
bar
]
>=
self
.
data
[
'SMA'
]
.
iloc
[
bar
]:
self
.
place_sell_order
(
bar
,
units
=
self
.
units
)
self
.
position
=
0
self
.
close_out
(
bar
)
if
__name__
==
'__main__'
:
def
run_strategies
():
lobt
.
run_sma_strategy
(
42
,
252
)
lobt
.
run_momentum_strategy
(
60
)
lobt
.
run_mean_reversion_strategy
(
50
,
5
)
lobt
=
BacktestLongOnly
(
'AAPL.O'
,
'2010-1-1'
,
'2019-12-31'
,
10000
,
verbose
=
False
)
run_strategies
()
# transaction costs: 10 USD fix, 1% variable
lobt
=
BacktestLongOnly
(
'AAPL.O'
,
'2010-1-1'
,
'2019-12-31'
,
10000
,
10.0
,
0.01
,
False
)
run_strategies
()
The following Python code contains a class for the event-based backtesting of long/short strategies, with implementations for strategies based on SMAs, momentum, and mean reversion.
#
# Python Script with Long Short Class
# for Event-based Backtesting
#
# Python for Algorithmic Trading
# (c) Dr. Yves J. Hilpisch
# The Python Quants GmbH
#
from
BacktestBase
import
*
class
BacktestLongShort
(
BacktestBase
):
def
go_long
(
self
,
bar
,
units
=
None
,
amount
=
None
):
if
self
.
position
==
-
1
:
self
.
place_buy_order
(
bar
,
units
=-
self
.
units
)
if
units
:
self
.
place_buy_order
(
bar
,
units
=
units
)
elif
amount
:
if
amount
==
'all'
:
amount
=
self
.
amount
self
.
place_buy_order
(
bar
,
amount
=
amount
)
def
go_short
(
self
,
bar
,
units
=
None
,
amount
=
None
):
if
self
.
position
==
1
:
self
.
place_sell_order
(
bar
,
units
=
self
.
units
)
if
units
:
self
.
place_sell_order
(
bar
,
units
=
units
)
elif
amount
:
if
amount
==
'all'
:
amount
=
self
.
amount
self
.
place_sell_order
(
bar
,
amount
=
amount
)
def
run_sma_strategy
(
self
,
SMA1
,
SMA2
):
msg
=
f
'
Running SMA strategy | SMA1={SMA1} & SMA2={SMA2}'
msg
+=
f
'
fixed costs {self.ftc} | '
msg
+=
f
'proportional costs {self.ptc}'
(
msg
)
(
'='
*
55
)
self
.
position
=
0
# initial neutral position
self
.
trades
=
0
# no trades yet
self
.
amount
=
self
.
initial_amount
# reset initial capital
self
.
data
[
'SMA1'
]
=
self
.
data
[
'price'
]
.
rolling
(
SMA1
)
.
mean
()
self
.
data
[
'SMA2'
]
=
self
.
data
[
'price'
]
.
rolling
(
SMA2
)
.
mean
()
for
bar
in
range
(
SMA2
,
len
(
self
.
data
)):
if
self
.
position
in
[
0
,
-
1
]:
if
self
.
data
[
'SMA1'
]
.
iloc
[
bar
]
>
self
.
data
[
'SMA2'
]
.
iloc
[
bar
]:
self
.
go_long
(
bar
,
amount
=
'all'
)
self
.
position
=
1
# long position
if
self
.
position
in
[
0
,
1
]:
if
self
.
data
[
'SMA1'
]
.
iloc
[
bar
]
<
self
.
data
[
'SMA2'
]
.
iloc
[
bar
]:
self
.
go_short
(
bar
,
amount
=
'all'
)
self
.
position
=
-
1
# short position
self
.
close_out
(
bar
)
def
run_momentum_strategy
(
self
,
momentum
):
msg
=
f
'
Running momentum strategy | {momentum} days'
msg
+=
f
'
fixed costs {self.ftc} | '
msg
+=
f
'proportional costs {self.ptc}'
(
msg
)
(
'='
*
55
)
self
.
position
=
0
# initial neutral position
self
.
trades
=
0
# no trades yet
self
.
amount
=
self
.
initial_amount
# reset initial capital
self
.
data
[
'momentum'
]
=
self
.
data
[
'return'
]
.
rolling
(
momentum
)
.
mean
()
for
bar
in
range
(
momentum
,
len
(
self
.
data
)):
if
self
.
position
in
[
0
,
-
1
]:
if
self
.
data
[
'momentum'
]
.
iloc
[
bar
]
>
0
:
self
.
go_long
(
bar
,
amount
=
'all'
)
self
.
position
=
1
# long position
if
self
.
position
in
[
0
,
1
]:
if
self
.
data
[
'momentum'
]
.
iloc
[
bar
]
<=
0
:
self
.
go_short
(
bar
,
amount
=
'all'
)
self
.
position
=
-
1
# short position
self
.
close_out
(
bar
)
def
run_mean_reversion_strategy
(
self
,
SMA
,
threshold
):
msg
=
f
'
Running mean reversion strategy | '
msg
+=
f
'SMA={SMA} & thr={threshold}'
msg
+=
f
'
fixed costs {self.ftc} | '
msg
+=
f
'proportional costs {self.ptc}'
(
msg
)
(
'='
*
55
)
self
.
position
=
0
# initial neutral position
self
.
trades
=
0
# no trades yet
self
.
amount
=
self
.
initial_amount
# reset initial capital
self
.
data
[
'SMA'
]
=
self
.
data
[
'price'
]
.
rolling
(
SMA
)
.
mean
()
for
bar
in
range
(
SMA
,
len
(
self
.
data
)):
if
self
.
position
==
0
:
if
(
self
.
data
[
'price'
]
.
iloc
[
bar
]
<
self
.
data
[
'SMA'
]
.
iloc
[
bar
]
-
threshold
):
self
.
go_long
(
bar
,
amount
=
self
.
initial_amount
)
self
.
position
=
1
elif
(
self
.
data
[
'price'
]
.
iloc
[
bar
]
>
self
.
data
[
'SMA'
]
.
iloc
[
bar
]
+
threshold
):
self
.
go_short
(
bar
,
amount
=
self
.
initial_amount
)
self
.
position
=
-
1
elif
self
.
position
==
1
:
if
self
.
data
[
'price'
]
.
iloc
[
bar
]
>=
self
.
data
[
'SMA'
]
.
iloc
[
bar
]:
self
.
place_sell_order
(
bar
,
units
=
self
.
units
)
self
.
position
=
0
elif
self
.
position
==
-
1
:
if
self
.
data
[
'price'
]
.
iloc
[
bar
]
<=
self
.
data
[
'SMA'
]
.
iloc
[
bar
]:
self
.
place_buy_order
(
bar
,
units
=-
self
.
units
)
self
.
position
=
0
self
.
close_out
(
bar
)
if
__name__
==
'__main__'
:
def
run_strategies
():
lsbt
.
run_sma_strategy
(
42
,
252
)
lsbt
.
run_momentum_strategy
(
60
)
lsbt
.
run_mean_reversion_strategy
(
50
,
5
)
lsbt
=
BacktestLongShort
(
'EUR='
,
'2010-1-1'
,
'2019-12-31'
,
10000
,
verbose
=
False
)
run_strategies
()
# transaction costs: 10 USD fix, 1% variable
lsbt
=
BacktestLongShort
(
'AAPL.O'
,
'2010-1-1'
,
'2019-12-31'
,
10000
,
10.0
,
0.01
,
False
)
run_strategies
()
18.117.153.38