Symbols
3 Month T-bill
using, from FRED database 281, 282
13 Week T-bill
A
absolute percent error (APE) 261
accuracy paradox 568
Adaptive Moment Estimation (Adam) 656
Adaptive Synthetic Sampling (ADSYN) 564
drawbacks 564
additive model 144
Akaike Information Criterion (AIC) 156, 190
Alpha Vantage
reference link 21
altair
reference link 69
American options
pricing, with Least Squares Monte Carlo 353-356
pricing, with QuantLib 357-361
analysis of variance (ANOVA) 610
anchored walk-forward validation 207
antithetic variates 346
ARCH effect 104
ARCH model 306
stock returns’ volatility, modeling with 306-311
area under the ROC curve (AUC) 512
ARIMA (Autoregressive Integrated Moving Average) 305
ARIMA class models
reference links 189
time series, modeling with 176-189
Artificial Neural Networks (ANNs) 500, 646
asset allocation 371
asset allocation problem
results, comparing from two formulations 401, 402
asset prices 26
asset returns
stylized facts, investigating 98
Augmented Dickey-Fuller (ADF) 153
auto-ARIMA
best-fitting ARIMA model, finding with 190-202
autocorrelation 102
values, small and decreasing in squared/absolute returns 104-109
AutoETS 173
used, for selecting ETS model 173-175
autoregressive (AR) model 306
AutoRegressive Network (AR-Net) 683
average precision 516
B
backtesting 415
fixed commission per order 439
fixed commission per share 438, 439
backtest, trading strategies
look-ahead bias 415
meeting investment objectives and constraints 416
multiple testing 416
outlier detection and treatment 416
realistic trading environment 416
representative sample period 416
survivorship bias 415
backtrader
backtesting with 431
used, for event-driven backtesting 424-431
backward difference encoder 561
Balanced Random Forest 570
barrier options 361
Bayesian hyperparameter optimization 580, 581
shortcomings 581
Bayesian Online Change Point Detection (BOCPD) 91
best-fitting ARIMA model
used, for finding auto-ARIMA 190-203
Binary encoder 562
Black-Scholes (BS) 359
bokeh 64
URL 69
boosting 549
procedures 550
Boruta algorithm 621
Box-Cox transformation 165
Brownian motion 340
built-in cross-validation 259- 261
buy/sell strategy, based on Bollinger bands
C
Calmar ratio 377
candlestick chart 69
creating, with mplfinance 72-74
creating, with pure plotly 72- 74
candlestick patterns
Capital Asset Pricing Model (CAPM)
Carhart’s four-factor model
estimating 292
momentum factor 292
Catboost encoder 562
categorical encoders
ML pipelines, fitting with 557-561
categorical variables
CCC-GARCH model
used, for multivariate volatility forecasting 325-329
changepoint 86
detecting, in time series 86-88
detection algorithms, using 90
classification evaluation metrics
class imbalance
approaches, for handling 562- 569
class inheritance 525
CoinGecko
Bitcoin’s current price 23
trending coins 23
combinatorial purged cross-validation algorithm 219
conditional covariance matrix
forecasting, with DCC-GARCH model 330- 336
conditional heteroskedasticity 305
conditional hyperparameter spaces 588, 589
confusion matrix
possible values 510
Consumer Price Index (CPI) 30, 158
continuation patterns 124
convex optimization, with cvxpy
used, for finding efficient frontier 397-400
count encoder 561
cross-entropy loss function 656
cross-sectional factor models
estimating, with Fama-MacBeth regression 298-303
cross-validation 206
used, for tuning hyperparameters 529- 536
cumulative distribution function (CDF) 350
currencies
curse of dimensionality 499
D
data, obtaining
obtaining, from Yahoo Finance 2, 4
data-generating process (DGP) 152
data leakage 488
handling, with k-fold target encoding 561
dataset
loading, from CSV file into Python 462-468
DCC-GARCH model
conditional covariance matrix, forecasting with 330-336
decision trees
cons 506
evaluation criterias 511
pros 506
visualizing, with dtreeviz 517, 518
warning 505
DeepAR, Amazon
used, for time series forecasting 669- 677
DeepVAR model
downside deviation 377
dtreeviz
decision trees, visualizing with 517, 518
dummy-variable trap 500
E
encoding categorical features
alternative approaches, exploring 553-556
ensemble classifiers
AdaBoost 552
CatBoost 552
Extremely Randomized Trees 552
Histogram-Based gradient boosting 553
NGBoost 552
ensemble models
averaging methods 545
boosting methods 545
entity embedding 646
equally-weighted (1/n) portfolio 372
performance, evaluating 373-376
estimators 519
ETS methods
references 175
European call/put option 348
European options
pricing, with Monte Carlo simulations 348- 351
evaluation or event timestamp 218
evening star pattern 129
event-driven backtesting 424
excess kurtosis 109
exchange-traded fund (ETF) 376
Exclusive Feature Bundling (EFB) 551
expanders 137
explainable AI (XAI)
explainable AI techniques
exploring 624
Individual Conditional Expectation (ICE) 624
Partial Dependence Plot (PDP) 624
SHapley Additive exPlanations (SHAP) 625
Exploratory Data Analysis (EDA) 470
explored hyperparameters 590-594
exponential moving average (EMA) 72, 114
exponential smoothing methods
time series, modeling with 166-172
Extreme Gradient Boosting (XGBoost)
concepts 550
F
factor model
features 275
Fama-French Factors 284
Fama-French’s five-factor model
investment factor 292
profitability factor 292
Fama-French three-factor model
market factor (MKT) 283
size factor 283
value factor 283
Fama-MacBeth regression 298
reference link 303
used, for estimating cross-sectional factor models 298-303
fastai
reference link 657
Tabular Learner, exploring 646-656
fast Fourier transform (FFT) 265
feature engineering
applying, for time series 220- 230
examples 220
for time series 220
feature importance
disadvantages 598
drop column feature importance 598
investigating 597
permutation feature importance 598
feature selection
combining, with hyperparameter tuning 621, 622
feature selection techniques
approaches 620
embedded methods 610
filter methods 610
wrapper methods 610
Fisher’s kurtosis 109
Five-factor model
reference link 297
forecast
versus multi-step forecast 695-698
Forex API
reference link 40
FRED database
3 Month T-bill, using from 282
frequency
modifying, of time series data 31- 34
frontier 381
finding, convex optimization with cvxpy used 397-400
finding, optimization with spicy used 389-395
finding, with Monte Carlo simulations 382-387
G
GARCH model 306
conditional mean model 315
conditional volatility model 315, 316
error distribution 316
multivariate GARCH model 337
stock returns’ volatility, modeling with 312-315
univariate GARCH model 337
volatility, forecasting with 316- 324
generative adversarial networks (GANs) 369
geometric Brownian motion (GBM) 341
pros 346
used, for simulating stock price dynamics 340-347
ghost batch normalization 668
ghost feature 80
Gini importance 597
TabNet, exploring 658
Gradient-based One-Side Sampling (GOSS) 551
Gradient Boosted Trees 549
gradient descent 550
Graphics Processing Units (GPUs) 645
greedy algorithm 505
Greeks
used, for measuring price sensitivity 352
grid searches
used, for tuning hyperparameters 529- 536
with multiple classifiers 539, 540
group time series validation 217
H
Hampel filter 81
used, for outlier detection 81, 82, 83, 84
Hashing encoder 562
heatmap 380
helmert encoder 561
Heroku
URL 142
Hierarchical Risk Parity (HRP) 406
advantages 406
used, for finding optimal portfolio 406-409
Holt’s double exponential smoothing (DOS) 167
Holt’s linear trend method 167
Holt-Winters’ seasonal smoothing 167
hot-deck imputation 493
Hurst exponent
used, for detecting patterns in time series 94-98
hyperparameter
tuning, with grid searches and cross-validation 529-536
I
iceberg orders 42
identified patterns
using, as features for model/strategy 128, 129
imbalance bars 48
Individual Conditional Expectation (ICE) 624
advantages 624
disadvantages 624
inflation 28
information value (IV) 556
interactive visualizations
interactive web app
building, for technical analysis with Streamlit 129-138
reference link 39
interpretability 624
interquartile range (IQR) 152, 484
Intrinio
features 11
isolation forest 571
J
James-Stein encoder 562
Jensen’s alpha 279
joint hypothesis problem 279
K
Kaggle
reference link 541
Kendall’s Tau 92
kernel density estimate (KDE) 379, 473, 590
k-fold cross-validation 206
k-fold target encoding
data leakage, handling with 561
KNN
drawbacks 498
kurtosis 377
Kwiatkowski-Phillips-Schmidt-Shin (KPSS) 153
L
label encoding 499
Least Squares Monte Carlo (LSMC)
American options, pricing with 353-356
Leave One Out Encoding (LOOE) 556
LightGBM
features 551
possibilities 609
linear interpolation 38
line plots
Ljung-Box test
reference link 189
Local Interpretable Model-agnostic Explanations (LIME) 641
locally estimated scatterplot smoothing (LOESS) 149, 151
calculating 28
long/short strategy based on RSI
M
Mann-Kendall (MK) test 92
Markov Chain Monte Carlo (MCMC) 249
Markowitz’s curse 405
Matthew’s correlation coefficient 516
Max drawdown 376
Maximum Likelihood Estimation (MLE) 176
Maximum Relevance Minimum Redundancy (MRMR) 620
maximum Sharpe ratio portfolio 388, 389
Mean Absolute Error (MAE) 216
Mean Absolute Percentage Error (MAPE) 174, 216
Mean Decrease in Impurity (MDI) 597
mean encoding 555
mean-reversion 94
Mean Squared Error (MSE) 216
mean-variance analysis 371
mean-variance portfolio optimization
M-estimate encoder 561
Meta
Prophet 248
metrics
calculating, to evaluate portfolio’s performance 446
Minimum Variance portfolio 389
Minimum Volatility portfolio 388
MissForest
advantages 498
disadvantages 498
missing at random (MAR) 492
missing completely at random (MCAR) 492
missingno library
available visualizations 496, 497
missing not at random (MNAR) 492
missing time series data
missing values
categorizing 492
identifying 492
ML-based approaches
used, for imputing missing values 497, 498
modern portfolio theory (MPT) 371
assumptions 372
momentum factor
reference link 297
monotonic constraints 551
Monte Carlo
Value-at-Risk (VaR), estimating with 363- 369
Monte Carlo simulations
used, for finding efficient frontier 382-387
used, for improving valuation function 351, 352
used, for pricing European options 348-351
moving average convergence divergence (MACD) 114
downloading, with API 122, 124
moving average crossover strategy
backtesting, with crypto data 447- 453
multiple imputation approaches 497
Multiple Imputation by Chained Equations (MICE) 497
Multiple Seasonal-Trend Decomposition using LOESS (MSTL) 149
multiplicative model 145
multivariate GARCH model
estimation, parallelizing 337, 338
multivariate volatility forecasting
Mutual Information (MI) score 618
N
Nasdaq Data Link
reference link 6
nearest neighbors imputation 497
negative skewness (third moment) 99
nested cross-validation 208
NeuralProphet
features 698
forecast, versus multi-step forecast 695- 698
holidays and special events, adding 694, 695
time series forecasting 682- 693
non-Gaussian distribution of returns 99, 100, 101, 108
descriptive statistics 109
histogram, of returns 108
Q-Q plot 108
non-stationary data
drawbacks 153
non-systematic component
noise 144
O
Omega ratio 376
OneHotEncoder
categories, specifying for 504
one-hot encoding 499
issues 554
pandas, using 503
warning 505
Open, High, Low, and Close (OHLC) 4
prices 70
optimal portfolio
finding, with Hierarchical Risk Parity 406- 409
optimization, with scipy
used, for finding efficient frontier 389-395
oracle approximating shrinkage (OAS) 409
ordinal encoding 561
ordinary least squares (OLS) 176
Ornstein-Uhlenbeck process 94
outlier detection
outliers 77
identifying, with stock returns 84-86
oversampling methods
Borderline SMOTE 570
K-means SMOTE 571
SVM SMOTE 571
Synthetic Minority Oversampling Technique for Nominal and Continuous (SMOTE-NC) 570
P
pandas
using, for one-hot encoding 504
vectorized backtesting with 417-421
Partial Dependence Plot (PDP)
advantages 625
disadvantages 625
patterns
detecting, in time series with Hurst exponent 94-98
Pearson’s correlation coefficient 484
permutation feature importance 598
pros and cons 599
Phillips-Perron (PP) 157
pipelines
benefits 519
custom transformers, adding 524-528
elements, accessing 528
used, for organizing projects 519, 520
Platform as a Service (PaaS) 142
point anomaly detection 78
portfolio rebalancing 291
Precision-Recall curve 513
prices
Principal Components Analysis (PCA) 565
probability density function (PDF) 100, 352
Proof of Concept (PoC) 504
Prophet 249
features 248
model, tuning 261
PyCaret
features 272
URL 86
using, for time series forecasting 262-272
pycoingecko library
reference link 23
pyfolio 447
PyPortfolioOpt 410
efficient frontier, obtaining 410- 412
Python libraries, on AI explainability
investigating 642
PyTorch Forecasting 677
Q
quantile plot 380
quantile-quantile (Q-Q) plot 100, 108
QuantLib
American options, pricing with 357-361
quantstats 377
pandas DataFrames/Series, enriching with new methods 380
quarter plot 62
R
Random Forest model
feature importance, evaluating 599-608
random oversampling 563
random search (randomized grid search) 530
random undersampling 563
random walk 94
realized volatility 32
Receiver Operating Characteristic (ROC) 511
Rectified Linear Unit (ReLU) 655
Recursive Feature Elimination (RFE) 619
reduced regression
time series, forecasting as 235-247
reduction process 235
relative strength index (RSI) 114, 433
rescaled range (R/S) analysis 98
returns
adjusting, for inflation 28-30
benefit 26
log returns 26
simple returns 26
reversal patterns 124
RobustStatDetector 90
Rolling Sharpe ratio 378
rolling statistics
used, for outlier detection 77-80
rolling three-factor model
Root Mean Squared Error (RMSE) 216
S
seaborn 61
seasonal decomposition
approaches 152
seasonality 58
seasonal patterns
additional information, visualizing 61-63
sequential attention 658
Sequential Least-Squares Programming (SLSQP) algorithm 394
Sequential Model-Based Optimization (SMBO) 580
serial correlation 102
SHapley Additive exPlanations (SHAP) 625, 626
advantages 626
disadvantages 626
Sharpe ratio 376
signal types 436
simple exponential smoothing (SES) 167
Simple Moving Average (SMA) 72, 115, 417
simple returns 26
calculating 27
Simplified Wrapper and Interface Generator (SWIG) 357
single imputation approaches 497
Singular Value Decomposition (SVD) 368
sizers
reference link 440
skew 377
sktime
advantages 247
documentation link 86
slippage 416
Sortino ratio 376
sparsemax 668
squared/absolute returns
autocorrelation values, small and decreasing 104-109
stacked ensemble
goal 573
stationarity 25
correcting, in time series 158-164
testing, in time series 152-158
advantages 149
stochastic differential equations (SDEs) 340
Stochastic Gradient Boosted Trees 550
stock-keeping units (SKUs) 669
stock price dynamics
stock return’s volatility
modeling, with ARCH models 306-311
modeling, with GARCH models 312-315
Streamlit
documentation link 139
reference link 139
sign up page, reference link 139
Streamlit cloud, reference link 142
using, to build interactive web app for technical analysis 129-138
stylized facts 98
absence, of autocorrelation in returns 102-109
autocorrelation, small and decreasing in squared/absolute returns 104-109
investigating, of asset returns 98
non-Gaussian distribution of returns 99-108
volatility clustering 102, 109
successive halving
used, for performing faster search 536-538
sum encoder 561
surrogate model 580
symmetric MAPE (sMAPE) 245
Synthetic Minority Oversampling Technique for Nominal and Continuous (SMOTE-NC) 570
Synthetic Minority Oversampling Technique (SMOTE) 563
systematic components
level 144
seasonality 144
trend 144
T
TabNet, Google
implementation, in PyTorch 668
Tabular Learner, fastai
tail ratio 377
reference link 129
URL 120
tangency portfolio 387
target encoding 555
target orders 453
technical analysis (TA) 113
interactive web app, building with Streamlit 129-138
technical indicators
techniques, for tackling overfitting
batch normalization 655
dropout 655
weight decay 655
term frequency-inverse document frequency (TF-IDF) 523
test sets
Three-Factor Model 284
tick bars 43
time bars
drawbacks 42
time-related features
time series
feature engineering, applying 220-230
forecasting as reduced regression 235-247
modeling, with ARIMA class models 176-189
modeling, with exponential smoothing methods 166-173
non-systematic components 144
patterns, detecting with Hurst exponent 94-98
stationarity, correcting 158-164
stationarity, testing for 152-158
systematic components 144
time series data
frequency, modifying of 31- 34
time series decomposition
goals 144
references 152
time series forecast accuracy
evaluating, metrics 216
time series forecasting
NeuralProphet, using for 683-693
trade data
training sets
transformations, applying to data
continuous variables, discretizing 524
numerical features, scaling 524
outliers, transforming/removing 524
transformers 519
treeinterpreter 642
tree-structured Parzen Estimator (TPE) 580
trends
detecting, in time series 92-94
Tukey’s fences 484
U
undersampling methods
Edited Nearest Neighbors 570
NearMiss 570
Tomek links 570
underwater plot 376
V
validation methods
validation set 490
valuation function
improving, with Monte Carlo simulations 351, 352
Value-at-Risk (VaR) 305
estimating, with Monte Carlo 363-369
Variance Inflation Factor (VIF) 620
variance targeting 336
variance thresholding 620
vectorized backtesting 417
transaction costs, accounting 422, 423
volatility clustering 102, 109
volatility forecasting
analytical approach 317
bootstrap forecasts 317
simulation-based forecasts 317
Volatility Index (VIX) 305
volatility trading 305
volume bars 43
volume-weighted average price (VWAP) 48
W
walk-forward validation 207
used, for calculating model performance 208
weak stationarity 153
Weight of Evidence (WoE) encoding 556
winsorization 81
wrapper techniques
backward feature selection 620
considerations 621
exhaustive feature selection 621
forward feature selection 620
stepwise selection 621
X
XGBoost
possibilities 609
predictions, explaining 627-641
Y
Yahoo Finance
libraries 5
reference link 5
Thanks for purchasing this book!
Do you like to read on the go but are unable to carry your print books everywhere?
Is your eBook purchase not compatible with the device of your choice?
Don’t worry, now with every Packt book you get a DRM-free PDF version of that book at no cost.
Read anywhere, any place, on any device. Search, copy, and paste code from your favorite technical books directly into your application.
The perks don’t stop there, you can get exclusive access to discounts, newsletters, and great free content in your inbox daily
Follow these simple steps to get the benefits:
https://packt.link/free-ebook/9781803243191
18.224.73.125