Index

Symbols

3 Month T-bill

using, from FRED database 281, 282

13 Week T-bill

using 280, 281

A

absolute percent error (APE) 261

accuracy paradox 568

Adaptive Moment Estimation (Adam) 656

Adaptive Synthetic Sampling (ADSYN) 564

drawbacks 564

additive model 144

Akaike Information Criterion (AIC) 156, 190

allowable leverage 402-405

alpha 5, 377

Alpha Vantage

data, obtaining from 16-20

reference link 21

altair

reference link 69

American options

pricing, with Least Squares Monte Carlo 353-356

pricing, with QuantLib 357-361

analysis of variance (ANOVA) 610

anchored walk-forward validation 207

antithetic variates 346

ARCH effect 104

ARCH model 306

stock returns’ volatility, modeling with 306-311

area under the ROC curve (AUC) 512

ARIMA (Autoregressive Integrated Moving Average) 305

ARIMA class models

reference links 189

time series, modeling with 176-189

Artificial Neural Networks (ANNs) 500, 646

asset allocation 371

asset allocation problem

results, comparing from two formulations 401, 402

asset prices 26

asset returns

stylized facts, investigating 98

Augmented Dickey-Fuller (ADF) 153

auto-ARIMA

best-fitting ARIMA model, finding with 190-202

autocorrelation 102

values, small and decreasing in squared/absolute returns 104-109

AutoETS 173

used, for selecting ETS model 173-175

autoregressive (AR) model 306

AutoRegressive Network (AR-Net) 683

average precision 516

B

backtesting 415

AllInSizer 437, 438

fixed commission per order 439

fixed commission per share 438, 439

backtest, trading strategies

look-ahead bias 415

meeting investment objectives and constraints 416

multiple testing 416

outlier detection and treatment 416

realistic trading environment 416

representative sample period 416

survivorship bias 415

backtrader

backtesting with 431

used, for event-driven backtesting 424-431

backward difference encoder 561

Balanced Random Forest 570

barrier options 361

pricing 361-363

Bayesian hyperparameter optimization 580, 581

libraries 595, 596

running 582-587

shortcomings 581

Bayesian Online Change Point Detection (BOCPD) 91

best-fitting ARIMA model

used, for finding auto-ARIMA 190-203

beta 2, 377

Binary encoder 562

Black-Scholes (BS) 359

bokeh 64

URL 69

Bollinger bands 114, 440

boosting 549

procedures 550

Boruta algorithm 621

Box-Cox transformation 165

Brownian motion 340

built-in cross-validation 259- 261

buy/sell strategy, based on Bollinger bands

backtesting 440- 445

C

Calmar ratio 377

candlestick chart 69

creating 69-72

creating, with mplfinance 72-74

creating, with pure plotly 72- 74

candlestick patterns

recognizing 124-128

Capital Asset Pricing Model (CAPM)

estimating 276, 277

implementing 277- 279

Carhart’s four-factor model

estimating 292

implementing 293-, 297

momentum factor 292

Catboost encoder 562

categorical encoders

ML pipelines, fitting with 557-561

library 504, 505

categorical variables

encoding 499-503

CCC-GARCH model

used, for multivariate volatility forecasting 325-329

changepoint 86

detecting, in time series 86-88

detection algorithms, using 90

classification evaluation metrics

exploring 513, 514

class imbalance

approaches, for handling 562- 569

class inheritance 525

CoinGecko

Bitcoin’s current price 23

data, obtaining from 21, 22

trending coins 23

combinatorial purged cross-validation algorithm 219

conditional covariance matrix

forecasting, with DCC-GARCH model 330- 336

conditional heteroskedasticity 305

conditional hyperparameter spaces 588, 589

confusion matrix

possible values 510

Consumer Price Index (CPI) 30, 158

continuation patterns 124

convex optimization, with cvxpy

used, for finding efficient frontier 397-400

count encoder 561

cross-entropy loss function 656

cross-sectional factor models

estimating, with Fama-MacBeth regression 298-303

cross-validation 206

used, for tuning hyperparameters 529- 536

cumulative distribution function (CDF) 350

currencies

converting 39, 40

curse of dimensionality 499

D

data, obtaining

from Alpha Vantage 16- 20

from CoinGecko 21, 22

from Intrinio 9, 10, 11

from Nasdaq Data Link 5-8

obtaining, from Yahoo Finance 2, 4

data-generating process (DGP) 152

data leakage 488

handling, with k-fold target encoding 561

dataset

loading, from CSV file into Python 462-468

DCC-GARCH model

conditional covariance matrix, forecasting with 330-336

decision trees

cons 506

evaluation criterias 511

fitting 505-513

pros 506

visualizing, with dtreeviz 517, 518

warning 505

DeepAR, Amazon

used, for time series forecasting 669- 677

DeepVAR model

training 678-682

discretization 339, 345

downside deviation 377

dtreeviz

decision trees, visualizing with 517, 518

dummy-variable trap 500

E

encoding categorical features

alternative approaches, exploring 553-556

drawbacks 554, 555

ensemble classifiers

AdaBoost 552

CatBoost 552

exploring 544, 545

Extremely Randomized Trees 552

Histogram-Based gradient boosting 553

NGBoost 552

training 545-549

ensemble models

averaging methods 545

boosting methods 545

entity embedding 646

equally-weighted (1/n) portfolio 372

performance, evaluating 373-376

estimators 519

ETS methods

references 175

European call/put option 348

European options

pricing, with Monte Carlo simulations 348- 351

evaluation or event timestamp 218

evening star pattern 129

event-driven backtesting 424

with backtrader 424- 431

excess kurtosis 109

exchange-traded fund (ETF) 376

Exclusive Feature Bundling (EFB) 551

expanders 137

explainable AI (XAI)

benefits 623, 624

explainable AI techniques

exploring 624

Individual Conditional Expectation (ICE) 624

Partial Dependence Plot (PDP) 624

SHapley Additive exPlanations (SHAP) 625

Exploratory Data Analysis (EDA) 470

carrying out 470- 488

explored hyperparameters 590-594

exponential moving average (EMA) 72, 114

exponential smoothing methods

time series, modeling with 166-172

Extreme Gradient Boosting (XGBoost)

concepts 550

F

factor model

features 275

Fama-French Factors 284

Fama-French’s five-factor model

implementing 293-297

investment factor 292

profitability factor 292

Fama-French three-factor model

estimating 283- 287

market factor (MKT) 283

size factor 283

value factor 283

Fama-MacBeth regression 298

reference link 303

used, for estimating cross-sectional factor models 298-303

fastai

features 656, 657

reference link 657

Tabular Learner, exploring 646-656

fast Fourier transform (FFT) 265

feature engineering

applying, for time series 220- 230

examples 220

for time series 220

feature importance

benefits 597, 598

disadvantages 598

drop column feature importance 598

investigating 597

permutation feature importance 598

feature selection

combining, with hyperparameter tuning 621, 622

feature selection techniques

approaches 620

embedded methods 610

exploring 610-619

filter methods 610

wrapper methods 610

Fisher’s kurtosis 109

Five-factor model

reference link 297

forecast

versus multi-step forecast 695-698

Forex API

reference link 40

FRED database

3 Month T-bill, using from 282

frequency

modifying, of time series data 31- 34

frontier 381

finding, convex optimization with cvxpy used 397-400

finding, optimization with spicy used 389-395

finding, with Monte Carlo simulations 382-387

G

GARCH model 306

conditional mean model 315

conditional volatility model 315, 316

error distribution 316

estimation details 336, 337

multivariate GARCH model 337

stock returns’ volatility, modeling with 312-315

univariate GARCH model 337

volatility, forecasting with 316- 324

generative adversarial networks (GANs) 369

geometric Brownian motion (GBM) 341

pros 346

used, for simulating stock price dynamics 340-347

ghost batch normalization 668

ghost feature 80

Gini importance 597

Google

TabNet, exploring 658

Gradient-based One-Side Sampling (GOSS) 551

Gradient Boosted Trees 549

gradient descent 550

Graphics Processing Units (GPUs) 645

greedy algorithm 505

Greeks

used, for measuring price sensitivity 352

grid searches

used, for tuning hyperparameters 529- 536

with multiple classifiers 539, 540

group time series validation 217

H

Hampel filter 81

used, for outlier detection 81, 82, 83, 84

Hashing encoder 562

heatmap 380

helmert encoder 561

Heroku

URL 142

Hierarchical Risk Parity (HRP) 406

advantages 406

used, for finding optimal portfolio 406-409

Holt’s double exponential smoothing (DOS) 167

Holt’s linear trend method 167

Holt-Winters’ seasonal smoothing 167

hot-deck imputation 493

Hurst exponent

used, for detecting patterns in time series 94-98

hyperparameter

tuning, with grid searches and cross-validation 529-536

I

iceberg orders 42

identified patterns

using, as features for model/strategy 128, 129

imbalance bars 48

Individual Conditional Expectation (ICE) 624

advantages 624

disadvantages 624

inflation 28

returns, adjusting for 28-30

information value (IV) 556

interactive visualizations

creating 64-67

interactive web app

building, for technical analysis with Streamlit 129-138

interpolation methods 37-39

reference link 39

interpretability 624

interquartile range (IQR) 152, 484

Intrinio

data, obtaining from 9-11

features 11

isolation forest 571

J

James-Stein encoder 562

Jensen’s alpha 279

joint hypothesis problem 279

K

Kaggle

reference link 541

Kendall’s Tau 92

kernel density estimate (KDE) 379, 473, 590

k-fold cross-validation 206

k-fold target encoding

data leakage, handling with 561

KNN

drawbacks 498

kurtosis 377

Kwiatkowski-Phillips-Schmidt-Shin (KPSS) 153

L

label encoding 499

Least Squares Monte Carlo (LSMC)

American options, pricing with 353-356

Leave One Out Encoding (LOOE) 556

leverage effect 105-109

investigating 109-111

LightGBM

features 551

possibilities 609

linear interpolation 38

line plots

creating 55-57

Ljung-Box test

reference link 189

Local Interpretable Model-agnostic Explanations (LIME) 641

locally estimated scatterplot smoothing (LOESS) 149, 151

log returns 26, 421

calculating 28

long/short strategy based on RSI

backtesting 433-437

M

Mann-Kendall (MK) test 92

Markov Chain Monte Carlo (MCMC) 249

Markowitz’s curse 405

Matthew’s correlation coefficient 516

Max drawdown 376

Maximum Likelihood Estimation (MLE) 176

Maximum Relevance Minimum Redundancy (MRMR) 620

maximum Sharpe ratio portfolio 388, 389

Mean Absolute Error (MAE) 216

Mean Absolute Percentage Error (MAPE) 174, 216

Mean Decrease in Impurity (MDI) 597

mean encoding 555

mean-reversion 94

Mean Squared Error (MSE) 216

mean-variance analysis 371

mean-variance portfolio optimization

backtesting 454-459

M-estimate encoder 561

Meta

Prophet 248

metrics

calculating, to evaluate portfolio’s performance 446

Minimum Variance portfolio 389

Minimum Volatility portfolio 388

MissForest

advantages 498

disadvantages 498

missing at random (MAR) 492

missing completely at random (MCAR) 492

missingno library

available visualizations 496, 497

missing not at random (MNAR) 492

missing time series data

imputing, ways 34- 37

missing values

categorizing 492

dealing with 492- 495

identifying 492

ML-based approaches

used, for imputing missing values 497, 498

modern portfolio theory (MPT) 371

assumptions 372

momentum factor

reference link 297

monotonic constraints 551

Monte Carlo

Value-at-Risk (VaR), estimating with 363- 369

Monte Carlo simulations

used, for finding efficient frontier 382-387

used, for improving valuation function 351, 352

used, for pricing European options 348-351

moving average convergence divergence (MACD) 114

downloading, with API 122, 124

moving average crossover strategy

backtesting, with crypto data 447- 453

multiple imputation approaches 497

Multiple Imputation by Chained Equations (MICE) 497

Multiple Seasonal-Trend Decomposition using LOESS (MSTL) 149

multiplicative model 145

multivariate GARCH model

estimation, parallelizing 337, 338

multivariate volatility forecasting

with CCC-GARCH model 325-329

Mutual Information (MI) score 618

N

Nasdaq Data Link

data, obtaining from 5-8

reference link 6

nearest neighbors imputation 497

negative skewness (third moment) 99

nested cross-validation 208

NeuralProphet

features 698

forecast, versus multi-step forecast 695- 698

holidays and special events, adding 694, 695

time series forecasting 682- 693

non-Gaussian distribution of returns 99, 100, 101, 108

descriptive statistics 109

histogram, of returns 108

Q-Q plot 108

non-stationary data

drawbacks 153

non-systematic component

noise 144

O

Omega ratio 376

OneHotEncoder

categories, specifying for 504

one-hot encoding 499

issues 554

pandas, using 503

warning 505

Open, High, Low, and Close (OHLC) 4

prices 70

optimal portfolio

finding, with Hierarchical Risk Parity 406- 409

optimization, with scipy

used, for finding efficient frontier 389-395

oracle approximating shrinkage (OAS) 409

ordinal encoding 561

ordinary least squares (OLS) 176

Ornstein-Uhlenbeck process 94

outlier detection

with Hampel filter 81-84

with rolling statistics 77-80

outliers 77

identifying, with stock returns 84-86

oversampling methods

Borderline SMOTE 570

K-means SMOTE 571

SVM SMOTE 571

Synthetic Minority Oversampling Technique for Nominal and Continuous (SMOTE-NC) 570

P

pandas

using, for one-hot encoding 504

vectorized backtesting with 417-421

Partial Dependence Plot (PDP)

advantages 625

disadvantages 625

patterns

detecting, in time series with Hurst exponent 94-98

Pearson’s correlation coefficient 484

permutation feature importance 598

pros and cons 599

Phillips-Perron (PP) 157

pipelines

benefits 519

building 520-523

custom transformers, adding 524-528

elements, accessing 528

used, for organizing projects 519, 520

Platform as a Service (PaaS) 142

point anomaly detection 78

portfolio rebalancing 291

Precision-Recall curve 513

analyzing 514-517

prices

converting, to returns 25-27

Principal Components Analysis (PCA) 565

probability density function (PDF) 100, 352

Proof of Concept (PoC) 504

Prophet 249

features 248

model, tuning 261

used, in forecasting 248- 258

PyCaret

features 272

URL 86

using, for time series forecasting 262-272

pycoingecko library

reference link 23

pyfolio 447

PyPortfolioOpt 410

efficient frontier, obtaining 410- 412

Python libraries, on AI explainability

investigating 642

PyTorch Forecasting 677

Q

quantile plot 380

quantile-quantile (Q-Q) plot 100, 108

QuantLib

American options, pricing with 357-361

quantstats 377

pandas DataFrames/Series, enriching with new methods 380

quarter plot 62

R

Random Forest model

feature importance, evaluating 599-608

random oversampling 563

random search (randomized grid search) 530

random undersampling 563

random walk 94

realized volatility 32

Receiver Operating Characteristic (ROC) 511

Rectified Linear Unit (ReLU) 655

Recursive Feature Elimination (RFE) 619

reduced regression

time series, forecasting as 235-247

reduction process 235

relative strength index (RSI) 114, 433

rescaled range (R/S) analysis 98

returns

adjusting, for inflation 28-30

benefit 26

log returns 26

prices, converting to 25-28

simple returns 26

reversal patterns 124

RobustStatDetector 90

Rolling Sharpe ratio 378

rolling statistics

used, for outlier detection 77-80

rolling three-factor model

implementing 288-291

Root Mean Squared Error (RMSE) 216

S

seaborn 61

seasonal decomposition

approaches 152

seasonality 58

seasonal patterns

additional information, visualizing 61-63

visualizing 58-60

sequential attention 658

Sequential Least-Squares Programming (SLSQP) algorithm 394

Sequential Model-Based Optimization (SMBO) 580

serial correlation 102

SHapley Additive exPlanations (SHAP) 625, 626

advantages 626

disadvantages 626

Sharpe ratio 376

signal types 436

simple exponential smoothing (SES) 167

Simple Moving Average (SMA) 72, 115, 417

calculating 431, 432

simple returns 26

calculating 27

Simplified Wrapper and Interface Generator (SWIG) 357

single imputation approaches 497

Singular Value Decomposition (SVD) 368

sizers

reference link 440

skew 377

sktime

advantages 247

documentation link 86

slippage 416

Sortino ratio 376

sparsemax 668

squared/absolute returns

autocorrelation values, small and decreasing 104-109

stacked ensemble

creating 573-579

stacking 573, 574

goal 573

stationarity 25

correcting, in time series 158-164

testing, in time series 152-158

STL decomposition 150, 151

advantages 149

stochastic differential equations (SDEs) 340

Stochastic Gradient Boosted Trees 550

stock-keeping units (SKUs) 669

stock price dynamics

simulating, with GBM 340-347

stock return’s volatility

modeling, with ARCH models 306-311

modeling, with GARCH models 312-315

Streamlit

documentation link 139

reference link 139

sign up page, reference link 139

Streamlit cloud, reference link 142

using, to build interactive web app for technical analysis 129-138

stylized facts 98

absence, of autocorrelation in returns 102-109

autocorrelation, small and decreasing in squared/absolute returns 104-109

investigating, of asset returns 98

leverage effect 105-109

non-Gaussian distribution of returns 99-108

volatility clustering 102, 109

successive halving

used, for performing faster search 536-538

sum encoder 561

surrogate model 580

symmetric MAPE (sMAPE) 245

Synthetic Minority Oversampling Technique for Nominal and Continuous (SMOTE-NC) 570

Synthetic Minority Oversampling Technique (SMOTE) 563

systematic components

level 144

seasonality 144

trend 144

T

TabNet, Google

exploring 658- 668

features 658, 659

implementation, in PyTorch 668

Tabular Learner, fastai

exploring 646-656

tail ratio 377

TA-Lib 114, 119

reference link 129

URL 120

tangency portfolio 387

target encoding 555

target orders 453

technical analysis (TA) 113

deploying 139-142

interactive web app, building with Streamlit 129-138

technical indicators

calculating 113-119

downloading 120-122

techniques, for tackling overfitting

batch normalization 655

dropout 655

weight decay 655

term frequency-inverse document frequency (TF-IDF) 523

test sets

data, splitting into 488-492

Three-Factor Model 284

tick bars 43

time bars

drawbacks 42

time-related features

creating 230-235

time series

changepoints, detecting 86-88

feature engineering, applying 220-230

forecasting as reduced regression 235-247

modeling, with ARIMA class models 176-189

modeling, with exponential smoothing methods 166-173

non-systematic components 144

patterns, detecting with Hurst exponent 94-98

stationarity, correcting 158-164

stationarity, testing for 152-158

systematic components 144

trends, detecting 92-94

validation methods 206-216

time series data

frequency, modifying of 31- 34

visualizing 52- 54

time series decomposition

goals 144

performing 146-148

references 152

time series forecast accuracy

evaluating, metrics 216

time series forecasting

NeuralProphet, using for 683-693

with Amazon’s DeepAR 669-677

with PyCaret 262-272

trade data

aggregating, ways 43-48

training sets

data, splitting into 488-492

transformations, applying to data

continuous variables, discretizing 524

numerical features, scaling 524

outliers, transforming/removing 524

transformers 519

treeinterpreter 642

tree-structured Parzen Estimator (TPE) 580

trends

detecting, in time series 92-94

Tukey’s fences 484

U

undersampling methods

Edited Nearest Neighbors 570

NearMiss 570

Tomek links 570

underwater plot 376

V

validation methods

for time series 206- 216

validation set 490

valuation function

improving, with Monte Carlo simulations 351, 352

Value-at-Risk (VaR) 305

estimating, with Monte Carlo 363-369

Variance Inflation Factor (VIF) 620

variance targeting 336

variance thresholding 620

vectorized backtesting 417

transaction costs, accounting 422, 423

with pandas 417- 421

volatility clustering 102, 109

volatility forecasting

analytical approach 317

bootstrap forecasts 317

GARCH models, using 316-324

simulation-based forecasts 317

Volatility Index (VIX) 305

volatility trading 305

volume bars 43

volume-weighted average price (VWAP) 48

W

walk-forward validation 207

used, for calculating model performance 208

weak stationarity 153

Weight of Evidence (WoE) encoding 556

winsorization 81

wrapper techniques

backward feature selection 620

considerations 621

exhaustive feature selection 621

forward feature selection 620

stepwise selection 621

X

XGBoost

possibilities 609

predictions, explaining 627-641

Y

Yahoo Finance

data, obtaining from 2, 4

libraries 5

reference link 5

Download a free PDF copy of this book

Thanks for purchasing this book!

Do you like to read on the go but are unable to carry your print books everywhere?

Is your eBook purchase not compatible with the device of your choice?

Don’t worry, now with every Packt book you get a DRM-free PDF version of that book at no cost.

Read anywhere, any place, on any device. Search, copy, and paste code from your favorite technical books directly into your application.

The perks don’t stop there, you can get exclusive access to discounts, newsletters, and great free content in your inbox daily

Follow these simple steps to get the benefits:

  1. Scan the QR code or visit the link below

https://packt.link/free-ebook/9781803243191

  1. Submit your proof of purchase
  2. That’s it! We’ll send your free PDF and other benefits to your email directly
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.224.73.125