advanced_grid_search method 114
AI (artificial intelligence) 1
AIF360 (AI Fairness 360) 96 – 100
arbitrary category imputation 47 – 48
training to learn features 127 – 130
BERTs (bidirectional encoder representations from transformers)
transfer learning with 131 – 133
BinaryLabelDataset dataframe 97
binning 54 – 55
COVID-19 diagnostics case study 71
law school success prediction case study 102
object recognition case study 160
social media sentiment classification case study 137
categorical data construction 54 – 59
binning 54 – 55
categorical dummy bucketing 234 – 236
constructing dummy features from categorical data 227 – 228
domain-specific feature construction 58 – 59
when to dummify categorical variables vs. leaving as single column 232 – 233
categorical dummy bucketing 234 – 236
CI/CD (continuous integration and development) 198
COMPAS (Correctional Offender Management Profiling for Alternative Sanctions) dataset 73 – 75
computer vision case study 160
feature construction 140 – 142
histogram of oriented gradients 143 – 148
principal component analysis 147 – 148
using fine-tuned VGG-11 features with logistic regression 158 – 159
using pretrained VGG-11 as feature extractor 151 – 154
problem statement and success definition 140
ConvNets (Convolutional Networks) 149
COVID-19 diagnostics case study 71
COVID-flu diagnostic dataset 36 – 39
exploratory data analysis 39 – 41
categorical data construction 54 – 59
numerical feature transformations 48 – 54
imputing missing qualitative data 46 – 48
imputing missing quantitative data 41 – 46
problem statement and success definition 36 – 39
create_feature_group method 211
CSAT (customer satisfaction score) 21
daily price features 179 – 180
benefits of using feature stores 198 – 202
creating training data in Hopsworks 215 – 220
MLOps and feature stores 198 – 203
setting up feature stores with Hopsworks 204 – 215
connecting to Hopsworks with HSFS API 204 – 206
using feature groups to select data 213 – 215
Wikipedia and MLOps and feature stores 202 – 203
data-type-specific feature engineering techniques 226 – 232
constructing dummy features from categorical data 227 – 228
standardization and normalization 228
feature construction 166 – 185
domain-specific features 179 – 185
rolling/expanding window features 169 – 178
recursive feature elimination 187 – 188
selecting features using machine learning 186 – 187
dimension reduction, optimizing with principal component analysis 147 – 148
treating using Yeo-Johnson transformers 91 – 96
domain-specific feature construction 58 – 59
DummifyRiskFactor transformer 57
categorical dummy bucketing 234 – 236
constructing from categorical data 227 – 228
when to dummify categorical variables vs. leaving as single column 232 – 233
why not to dummify everything 232 – 233
EDA (exploratory data analysis) 5
EMA (exponential moving average) 180
end-of-tail imputation 43 – 46
machine learning complexity and speed 31 – 32
expanding window features 170 – 172
COVID-19 diagnostics case study 39 – 41
law school success prediction case study 76 – 79
exploratory data analysis (EDA) 5
exponential moving average (EMA) 180
fair representation implementation 96 – 100
definitions of fairness 79 – 81
disparate treatment vs. disparate impact 79
how to know if bias in data needs to be dealt with 234
measuring bias in baseline model 85 – 90
basics of 25 – 26, 224
COVID-19 diagnostics case study 48 – 59
categorical data construction 54 – 59
numerical feature transformations 48 – 54
day trading case study 166 – 185
domain-specific features 179 – 185
rolling/expanding window features 169 – 178
law school success prediction case study
object recognition case study 140 – 142
social media sentiment classification case study 109
feature engineering 12, 33, 241
as crucial as machine learning model choice 222 – 223
categorical dummy bucketing 234 – 236
combining learned features with conventional features 236 – 239
data-type-specific techniques 226 – 232
machine learning complexity and speed 31 – 32
frequently asked questions 232 – 234
how to know if bias in data needs to be dealt with 234
when to dummify categorical variables vs. leaving as single column 232 – 233
why not to dummify everything 232 – 233
further reading material 240 – 241
great data and great models 4 – 5
qualitative data vs. quantitative data 16
need for 3 – 4
not one-size-fits-all solution 223
pipeline 5 – 8, 221 – 222
raw-data vectorizers 239 – 240
types of 9 – 10, 24 – 29, 224 – 226
feature construction 25 – 26, 224
feature extraction 27 – 28, 225
feature improvement 24 – 25, 224
feature selection 26 – 27, 224 – 225
basics of 27 – 28, 225
day trading case study 189 – 192
law school success prediction case study 96 – 100
histogram of oriented gradients 143 – 148
principal component analysis 147 – 148
social media sentiment classification case study 123 – 125
using to select data 213 – 215
basics of 24 – 25, 224
COVID-19 diagnostics case study 41 – 48
imputing missing qualitative data 46 – 48
imputing missing quantitative data 41 – 46
social media sentiment classification case study 118 – 123
cleaning noise from text 118 – 120
standardizing tokens 120 – 123
basics of 28 – 29, 226
object recognition case study 149 – 159
using fine-tuned VGG-11 features with logistic regression 158 – 159
using pretrained VGG-11 as feature extractor 151 – 154
social media sentiment classification case study 125 – 135
BERTs pretrained features 133 – 135
basics of 26 – 27, 224 – 225
COVID-19 diagnostics case study 66 – 69
day trading case study 185 – 188
recursive feature elimination 187 – 188
selecting features using machine learning 186 – 187
single source of features 200 – 202
creating training data in Hopsworks 215 – 220
setting up with Hopsworks 204 – 215
connecting to Hopsworks with HSFS API 204 – 206
using feature groups to select data 213 – 215
GANs (generative adversarial networks) 10
COVID-flu diagnostic dataset 36 – 39
exploratory data analysis 39 – 41
categorical data construction 54 – 59
numerical feature transformations 48 – 54
imputing missing qualitative data 46 – 48
imputing missing quantitative data 41 – 46
problem statement and success definition 36 – 39
HOGs (histogram of oriented gradients) 143 – 148
creating training data in 215 – 220
setting up feature stores with 204 – 215
connecting to Hopsworks with HSFS API 204 – 206
IDF (inverse document frequency) 115
k-NN ( k-nearest neighbors) 52
KBinsDiscretizer class 55, 234
law school success prediction case study 102
exploratory data analysis 76 – 79
fairness and bias measurement 79 – 81
definitions of fairness 79 – 81
disparate treatment vs. disparate impact 79
problem statement and success definition 75
logistic regression, using fine-tuned VGG-11 features with 158 – 159
MACD (moving average convergence divergence) 180 – 181
Machine Learning Bookcamp (Grigorev) 241
min – max standardization scales 52
feature construction 166 – 185
feature engineering as crucial as ML model choice 222 – 223
feature selection with 68 – 69
pipeline 5 – 8
selecting features using 186 – 187
MLM (masked language model) 132
MLOps Engineering at Scale (Osipov) 241
most-frequent category imputation 47
moving average convergence divergence (MACD) 180 – 181
NLP (Natural Language Processing) 4, 137
cleaning noise from text 118 – 120
standardizing tokens 120 – 123
BERTs pretrained features 133 – 135
problem statement and success definition 108
text vectorization 108 – 117, 135
TF-IDF vectorization 115 – 117
tweet sentiment dataset 105 – 108
NLTK (Natural Language Toolkit) 120 – 121
noise, cleaning from text 118 – 120
defined 16 – 17
NSP (next sentence prediction) 132
numerical feature transformations 48 – 54
object recognition case study 160
feature construction 140 – 142
histogram of oriented gradients 143 – 148
principal component analysis 147 – 148
using fine-tuned VGG-11 features with logistic regression 158 – 159
using pretrained VGG-11 as feature extractor 151 – 154
problem statement and success definition 140
PCA (principal component analysis) 27, 123, 147 – 148
polynomial feature extraction 189 – 192
Principles of Data Science, The, Second Edition (Ozdemir and Kakade) 241
arbitrary category imputation 47 – 48
most-frequent category imputation 47
end-of-tail imputation 43 – 46
raw-data vectorizers 239 – 240
RFE (recursive feature elimination) 187 – 188
rolling window features 169 – 170
SelectFromModel object 186, 223
singular value decomposition (SVD) 27, 123 – 125
social media sentiment classification case study 137
cleaning noise from text 118 – 120
standardizing tokens 120 – 123
BERTs pretrained features 133 – 135
problem statement and success definition 108
text vectorization 108 – 117, 135
TF-IDF vectorization 115 – 117
tweet sentiment dataset 105 – 108
constructing dummy features from categorical data 227 – 228
standardization and normalization 228
SVD (singular value decomposition) 27, 123 – 125
text vectorization 108 – 117, 135
TF-IDF vectorization 115 – 117
TF-IDF (term-frequency inverse document-frequency) vectorization 115 – 117
time series analysis case study 196
feature construction 166 – 185
domain-specific features 179 – 185
rolling/expanding window features 169 – 178
recursive feature elimination 187 – 188
selecting features using machine learning 186 – 187
time series cross-validation splitting 173 – 178
tokens, standardizing 120 – 123
tweet sentiment dataset 105 – 108
day trading case study 181 – 184
social media sentiment classification case study 137
unstructured data 14 – 15, 230 – 232
raw-data vectorizers 239 – 240
TF-IDF vectorization 115 – 117
VGG (Visual Geometry Group) 149
using fine-tuned VGG-11 features with logistic regression 158 – 159
using pretrained VGG-11 as feature extractor 151 – 154
Yeo-Johnson transformers 91 – 96
z-score standardization scales 52
52.14.249.197