As this ebook edition doesn't have fixed pagination, the page numbers below are hyperlinked for reference only, based on the printed edition of this book.
Symbols
reference link 81
A
advanced model configurations
majority class, selecting 223, 224
Amazon AWS monitoring
reference link 75
benefits 292
used, for serving model 295-304
Amazon SageMaker Studio Lab 293
Amazon SageMaker Training Compiler 293
installing 202
used, for creating pipeline 204-211
used, for demonstrating machine learning pipeline 211-215
API Gateway 114
automatic periodic triggers 141
advantages 143
cron jobs 141
disadvantages 144
B
automatic periodic triggers 141
components 136
continuous model evaluation, using to retrain 144, 145
latency 134
manual trigger 137
offline inference, serving for 145
on-demand inference, serving for 145
throughput 135
types 136
batch model serving, scenarios example 145
recommendation 146
sentiment analysis 146
batch model serving, techniques 146
periodic batch update, setting up 147-149
predictions, pulling by server application 150-152
predictions, storing in persistent store 149
BentoML 271
frameworks 9
reference link 7
Boyer-Moore algorithm
used, for selecting majority class in classification problem 223, 224
Boyer-Moore majority voting algorithm 224
business logic
post-inference business logic 231
pre-inference business logic 231
business logic in model serving, technical approaches 232
feature transformation 234
post-processing, prediction 235, 236
business logic pattern 229, 231
model, serving 32
C
CAP principle 44
reference link 44
Combiner block 196
common metrics, for training and monitoring
concept drift
reference link 83
continuous integration/continuous deployment (CI/CD) pipeline 12
continuous model evaluation 74-77
advantages 144
common metrics, for training and monitoring 85
disadvantages 144
errors monitoring 77
monitoring 75
rare classes, monitoring 102, 103
serving resources, enhancing 84
continuous model evaluation, business metrics
account deletion count 74
average time spent 74
registration count 74
continuous model evaluation, metrics
business perspectives 75
operational perspectives 75
continuous model evaluation, performance 97
metrics, plotting on dashboard 99, 100
notification, setting for performance drops 102
continuous model evaluation, service/operational metrics
availability 74
error rate 74
latency 74
cron expressions 134
D
data drift
reference link 83
DB fiddle
reference link 119
decision path 47
reference link 48
decision tree model
deep learning model 48
bias of hidden layer 52
weights from hidden layer to output layer 52
deep neural network 48
design patterns 16
in software engineering 15, 16
directed acyclic graph (DAG) 200, 201
Distributed Denial of Service (DDoS) 32
Docker
used, for TensorFlow Serving 242-247
DynamicManager 241
E
ensemble model serving
use cases 31
ensemble pattern 218
techniques, using 219
ensemble pattern, techniques
model selection 225
Error estimation model 196
errors monitoring 77
evaluation period 220
F
factory pattern 18
feature transformation 234
G
gini 60
gini coefficient 60
gini impurity 60
gini index 60
reference link 60
Google Cloud Monitoring
reference link 75
H
hashing techniques
advantages 131
disadvantages 131
I
ingress deployment 259
intercept 58
K
keyed prediction model 106-108
continued model evaluation, metrics, computing 117, 118
need for 115
order of predictions, rearranging 116, 117
tasks 122
keyed prediction model serving 27, 28
keyed prediction model, techniques
keys, creating 126
keys, passing with features from client 124, 125
keys, removing before prediction 125, 126
predictions, tagging with keys 126
keyed prediction model, use cases
exploring 108
model servers asynchronously, running 112-115
multi-threaded programming 108-112
keys
creating 126
L
latency 134
loaders 241
M
libraries 189
machine learning (ML) models
hyperparameters 43
input data 43
input data, using as states 43, 44
reference link 85
states 43
machine learning (ML) pipeline
demonstrating, with Apache Airflow 211-214
machine learning (ML) pipeline, stages
data cleaning 201
data collection 201
feature extraction 201
model, saving 201
model, testing 201
model, training 201
majority class
selecting, in classification problem with Boyer-Moore algorithm 223, 224
majority vote
reference link 218
Manager, version policy
availability-preserving policy 241
resource-preserving policy 241
manual trigger 137
advantages 137
disadvantages 137
MapReduce 42
Mean Absolute Percentage Error (MAPE) 91
Mean Square Error (MSE) 46, 91, 107
reference link 46
ML Frameworks in TFX, usage
reference link 239
ML serving patterns 22
serving approaches 22
serving philosophies 22
MNIST model
model
BentoML, using to serve 284-287
defining 5
formats 6
reference link 6
importance 11
model serving patterns
model serving, with TensorFlow Serving
operations 242
model training
model weights
Multi Model Server (MMS) 13
N
non-blocking operations 114
non-pure function 57
notebook
creating, in Amazon SageMaker 295-301
O
offline inference
serving for 145
offline model serving 134
on-demand inference
serving for 145
issues 30
online model serving, challenges 169
class imbalance 172
concurrent requests, handling 172-174
latency 172
newly arrived data for training, using 169
online training model, underperforming 170-172
overfitting 172
online model serving, requests 156
online model serving, use cases 166
emergency center, recommending 166, 167
estimated delivery time of delivery trucks, predicting 169
favorite soccer team, predicting 167, 168
hurricane or storm path, predicting 168
P
periodic batch update
used, for making predictions 180
phase two model 180
pickle module
reference link 6
pipeline pattern 200
advantages 215
disadvantages 216
post-training integer quantization
reference link 183
prediction drift
reference link 84
predictions
pulling, by server application 150-152
storing, in persistent store 149
python-crontab library
reference link 143
PyTorch pre-trained AlexNet model to ONNX format
reference link 7
Q
quantization aware training
reference link 188
R
random states
in machine learning model 45-48
Ray Serve 253
ensemble pattern, using 261-265
ingress deployment 259
pipeline pattern, using 265-269
using, to serve model 261
regression model
end-to-end dummy example, of serving 226-228
Representational State Transfer (REST) 41
reference link 23
request timeout errors 84
REST API design
reference link 81
RNN model
route planners
S
SageMaker Autopilot 294
SageMaker Canvas 292
SageMaker Clarify 294
SageMaker Data Wrangler 293
SageMaker Edge Manager 294
SageMaker Feature Store 294
SageMaker Ground Truth Plus 292
SageMaker Inference Recommender 293
SageMaker ML Lineage Tracking 293
SageMaker Model Building Pipelines 293
SageMaker Model Monitor 294
SageMaker model registry 293
SageMaker Neo 294
SageMaker projects 293
SageMaker serverless endpoints 293
SageMaker Studio 292
SageMaker Studio notebooks 294
servable stream 241
service 274
Service Level Agreement (SLA) 30
serving 4
serving approaches pattern 22, 28
batch model serving 29
business logic pattern model serving 32
ensemble model serving 31
online model serving 30
two-phase prediction model serving 30
versus serving philosophy pattern 28
serving philosophy pattern 22
continuous model evaluation 26, 27
keyed prediction model serving 27, 28
shallow neural network 48
single responsibility principle 17
singleton pattern 41
slope 58
software engineering
source 241
Square Error (SE) 76
staged rollout 218
reference link 218
stateful functions 39
states, extracting from 40, 41
stateful models
reference link 43
state information 23
stateless functions 38
benefits 38
example 38
properties 38
stateless model serving 37
states in decision tree model 59-61
states impact, mitigating from ML model
fixed random seed, using during training 62
serving without params, from param store 66-70
states, moving to separate location 62-66
states in regression model 58, 59
T
TensorFlow
reference link 242
TensorFlow servable
examples 240
TensorFlow Serving 239
advanced model configurations, using 248-252
aspired versions 241
loaders 241
Manager 241
source 241
using, to serve models 242
TensorFlow Serving configuration
reference link 252
too many requests errors 84
traffic shadowing
reference link 218
two-phase model serving 180, 181
route planners, use cases 196, 197
two-phase model serving, techniques
converted model, saving 184-186
full integer quantization of model 184-186
MNIST model, training 183, 184
models, training for phase one and phase two 190-193
phase one model, training with reduced features 189, 190
two-phase prediction model serving 30
two-phase prediction pattern 179
U
underfitting 26
User Acceptance Testing (UAT) 26
V
VGG-16 MNIST classification
reference link 192
W
web application development life cycle 5
18.117.159.229