A
activation function
rectifier activation function 131, 132, 133
sigmoid activation function 130, 131
threshold activation function 129
AI
networking 330
AI blueprint
Deep Reinforcement Learning algorithm, implementing 328
environment, building 327
testing 328
training 328
AI, for cost implications
brain, building 270
environment, building 269
testing
training 270
AI models
Deep Q-Learning 3
Q-Learning 3
AI, real-world business problem
artificial neural network 221, 222, 223
deep reinforcement learning algorithm, implementing 238, 240, 241, 243, 244, 245
demonstrating 258, 260, 263, 264, 265, 266, 267, 269
environment, building 224, 225, 226, 227, 229, 230, 231
implementing, with dropout regularization technique 237, 238
implementing, without dropout regularization technique 233, 234, 235, 236, 237
training, with early stopping 254, 255
training, without early stopping 246, 247, 248, 249, 251, 252, 253
AI Solution
building, with Q-Learning 101, 102, 103
AI solution refresher
about 96
initialization (first iteration) 96
AI solution, Snake game
about 293
experience replay memory 295
Anaconda
about 191
installation link 191
argmax method
about 147
arrays
artificial intelligence (AI)
adding, value to business 6
companion robots 6
education 5
employment 5
energy consumption 4
environment 6
global economy 6
healthcare 4
models 2
robots 5
security 5
smart homes 5
transport and logistics 5
Artificial Neural Networks (ANNs) 113
assumptions, server environment
energy costs, approximating 210, 211
server temperature, approximating 209, 210
B
Batch Gradient Descent 140, 141, 142
C
channel 273
classes
car class exercise 31
Colaboratory
reference link 11
Convolutional Layer 276
Convolutional Neural Network (CNN) 4
Convolutional Neural Networks (CNNs)
about 271
full connection 284
max pooling 278, 279, 280, 283
D
Deep Convolutional Q-Learning 4
Deep Learning
about 125
Gradient Descent 137
Neural Networks, learning 135
Neural Networks, working 133, 134
neuron 125
Deep Q-Learning
transitions 149
Deep Q-Learning for Business 259
Deep Q-Network (DQN) 240
E
e-commerce business, issues 60, 61
environment, self-driving car
goal, defining 156, 157, 158, 159
output actions 165
parameters, setting 160, 161, 162
system of rewards, defining 166, 168
environment, Snake game
actions, defining 290, 291, 292
building 288
rewards, defining 292
Exploration 147
F
feature detectors 275
Feature Map 275
Flattening 280
for loops
exercise 27
functions
exercise 28
G
GitHub page
about 9
Gradient Descent
Batch Gradient Descent 140, 141, 142
Mini-Batch Gradient Descent 145
Stochastic Gradient Descent (SGD) 143, 144, 145
H
house prices prediction
about 114
data preparation 119
dataset, uploading 114, 115, 116
Neural Network, building 122, 123
Neural Network, training 123
I
if conditions
excercise 23
if statements
excercise 23
Integrated Development Environment (IDE) 10
intermediate goal
Internet of Things (IOT) 4
K
Kivy
about 191
installing 197, 198, 200, 201, 202, 203, 204, 205, 206
reference link 153
L
lists
M
Markov Decision Process (MDP) 37, 38
maze
Q-Learning, applying 78
Mean Squared Error (MSE) 222
Mini-Batch Gradient Descent 145
Multi-Armed Bandit problem
N
Natural Language Processing (NLP) 271
Neural Networks
learning 135
neuron
about 125
Numpy array
creating 21
Numpy library
about 21
functions, reference link 22
O
Object-Oriented Programming (OOP) 232
objects
OpenAI Gym 329
operations
about 19
P
Pooled Feature Map 279
Pooling Layer 280
PricewaterhouseCoopers (PwC) 1
principles, Reinforcement Learning
about 34
AI Environment 37
inference mode 40
input and output system 34, 35
Markov Decision Process (MDP) 38
training and inference 38
Python 3.6
used, for creating virtual environment 193, 195
virtual environment 191
PyTorch
Q
Q-Learning 3
about 91
applying, to maze 78
used, for building AI Solution 101, 102, 103
Q-Learning, maze
AI, building 85
applying 78
environment, building 79
Q-value 85
reinforcement Intuition 88
states, defining 79
Q-Learning process
about 88
inference mode 89
training mode 89
R
real-world business problem
environment, building 208
rectifier activation function 131
Reinforcement Learning
principles 34
reference link 34
reward attribution
S
self-driving car
Deep Q-Learning, implementing 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190
demonstrating 191
environment, building 153
experience replay, implementing 177, 178, 179
implementing 170
Kivy, installing 197, 198, 199, 200, 201, 202, 203, 204, 205, 206
libraries, importing 171
neural network architecture, creating 172, 173, 174, 175
virtual environment, creating with Python 3. 193, 194, 195
server environment
actions, defining 214
assumptions 209
final simulation example 216, 217, 218, 219
overall functioning 212, 213, 214
parameters 208
simulation 211
states, defining 214
variables 208
sigmoid activation function 130
simulation
environment, building 61, 62, 63
Snake game
about 288
AI, training 310, 311, 312, 313, 314, 315, 316
Anaconda Prompt, installing 319, 320, 321, 322, 323, 324
brain, building 304, 305, 306, 307
demonstrating 318
environment, building 288, 297, 298, 299, 300, 301, 302, 303, 304
experience replay memory, building 308, 309
Softmax method
about 147
Standard model
versus Thompson Sampling model 57, 58
states 35
Stochastic Gradient Descent (SGD) 143, 144, 145
T
TensorFlow
about 170
text
displaying 18
displaying, with print() method 18
Thompson Sampling 3
Thompson Sampling model
actions, simulating 66
coding 43, 44, 45, 46, 47, 69, 70, 71, 72, 73
distribution 48, 49, 50, 51, 52
implementation 68
Multi-Armed Bandit problem, tackling 52, 53, 54, 55
versus Random Selection 69
Thompson Sampling model, versus Random Selection
about 69
performance measure 69
threshold activation function 129, 130
V
variables
about 19
W
warehouse
environment, building 94
priority locations 94
warehouse environment
actions 95
AI solution refresher 96
building, elements 94
rewards 95
states 94
while loops
3.129.59.176