Index

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Previous Chapter

C. PyTorch

Next Chapter

Credits

Index

A

act() method, defining DQN agent, 299–300, 301

Action potential, of biological neurons, 85–86

Action(s)

deep Q-learning network theory, 290–292

DeepMind DQN and, 59

DQN agent, 298–300

Markov decision processes and, 286

reinforcement learning problems and, 55–56

Activation functions

calculus behind backpropagation, 335–336

choosing neuron type, 96

convolutional example, 164–166

Glorot distributions, 136–137

nonlinear nature of in deep learning architectures, 95

ReLU neuron, 94–95

saturated neurons, 112–113

sigmoid neuron example, 92–94

softmax layer of fast food-classifying network, 106–108

tanh neuron, 94

Activation maps

convolutional networks and, 238

in discriminator network, 267–268

Faster R-CNN, 184–185

in generator network, 269, 272

LeNet-5 ConvNet architecture, 173–175

as output from convolutional kernels, 163–167

with padding, 168–169

pooling layers spatially reducing, 169–170

U-Net, 187–188

Actor-critic algorithm, RL agent, 307–308

AdaDelta optimizer, 146–147

AdaGrad optimizer, 146

Adaptive moment estimation (Adam) optimizer, 147, 148–149

Adversarial network, GANs, 272–274

Agent(s)

beyond DQN, 306–308

deep Q-learning network theory, 290–292

deep reinforcement learning and, 57

DeepMind DQN, 58

DQN. See DQN agents

optimal policy in deep reinforcement learning, 289–290

reinforcement learning problems of machine learning, 54–56

reinforcement learning theory, 283

SLM Lab, 304

AGI (artificial general intelligence), 72, 326–328

AI. See Artificial intelligence (AI)

AlexNet

CNN model inspired by, 176–177

history of deep learning for NLP, 25

overview of, 14–17

ReLU neurons in, 95

Algorithms, development of AGI and, 327

AlphaZero, 53–54, 65–66

AlphaGo, 59–62, 63–64

AlphaGo Master, 64

AlphaGo Zero, 62–65

Amazon review polarity, NLP training/validation samples, 316

ANI (artificial narrow intelligence), 72, 326–327

Architecture

adversarial model, 273–274

AlexNet hierarchical, 16

bidirectional LSTM sentiment classifier, 248

convolutional sentiment classifier, 237–238

deep learning model, 52

deep net in Keras model, 148–149

dense sentiment classifier, 229–231

discriminator model, 266–269

generalist neural network as single network, 52

generator model, 270–272

intermediate-depth neural network, 127–128

Keras functional API, 251–256

LeNet-5 hierarchical, 9–11, 172–176

LSTM, 247

multi-ConvNet sentiment classifier, 253–254

regression model network, 150–151

residual network, 182

RNN sentiment classifier, 243–244

shallow neural network, 78–79, 83

stacked recurrent model, 249

TensorFlow Playground, 17

U-Net, 47, 187

weight initialization, 133–135

word2vec, 207–213

Arithmetic

on fake human faces, 41–44

word-vector, 29–30

Art. See Machine art

Artificial general intelligence (AGI), 72, 326–328

Artificial intelligence (AI)

categories of, 71–72

deep learning for NLP relevant to, 53

driven by deep learning, 52

general-purpose learning algorithms for, 58

history of chess and, 65

machine learning as subset of, 50

OpenAI Gym environments as, 68–70

overview of, 49–50

Artificial narrow intelligence (ANI), 72, 326–327

Artificial neural networks (ANNs). See also Artificial neurons, constituting ANNs

AlphaGo Zero development, 63

AlphaZero development, 65–66

architecture for shallow networks, 83–84

birth of GANs via, 40–41

building model for DQN agent, 297–298

deep reinforcement learning using, 56–57

dense layers, 99–100

dominating representation learning, 51

hot dog-detecting dense network, 101–106

input layer, 99

key concepts, 110

manipulation of objects via, 67–68

schematic diagram of Jupyter network, 77–79

shallow networks and, 108–110

softmax layer of fast food-classifying network, 106–108

summary, 110

Artificial neurons

deep learning and, 22

deep learning model architectures, 51–52

Artificial neurons, constituting ANNs

biological neuroanatomy, 85–86

choosing neuron type, 96

hot dog/not hot dog detector, 86–90

key concepts, 97

modern neurons/activation functions, 91–95

most important equation in this book, 90–91

overview of, 85

perceptrons as early, 86

ReLU neuron, 94–95

sigmoid neuron, 92–94

summary, 96

tanh neuron, 94

Artificial super intelligence (ASI), 72, 327

astype() method, LeNet-5 in Keras, 172

Atari Games, DeepMind, 58–59

Attention, seq2seq and, 250

Automatic differentiation, PyTorch, 342–343

B

Backpropagation

of bidirectional LSTMs, 247

cross-entropy costs and, 114

enabling neural networks to learn, 113

LeNet-5 model, 10–12

minimizing cost, 115

overview of, 124–125

partial-derivative calculus behind, 335–337

training recurrent neural networks, 241

tuning hidden-layer and neuron counts, 125–126

BAIR (Berkeley Artificial Intelligence Research) Lab, 44–45

Batch normalization

deep neural networks in Keras, 148

improving deep networks, 138–139

network architecture regression model, 150–151

Batch size

of 1, also known as online learning, 124

building own project and, 320

escaping local minimum of cost, 122–124

as hyperparameter, 119

and stochastic gradient descent, 119–122

Bazinska, Julia, 30–31

Behavioral cloning, 307

Benchmarking performance, SLM lab, 304

Bengio, Yoshua

LeNet-5 model, 9–12

Turing Award for deep learning, 15

weight initialization and Glorot normal distribution, 135–137

Berkeley Artificial Intelligence Research (BAIR) Lab, 44–45

BERT (bi-directional encoder representations from transformers), NLP, 251

beta (β) hyperparameter

batch normalization adding, 139

optimizing SGD, 145–147

Bi-directional encoder representations from transformers (BERT), NLP, 251

bias (b)

adding to convolutional layers, 162

in convolutional example, 164

minimizing cost via gradient descent, 115–116

notation for neural networks, 333

in perceptron equation, 90–91

Bidirectional LSTMs (Bi-LSTMs), 247–249

Bigram collocation, 202–206

Binary-only restriction, of perceptrons, 91–92

Biological neurons

anatomy of, 85–86

creating perceptron algorithm with, 86

ReLU neuron activation function, 94–95

Biological vision, 8, 20

Board games

AlphaZero and, 65–66

AlphaGo and, 59–62

AlphaGo Zero and, 62–65

overview of, 59

boston_housing dataset, 149–150

Bostrom, Nick, 72

Bounding boxes, developing YOLO, 185–186

build_discriminator function, 266–268

_build_model() method, DQN agent, 296–298

Burges, Chris, MNIST dataset, 77–78

C

Caffe, deep learning library, 324

Calculus, in backpropagation, 335–337

callbacks argument

dense sentiment classifier, 232

TensorBoard, 152, 154

Cambrian explosion, 3

Capsule networks, machine vision and, 192

Cart-Pole game

defining DQN agent for. See DQN agents

DQN agent interaction with OpenAI Gym, 300–303

estimating optimal Q-value, 292

hyperparameter optimization using SLM Lab, 304

Markov decision processes in, 288–289

as reinforcement learning problem, 284–286

CartPole, OpenAI Gym environment, 70

CBOW (continuous bag of words), word2vec, 207, 208

Cell body, biological neurons, 85–86

Cell state, LSTM, 244–245

Cerebral cortex, processing visual information, 3–4

cGAN (conditional GAN), 45

Chain rule of calculus, backpropagation and, 124

Chatbots, natural language processing in, 23–24

Checkpoints, dense sentiment classifier, 231

Chen, Chen, deep learning image processing, 47–48

Chess

AlphaZero and, 65–66

vs. Go board complexity, 59, 61

Classification

adding layers to transfer learning model, 190

convolutional sentiment classifier, 235–239

of film reviews by sentiment, 229–235

natural language. See Natural language classification

as supervised learning problem, 53–54

CNNs. See Convolutional neural networks (CNNs)

CNTK, deep learning library, 324

Coding shallow network in Keras

designing neural network architecture, 83

installation, 76

loading MNIST data, 80–81

MNIST handwritten digits, 76–77

overview of, 75, 76

prerequisites, 75–76

reformatting data, 81–83

schematic diagram of network, 77–79

software dependencies for shallow net, 80

summary, 84

training deep learning model, 83–84

Color, visual cortex detects, 7–8

Compiling

adversarial network, 274

dense sentiment classifier, 231

discriminator network, 269

network model for DQN agent, 298

Complex neurons

forming primary visual cortex, 6–7

neocognition and, 9

Computational complexity

minimizing number of kernels to avoid, 163

from piping images into dense networks, 160

Computational homogeneity, with Software 2.0, 325

Computing power, AGI and development of, 327

Conditional GAN (cGAN), 45–46

Conditional imitation learning algorithms, 307

Confusion matrix, 218–219

Content generation, building socially beneficial projects, 318

Context words, running word2vec, 207–209

Continuous bag of words (CBOW), word2vec, 207, 208

Continuous variable, supervised learning problem, 54

Contracting path, U-Net, 187–188

Conv2D dependency, LeNet-5 in Keras, 171–174

Convolutional filter hyperparameters, CNNs, 168–169

Convolutional layers

convolutional neural networks (CNNs) and, 160–162

general approach to CCN design, 176

working with pooling layers, 169–170

Convolutional layers, GANs

birth of GANs, 41

convolutional neural networks (CNNs) and, 52–53

multiple filters in, 162–163

results of latent space arithmetic, 42–44

Convolutional neural networks (CNNs)

computational complexity, 160

contemporary machine vision and, 52–53

convolutional filter hyperparameters, 168–169

convolutional layers, 160–162

DeepMind DQN using, 58

detecting spatial patterns among words, 235–239

developing Faster R-CNN, 184–185

developing YOLO, 185–186

example of, 163–167

general approach to CCN design, 176

image segmentation with Mask R-CNN, 187

LeNet-5 in Keras, 171–176

manipulation of objects via, 67–68

model inspired by AlexNet, 176–178

model inspired by VGGNet, 178–179

multiple filters, 162–163

object detection with Fast R-CNN, 184

object detection with R-CNN, 183–184

overview of, 159

transfer learning model of, 188–192

two-dimensional structure of visual imagery, 159–160

Convolutional sentiment classifier, 235–239, 252–256

convTranspose layers, in generator networks, 270

Corpus

one-hot encoding of words within, 25–26

preprocessing full, 203–206

word vectors within, 27–29

word2vec architectures for, 208

Cortes, Corinna, curating MNIST dataset, 77–78

Cost (loss) functions

building own project, 319

cross-entropy cost, 113–115

quadratic cost, 112–113

in stochastic gradient descent, 120

training deep networks and, 111

using backpropagation to calculate gradient of, 124–125, 335–337

Cost, minimizing via optimization

batch size and stochastic gradient descent, 119–122

escaping local minimum, 122–124

gradient descent, 115–117

learning rate, 117–119

training deep networks and, 115

Count based, word2vec as, 208

Cross-entropy cost

essential GAN theory, 262

minimizes impact of neuron saturation, 113–115, 131

pairing with weight initialization, 131–135

CycleGANs, style transfer of well-known painters, 44–45

D

Dahl, George, 24–25

Data

augmentation, training deep networks, 145

development of AGI and, 327

Data generators, training, 190–191

DataFrame, IMDb validation data, 234

Datasets, deep reinforcement learning using larger, 57

De-convolutional layers, generator networks, 269–270, 272

deCNN, generator network as, 270

Deep Blue, history of chess, 65

Deep learning

code. See Coding shallow network in Keras

computational representations of language. See Language, computational representations of

definition of, 22

elements of natural human language in, 33–35

Google Duplex as NLP based on, 35–37

model architectures, 51–52

natural language processing and, 23–25, 37

networks learn representations automatically, 22–23

reinforcement learning combined with. See Reinforcement learning, deep

training deep networks. See Training deep networks

Deep learning, introduction

biological vision, 3–8

machine vision. See Machine vision

Quick, Draw! game, 19

summary, 20

TensorFlow Playground, 17–19

traditional machine learning vs., 11–12

Deep learning projects, building own

artificial general intelligence approach, 326–328

converting existing machine learning project, 316–317

deep learning libraries, 321–324

deep reinforcement learning, 316

machine vision and GANs, 313–315

modeling process, including hyperparameter tuning, 318–321

natural language processing, 315–316

overview of, 313

resources for further projects, 317–318

Software 2.0, 324–326

summary, 328–329

Deep networks, improving

deep neural network in Keras, 147–149

fancy optimizers, 145–147

key concepts, 154–155

model generalization (avoiding overfitting), 140–145

overview of, 131

regression, 149–152

summary, 154

TensorBoard, 152–154

unstable gradients, 137–139

weight initialization, 131–135

Xavier Glorot distributions, 135–137

Deep Q-learning networks (DQNs)

DeepMind video game and, 58–60

defining DQN agent. See DQN agents

essential theory of, 290–292

SLM Lab using, 304–306

Deep reinforcement learning. See Reinforcement learning, deep

Deep RL agents, 306–308

DeepMind

AlphaGo board game, 61–62

AlphaGo Zero board game, 62–65

Google acquiring, 59

video games, 58–60

DeepMind Lab

building own deep learning project with, 316

deep reinforcement learning, 69, 71

Dendrites, and biological neurons, 85–86

Denormalization, in batch normalization, 139

Dense layers

architecting intermediate net in Keras, 127–128

artificial neural networks with, 99–100

CNN model inspired by AlexNet, 177–178

computational complexity and, 160

convolutional layers vs., 168

deep learning and, 51

Fast R-CNN and, 184

in GANs, 271–272

general approach to CCN design, 176

LeNet-5 in Keras and, 172–173, 175–176

multi-ConvNet model architecture, 253–255

in natural language processing, 224–225, 230–231, 236–238

networks designed for sequential data, 243

in shallow networks, 109

using weight initialization for deep networks, 132–133, 137

in wide and need model architecture, 317

Dense network

architecture, 229–235

building socially beneficial projects, 318

defined, 100

hot dog-detecting, 101–106

revisiting shallow network, 108–110

softmax layer of fast food-classifying network, 106–108

Dense sentiment classifier, 229–235

Dense Sentiment Classifier Jupyter notebook. See Natural language classification

Dependencies

Cart-Pole DQN agent, 293

convolutional sentiment classifier, 236

LeNet-5 in Keras, 171

loading GAN for Quick, Draw! game, 264–265

loading IMDb film reviews, 222–223

preprocessing natural language, 197

regression model, 150

TensorFlow with Keras layers, 323

Dimensionality reduction, plotting word vectors, 213–217

Discount factor (decay), Markov decision processes, 288–289

Discounted future reward

expected, 290

maximizing, 290

Discriminator network, GANs

code for training, 277

defined, 40–41

overview of, 266–269

training, 259–262

Distributed representations, localist representations vs., 32

Dot product notation, perceptron equation, 90–91

DQN agents

agents beyond, 306–308

building neural network model for, 297–298

drawbacks of, 306

hyperparameter optimization using SLM Lab, 304

initialization parameters, 295–297

interacting with OpenAI Gym environment, 300–303

overview of, 293–295

remembering gameplay, 298

selecting action to take, 299–300

training via memory replay, 298–299

DQNs. See Deep Q-learning networks (DQNs)

Dropout

for AlexNet in Keras, 177

for deep neural network in Keras, 148

for LeNet-5 in Keras, 171–174

network architecture regression model and, 150–151

preventing overfitting with, 142–145

E

Eager mode, TensorFlow, 322–323

Ease of use, Software 2.0, 326

Efros, Alexei, 44

ELMo (embeddings from language models), transfer learning, 251

Elo scores

AlphaGo game, 62–63

AlphaGo Zero game, 64–65

AlphaZero game, 66

Encoder-decoder structure, NMT, 250

Environment(s)

DeepMind DQN, 58

OpenAI Gym, 300–303

popular deep reinforcement learning, 68

reinforcement learning problems of machine learning, 54–56

reinforcement learning theory, 283

training agents simultaneously via SLM Lab in multiple, 304

Epochs of training, checkpointing model parameters after, 231–232

Essential theory. See Theory, essential

exp function, softmax layer of fast food-classifying network, 106–108

Expanding path, U-Net, 187–188

Experiment graph, SLM Lab, 304

Expertise, subject-matter

AutoNet reducing requirement for, 17

deep learning easing requirement for, 22–23

Exploding gradients, ANNs, 138

Extrinsic evaluations, evaluating word vectors, 209

F

Face detection

arithmetic on fake human faces, 41–44

birth of generative adversarial networks, 39–41

engineered features for robust real-time, 12–13

in visual cortex, 8

Facebook, fastText library, 33

False negative, IMDb reviews, 236

False positive, IMDb reviews, 235

Fan Hui, AlphaGo match, 62

Fancy optimizers, deep network improvement, 145–147

Fashion-MNIST dataset, deep learning project, 313–315

Fast food-classifying network, softmax layer of, 106–108

Fast R-CNN, object detection, 184

Faster R-CNN, object detection, 184–185

FastText, 33, 209

Feature engineering

AlexNet automatic vs. expertise-driven, 17

defined, 11

traditional machine learning and, 12–13

traditional machine learning vs. deep learning, 11–12

Feature maps

convolutional example of, 163–167

image segmentation with U-Net, 188

transfer learning model and, 188–192

Feedforward neural networks, training, 241

FetchPickAndPlace, OpenAI Gym, 70

Figure Eight

image-classification model, 315

natural language processing model, 316

Filters. See Kernels (filters)

Finn, Chelsea, 67

fit_generator() method, transfer learning, 191–192

Fitting, dense sentiment classifier, 232

Flatten layer, LeNet-5 in Keras, 171–174

FloatTensor, PyTorch, 339

for loop, GAN training, 275–281

Formal notation, neural networks, 333–334

Forward propagation

backpropagation vs., 124

defined, 103

in hot dog-detecting dense network, 101–106

notation for neural networks, 334

in softmax layer of fast food-classifying network, 106–108

in stochastic gradient descent, 120, 121

Frozen Lake game, 316

Fukushima, Kunihiko, LeNet-5, 9–12

Fully connected layer (as dense layer), 99

Functional API, non-sequential architectures and Keras, 251–256

Fusiform face area, detecting in visual cortex, 8

G

Game-playing machines

artificial intelligence, 49–50

artificial neural networks (ANNs), 51

board games, 59–66

categories of AI, 71–72

categories of machine learning problems, 53–56

deep learning, 51–52

deep reinforcement learning, 56–57

machine learning, 50

machine vision, 52–53

manipulation of objects, 67–68

natural language processing, 53

overview of, 49–50

popular deep reinforcement learning environments, 68–71

representation learning, 51

Software 2.0 and, 326

summary, 72

video games, 57–59

Gameplay, 298–300

gamma (γ), batch normalization adding, 139

GANs. See Generative adversarial networks (GANs)

Gated recurrent units (GRUs), 249–250

gberg_sents, tokenizing natural language, 199

Generative adversarial networks (GANs)

actor-critic algorithm reminiscent of, 308

adversarial network component, 272–274

arithmetic on fake human faces, 41–44

birth of, 39–41

building and tuning own, 315

creating photorealistic images from text, 45–46

discriminator network component, 266–269

essential theory, 259–262

generator network component, 269–272

high-level concepts behind, 39

image processing using deep learning, 47–48

key concepts, 281–282

making photos photorealistic, 45

Quick, Draw! game dataset, 263–266

reducing computational complexity with, 170

Software 2.0 and, 326

style transfer, 44–45

summary, 281

training, 275–281

Generator network, GANs

code for training, 277–278

defined, 40–41

overview of, 269–272

training, 259–262

Geoff Hinton, 94–95

Girshick, Ross, 183–184

GitHub repository, Quick, Draw! game dataset, 263

Global minimum of cost, training deep networks for, 122–124

Glorot normal distribution, improving deep networks, 135–137

GloVe

converting natural words to word vectors, 28

as major alternative to word2vec, 208

Go board game, 59–66

Goodfellow, Ian

arithmetic on fake human faces and, 41–44

birth of GANs, 39–41

MNIST dataset used by, 76–77

Google Duplex technology, deep-learning-based NLP, 35–37

GPUs (graphics processing units), deep reinforcement learning, 57

Gradient descent

batch size and stochastic, 119–122

cross-entropy costs and, 114

enabling neural networks to learn, 113

escaping local minimum using, 122–124

learning rate in, 117–119

minimizing cost with, 115–117

training deep networks with batch size/stochastic, 119–122

Graesser, Laura, 304

Graphics processing units (GPUs), deep reinforcement learning, 57

GRUs (gated recurrent units), 249–250

Gutenberg, Johannes, 197

H

HandManipulateBlock, OpenAI Gym, 70

Handwritten digits, MNIST, 76–78

Hassabis, Demis, 58–59

Hidden layers

artificial neural network with, 99

building network model for DQN agent, 297

calculus behind backpropagation, 337

deep learning model architectures, 51–52

dense layers within. See Dense layers

forward propagation in dense network through, 102–106

hot dog-detecting dense network, 101–106

neural network notation, 333–334

schematic diagram of shallow network, 79

TensorFlow Playground demo, 100

tuning neuron count and number of, 125–126

Hidden state, LSTM, 245

Hierarchical softmax, training word2vec, 208

Hinton, Geoffrey

developing capsule networks, 192

developing t-distributed stochastic neighbor embedding, 213–214

as godfather of deep learning, 14–15

Histogram of validation data

convolutional sentiment classifier, 239

dense sentiment classifier, 233–234

Hochreiter, Sepp, 244

Hot dog-detecting dense network, 101–106

Hot dog/not hot dog detector, perceptrons, 86–90

Hubel, David

LeNet-5 model built on work of, 10–12

machine vision approach using work of, 8–9

research on visual cortex, 4–7

Human and machine language. See also Language, computational representations of

deep learning for natural language processing, 21–25

elements of natural human language in, 33–35

Google Duplex technology, 35–37

summary, 37

Humanoid, OpenAI Gym environment, 70

Hyperparameters. See also Parameters

in artificial neural networks, 130

automating search for, 321

batch size, 119

Cart-Pole DQN agent, 293–295

convolutional filter, 163, 167

convolutional sentiment classifier, 236–237

learning rate, 118

for loading IMDb film reviews, 223–225

LSTM, 246–247

multi-ConvNet sentiment classifier, 253

network depth, 125–126

number of epochs of training, 122

optimizing with SLM Lab, 303–306

reducing model overfitting with dropout, 144–145

RMSProp and AdaDelta, 147

RNN sentiment classifier, 242–243

tuning own project, 318–321

understanding in this book, 118–119

I

IMDb (Internet Movie Database) film reviews. See Natural language classification

ILSVRC (ImageNet Large Scale Visual Recognition Challenge)

AlexNet and, 14–17

ResNet, 182

traditional ML vs. deep learning entrants, 13–14

Image classification

building socially beneficial projects using, 318

ILSVRC competition for, 182

machine vision datasets for deep learning, 313–315

object detection vs., 183

Image segmentation applications, machine vision, 186–188

ImageDataGenerator class, transfer learning, 190–191

ImageNet, and ILSVRC, 13–14

Images

creating photorealistic. See Machine art

processing using deep learning, 46–48

Imitation learning, agents beyond DQN optimizing, 307

Infrastructure, rapid advances in, 327

Initialization parameters, DQN agent, 295–297

Input layer

artificial neural networks with, 99

of deep learning model architectures, 51–52

hot dog-detecting dense network, 101–106

LSTM, 245

notation for neural networks, 333

of perceptrons, 86–88

schematic diagram of shallow network, 79

TensorFlow Playground demo, 100

Installation

of code notebooks, 76

PyTorch, 341

Integer labels, converting to one-hot, 82–83

Intermediate Net in Keras Jupyter notebook, 127–129

Internal covariate shift, batch normalization, 138–139

Internet Movie Database (IMDb) film reviews. See Natural language classification

Intrinsic evaluations, word vectors, 209

iter argument, running word2vec, 210

J

Jones, Michael, real-time face detection, 12–13

K

Kaggle

image-classification model, 315

natural language processing model, 316

Karpathy, Andrej, 324–326

Kasparov, Garry, 65

Keng, Wah Loon, 304

Keras

AlexNet and VGGNet in, 176–179

coding in. See Coding shallow network in Keras

deep learning library in, 321–323

deep neural network in, 147–149

functional API, non-sequential architectures and, 251–256

implementing LSTM, 246–247

implementing RNN, 242

intermediate-depth neural network in, 127–129

LeNet-5 model in, 171–176

loading IMDb film reviews in, 225–226

parameter-adjustment in, 144

TensorBoard dashboard in, 152–154

transfer learning in, 188–192

weight initialization in, 132–135

Kernels (filters)

convolutional example of, 164–167

of convolutional layers, 160–162

number in convolutional layer, 162–163

pooling layers using, 169–170

size, convolutional filter hyperparameter, 167

Key concepts

artificial neural networks (ANNs), 110

artificial neurons that constitute ANNs, 97

deep reinforcement learning, 308–309

generative adversarial networks (GANs), 281–282

improving deep networks, 154–155

machine vision, 193

natural language processing (NLP), 256–257

training deep networks, 130

Krizhevsky, Alex, 14, 16

L

L1 vs. L2 regularization, reducing model overfitting, 141–142

Language. See Human and machine language

Language, computational representations of

localist vs. distributed representations, 32–33

one-hot representations of words, 25–26

overview of, 25

word vector-arithmetic, 29–30

word vectors, 26–29

word2viz tool for exploring, 30–32

LASSO regression, reducing model overfitting, 141–142

Latent space

arithmetic on fake human faces in, 42–44

birth of generative adversarial networks, 40–41

Layers

building own project, 319–320

deep learning model architectures, 51–52

Leaky ReLU activation function, 96

Learn Python the Hard Way (Shaw), 75

Learning rate

batch normalization allowing for higher, 139

building own project, 320

shortcomings of improving SGD with momentum, 146

as step size in gradient descent, 117–119

LeCun, Yan

on fabricating realistic images, 39

LeNet-5 model, 9–12

MNIST handwritten digits curated by, 76–78

PyTorch development, 323–324

Turing Award for deep learning, 15

Legg, Shane, 58

Lemmatization, as sophisticated alternative to stemming, 196

LeNet-5 model

AlexNet vs., 15–17

in Keras, 171–176

machine vision, 9–12

Les 3 Brasseurs bar, 39

Levine, Sergey, 67

Li, Fei-Fei, 13–14

Libraries, deep learning, 321–324

Linear regression, object detection with R-CNN, 183–184

List comprehension

adding word stemming to, 201

removing stop words and punctuation, 200–201

load() method, neural network model for DQN agent, 300

Loading data

coding shallow network in Keras, 79–81

for shallow net, 80–81

load_weights() method, loading model parameters, 232

Local minimum of cost, escaping, 122–124

Localist representations, distributed representations vs., 32–33

Long short-term memory (LSTM) cells

bidirectional (Bi-LSTMs), 247–248

implementing with Keras, 246–247

as layer of NLP, 53

overview of, 244–246

Long-term memory, LSTM, 245–246

Lowercase

converting all characters in NLP to, 195–196, 199–200

processing full corpus, 204–206

LSTM. See Long short-term memory (LSTM) cells

LunarLander, OpenAI Gym environment, 70

M

Maaten, Laurens van der, 213–214

Machine art

arithmetic on fake human faces, 41–44

boozy all-nighter, 39–41

creating photorealistic images from text, 45–46

image processing using deep learning, 46–48

make your own sketches photorealistic, 45

overview of, 39

style transfer, 44–45

summary, 48

Machine language. See Human and machine language

Machine learning (ML). See also Traditional machine learning (ML) approach

overview of, 50

reinforcement learning problems of, 54–56

representation learning as branch of, 51

supervised learning problems of, 53–54

traditional machine vs. representation learning techniques, 22

unsupervised learning problems of, 54

Machine translation, NLP in, 23–24

Machine vision

AlexNet, 14–17

AlexNet and VGGNet in Keras, 176–179

CNNs. See Convolutional neural networks (CNNs)

converting existing project, 316–317

datasets for deep learning image-classification models, 313–315

ImageNet and ILSVRC, 13–14

key concepts, 193

LeNet-5, 9–12

LeNet-5 in Keras, 171–176

neocognition, 8–9

object recognition tasks, 52–53

overview of, 8, 159

pooling layers, 169–170

Quick, Draw! game, 19

residual networks, 179–182

Software 2.0 and, 326

summary, 20, 193

TensorFlow Playground, 17–19

traditional machine learning approach, 12–13

Machine vision, applications of

capsule networks, 192

Fast R-CNN, 184

Faster R-CNN, 184–185

image segmentation, 186–187

Mask R-CNN, 187

object detection, 183

overview of, 182

R-CNN, 183–184

transfer learning, 188–192

U-Net, 187–188

YOLO, 185–186

Magnetic resonance imaging (MRI), and visual cortex, 7–8

Manipulation of objects, 67–68

Markov decision process (MDP), 286–290

Mask R-CNN, image segmentation with, 186–187

Mass, Andrew, 203

matplotlib, weight initialization, 132

max operation, pooling layers, 170

Max-pooling layers

AlexNet and VGGNet in Keras, 176–179

LeNet-5 in Keras, 170–174

MaxPooling2D dependency, LeNet-5 in Keras, 171–174

MCTS (Monte Carlo tree search) algorithm, 61, 66

MDP (Markov decision process), 286–290

Mean squared error, 112, 298

Memory

batch size/stochastic gradient descent and, 119–122

DQN agent gameplay, 298

Software 2.0 and, 326

training DQN agent via replay of, 298–299

Metrics, SLM Lab performance, 305–306

Milestones, deep learning for NLP, 24–25

min_count argument, word2vec, 210–211

Minibatches, splitting training data into, 119–122

ML. See Machine learning (ML)

Mnih, Volodymyr, 58–60

MNIST handwritten digits

calculus for backpropagation, 337

coding shallow network in Keras, 76–78

computational complexity in dense networks, 160

Fashion-MNIST dataset deep learning project, 313–315

loading data for shallow net, 80–81

loss of two-dimensional imagery in dense networks, 159–160

reformatting data for shallow net, 81–83

schematic diagram of shallow network, 77–79

software dependencies for shallow net, 80

in stochastic gradient descent, 120

training deep networks with data augmentation, 145

using in Keras, 171–176

Model generalization. See Overfitting, avoiding

Model optimization, agents beyond DQN using, 307

ModelCheckpoint() object, dense sentiment classifier, 231–232

Modeling process, building own project, 318–321

Momentum, 145–146

Monet, Claude, 44–45

Monte Carlo tree search (MCTS) algorithm, 61, 66

Morphemes, natural human language, 34

Morphology, natural human language, 34–35

most_similar() method, word2vec, 212–213

Motion, detecting in visual cortex, 7–8

Mountain Car game, 316

MRI (magnetic resonance imaging), and visual cortex, 7–8

Müller, Vincent, 72

Multi ConvNet Sentiment Classifier Jupyter notebook, 320

MXNet, deep learning library, 324

N

n-dimensional spaces, 42–43, 339

n-grams, 196, 202–203

Nair, Vinod, 94–95

Natural human language, elements of, 33–35

Natural language classification

dense network classifier architecture, 229–235

examining IMDb data, 227–228

with familiar networks, 222

loading IMDb film reviews, 222–226

processing in document, 23–24

standardizing length of reviews, 228–229

Natural Language Preprocessing Jupyter notebook, 197

Natural language processing (NLP)

area under ROC curve, 217–222

building own deep learning project, 315–316

building socially beneficial projects, 318

computational representations of. See Language, computational representations of

deep learning approaches to, 53

examples, 23–24

Google Duplex as deep-learning, 35–37

history of deep learning, 24–25

key concepts, 256–257

learning representations automatically, 22–23

natural human language elements of, 33–35

natural language classification in. See Natural language classification

networks designed for sequential data, 240–251

non-sequential architectures, 251–256

overview of, 195

preprocessing. See Preprocessing natural language data

Software 2.0 and, 326

summary, 256

transfer learning in, 251

word embedding with word2vec. See word2vec

n_components, plotting word vectors, 214

Negative rewards, reinforcement learning problems and, 56

Negative sampling, training word2vec, 208

Neocognition

LeNet-5 advantages over, 13–14

LeNet-5 model and, 9–12

machine vision and, 8–9

Nesterov momentum optimizer, stochastic gradient descent, 146

Network architecture, regression model, 150–151

Network depth, as hyperparameter, 125–126

Neural Information Processing Systems (NIPS) conference, 41

Neural machine translation (NMT), seq2seq models, 250

Neural networks

building deep in PyTorch, 343–344

coding shallow in Keras, 83

formal notation for, 333–334

Neuron saturation. See Saturated neurons

Neurons

AlexNet vs. LeNet-5, 17

behaviors of biological, 85–86

forming primary visual cortex, 4–7

neocognition and, 8–9

regions processing visual stimuli in visual cortex, 7–8

TensorFlow Playground and, 17–19

tuning hidden-layer count and number of, 126

next_state, DQN agent gameplay, 298

NIPS (Neural Information Processing Systems) conference, 41

n_iter, plotting word vectors, 214

NLP. See Natural language processing (NLP)

NMT (neural machine translation), seq2seq models, 250

Noë, Alva, 39

Non-sequential model architecture, 251–256

Non-trainable params, model object, 109–110

Nonlinearity, of ReLU neuron, 95

Notation, formal neural network, 333–334

Number of epochs of training

as hyperparameter, 122

rule of thumb for learning rate, 119

stochastic gradient descent and, 119–122

training deep learning model, 83–84

NumPy

PyTorch tensors and, 324, 339

selecting action for DQN agent, 299–300

weight initialization, 132, 134

O

Object detection

with Fast R-CNN, 184

as machine vision application, 182–183

with R-CNN, 183–184

understanding, 183

with YOLO, 185–186

Objective function (π), maximizing reward with, 290

Objects

manipulation of, 67–68

recognition tasks of machine vision, 52–53

Occam’s razor, neuron count and, 126

Oliveira, Luke de, 315, 316

On-device processing, machine learning for, 46–48

One-hot format

computational representations of language via, 25–26

converting integer labels to, 82–83

localist vs. distributed representations, 32–33

Online resources

building deep learning projects, 317–318

pretrained word vectors, 230

OpenAI Gym

building deep learning projects, 316

Cart-Pole game, 284–286

deep reinforcement learning, 68–70

interacting with environment, 300–303

Optimal policy

building neural network model for, 288–290

estimating optimal action via Q-learning, 290–292

Optimal Q-value (Q*), estimating, 291–292

Optimization

agents beyond DQN using, 306–307

fancy optimizers for stochastic gradient descent, 145–147

hyperparameter optimizers, 130, 303–306

minimizing cost via. See Cost, minimizing via optimization

stochastic gradient descent. See Stochastic gradient descent (SGD)

Output layer

artificial neural network with, 99

batch normalization and, 139

building network model for DQN agent, 298

calculus behind backpropagation, 335, 337

deep learning model architectures, 51–52

LSTM, 245

notation for neural networks, 334

perceptrons, 86–87, 89

schematic diagram of shallow network, 79

softmax layer for multiclass problems, 106–108

softmax layer of fast food-classifying network, 106–107

TensorFlow Playground demo, 100

Overfitting, avoiding

building your own project, 320

data augmentation, 145

dropout, 142–145

L1 and L2 regularization, 141–142

model generalization and, 140–141

P

Pac-Man

discount factor (decay) and, 288–289

DQN agent initialization and, 296

Padding

convolutional example of, 163–167

as convolutional filter hyperparameter, 167–168

standardizing length of IMDb film reviews, 228–229

Parameter initialization, building own project, 319

Parameters. See also Hyperparameters

Cart-Pole DQN agent initialization, 295–297

creating dense network classifier architecture, 230–232

escaping local minimum, 122–124

gradient descent minimizing cost across multiple, 116–117

pooling layers reducing overall, 169–170

saving model, 300

weight initialization, 132–135

Parametric ReLU activation function, 96

Partial-derivative calculus, cross-entropy cost, 114–115

Patches, in convolutional layers, 160

PCA (principal component analysis), 213

Perceptrons

choosing, 96

hot dog/not hot dog detector example, 86–90

modern neurons vs., 91

as most important equation in this book, 90–91

overview of, 86

Performance

hyperparameter optimization using SLM Lab, 303–306

Software 2.0 and, 326

PG. See Policy gradient (PG) algorithm

Phonemes, natural human language and, 34

Phonology, natural human language and, 34–35

Photorealistic images, creating. See Machine art

Phraser() method, NLP, 202–203, 204–205

Phrases() method, NLP, 202–203, 204–205

Pichai, Sundar, 35–36

pix2pix web application, 45–46

Pixels

computational complexity and, 160

converting integers to floats, 82

convolutional example of, 163–167

convolutional layers and, 160–162

handwritten MNIST digits as, 77–78

kernel size hyperparameter of convolutional filters, 167

reformatting data for shallow net, 81–83

schematic diagram of shallow network, 78–79

two-dimensional imagery and, 159–160

Plotting

GAN training accuracy, 281

GAN training loss, 280–281

word vectors, 213–217

Policy function (π), discounted future reward, 288–290

Policy gradient (PG) algorithm

actor-critic using Q-learning with, 307–308

in deep reinforcement learning, 68

REINFORCE algorithm as, 307

Policy networks, AlphaGo, 61

Policy optimization

agents beyond DQN using, 307

building neural network model for, 288–290

estimating optimal action via Q-learning, 290–292

RL agent using actor-critic with, 307–308

Pooling layers, 169–170, 176

Positive rewards, deep reinforcement learning, 56, 57

Prediction

selecting action for DQN agent, 300

training dense sentiment classifier, 232

training DQN agent via memory replay, 299

word2vec using predictive models, 208

Preprocessing natural language data

converting all characters to lowercase, 199–200

full corpus, 203–206

handling n-grams, 202–203

overview of, 195–197

removing stop words and punctuation, 200–201

stemming, 201

tokenization, 197–199

Principal component analysis (PCA), 213

Probability distribution, Markov decision processes, 288

Processing power, AlexNet vs. LeNet-5, 16–17

Project Gutenberg. See Preprocessing natural language data

Punctuation

processing full corpus, 204–206

removing, 196, 200

Python, for example code in this book, 75–76

PyTorch

building deep neural network in, 343–344

deep learning library, 323–324

features, 339–340

installation, 341

in practice, 341–343

TensorFlow vs., 340–341

Q

Q-learning networks

actor-critic combining PG algorithms with, 307–308

DQNs. See Deep Q-learning networks (DQNs)

Q-value functions

agents beyond DQN optimizing, 306

drawbacks of DQN agents, 306

estimating optimal, 291–292

training DQN agent via memory replay, 299

Quadratic cost, 112–113

Quake III Arena, DeepMind Lab built on, 69

Quick, Draw! game

GANs and, 263–266

for hundreds of machine-drawn sketches, 48

introduction to deep learning, 19

R

R-CNN

Fast R-CNN, 184

Faster R-CNN, 184–185

Mask R-CNN, 186–187

object detection application, 183–184

Radford, Alec, 41–44

RAM (memory), batch size/stochastic gradient descent and, 119–122

rand function, DQN agent action selection, 299–300

randrange function, DQN agent action selection, 300

Rectified linear unit neurons. See ReLU (rectified linear unit) neurons

Recurrent neural networks (RNNs)

bidirectional LSTM, 247–248

LSTM, 244–247

LSTM cell as layer of NLP in, 53

overview of, 240–244

stacked recurrent models, 248–250

Redmon, Joseph, 185–186

Reformatting data, coding shallow network, 81–83

Regions of interest (ROIs)

developing Faster R-CNN, 184–185

image segmentation with Mask R-CNN, 187

object detection with Fast R-CNN, 184

object detection with R-CNN, 183–184

Regression, improving deep networks, 149–152

REINFORCE algorithm, agents beyond DQN using, 307

Reinforcement learning

building socially beneficial projects, 318

essential theory of, 283–286

overview of, 49

problems of machine learning, 54–56

as sequential decision-making problems, 284

Reinforcement Learning: An Introduction (Barto), 292

Reinforcement learning, deep

agents beyond DQN, 306–308

board games. See Board games

building own project. See Deep learning projects, building own

Cart-Pole game, 284–286

DeepMind DQN using, 58–59

defining DQN agent, 293–300

essential theory of deep Q-learning networks, 290–292

essential theory of reinforcement learning, 283–286

game-playing applications. See Game-playing machines

hyperparameter optimization with SLM Lab, 303–306

interacting with OpenAI Gym environment, 300–303

key concepts, 308–309

manipulation of objects, 67–68

Markov decision processes, 286–288

optimal policy, 288–290

overview of, 56–57, 283

popular learning environments for, 68–71

summary, 308

video games, 57–60

ReLU (rectified linear unit) neurons

with Glorot distributions, 136–137

neural network model for DQN agent, 297

overview of, 94–95

as preferred neuron type, 96

TensorFlow Playground demo, 100

Representation learning, 22, 51

requires_grad argument, PyTorch, 342

Residual connections, 180–182

Residual modules, 180–182

Residual networks (ResNets), 180–182

Resources, building deep learning projects, 317–318

return_sequencesTrue, stacking recurrent layers, 248

Reward(s)

deep Q-learning network theory, 290–292

DeepMind DQN and, 59

DeepMind Lab, 69, 71

DQN agent gameplay, 298

Markov decision processes (MDPs), 287–289

optimal policy, 288–290

reinforcement learning problems and, 56

theory of reinforcement learning, 283

training DQN agent via memory replay, 298–299

Ridge regression, reducing model overfitting, 141–142

RMSProp, 147

RMSProp optimizer, 147

ROC AUC metric

as area under ROC curve, 217–218

calculating, 219–222, 234

confusion matrix, 218–219

for sentiment classifier model architectures, 256

ROIs. See Regions of interest (ROIs)

Rosenblatt, Frank, 86–90

Round of training, stochastic gradient descent, 120–121

Running time, Software 2.0 and, 325

S

Sabour, Sara, 192

Saturated neurons

as flaw in calculating quadratic cost, 112–113

minimizing impact using cross-entropy cost, 113–115

reducing with cross-entropy cost and weight initialization, 131–135

weight initialization, Glorot normal distribution, 136

Saving model parameters, 300

Schematic diagram

activation values in feature map of convolutional layer, 164

coding shallow network in Keras, 77–79

of discriminator network, 268

of generator network, 270

of LSTM, 245

of recurrent neural network, 241

wide and deep modeling, 317

Schmidhuber, Jürgen, 244

Search, automating hyperparameter, 321

Search engines, NLP in, 23–24

Sedol, Lee, 62

See-in-the-Dark dataset, image processing, 47–48

Semantics, natural human language and, 34–35

sentences argument, word2vec, 210

Sentiment classifier

bidirectional LSTM, 247–248

convolutional, 236–239

dense, 229–235

LSTM architecture, 247

LSTM hyperparameters, 246–247

non-sequential architecture example, 251–255

performance of model architectures, 256

seq2seq (sequence-to-sequence), and attention, 250

Sequential decision-making problems, 284

Sequential model, building for DQN agent, 297–298

sg argument, word2vec, 210

SG (skip-gram) architecture, 207, 208

SGD. See Stochastic gradient descent (SGD)

Shadow Dexterous Hand, OpenAI Gym, 70

Shallow network

coding. See Coding shallow network in Keras

for dense networks, 108–110

intermediate-depth neural network in, 127–129

vs. deep learning, 78–79

Shogi, AlphaZero and, 65–66

Short-term memory, LSTM, 245–246

Sigmoid Function Jupyter notebook, 105

Sigmoid neuron(s)

activation function of, 92–94

for binary classification problems, 100–101, 105–106

choosing, 96

for shallow net in Keras, 79, 83

softmax function with single neuron equivalent to using, 108

weight initialization and, 133–137

Silver, David, 61–62, 65–66

Similarity score, running word2vec, 212–213

Simple neurons

forming primary visual cortex, 6–7

neocognition and, 8–9

SimpleRNN() layer, RNN sentiment classifier, 243

size argument, word2vec, 210

Skip-gram (SG) architecture, 207, 208

SLM Lab, 303–306, 316

Socially beneficial projects, deep learning projects, 318

Sodol, Lee, 62, 64

Softmax layer, fast food-classifying network, 106–108

Softmax probability output, Fast R-CNN, 184

Software dependencies, shallow net in Keras, 80

Sofware 2.0, deep learning models, 324–326

Speech recognition, NLP in, 24

Spell-checkers, 24

Squared error, as quadratic cost, 112

Stacked recurrent models, 248–250

StackGAN, photorealistic images from text, 45–46

State(s)

deep Q-learning network theory and, 290–292

DeepMind DQN and, 58

DQN agent, remembering gameplay, 298

Markov decision processes and, 286

optimal policy in deep reinforcement learning and, 289–290

reinforcement learning problems and, 56

reinforcement learning via Cart-Pole game and, 286

theory of reinforcement learning, 284

Static scatterplot, plotting word vectors, 214–216

Stemming, word

forgoing removal of, 203–206

overview of, 201

preprocessing natural language via, 196

Stochastic gradient descent (SGD)

escaping local minimum of cost via, 122–124

fancy optimizers for, 145–147

training deep networks using batch size and, 119–124

Stop words

forgoing removal of, 203–206

how to remove, 200

removing in NLP, 195–196

Stride length

as convolutional filter hyperparameter, 167

pooling layers using, 169–170

reducing computational complexity, 170

Style transfer, 44–45

Suleyman, Mustafa, 58

Supervised learning problems, machine learning, 53–54

Support vector machines, R-CNN, 183–184

Sutskever, Ilya, 14, 16

Sutton, Richard, 292

Syntax, natural human language and, 34–35

T

Tacotron, TTS engine, 36–37

Tanh neurons

activation function of, 94

choosing, 96

with Glorot distributions, 136–137

LSTM, 244–245

Target word

converting natural words to word vectors, 27–28

running word2vec, 207–209

Tensor processing units (TPUs), Google training neural networks, 64

TensorBoard dashboard, 152–154

TensorFlow, 321–323

TensorFlow Playground, 17–19, 100

Tensors, PyTorch

automatic differentiation in, 342–343

building deep neural network, 343–344

compatibility with NumPy operations, 324

features, 339–340

Terminal state, theory of reinforcement learning, 284

Text, creating photorealistic images from, 45–46

Text-to-speech (TTS) engine, Google Duplex, 36–37

Theano, deep learning library, 324

Theory, essential

of deep Q-learning networks, 290–292

of GANs, 259–262

of reinforcement learning, 283–284

of RNNs, 240–244

of word2vec, 206–209

Threshold value, perceptron equation, 89–91

Tokenization

examining IMDb data, 226–228

natural human language and, 35–36

preprocessing natural language, 195, 197–199

Torch, PyTorch as extension of, 323–324

torch.nn.NLLLoss() function, PyTorch, 344

TPUs (tensor processing units), Google training neural networks, 64

Traditional machine learning (ML) approach

deep learning approach vs., 11–12

entrants into ILSVRC using, 14–15

natural human language in, 33–35

one-hot encoding of words in, 25–26

understanding, 12–13

train() method

training DQN agent, 299

training GAN, 275–281

Training

AlexNet vs. LeNet-5, 16–17

AlphaGo vs. AlphaGo Zero, 63–65

TensorFlow Playground, 17–19

Training deep networks

adversarial network, 272–274

backpropagation, 124–125

batch size and stochastic gradient descent, 119–122

coding shallow network in Keras, 83–84

convolutional sentiment classifier, 238

cost functions, 111–115

cross-entropy cost, 113–115

data augmentation for, 145

deep neural network in Keras, 147–149

dense sentiment classifier, 232

escaping local minimum, 122–124

generative adversarial networks (GANs), 259–262, 275–281

gradient descent, 115–117

intermediate-depth neural network, 128–129

intermediate net in Keras, 127–129

key concepts, 130

learning rate, 117–119

minimizing cost via optimization, 115

overview of, 111

preventing overfitting with dropout, 142–145

quadratic cost, 112–113

recurrent neural networks (RNNs), 241

running word2vec, 208

saturated neurons, 112–113

summary, 129–130

transfer learning model of, 188–192

tuning hidden-layer and neuron counts, 125–126

via memory replay for DQN agent, 298–299

Transfer learning

machine vision and, 188–192

natural language and, 230

in NLP, 251

overview of, 188–192

Truncation, standardizing film review length, 228–229

TSNE() method, plotting word vectors, 214–216

TTS (text-to-speech) engine, Google Duplex, 36–37

Two-dimensional images, flattening to one dimension, 82

Two-dimensional structure of visual imagery

overview of, 159–160

retaining in convolutional layers, 167

retaining using LeNet-5 in Keras, 172

U

U-Net, image segmentation, 187–188

ULMFiT (universal language model fine-tuning), transfer learning, 251

United States Postal Service, LeNet-5 reading ZIP codes, 11

Unity ML-Agents plug-in, 71, 304

Unstable gradients, improving deep networks, 137–139

Unsupervised learning problems, machine learning, 54

Upsampling layers, 187, 272

V

Validation data, 232–235, 239

Value functions, Q-learning, 291–292

Value networks, AlphaGo algorithm, 61

Value optimization

agents beyond DQN using, 306

RL agent using actor-critic algorithm and, 307–308

Vanishing gradient problem

in artificial neural networks, 137–138

performance degradation in deep CNNs, 179–180

Vector space

embeddings. See Word vectors

latent space similarities to, 42–43

word meaning represented by three dimensions, 27–29

word-vector arithmetic, 29–30

Venn diagram, 22, 50

VGGNet, 178–179, 188–192

Video games, 57–60

Viola, Paul, 12–13

Visual imagery, two-dimensional structure of, 159–160

Visual perception

cerebral cortex research on, 4–7

development of species on planet due to, 3–4

W

WaveNet, Google Duplex TTS engine, 36–37

Weight initialization, 131–137

Weighted sum, perceptron algorithm, 86–89

Weight(s)

backpropagation and, 125, 335–337

convolutional example of, 163–167

of kernels in convolutional layers, 160–162

minimizing cost via gradient descent, 115–116

notation for neural networks, 334

Wide and deep modeling approach, Google, 317

Wiesel, Torsten

LeNet-5 model built on work of, 10–12

machine vision using work of, 8–9

research on visual cortex, 4–7

window argument, word2vec, 210

Wittgenstein, Ludwig, 21

Word embeddings. See Word vectors

Word vectors. See also word2vec

arithmetic of, 29–30

capturing word meaning, 195

computational representations. See Language, computational representations of

convolutional filters detecting triplets of, 239

evaluating, 209

localist vs. distributed representations, 32–33

in NLP. See Natural language processing (NLP)

online pretrained, 230

plotting, 213–217

training on natural language data, 229–230

word2viz tool for exploring, 30–32

word2vec

converting natural words to word vectors, 28

essential theory behind, 206–209

evaluating word vectors, 209

FastText as leading alternative to, 209

plotting word vectors, 213–217

running, 209–213

word embeddings, 206

Words

creating embeddings with word2vec. See word2vec

natural human language and, 33–35

preprocessing natural language. See Preprocessing natural language data

word_tokenize() method, natural language, 199

workers argument, word2vec, 211

X

Xavier Glorot distributions, improving deep networks, 135–137

Y

Yelp review polarity, 316

YOLO (You Only Look Once), object detection, 185–186

Z

Zhang, Xiang, 315

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Index

Create new playlist

Sign In

Sign Up

Index

A

B

C

D

E

F

G

H

I

J

K

L

M

N

O

P

Q

R

S

T

U

V

W

X

Y

Z

Table of Contents for
Index