Figures

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Figures

P.1 Santiago Cajal (1852–1934)

P.2 A hand-drawn diagram from Cajal’s (1894) publication showing the growth of a neuron (a–e) and contrasting neurons from frog (A), lizard (B), rat (C), and human (D) samples

P.3 The reading trilobite enjoys expanding your knowledge

P.4 This trilobite calls attention to tricky passages of text

1.1 The number of species on our planet began to increase rapidly 550 million years ago, during the prehistoric Cambrian period

1.2 A bespectacled trilobite

1.3 The Nobel Prize-winning neurophysiologists Torsten Wiesel (left) and David Hubel

1.4 Hubel and Wiesel used a light projector to present slides to anesthetized cats while they recorded the activity of neurons in the cats’ primary visual cortex

1.5 A simple cell in the primary visual cortex of a cat fires at different rates, depending on the orientation of a line shown to the cat

1.6 A caricature of how consecutive layers of biological neurons represent visual information in the brain of, for example, a cat or a human

1.7 Regions of the visual cortex

1.8 Abridged timeline of biological and machine vision, highlighting the key historical moments in the deep learning and traditional machine learning approaches to vision that are covered in this section

1.9 Paris-born Yann LeCun is one of the preeminent figures in artificial neural network and deep learning research

1.10 Yoshua Bengio is another of the leading characters in artificial neural networks and deep learning

1.11 LeNet-5 retains the hierarchical architecture uncovered in the primary visual cortex by Hubel and Wiesel and leveraged by Fukushima in his neocognitron

1.12 Feature engineering—the transformation of raw data into thoughtfully transformed input variables—often predominates the application of traditional machine learning algorithms

1.13 Engineered features leveraged by Viola and Jones (2001) to detect faces reliably

1.14 The hulking ImageNet dataset was the brainchild of Chinese-American computer science professor Fei-Fei Li and her colleagues at Princeton in 2009

1.15 Performance of the top entrants to the ILSVRC by year

1.16 The eminent British-Canadian artificial neural network pioneer Geoffrey Hinton, habitually referred to as “the godfather of deep learning” in the popular press

1.17 AlexNet’s hierarchical architecture is reminiscent of LeNet-5, with the first (left-hand) layer representing simple visual features like edges, and deeper layers representing increasingly complex features and abstract concepts

1.18 This deep neural network is ready to learn how to distinguish a spiral of orange dots (negative cases) from blue dots (positive cases) based on their position on the X₁ and X₂ axes of the grid on the right

1.19 The network after training

2.1 Venn diagram that distinguishes the traditional family from the representation learning family of machine learning techniques

2.2 NLP sits at the intersection of the fields of computer science, linguistics, and artificial intelligence

2.3 Milestones involving the application of deep learning to natural language processing

2.4 One-hot encodings of words, such as this example, predominate the traditional machine learning approach to natural language processing

2.5 A toy-sized example for demonstrating the high-level process behind techniques like word2vec and GloVe that convert natural language into word vectors

2.6 Diagram of word meaning as represented by a three-dimensional vector space

2.7 Examples of word-vector arithmetic

2.8 The default screen for word2viz, a tool for exploring word vectors interactively

2.9 Relationships between the elements of natural human language

3.1 High-level schematic of a generative adversarial network (GAN)

3.2 Results presented in Goodfellow and colleagues’ 2014 GAN paper

3.3 An example of latent-space arithmetic from Radford et al. (2016)

3.4 A cartoon of the latent space associated with generative adversarial networks (GANs)

3.5 Photos converted into the styles of well-known painters by CycleGANs

3.6 A mutant three-eyed cat (right-hand panel) synthesized via the pix2pix web application

3.7 Photorealistic high-resolution images output by StackGAN, which involves two GANs stacked upon each other

3.8 A sample image (left) processed using a traditional pipeline (center) and the deep learning pipeline by Chen et al. (right)

3.9 Novel “hand drawings” of apples produced by the GAN architecture we develop together in Chapter 12

4.1 Venn diagram showing the relative positioning of the major concepts covered over the course of this book

4.2 Generalization of deep learning model architectures

4.3 The reinforcement learning loop

4.4 Demis Hassabis cofounded DeepMind in 2010 after completing his PhD in cognitive neuroscience at University College London

4.5 The normalized performance scores of Mnih and colleagues’ (2015) DQN relative to a professional game tester: Zero percent represents random play, and 100% represents the pro’s best performance

4.6 The Go board game

4.7 David Silver is a Cambridge- and Alberta-educated researcher at Google DeepMind

4.8 The Elo score of AlphaGo (blue) relative to Fan Hui (green) and several Go programs (red)

4.9 Comparing Elo scores between AlphaGo Zero and other AlphaGo variations or other Go programs

4.10 Comparing Elo scores between AlphaZero and each of its opponents in chess, shogi, and Go

4.11 Chelsea Finn is a doctoral candidate at the University of California, Berkeley, in its AI Research Lab

4.12 Sample images from Levine, Finn, et al. (2016) exhibiting various object-manipulation actions the robot was trained to perform

4.13 A sampling of OpenAI Gym environments

4.14 A DeepMind Lab environment, in which positive-reward points are awarded for capturing scrumptious green apples

5.1 A sample of a dozen images from the MNIST dataset

5.2 The Danish computer scientist Corinna Cortes is head of research at Google’s New York office

5.3 Each handwritten MNIST digit is stored as a 28×28-pixel grayscale image

5.4 A rough schematic of the shallow artificial-neural-network architecture we’re whipping up in this chapter

5.5 The first MNIST digit in the validation dataset (X_valid[0]) is a seven

6.1 The anatomy of a biological neuron

6.2 The American neurobiology and behavior researcher Frank Rosenblatt

6.3 Schematic diagram of a perceptron, an early artificial neuron

6.4 First example of a hot dog-detecting perceptron: In this instance, it predicts there is indeed a hot dog

6.5 Second example of a hot dog-detecting perceptron: In this instance, it predicts there is not a hot dog

6.6 Third example of a hot dog-detecting perceptron: In this instance, it again predicts the object presented to it is a hot dog

6.7 The general equation for artificial neurons that we will return to time and again

6.8 The perceptron’s transition from outputting zero to outputting one happens suddenly, making it challenging to gently tune w and b to match a desired output

6.9 The sigmoid activation function

6.10 The tanh activation function

6.11 The ReLU activation function

7.1 A dense network of artificial neurons, highlighting the inputs to the neuron labeled a₁

7.2 Our hot dog-detecting network from Figure 7.1, now highlighting the activation output of neuron a₁, which is provided as an input to both neuron a₄ and neuron a₅

7.3 Our hot dog-detecting network, with the activations providing input to the output neuron ŷ highlighted

7.4 Our food-detecting network, now with three softmax neurons in the output layer

7.5 A summary of the model object from our Shallow Net in Keras Jupyter notebook

8.1 Plot reproducing the tanh activation function shown in Figure 6.10, drawing attention to the high and low values of z at which a neuron is saturated

8.2 A trilobite using gradient descent to find the value of a parameter p associated with minimal cost, C

8.3 A trilobite exploring along two model parameters—p₁ and p₂—in order to minimize cost via gradient descent

8.4 The learning rate (η) of gradient descent expressed as the size of a trilobite

8.5 An individual round of training with stochastic gradient descent

8.6 An outline of the overall process for training a neural network with stochastic gradient descent

8.7 A trilobite applying vanilla gradient descent from a random starting point (top panel) is ensnared by a local minimum of cost (middle panel)

8.8 The speed of learning over epochs of training for a deep learning network with five hidden layers

8.9 A summary of the model object from our Intermediate Net in Keras Jupyter notebook

8.10 The performance of our intermediate-depth neural network over its first four epochs of training

9.1 Histogram of the a activations output by a layer of sigmoid neurons, with weights initialized using a standard normal distribution

9.2 Histogram of the a activations output by a layer of sigmoid neurons, with weights initialized using the Glorot normal distribution

9.3 The activations output by a dense layer of 256 neurons, while varying activation function (tanh or ReLU) and weight initialization (standard normal or Glorot uniform)

9.4 Batch normalization transforms the distribution of the activations output by a given layer of neurons toward a standard normal distribution

9.5 Fitting y given x using models with varying numbers of parameters

9.6 Dropout, a technique for reducing model overfitting, involves the removal of randomly selected neurons from a network’s hidden layers in each round of training

9.7 Our deep neural network architecture peaked at a 97.87 percent validation following 15 epochs of training, besting the accuracy of our shallow and intermediate-depth architectures

9.8 The TensorBoard dashboard enables you to, epoch over epoch, visually track your model’s cost (loss) and accuracy (acc) across both your training data and your validation (val) data

10.1 When reading a page of a book written in English, we begin in the top-left corner and read to the right

10.2 A 3×3 kernel and a 3×3-pixel window

10.3 This schematic diagram demonstrates how the activation values in a feature map are calculated in a convolutional layer

10.4 A continuation of the convolutional example from Figure 10.3, now showing the activation for the next filter position

10.5 Finally, the activation for the last filter position has been calculated, and the activation map is complete

10.6 A graphical representation of the input array (left; represented here is a three-channel RGB image of size 32×32 with the kernel patch currently focused on the first—i.e., top-left—position) and the activation map (right)

10.7 An example of a max-pooling layer being passed a 4×4 activation map

10.8 A summary of our LeNet-5-inspired ConvNet architecture

10.9 Our LeNet-5-inspired ConvNet architecture peaked at a 99.27 percent validation accuracy following nine epochs of training, thereby outperforming the accuracy of the dense nets we trained earlier in the book

10.10 A general approach to CNN design: A block (shown in red) of convolutional layers (often one to three of them) and a pooling layer is repeated several times

10.11 A schematic representation of a residual module

10.12 Shown at left is the conventional representation of residual blocks within a residual network

10.13 These are examples of various machine vision applications

10.14 These are examples of object detection (performed on four separate images by the Faster R-CNN algorithm)

10.15 This is an example of image segmentation (as performed by the Mask R-CNN algorithm)

10.16 With convolutional neural networks, which are agnostic to the relative positioning of image features, the figure on the left and the one on the right are equally likely to be classified as Geoff Hinton’s face

11.1 The second sentence of Jane Austen’s classic Emma tokenized to the word level

11.2 A dictionary of bigrams detected within our corpus

11.3 Sample of a dictionary of bigrams detected within our lowercased and punctuation-free corpus

11.4 Sample of a more conservatively thresholded dictionary of bigrams

11.5 Clean, preprocessed sentence from the Project Gutenberg corpus

11.6 The location of the token “dog” within the 64-dimensional word-vector space we generated using a corpus of books from Project Gutenberg

11.7 This is a Pandas DataFrame containing a two-dimensional representation of the word-vector space we created from the Project Gutenberg corpus

11.8 Static two-dimensional word-vector scatterplot

11.9 Interactive bokeh two-dimensional word-vector plot

11.10 Clothing words from the Project Gutenberg corpus, revealed by zooming in to a region of the broader bokeh plot from Figure 11.9

11.11 The (orange-shaded) area under the curve of the receiving operator characteristic, determined using the TPRs and FPRs from Table 11.5

11.12 The first two film reviews from the training dataset of Andrew Maas and colleagues’ (2011) IMDb dataset

11.13 The first film review from the training dataset, now shown as a character string

11.14 The first film review from the training dataset, now shown in full as a character string

11.15 The sixth film review from the training dataset, padded with the PAD token at the beginning so that—like all the other reviews—it has a length of 100 tokens

11.16 Dense sentiment classifier model summary

11.17 Training the dense sentiment classifier

11.18 Histogram of validation data ŷ values for the second epoch of our dense sentiment classifier

11.19 DataFrame of y and ŷ values for the IMDb validation data

11.20 An example of a false positive: This negative review was misclassified as positive by our model

11.21 An example of a false negative: This positive review was misclassified as negative by our model

11.22 Convolutional sentiment classifier model summary

11.23 Training the convolutional sentiment classifier

11.24 Histogram of validation data ŷ values for the third epoch of our convolutional sentiment classifier

11.25 Schematic diagram of a recurrent neural network

11.26 Schematic diagram of an LSTM

11.27 A non-sequential model architecture: Three parallel streams of convolutional layers—each with a unique filter length (k = 2, k = 3, or k = 4)—receive input from a word-embedding layer

12.1 Highly simplified schematic diagrams of the two models that make up a typical GAN: the generator (left) and the discriminator (right)

12.2 This is an outline of the discriminator training loop

12.3 An outline of the generator training loop

12.4 Example of sketches drawn by humans who have played the Quick, Draw! game

12.5 The directory structure inside the Docker container that is running Jupyter

12.6 This example bitmap is the 4,243rd sketch from the apple category of the Quick, Draw! dataset

12.7 A schematic representation of our discriminator network for predicting whether an input image is real (in this case, a hand-drawn apple from the Quick, Draw! dataset) or fake (produced by an image generator)

12.8 A schematic representation of our generator network, which takes in noise (in this case, representing 32 latent-space dimensions) and outputs a 28×28-pixel image

12.9 Shown here is a summary of the whole adversarial network

12.10 Fake apple sketches generated after 100 epochs of training our GAN

12.11 Fake apple sketches after 200 epochs of training our GAN

12.12 Fake apple sketches after 1,000 epochs of training our GAN

12.13 GAN training loss over epochs

12.14 GAN training accuracy over epochs

13.1 The objective of the Cart-Pole game is to keep the brown pole balanced upright on top of the black cart for as long as possible

13.2 The Cart-Pole game ends early if the pole falls toward horizontal or the cart is navigated off-screen

13.3 The reinforcement learning loop (top; a rehashed version of Figure 4.3, provided again here for convenience) can be considered a Markov decision process, which is defined by the five components S, A, R, ℙ, and γ (bottom)

13.4 Based on the discount factor γ, in a Markov decision process more-distant reward is discounted relative to reward that’s more immediately attainable

13.5 The policy function π enables an agent to map any state s (from the set of all possible states S) to an action a from the set of all possible actions A

13.6 The biggest star in the field of reinforcement learning, Richard Sutton has long been a computer science professor at the University of Alberta

13.7 As in Figure 13.4, here we use the Pac-Man environment (with a green trilobite representing a DQN agent in place of the Mr. Pac-Man character) to illustrate a reinforcement learning concept

13.8 The performance of our DQN agent during its first 10 episodes playing the Cart-Pole game

13.9 The performance of our DQN agent during its final 10 episodes playing the Cart-Pole game

13.10 An experiment run with SLM Lab, investigating the impact of various hyperparameters (e.g., hidden-layer architecture, activation function, learning rate) on the performance of a DQN agent within the Cart-Pole environment

13.11 The broad categories of deep reinforcement learning agents

13.12 The actor-critic algorithm combines the policy gradient approach to reinforcement learning (playing the role of actor) with the Q-learning approach (playing the role of critic)

14.1 Following our pixel-by-pixel rendering of an MNIST digit (Figure 5.3), this is an example of an image from the Fashion-MNIST dataset

14.2 The wide and deep model architecture concatenates together inputs from two separate legs

14.3 A plot of training loss (red) and validation loss (blue) over epochs of model training

14.4 A strictly structured grid search (shown in the left-hand panel) is less likely to identify optimal hyperparameters for a given model than a search over values that are sampled randomly over the same hyperparameter ranges (right-hand panel)

14.5 The relative Google search frequency (from October 2015 to February 2019) of five of the most popular deep learning libraries

14.6 Andrej Karpathy is the director of AI at Tesla, the California-based automotive and energy firm

14.7 Trilobyte waving good-bye

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Figures

Create new playlist

Sign In

Sign Up

Figures

Table of Contents for
Figures