© Poornachandra Sarang 2021
P. SarangArtificial Neural Networks with TensorFlow 2https://doi.org/10.1007/978-1-4842-6150-7_13

13. Image Generation

Poornachandra Sarang1  
(1)
Mumbai, India
 

Did you ever imagine that neural networks could be used for generating complex color images? How about Anime? How about the faces of celebrities? How about a bedroom? Doesn’t it sound interesting? All these are possible with the most interesting idea in neural networks and that is Generative Adversarial Networks (GANs). The idea was introduced and developed by Ian J. Goodfellow in 2014. The images created by GAN look so real that it becomes practically impossible to differentiate between a fake and a real image. Be warned, to generate complex images of this nature, you would require lots of resources to train the network, but it does certainly work as you would see when you study this chapter. So let us look at what is GAN.

GAN – Generative Adversarial Network

In GAN, there are two neural network models which are trained simultaneously by an adversarial process. One network is called a generator, and the other is called a discriminator. A generator (an artist) learns to create images that look real. A discriminator (the critic) learns to tell real images apart from the fakes. So these are two competing models which try to beat each other. At the end, if you are able to train the generator to outperform the discriminator, you would have achieved your goal.

How Does GAN Work?

As I said, the GAN consists of two networks. Training a GAN requires two parts:
  1. 1.

    Keeping the generator idle, train the discriminator. Train the discriminator on real images for a number of epochs and see if it can correctly predict them as real. In the same training phase, train the discriminator on the fake images (generated by the generator) and see if it can predict them as fake.

     
  2. 2.

    Keeping the discriminator idle, train the generator. Use the prediction results made by the discriminator on the fake images to improve upon those images.

     

The preceding steps are repeated for a large number of epochs, and the results (fake images) are examined manually to see if they look real. If they do, stop the training. If not, continue the preceding two steps until the fake image looks real.

The whole process is depicted in Figure 13-1.
../images/495303_1_En_13_Chapter/495303_1_En_13_Fig1_HTML.jpg
Figure 13-1

GAN working schematic

The Generator

The generator schematic is shown in Figure 13-2.
../images/495303_1_En_13_Chapter/495303_1_En_13_Fig2_HTML.jpg
Figure 13-2

Generator architecture

The generator takes a random noise vector of some dimension. We will be using a dimension of 100 in our examples. From this random vector, it generates an image of 64x64x3. The image is upscaled by a series of transitions through convolutional layers. Each convolutional transpose layer is followed by a batch normalization and a leaky ReLU. The leaky ReLU has neither vanishing gradient problems nor dying ReLU problems. The leaky ReLU attempts to fix the “dying ReLU” problem. The ReLU will acquire a small value of 0.01 or so instead of dying out to 0. We use strides in each convolutional layer to avoid unstable training.

Note how, at each convolution, the image is upscaled to create a final image of 64x64x3.

The Discriminator

The discriminator schematic is shown in Figure 13-3.
../images/495303_1_En_13_Chapter/495303_1_En_13_Fig3_HTML.jpg
Figure 13-3

Discriminator architecture

The discriminator too uses Convolutional layers and downsizes the given image for evaluation.

Mathematical Formulation

The working of GAN can be formulated with the simple mathematical equation given here:
$$ frac{mathit{min}}{G}frac{mathit{max}}{D}Vleft(D,G
ight)=frac{mathit{min}}{G}frac{mathit{max}}{D}left({E}_{zsim {P}_{data}(x)}left[ logD(x)
ight]+{E}_{zsim {P}_z(z)}left[mathit{log}left(1-Dleft(G(z)
ight)
ight)
ight]
ight) $$

where G represents the generator and D the discriminator. The data(x) represents the distribution of real data and pz(z) the distribution of generated or fake data. The x represents a sample from real data and z from the generated data. The D(x) represents the discriminator network and G(z) the generator network.

The discriminator loss while training on the real data is expressed as
$$ Lleft(D(x),1
ight)=mathit{log}left(D(x)
ight) $$
(1)

The discriminator loss while training on the fake data coming from the generator is expressed as

$$ Lleft(Dleft(G(z)
ight),0
ight)=mathit{log}left(1-Dleft(G(z)
ight)
ight) $$
(2)

For real data, the discriminator prediction should be close to 1. Thus, the equation 1 should be maximized in order to get D(x) close to 1. The first equation is the discriminator loss on real data, which should be maximized in order to get D(G(z)) close to 1. Because the second equation is the discriminator loss on the fake data, it should also be maximized. Note that the log is an increasing function.

For the second equation, the discriminator prediction on the generated fake data should be close to zero. In order to maximize the second equation, we have to minimize the value of D(G(z)) to zero. Thus, we need to maximize both losses for the discriminator. The total loss of the discriminator is the addition of two losses given by equations 1 and 2. Thus, the combined total loss will also be maximized.

The generator loss is expressed as
$$ {L}^{(G)}=mathit{min}left[mathit{log}left(D(x)
ight)+mathit{log}left(1-Dleft(G(z)
ight)
ight)
ight] $$
(3)

We need to minimize this loss during the generator training.

Digit Generation

In this project, we will use the popular MNIST dataset provided by Kaggle. As you know, the dataset consists of images of handwritten digits. We will create a GAN model to create additional images looking identical to these images. Maybe this could be helpful to somebody in increasing the training dataset size in their future developments.

Creating Project

Create a Colab project and rename it to DigitGen-GAN. Import the required libraries.
import matplotlib.pyplot as plt
import tensorflow as tf
import numpy as np
import time
from tensorflow import keras
from tensorflow.keras import layers
import os

Loading Dataset

You load the datasets into your code using the following statement:
(train_images,train_labels),
      (test_images,test_labels) =
              tf.keras.datasets.mnist.load_data()
The training and testing datasets are loaded into separate numpy arrays. As we are interested in generating a single digit, say 9, we will extract all images containing 9 from the train dataset.
digit9_images = []
for i in range(len(train_images)):
    if train_labels[i] == 9:
        digit9_images.append(train_images[i])
train_images = np.array(digit9_images)
train_images.shape

The shape is (5949, 28, 28). Thus, we have 5949 images of size 28x28. It is a huge repository. Our model will try to produce images of this size, matching the looks of these images.

You can verify that you have images only for digit 9 by printing a few on the terminal using the following code:
n = 10
f = plt.figure()
for i in range(n):
    f.add_subplot(1, n, i + 1)
    plt.subplot(1, n, i+1 ).axis("off")
    plt.imshow(train_images[i])
plt.show()
You will see the output in Figure 13-4.
../images/495303_1_En_13_Chapter/495303_1_En_13_Fig4_HTML.jpg
Figure 13-4

Sample images

We will now prepare the dataset for training.

Preparing Dataset

First, we reshaped the data using the following statement; note that the image size in our dataset is 28x28 pixels.
train_images = train_images.reshape (
    train_images.shape[0], 28, 28, 1).astype
                         ('float32')

As each color value in the image is in the range of 0 through 256, we normalize these values to the scale of –1 to 1 for better learning. The mean value is 127.5, and thus the following equation will normalize the values in the range –1 to +1. You could have alternatively chosen 255 to normalize the values between 0 and 1.

train_images = (train_images - 127.5) / 127.5
We create a batch dataset for training by calling the from_tensor_slices method.
train_dataset = tf.data.Dataset.from_tensor_slices(
    train_images).shuffle
            (train_images.shape[0]).batch(32)

Next comes the important part of our application, and that is defining the model for our generator.

Defining Generator Model

The generator’s purpose is to create images containing digit 9 which look similar to the images in our training dataset.

You will use the Keras sequential model for creating our generator.
gen_model = tf.keras.Sequential()
You will add the Keras Dense layer as the first layer. Optionally, you could make Conv2D layer as your first layer.
gen_model.add(tf.keras.layers.Dense(7*7*256,
                                    use_bias=False,
input_shape=(100,)))

The input to this layer is specified as 100 because later on we will use a noise vector of dimension 100 as an input to this GAN model. We will start with an image size of 7x7 and keep upscaling it to a final destination size of 28x28. The z-dimension 256 specifies the filters used for our image, which eventually gets converted to 3 for our final image.

Next, we add a batch normalization layer to the model for providing stability.
gen_model.add(tf.keras.layers.BatchNormalization())
gen_model.add(tf.keras.layers.LeakyReLU())

We add the activation layer as a leaky ReLU.

We reshape the output to 7x7x256:
gen_model.add(tf.keras.layers.Reshape((7, 7, 256)))
We now use the Conv2D layer to upscale the generated image.
gen_model.add(tf.keras.layers.Conv2DTranspose
                    (128, (5, 5),
                     strides=(1, 1),
                     padding='same',
                     use_bias=False))

The first parameter is the dimensionality of the output space, that is, the number of output filters in the convolution. The second parameter is the kernel_size that actually specifies the height and width of the convolution filter. The third parameter specifies the strides of the convolution along the height and width. To understand strides, consider a filter being moved across the image from left to right and top to bottom, 1 pixel at a time. This movement is referred to as the stride. With a stride of (2, 2), the filter is moved 2 pixels on each side, upscaling the image by 2x2.

Since the stride is specified as (1, 1), the output of this layer will be the image of size 7x7 – the same as its input. The padding ensures that the dimensions remain the same. The false value in the use_bias parameter indicates that the layer does not use a bias vector. This layer is then followed by the batch normalization and activation layers as before:
gen_model.add(tf.keras.layers.BatchNormalization())
gen_model.add(tf.keras.layers.LeakyReLU())
Next, we add the Conv2DTranspose layer with strides set to (2, 2) followed by the batch normalization and activation layers.
gen_model.add(tf.keras.layers.Conv2DTranspose
                     (64, (5, 5),
                      strides=(2, 2),
                      padding='same',
                      use_bias=False))
gen_model.add(tf.keras.layers.BatchNormalization())
gen_model.add(tf.keras.layers.LeakyReLU())

The output image will now be of size 14x14.

We now add the last Conv2D layer with strides equal to (2, 2), thus further upscaling the image to size 28x28. This is our final desired image size.
gen_model.add(tf.keras.layers.Conv2DTranspose
                      (1, (5, 5),
                       strides=(2, 2),
                       padding='same',
                       use_bias=False,
                       activation='tanh'))

The last layer uses tanh activation and the map parameter value of 1, thus giving us a single output image.

The model plot as generated by the plot utility is shown in Figure 13-5.
../images/495303_1_En_13_Chapter/495303_1_En_13_Fig5_HTML.jpg
Figure 13-5

Generator architecture

The model summary is shown in Figure 13-6.
../images/495303_1_En_13_Chapter/495303_1_En_13_Fig6_HTML.jpg
Figure 13-6

Generator model summary

Testing Generator

You will test the generator with a random input vector and display it on the console using the following code:
noise = tf.random.normal([1, 100])
#giving random input vector
generated_image = gen_model(noise, training=False)
plt.imshow(generated_image[0, :, :, 0], cmap="gray")
The image will look like the one shown in Figure 13-7.
../images/495303_1_En_13_Chapter/495303_1_En_13_Fig7_HTML.jpg
Figure 13-7

Random image generated by the generator

Check the shape of the image:
generated_image.shape
It gives the following output:
TensorShape([1, 28, 28, 1])

The output indicates that the image has dimensions of 28x28, as desired by us.

Next , we will define the discriminator.

Defining Discriminator Model

We define our discriminator as follows:
discri_model = tf.keras.Sequential()
discri_model.add(tf.keras.layers.Conv2D
                        (64, (5, 5),
                         strides=(2, 2),
                         padding='same',
                         input_shape=[28, 28, 1]))
discri_model.add(tf.keras.layers.LeakyReLU())
discri_model.add(tf.keras.layers.Dropout(0.3))
discri_model.add(tf.keras.layers.Conv2D
                        (128, (5, 5),
                         strides=(2, 2),
                         padding='same'))
discri_model.add(tf.keras.layers.LeakyReLU())
discri_model.add(tf.keras.layers.Dropout(0.3))
discri_model.add(tf.keras.layers.Flatten())
discri_model.add(tf.keras.layers.Dense(1))

The discriminator uses just two convolutional layers. The output of the last convolutional layer is of type (batch size, height, width, filters). The Flatten layer in our network flattens this output to feed it to our last Dense layer in the network.

The model plot is shown in Figure 13-8.
../images/495303_1_En_13_Chapter/495303_1_En_13_Fig8_HTML.jpg
Figure 13-8

Discriminator architecture

The model summary is given in Figure 13-9.
../images/495303_1_En_13_Chapter/495303_1_En_13_Fig9_HTML.jpg
Figure 13-9

Discriminator model summary

Note that the discriminator has only around 200,000 trainable parameters.

Testing Discriminator

You may test the discriminator by feeding it with our earlier generated image.
decision = discri_model(generated_image)
The discriminator will give a negative value if the image is fake and a positive value if it is real. Print the discriminator’s decision for this test image.
print (decision)
It prints out the following decision:
tf.Tensor([[0.0033829]], shape=(1, 1), dtype=float32)

The decision value is 0.0033, a positive integer indicating that the image is real. A maximum value of 1 tells us that the model is sure about the image being real. If you generate another image using our generator and test the output of the discriminator, you may get a negative result. This is because we have not yet trained our generator and discriminator on some datasets.

Defining Loss Functions

We will now define the loss function for our generator and discriminator.
cross_entropy = tf.keras.losses.BinaryCrossentropy
                              (from_logits=True)

We use Keras’s binary cross-entropy function for our purpose. Note that we have two classes – one (1) for a real image and the other (0) for a fake one. We compute our loss for both these classes, and thus it makes our problem a binary classification problem. Thus, we use a binary cross-entropy for the loss function.

We define a function to compute the generator loss as follows:
def generator_loss(generated_output):
    return cross_entropy(tf.ones_like
                (generated_output),generated_output)

The return value of the function is the quantification of how well the generator is able to trick the discriminator. If the generator does its job well, the discriminator will classify the fake image as real, returning a decision of 1. Thus, the function compares the discriminator’s decision on the generated image with an array of ones.

We define the discriminator loss function as follows:
def discriminator_loss(real_output,
                        generated_output):
# compute loss considering the image is real [1,1,...,1]
    real_loss = cross_entropy(tf.ones_like
                (real_output),real_output)
# compute loss considering the image is fake[0,0,...,0]
    generated_loss = cross_entropy(tf.zeros_like
                (generated_output),
                 generated_output)
    # compute total loss
    total_loss = real_loss + generated_loss
    return total_loss

We first let the discriminator consider that the given image is real and then compute the loss with respect to an array of ones. We then let the discriminator consider that the image is fake and then ask it to calculate the loss with respect to an array of zeros. The total loss as determined by the discriminator is the sum of these two losses.

We now define the optimizers for both generator and discriminator, which are set to Adam for both.
gen_optimizer = tf.optimizers.Adam(1e-4)
discri_optimizer = tf.optimizers.Adam(1e-4)

You will now write a few utility functions, which are used during training.

Defining Few Functions for Training

First, we declare a few variables:
epoch_number = 0
EPOCHS = 100
noise_dim = 100
seed = tf.random.normal([1, noise_dim])

We will train the model for 100 epochs. You can always change this variable to your choice. A higher number of epochs would generate a better image for the digit 9, which we are trying to generate. The noise dimension is set to 100 which is used while creating the random image for our first input to the generator network. The seed is set to random data for one image.

Checkpoint Setup

As the training may take a long time, we provide the checkpoint facility in the training so that the intermediate states of the generator and discriminator are saved to a local file.
checkpoint_dir =
        '/content/drive/My Drive/GAN1/Checkpoint'
checkpoint_prefix =
            os.path.join(checkpoint_dir, "ckpt")
checkpoint = tf.train.Checkpoint
             (generator_optimizer = gen_optimizer,
discriminator_optimizer=discri_optimizer,
              generator= gen_model,
              discriminator = discri_model)

In case of disconnection, you can continue the training from the last checkpoint. I will show you how to do this.

Setting Up Drive

The checkpoint data is saved to a folder called GAN1/Checkpoint in your Google Drive. So, before running the code, make sure that you have created this folder structure in your Google Drive.

Mount the drive in your project using the following code:
from google.colab import drive
drive.mount('/content/drive')
Change the current folder to the new location so that the checkpoint files are stored at this location.
cd '/content/drive/My Drive/GAN1'

Next, we write a gradient_tuning function.

Model Training Step

Both our generator and discriminator models will be trained in several steps. We will write a function for these steps.

We will use the gradient tape (tf.GradientTape) for automatic differentiation on both the generator and discriminator. The automatic differentiation computes the gradient of a computation with respect to its input variables. The operations executed in the context of a gradient tape are recorded onto a tape. The reverse mode differentiation is then used to compute the new gradients. You need not get concerned with these operations for understanding how the models are trained.

At each step, we will give a batch of images to the function as an input. We will ask the discriminator to produce outputs for both the training and the generated images. We will call the training output as real and the generated image output as fake. We will calculate the generator loss on the fake and the discriminator loss on both real and fake. We will use the gradient tape to compute the gradients on both using these losses and then apply the new gradients to the modes. The full function definition along with the comments on each line is given in Listing 13-1.
def gradient_tuning(images):
    # create a noise vector.
    noise = tf.random.normal([16, noise_dim])
    # Use gradient tapes for automatic
    # differentiation
    with tf.GradientTape()
          as generator_tape, tf.GradientTape()
          as discriminator_tape:
      # ask genertor to generate random images
      generated_images = gen_model(noise, training=True)
      # ask discriminator to evalute the real images and generate its output
      real_output = discri_model(images,
                        training = True)
      # ask discriminator to do the evlaution on generated (fake) images
      fake_output = discri_model(generated_images,
                        training = True)
      # calculate generator loss on fake data
      gen_loss = generator_loss(fake_output)
      # calculate discriminator loss as defined earlier
      disc_loss = discriminator_loss(real_output,
                        fake_output)
    # calculate gradients for generator
    gen_gradients = generator_tape.gradient
            (gen_loss, gen_model.trainable_variables)
    # calculate gradients for discriminator
    discri_gradients =
          discriminator_tape.gradient(disc_loss,
                discri_model.trainable_variables)
    # use optimizer to process and apply gradients to variables
    gen_optimizer.apply_gradients(zip(gen_gradients,
                      gen_model.trainable_variables))
    # same as above to discriminator
    discri_optimizer.apply_gradients(
        zip(discri_gradients,
            discri_model.trainable_variables))
Listing 13-1

Gradient tuning function

We now write one more function for generating the image of digit 9, our desired output, and saving it to your Google Drive.
    def generate_and_save_images(model, epoch,
                                   test_input):
        global epoch_number
        epoch_number = epoch_number + 1
        # set training to false to ensure inference mode
        predictions = model(test_input,
                                training = False)
        # display and save image
        fig = plt.figure(figsize=(4,4))
        for i in range(predictions.shape[0]):
            plt.imshow(predictions[i, :, :, 0] *
                       127.5 + 127.5, cmap="gray")
            plt.axis('off')
        plt.savefig('image_at_epoch_{:01d}.png'.format
                              (epoch_number))
        plt.show()

The function uses a global epoch_number to track the epochs, in case of disconnection and continuing thereof. The test_input to the model will always be our random seed. The inference is done on this seed after setting training to False for batch normalization to be effective. We then display the image on the user’s console and also save it to the drive with epoch_number added as a suffix.

Using these functions, you will now write code to train the models and generate some output.

Model Training

To train the generator and discriminator models, you need to set up a simple for loop as shown in Listing 13-2. The train method accepts the dataset of true images as its first parameter. I have put this as a parameter so that you can experiment with different datasets of images and/or different sizes. The second parameter is the number of epochs for which you want to perform the training. We call the gradient_tuning for batches of data in our dataset. At the end of each epoch, we generate and save the image to the user’s drive. Also, the network state is saved as a checkpoint so that we can continue the training in case of disconnection from the last checkpoint. The time taken to execute each epoch is tracked and printed on the user console. The function for the model training is given in Listing 13-2.
def train(dataset, epochs):
  for epoch in range(epochs):
    start = time.time()
    for image_batch in dataset:
      gradient_tuning(image_batch)
    # Produce images as we go
    generate_and_save_images(gen_model,
                             epoch + 1,
                             seed)
    # save checkpoint data
    checkpoint.save(file_prefix = checkpoint_prefix)
    print ('Time for epoch {} is {} sec'.format
                                  (epoch + 1,
                                  time.time()-start))
Listing 13-2

Model training function

The model training is now started with a call to this train method.
train(train_dataset, EPOCHS)
As the training progresses, the checkpoints are saved and the images are generated at each epoch. The images are displayed on the console and are also saved to your Google Drive. The output of my run for 100 epochs is shown in Table 13-1.
Table 13-1

Images for digit 9 generated by the program

../images/495303_1_En_13_Chapter/495303_1_En_13_Figa_HTML.gif

As you can see from the output, the network is able to create an acceptable output just after 20/30 epochs, and at 70 and above, the quality is best.

During training and in case of disconnection, you can restore the network state from a previous known checkpoint as shown in the statement here and continue the training.
#run this code only if there is a runtime disconnection
try:
     checkpoint.restore(tf.train.latest_checkpoint
                             (checkpoint_dir))
except Exception as error:
    print("Error loading in model :
                  {}".format(error))
train(train_dataset, EPOCHS)

In my run of this application, it took approximately 10 seconds for each epoch on a GPU. Many times, for more complex image generation, it may take several hours to get an acceptable output. In such cases, checkpoints would be useful in restarting the training.

Full Source

The full source code for generating images for handwritten digits is given in Listing 13-3.
import matplotlib.pyplot as plt
import tensorflow as tf
import numpy as np
import time
from tensorflow import keras
from tensorflow.keras import layers
import os
(train_images,train_labels),
      (test_images,test_labels) =
              tf.keras.datasets.mnist.load_data()
digit9_images = []
for i in range(len(train_images)):
    if train_labels[i] == 9:
        digit9_images.append(train_images[i])
train_images = np.array(digit9_images)
train_images.shape
n = 10
f = plt.figure()
for i in range(n):
    f.add_subplot(1, n, i + 1)
    plt.subplot(1, n, i+1 ).axis("off")
    plt.imshow(train_images[i])
plt.show()
train_images = train_images.reshape (
    train_images.shape[0], 28, 28, 1).astype
                         ('float32')
train_images = (train_images - 127.5) / 127.5
train_dataset = tf.data.Dataset.from_tensor_slices(
    train_images).shuffle
            (train_images.shape[0]).batch(32)
gen_model = tf.keras.Sequential()
# Feed network with a 7x7 random image
gen_model.add(tf.keras.layers.Dense(7*7*256,
                                    use_bias=False,
input_shape=(100,)))
# Add batch normalization for stability
gen_model.add(tf.keras.layers.BatchNormalization())
gen_model.add(tf.keras.layers.LeakyReLU())
# reshape the output
gen_model.add(tf.keras.layers.Reshape((7, 7, 256)))
# Apply (5x5) filter and shift of (1,1).
# The image output is still 7x7.
gen_model.add(tf.keras.layers.Conv2DTranspose
                    (128, (5, 5),
                     strides=(1, 1),
                     padding='same',
                     use_bias=False))
gen_model.add(tf.keras.layers.BatchNormalization())
gen_model.add(tf.keras.layers.LeakyReLU())
# apply stride of (2,2). The output image is now 14x14.
gen_model.add(tf.keras.layers.Conv2DTranspose
                     (64, (5, 5),
                      strides=(2, 2),
                      padding='same',
                      use_bias=False))
gen_model.add(tf.keras.layers.BatchNormalization())
gen_model.add(tf.keras.layers.LeakyReLU())
# another shift upscales the image to 28x28, whihch is our final size.
gen_model.add(tf.keras.layers.Conv2DTranspose
                      (1, (5, 5),
                       strides=(2, 2),
                       padding='same',
                       use_bias=False,
                       activation='tanh'))
gen_model.summary()
tf.keras.utils.plot_model(gen_model)
noise = tf.random.normal([1, 100])
#giving random input vector
generated_image = gen_model(noise, training=False)
plt.imshow(generated_image[0, :, :, 0], cmap="gray")
generated_image.shape
discri_model = tf.keras.Sequential()
discri_model.add(tf.keras.layers.Conv2D
                        (64, (5, 5),
                         strides=(2, 2),
                         padding='same',
                         input_shape=[28, 28, 1]))
discri_model.add(tf.keras.layers.LeakyReLU())
discri_model.add(tf.keras.layers.Dropout(0.3))
discri_model.add(tf.keras.layers.Conv2D
                        (128, (5, 5),
                         strides=(2, 2),
                         padding='same'))
discri_model.add(tf.keras.layers.LeakyReLU())
discri_model.add(tf.keras.layers.Dropout(0.3))
discri_model.add(tf.keras.layers.Flatten())
discri_model.add(tf.keras.layers.Dense(1))
discri_model.summary()
tf.keras.utils.plot_model(discri_model)
decision = discri_model(generated_image)
print (decision)
cross_entropy = tf.keras.losses.BinaryCrossentropy
                              (from_logits=True)
 #creating loss function
def generator_loss(generated_output):
    return cross_entropy(tf.ones_like
                (generated_output),generated_output)
def discriminator_loss(real_output,
                        generated_output):
# compute loss considering the image is real [1,1,...,1]
    real_loss = cross_entropy(tf.ones_like
                (real_output),real_output)
# compute loss considering the image is fake[0,0,...,0]
    generated_loss = cross_entropy(tf.zeros_like
                (generated_output),
                 generated_output)
    # compute total loss
    total_loss = real_loss + generated_loss
    return total_loss
gen_optimizer = tf.optimizers.Adam(1e-4)
discri_optimizer = tf.optimizers.Adam(1e-4)
epoch_number = 0
EPOCHS = 100
noise_dim = 100
seed = tf.random.normal([1, noise_dim])
checkpoint_dir =
        '/content/drive/My Drive/GAN1/Checkpoint'
checkpoint_prefix =
            os.path.join(checkpoint_dir, "ckpt")
checkpoint = tf.train.Checkpoint
             (generator_optimizer = gen_optimizer,
                                 discriminator_optimizer=discri_optimizer,
              generator= gen_model,
              discriminator = discri_model)
from google.colab import drive
drive.mount('/content/drive')
cd '/content/drive/My Drive/GAN1'
def gradient_tuning(images):
    # create a noise vector.
    noise = tf.random.normal([16, noise_dim])
    # Use gradient tapes for automatic differentiation
    with tf.GradientTape()
          as generator_tape, tf.GradientTape()
          as discriminator_tape:
      # ask genertor to generate random images
      generated_images = gen_model(noise,
                           training=True)
      # ask discriminator to evalute the real images and generate its output
      real_output = discri_model(images,
                        training = True)
      # ask discriminator to do the evlaution on generated (fake) images
      fake_output = discri_model(generated_images,
                        training = True)
      # calculate generator loss on fake data
      gen_loss = generator_loss(fake_output)
      # calculate discriminator loss as defined earlier
      disc_loss = discriminator_loss(real_output,
                        fake_output)
    # calculate gradients for generator
    gen_gradients = generator_tape.gradient
            (gen_loss, gen_model.trainable_variables)
    # calculate gradients for discriminator
    discri_gradients =
          discriminator_tape.gradient(disc_loss,
                discri_model.trainable_variables)
    # use optimizer to process and apply gradients to variables
    gen_optimizer.apply_gradients(zip(gen_gradients,
                      gen_model.trainable_variables))
    # same as above to discriminator
    discri_optimizer.apply_gradients(
        zip(discri_gradients,
            discri_model.trainable_variables))
    def generate_and_save_images(model, epoch,
                                   test_input):
        global epoch_number
        epoch_number = epoch_number + 1
        # set training to false to ensure inference mode
        predictions = model(test_input,
                                training = False)
        # display and save image
        fig = plt.figure(figsize=(4,4))
        for i in range(predictions.shape[0]):
            plt.imshow(predictions[i, :, :, 0] *
                       127.5 + 127.5, cmap="gray")
            plt.axis('off')
        plt.savefig('image_at_epoch_{:01d}.png'.format
                              (epoch_number))
        plt.show()
def train(dataset, epochs):
  for epoch in range(epochs):
    start = time.time()
    for image_batch in dataset:
      gradient_tuning(image_batch)
    # Produce images as we go
    generate_and_save_images(gen_model,
                             epoch + 1,
                             seed)
    # save checkpoint data
    checkpoint.save(file_prefix = checkpoint_prefix)
    print ('Time for epoch {} is {} sec'.format
                                  (epoch + 1,
                                  time.time()-start))
train(train_dataset, EPOCHS)
#run this code only if there is a runtime disconnection
try:
     checkpoint.restore(tf.train.latest_checkpoint
                             (checkpoint_dir))
except Exception as error:
    print("Error loading in model :
                  {}".format(error))
train(train_dataset, EPOCHS)
Listing 13-3

DigitGen-GAN.ipynb

In this example, you trained a model to generate the handwritten digits. In the next example, I will show you how to create handwritten characters.

Alphabet Generation

Just the way Kaggle provides the dataset for digits, the dataset for handwritten alphabets is available in another package called extra-keras-datasets. You will be using this dataset to generate the handwritten alphabets. The generator and discriminator models, their training, and inference all remain the same as in the case of digit generation. So, I am just going to give you the code of how to load the alphabets dataset from the Kaggle site and will give you the output of generated images. The full project source is available in the book’s repository under the name emnist-GAN.

Downloading Data

The dataset is available in a separate package that you can install by running the pip utility.
pip install extra-keras-datasets
Import the package into your project.
from extra_keras_datasets import emnist
Load the data into your project and display one image and the corresponding label.
(train_images,train_labels),
        (test_images,test_labels) =
        emnist.load_data(type='letters')
plt.imshow(train_images[1])
print ("label: ", train_labels[1])
The output is shown in Figure 13-10.
../images/495303_1_En_13_Chapter/495303_1_En_13_Fig10_HTML.jpg
Figure 13-10

Image for the handwritten G character

Like in the digits database, each image has a dimension of 28x28. Thus, we will be able to use our previous generator model that produces images of this dimension. One good thing for our experimentation here is that each alphabet label takes the value of its position in the alphabets set. For example, the letter a has a label value of 1, the letter b has a label value of 2, and so on.

Creating Dataset for a Single Alphabet

Like in the digit generation example, we will train the model to produce a single alphabet. Thus, we will need to create a dataset of images containing only the desired alphabet. This is done using the following code:
letter_G_images = []
for i in range(len(train_images)):
    if train_labels[i] == 7:
        letter_G_images.append(train_images[i])
train_images = np.array(letter_G_images)
You can verify that you have extracted the images of only G alphabet by running a small for loop as follows:
n = 10
f = plt.figure()
for i in range(n):
    f.add_subplot(1, n, i + 1)
    plt.subplot(1, n, i+1 ).axis("off")
    plt.imshow(train_images[i])
plt.show()
The output is shown in Figure 13-11.
../images/495303_1_En_13_Chapter/495303_1_En_13_Fig11_HTML.jpg
Figure 13-11

Sample handwritten character images

Okay, now your dataset is ready. The rest of the code for preprocessing the data, defining models, loss functions, optimizers, training, and so on remains the same as the digit generation application. I will not reproduce the code here, I will simply show you the final output of my run.

Program Output

The program output at various epochs is shown in Table 13-2.
Table 13-2

Images generated at various epochs

../images/495303_1_En_13_Chapter/495303_1_En_13_Figb_HTML.gif

As in the case of digit generation, you can notice that the model gets quickly trained during the first few epochs, and by the end of 100 epochs, you have a quality output.

Full Source

The full source code for the character image generation is given in Listing 13-4.
import matplotlib.pyplot as plt
import tensorflow as tf
import numpy as np
import time
from tensorflow import keras
from tensorflow.keras import layers
import os
pip install extra-keras-datasets
from extra_keras_datasets import emnist
(train_images,train_labels),
        (test_images,test_labels) =
        emnist.load_data(type='letters')
plt.imshow(train_images[1])
print ("label: ", train_labels[1])
letter_G_images = []
for i in range(len(train_images)):
    if train_labels[i] == 7:
        letter_G_images.append(train_images[i])
train_images = np.array(letter_G_images)
n = 10
f = plt.figure()
for i in range(n):
    f.add_subplot(1, n, i + 1)
    plt.subplot(1, n, i+1 ).axis("off")
    plt.imshow(train_images[i])
plt.show()
train_images = train_images.reshape (
    train_images.shape[0], 28, 28, 1).astype('float32')
train_images = (train_images - 127.5) / 127.5
train_dataset = tf.data.Dataset.from_tensor_slices(
    train_images).shuffle
            (train_images.shape[0]).batch(32)
gen_model = tf.keras.Sequential()
# Feed network with a 7x7 random image
gen_model.add(tf.keras.layers.Dense
                           (7*7*256,
                           use_bias=False,
                           input_shape=(100,)))
# Add batch normalization for stability
gen_model.add(tf.keras.layers.BatchNormalization())
gen_model.add(tf.keras.layers.LeakyReLU())
# reshape the output
gen_model.add(tf.keras.layers.Reshape((7, 7, 256)))
# Apply (5x5) filter and shift of (1,1).
# The image output is still 7x7.
gen_model.add(tf.keras.layers.Conv2DTranspose
                                (128, (5, 5),
                                strides=(1, 1),
                                padding='same',
                                use_bias=False))
gen_model.add(tf.keras.layers.BatchNormalization())
gen_model.add(tf.keras.layers.LeakyReLU())
# apply stride of (2,2). The output image is now 14x14.
gen_model.add(tf.keras.layers.Conv2DTranspose
                               (64, (5, 5),
                                strides=(2, 2),
                                padding='same',
                                use_bias=False))
gen_model.add(tf.keras.layers.BatchNormalization())
gen_model.add(tf.keras.layers.LeakyReLU())
# another shift upscales the image to 28x28, whihch is our final size.
gen_model.add(tf.keras.layers.Conv2DTranspose
                                (1, (5, 5),
                                strides=(2, 2),
                                padding='same',
                                use_bias=False,
                                activation='tanh'))
gen_model.summary()
tf.keras.utils.plot_model(gen_model)
noise = tf.random.normal([1, 100])#giving random input vector
generated_image = gen_model(noise, training=False)
plt.imshow(generated_image[0, :, :, 0], cmap="gray")
generated_image.shape
discri_model = tf.keras.Sequential()
discri_model.add(tf.keras.layers.Conv2D
                          (64, (5, 5),
                           strides=(2, 2),
                           padding='same',
                           input_shape=[28, 28, 1]))
discri_model.add(tf.keras.layers.LeakyReLU())
discri_model.add(tf.keras.layers.Dropout(0.3))
discri_model.add(tf.keras.layers.Conv2D
                          (128, (5, 5),
                           strides=(2, 2),
                           padding='same'))
discri_model.add(tf.keras.layers.LeakyReLU())
discri_model.add(tf.keras.layers.Dropout(0.3))
discri_model.add(tf.keras.layers.Flatten())
discri_model.add(tf.keras.layers.Dense(1))
discri_model.summary()
tf.keras.utils.plot_model(discri_model)
decision = discri_model(generated_image)
print (decision)
cross_entropy = tf.keras.losses.BinaryCrossentropy(from_logits=True)
#creating loss function
def generator_loss(generated_output):
    return cross_entropy(tf.ones_like(generated_output),
                           generated_output)
def discriminator_loss(real_output,
                           generated_output):
    # compute loss considering the image is real [1,1,...,1]
    real_loss = cross_entropy(tf.ones_like
                     (real_output),real_output)
    # compute loss considering the image is fake[0,0,...,0]
    generated_loss = cross_entropy
                  (tf.zeros_like(generated_output),
                                 generated_output)
    # compute total loss
    total_loss = real_loss + generated_loss
    return total_loss
gen_optimizer = tf.optimizers.Adam(1e-4)
discri_optimizer = tf.optimizers.Adam(1e-4)
epoch_number = 0
EPOCHS = 100
noise_dim = 100
seed = tf.random.normal([1, noise_dim])
checkpoint_dir =
      '/content/drive/My Drive/GAN2/Checkpoint'
checkpoint_prefix = os.path.join
                    (checkpoint_dir, "ckpt")
checkpoint = tf.train.Checkpoint
                (generator_optimizer = gen_optimizer,
                discriminator_optimizer =
                           discri_optimizer,
                generator= gen_model,
                discriminator = discri_model)
from google.colab import drive
drive.mount('/content/drive')
cd '/content/drive/My Drive/GAN2'
def gradient_tuning(images):
    # create a noise vector.
    noise = tf.random.normal([16, noise_dim])
    # Use gradient tapes for automatic differentiation
    with tf.GradientTape()
        as generator_tape, tf.GradientTape()
        as discriminator_tape:
      # ask genertor to generate random images
      generated_images = gen_model
                        (noise, training = True)
      # ask discriminator to evalute the real images and generate its output
      real_output = discri_model
                        (images, training = True)
      # ask discriminator to do the evlaution on generated (fake) images
      fake_output = discri_model
                (generated_images, training = True)
      # calculate generator loss on fake data
      gen_loss = generator_loss(fake_output)
      # calculate discriminator loss as defined earlier
      disc_loss = discriminator_loss
                     (real_output, fake_output)
    # calculate gradients for generator
    gen_gradients = generator_tape.gradient
                    (gen_loss,
                    gen_model.trainable_variables)
    # calculate gradients for discriminator
    discri_gradients = discriminator_tape.gradient
                        (disc_loss,
                   discri_model.trainable_variables)
    # use optimizer to process and apply gradients to variables
    gen_optimizer.apply_gradients(zip(gen_gradients,
                  gen_model.trainable_variables))
    # same as above to discriminator
    discri_optimizer.apply_gradients(
        zip(discri_gradients,
            discri_model.trainable_variables))
    def generate_and_save_images
                (model, epoch, test_input):
        global epoch_number
        epoch_number = epoch_number + 1
        # set training to false to ensure inference mode
        predictions = model(test_input,
                             training = False)
        # display and save image
        fig = plt.figure(figsize=(4,4))
        for i in range(predictions.shape[0]):
            plt.imshow(predictions
                 [i, :, :, 0] * 127.5 + 127.5,
                               cmap='gray')
            plt.axis('off')
        plt.savefig('image_at_epoch_
                  {:01d}.png'.format(epoch_number))
        plt.show()
def train(dataset, epochs):
  for epoch in range(epochs):
    start = time.time()
    for image_batch in dataset:
      gradient_tuning(image_batch)
    # Produce images as we go
    generate_and_save_images(gen_model,
                             epoch + 1,
                             seed)
    # save checkpoint data
    checkpoint.save(file_prefix =
                           checkpoint_prefix)
    print ('Time for epoch {} is {} sec'.format
                          (epoch + 1,
                          time.time()-start))
train(train_dataset, EPOCHS)
#run this code only if there is a runtime disconnection
try:
     checkpoint.restore(tf.train.latest_checkpoint
                           (checkpoint_dir))
except Exception as error:
    print("Error loading in model :
                           {}".format(error))
train(train_dataset, 100)
Listing 13-4

emnist-GAN.ipynb

Printed to Handwritten Text

You may train the preceding network for your own handwriting by feeding your handwritten images for a–z, A–Z, and digits 0–9. With this trained model, anybody who has access to the model will be able to convert any printed text to a personalized handwritten text authored by you. I used a few characters from such a trained network to personalize the writing of the word “tensor,” which is shown in Figure 13-12.
../images/495303_1_En_13_Chapter/495303_1_En_13_Fig12_HTML.jpg
Figure 13-12

Sample text created by combining images

Next comes the creation of a more complex image.

Color Cartoons

So far, you have created images of handwritten digits and alphabets. What about creating complex color images like cartoons? The techniques that you have learned so far can be applied to create complex color images. And that is what I am going to demonstrate in this project.

Downloading Data

There are a good deal of anime character datasets available on Kaggle site. We have kept the dataset for this project ready for your use on the book’s download site. Download the data into your project using the wget utility.
! wget --no-check-certificate -r 'https://drive.google.com/uc?export=download&id=1z7rXRIFtRBFZHt-Mmti4HxrxHqUfG3Y8' -O tf-book.zip
Unzip the contents of the downloaded file.
!unzip tf-book.zip

Creating Dataset

Write a function to create a dataset:
def load_dataset(batch_size, img_shape,
                       data_dir = None):
    # Create a tuple of size(30000,64,64,3)
    sample_dim = (batch_size,) + img_shape
    # Create an uninitialized array of shape (30000,64,64,3)
    sample = np.empty(sample_dim, dtype=np.float32)
    # Extract all images from our file
    all_data_dirlist = list(glob.glob(data_dir))
    # Randomly select an image file from our data list
    sample_imgs_paths = np.random.choice
                   (all_data_dirlist,batch_size)
    for index,img_filename in enumerate
                           (sample_imgs_paths):
        # Open the image
        image = Image.open(img_filename)
        # Resize the image
        image = image.resize(img_shape[:-1])
        # Convert the input into an array
        image = np.asarray(image)
        # Normalize data
        image = (image/127.5) -1
        # Assign the preprocessed image to our sample
        sample[index,...] = image
    print("Data loaded")
    return sample
The download code is self-explanatory and fully commented for your ease of understanding. You now call this function to create a dataset:
x_train=load_dataset(30000,(64,64,3),
      "/content/tf-book/chapter13/anime/data/*.png")
BUFFER_SIZE = 30000
BATCH_SIZE = 256
train_dataset = tf.data.Dataset.from_tensor_slices
              (x_train).shuffle(BUFFER_SIZE).batch
                      (BATCH_SIZE)

Displaying Images

You can check that the dataset is correctly loaded by printing a few images from the set.
n = 10
f = plt.figure(figsize=(15,15))
for i in range(n):
    f.add_subplot(1, n, i + 1)
    plt.subplot(1, n, i+1 ).axis("off")
    plt.imshow(x_train[i])
plt.show()
The output is shown in Figure 13-13.
../images/495303_1_En_13_Chapter/495303_1_En_13_Fig13_HTML.jpg
Figure 13-13

Sample anime images

Check the shape of your training data.
x_train.shape
The shape will be printed as follows:
(30000, 64, 64, 3)

There are 30,000 RGB images, each of size 64x64. At this point, you are ready for defining your models, training them, and doing inference. The rest of the code that follows here is exactly identical to the earlier two projects, and thus I am not reproducing the code here. You may look up the entire source code in the project download. I will present only the output at different epochs.

Output

The generated images at various epochs are shown in Table 13-3.
Table 13-3

Sample images generated at various epochs

../images/495303_1_En_13_Chapter/495303_1_En_13_Figc_HTML.gif

You can see that by about 1000 epochs, the network learns quite a bit to reproduce the original cartoon image. To train the model to generate a real-like image, you would need to run the code for 10,000 epochs or more. Each epoch took me about 16 seconds to run on a GPU. Basically, what I wanted to show you here is that the GAN technique that we develop for creating trivial handwritten digits can be applied as is to generate complex large images too.

Full source

The full source code for generating anime images is given in Listing 13-5.
import tensorflow as tf
import numpy as np
import sys
import os
import cv2
import glob
from PIL import Image
import matplotlib.pyplot as plt
import time
from tensorflow import keras
from tensorflow.keras import layers
from keras.layers import UpSampling2D, Conv2D
! wget --no-check-certificate -r 'https://drive.google.com/uc?export=download&id=1z7rXRIFtRBFZHt-Mmti4HxrxHqUfG3Y8' -O tf-book.zip
!unzip tf-book.zip
def load_dataset(batch_size, img_shape,
                       data_dir = None):
    # Create a tuple of size(30000,64,64,3)
    sample_dim = (batch_size,) + img_shape
    # Create an uninitialized array of shape (30000,64,64,3)
    sample = np.empty(sample_dim, dtype=np.float32)
    # Extract all images from our file
    all_data_dirlist = list(glob.glob(data_dir))
    
    # Randomly select an image file from our data list
    sample_imgs_paths = np.random.choice
                   (all_data_dirlist,batch_size)
    for index,img_filename in enumerate
                           (sample_imgs_paths):
        # Open the image
        image = Image.open(img_filename)
        # Resize the image
        image = image.resize(img_shape[:-1])
        # Convert the input into an array
        image = np.asarray(image)
        # Normalize data
        image = (image/127.5) -1
        # Assign the preprocessed image to our sample
        sample[index,...] = image
    print("Data loaded")
    return sample
x_train=load_dataset(30000,(64,64,3),
      "/content/tf-book/chapter13/anime/data/*.png")
BUFFER_SIZE = 30000
BATCH_SIZE = 256
train_dataset = tf.data.Dataset.from_tensor_slices
              (x_train).shuffle(BUFFER_SIZE).batch
                      (BATCH_SIZE)
n = 10
f = plt.figure(figsize=(15,15))
for i in range(n):
    f.add_subplot(1, n, i + 1)
    plt.subplot(1, n, i+1 ).axis("off")
    plt.imshow(x_train[i])
plt.show()
x_train.shape
gen_model = tf.keras.Sequential()
# seed image of size 4x4
gen_model.add(tf.keras.layers.Dense
                        (64*4*4,
                        use_bias=False,
                        input_shape=(100,)))
gen_model.add(tf.keras.layers.BatchNormalization())
gen_model.add(tf.keras.layers.LeakyReLU())
gen_model.add(tf.keras.layers.Reshape((4,4,64)))
# size of output image is still 4x4
gen_model.add(tf.keras.layers.Conv2DTranspose
                       (256, (5, 5),
                        strides=(1, 1),
                        padding='same',
                        use_bias=False))
gen_model.add(tf.keras.layers.BatchNormalization())
gen_model.add(tf.keras.layers.LeakyReLU())
# size of output image is 8x8
gen_model.add(tf.keras.layers.Conv2DTranspose
                       (128, (5, 5),
                       strides=(2, 2),
                       padding='same',
                       use_bias=False))
gen_model.add(tf.keras.layers.BatchNormalization())
gen_model.add(tf.keras.layers.LeakyReLU())
# size of output image is 16x16
gen_model.add(tf.keras.layers.Conv2DTranspose
                       (64, (5, 5),
                       strides=(2, 2),
                       padding='same',
                       use_bias=False))
gen_model.add(tf.keras.layers.BatchNormalization())
gen_model.add(tf.keras.layers.LeakyReLU())
# size of output image is 32x32
gen_model.add(tf.keras.layers.Conv2DTranspose
                       (32, (5, 5),
                       strides=(2, 2),
                       padding='same',
                       use_bias=False))
gen_model.add(tf.keras.layers.BatchNormalization())
gen_model.add(tf.keras.layers.LeakyReLU())
# size of output image is 64x64
gen_model.add(tf.keras.layers.Conv2DTranspose
                       (3, (5, 5),
                       strides=(2, 2),
                       padding='same',
                       use_bias=False,
                       activation='tanh'))
gen_model.summary()
noise = tf.random.normal([1, 100])
generated_image = gen_model(noise, training=False)
plt.imshow(generated_image[0, :, :, 0] )
discri_model = tf.keras.Sequential()
discri_model.add(tf.keras.layers.Conv2D
                    (128, (5, 5), strides=(2, 2),
                    padding='same',
                    input_shape=[64,64,3]))
discri_model.add(tf.keras.layers.LeakyReLU())
discri_model.add(tf.keras.layers.Dropout(0.3))
discri_model.add(tf.keras.layers.Conv2D(
                    256, (5, 5), strides=(2, 2),
                    padding='same'))
discri_model.add(tf.keras.layers.LeakyReLU())
discri_model.add(tf.keras.layers.Dropout(0.3))
discri_model.add(tf.keras.layers.Flatten())
discri_model.add(tf.keras.layers.Dense(1))
discri_model.summary()
tf.keras.utils.plot_model(discri_model)
decision = discri_model(generated_image)
#giving the generated image to discriminator,the discriminator will give negative value if it is fake,while if it is real then it will give positive value.
print (decision)
cross_entropy = tf.keras.losses.BinaryCrossentropy
                         (from_logits=True)
def generator_loss(generated_output):
    return cross_entropy(tf.ones_like(generated_output),
                           generated_output)
def discriminator_loss(real_output,
                       generated_output):
    # compute loss considering the image is real [1,1,...,1]
    real_loss = cross_entropy
                   (tf.ones_like(real_output),
                        real_output)
    # compute loss considering the image is fake[0,0,...,0]
    generated_loss = cross_entropy
                      (tf.zeros_like
                      (generated_output),
                       generated_output)
    # compute total loss
    total_loss = real_loss + generated_loss
    return total_loss
gen_optimizer = tf.optimizers.Adam(1e-4)
discri_optimizer = tf.optimizers.Adam(1e-4)
epoch_number = 0
EPOCHS = 10000
noise_dim = 100
seed = tf.random.normal([1, noise_dim])
checkpoint_dir =
       '/content/drive/My Drive/GAN3/Checkpoint'
checkpoint_prefix = os.path.join
                       (checkpoint_dir, "ckpt")
checkpoint = tf.train.Checkpoint
            (generator_optimizer=gen_optimizer,
            discriminator_optimizer=discri_optimizer,
            generator= gen_model,
            discriminator = discri_model)
from google.colab import drive
drive.mount('/content/drive')
cd '/content/drive/My Drive/GAN3'
def gradient_tuning(images):
    # create a noise vector.
    noise = tf.random.normal([16, noise_dim])
    # Use gradient tapes for automatic differentiation
    with tf.GradientTape()
            as generator_tape, tf.GradientTape()
            as discriminator_tape:
      # ask genertor to generate random images
      generated_images = gen_model
                         (noise, training=True)
      # ask discriminator to evalute the real images and generate its output
      real_output = discri_model(images,
                                 training=True)
      # ask discriminator to do the evlaution on generated (fake) images
      fake_output = discri_model(generated_images,
                                  training=True)
      # calculate generator loss on fake data
      gen_loss = generator_loss(fake_output)
      # calculate discriminator loss as defined earlier
      disc_loss = discriminator_loss(real_output,
                                     fake_output)
    # calculate gradients for generator
    gen_gradients = generator_tape.gradient
                    (gen_loss,
                     gen_model.trainable_variables)
    # calculate gradients for discriminator
    discri_gradients = discriminator_tape.gradient
                   (disc_loss,
                   discri_model.trainable_variables)
    # use optimizer to process and apply gradients to variables
    gen_optimizer.apply_gradients(zip(gen_gradients,
                    gen_model.trainable_variables))
    # same as above to discriminator
    discri_optimizer.apply_gradients(
        zip(discri_gradients,
            discri_model.trainable_variables))
    def generate_and_save_images(model, epoch,
                                test_input):
        global epoch_number
        epoch_number = epoch_number + 1
        # set training to false to ensure inference mode
        predictions = model(test_input,
                                training=False)
        # display and save image
        fig = plt.figure(figsize=(4,4))
        for i in range(predictions.shape[0]):
            plt.imshow(predictions[i, :, :, 0]
                      * 127.5 + 127.5, cmap="gray")
            plt.axis('off')
        plt.savefig('image_at_epoch_
                   {:01d}.png'.format(epoch_number))
        plt.show()
def train(dataset, epochs):
  for epoch in range(epochs):
    start = time.time()
    for image_batch in dataset:
      gradient_tuning(image_batch)
    # Produce images as we go
    generate_and_save_images(gen_model,
                             epoch + 1,
                             seed)
    # save checkpoint data
    checkpoint.save(file_prefix = checkpoint_prefix)
    print ('Time for epoch {} is {} sec'.format
                             (epoch + 1,
                             time.time()-start))
train(train_dataset, EPOCHS)
#run this code only if there is a runtime disconnection
try:
     checkpoint.restore(tf.train.latest_checkpoint
                                 (checkpoint_dir))
except Exception as error:
    print("Error loading in model :
                                 {}".format(error))
train(train_dataset, EPOCHS)
Listing 13-5

CS-Anime.ipynb

Summary

The Generative Adversarial Network (GAN) provides a novel idea of mimicking any given image. The GAN consists of two networks – generator and discriminator. Both models are trained simultaneously by an adversarial process. In this chapter, you studied to construct a GAN network, which was used for creating handwritten digits, alphabets, and even animes. To train a GAN requires huge processing resources. Yet, it produces very satisfactory results. Today, GANs have been successfully used in many applications. For example, GANs are used for creating large image datasets like the handwritten digits provided on the Kaggle site that you used in the first example in this chapter. It is successful in creating human faces of celebrities. It can be used for generating cartoon characters like the anime example in this chapter. It has also been applied in the areas of image-to-image and text-to-image translations. It can be used for creating emojis from photos. It can also be used for aging faces in the photos. The possibilities are endless; you should explore further to see for yourself how people have used GANs in creating many interesting applications. With the techniques that you have learned in this chapter, you would be able to implement your own ideas to add this repository of endless GAN applications.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.216.205.123