Training the model

Training a model takes several steps:

Create two tensors with the real and fake labels. These will be required during the training of the generator and the discriminator. Use label smoothing, which is covered in Chapter 1, Introduction to Generative Adversarial Networks:

real_labels = np.ones((batch_size, 1), dtype=float) * 0.9
fake_labels = np.zeros((batch_size, 1), dtype=float) * 0.1

Next, create a for loop, which should run for the number of times specified by the number of epochs, as follows:

for epoch in range(epochs):
    print("========================================")
    print("Epoch is:", epoch)
    print("Number of batches", int(X_train.shape[0] / batch_size))

    gen_losses = []
    dis_losses = []

After that, calculate a number of batches and write a for loop that will run for a specified number of batches:

    number_of_batches = int(X_train.shape[0] / batch_size)
    for index in range(number_of_batches):
        print("Batch:{}".format(index+1))

Sample a batch of data (a mini-batch) for the current iteration. Create a noise vector, select a batch of images and a batch of embeddings, and normalize the images:

        # Create a batch of noise vectors
        z_noise = np.random.normal(0, 1, size=(batch_size, z_dim))
        image_batch = X_train[index * batch_size:(index + 1) * batch_size]
        embedding_batch = embeddings_train[index * batch_size:(index + 1) * batch_size]

        # Normalize images
        image_batch = (image_batch - 127.5) / 127.5

Next, generate fake images using the generator model by passing the embedding_batch and z_noise:

        fake_images, _ = stage1_gen.predict([embedding_batch, z_noise], verbose=3)

This will generate a batch of fake images conditioned on a batch of embeddings and a batch of noise vectors.

Use the compressor model to compress the embedding. Spatially replicate it to convert it to a tensor with a shape of (batch_size, 4, 4, 128):

        compressed_embedding = embedding_compressor_model.predict_on_batch(embedding_batch)
        compressed_embedding = np.reshape(compressed_embedding, (-1, 1, 1, condition_dim))
        compressed_embedding = np.tile(compressed_embedding, (1, 4, 4, 1))

Next, train the discriminator model on the fake images generated by the generator, the real images from the real dataset, and the wrong images:

        dis_loss_real = stage1_dis.train_on_batch([image_batch, compressed_embedding],
                                          np.reshape(real_labels, (batch_size, 1)))
        dis_loss_fake = stage1_dis.train_on_batch([fake_images, compressed_embedding],
                                          np.reshape(fake_labels, (batch_size, 1)))
        dis_loss_wrong = stage1_dis.train_on_batch([image_batch[:(batch_size - 1)], compressed_embedding[1:]],
                                           np.reshape(fake_labels[1:], (batch_size-1, 1)))

We have now successfully trained the discriminator on three sets of data: real images, fake images, and wrong images. Let's now train the adversarial model:

Next, train the adversarial model. Provide it with three inputs and the corresponding truth values. This operation will calculate gradients and update the weights of one batch of data.

        g_loss = adversarial_model.train_on_batch([embedding_batch, z_noise, compressed_embedding],[K.ones((batch_size, 1)) * 0.9, K.ones((batch_size, 256)) * 0.9])

Next, calculate the losses and store them for evaluation purposes. It is a good idea to keep printing the different losses to keep track of the training:

d_loss = 0.5 * np.add(dis_loss_real, 0.5 * np.add(dis_loss_wrong, dis_loss_fake))

        print("d_loss:{}".format(d_loss))
        
        print("g_loss:{}".format(g_loss))

        dis_losses.append(d_loss)
        gen_losses.append(g_loss)

After the completion of each epoch, store any losses to TensorBoard:

    write_log(tensorboard, 'discriminator_loss', np.mean(dis_losses), epoch)
    write_log(tensorboard, 'generator_loss', np.mean(gen_losses[0]), epoch)

After each epoch, to evaluate the progress, generate images and save them in the results directory.

    z_noise2 = np.random.normal(0, 1, size=(batch_size, z_dim))
    embedding_batch = embeddings_test[0:batch_size]
    fake_images, _ = stage1_gen.predict_on_batch([embedding_batch, z_noise2])

    # Save images
    for i, img in enumerate(fake_images[:10]):
        save_rgb_img(img, "results/gen_{}_{}.png".format(epoch, i))

Here, save_rgb_img() is a utility function and defined as follows:

def save_rgb_img(img, path):
    """
    Save a rgb image
    """
    fig = plt.figure()
    ax = fig.add_subplot(1, 1, 1)
    ax.imshow(img)
    ax.axis("off")
    ax.set_title("Image")

    plt.savefig(path)
    plt.close()

Save the weights for each model in Stage-I of the StackGAN.

stage1_gen.save_weights("stage1_gen.h5")
stage1_dis.save_weights("stage1_dis.h5")

Congratulations, we have successfully trained Stage-I of the StackGAN. We now have a trained generator network that can generate images with dimensions of 64x64x3. These images will have basic colors and primitive shapes. In the next section, we will train the Stage-II StackGAN.

Table of Contents for Training the model

Create new playlist

Sign In

Sign Up

Table of Contents for
Training the model