Example of WGAN with TensorFlow

This example can be considered a variant of the previous one because it uses the same dataset, generator, and discriminator. The only main difference is that in this case, the discriminator (together with its variable scope) has been renamed critic():

import tensorflow as tf

def critic(x, is_training=True, reuse_variables=True):
with tf.variable_scope('critic', reuse=reuse_variables):
...

At this point, we can step directly to the creation of the Graph containing all of the placeholders, operations, and loss functions:

import tensorflow as tf

graph = tf.Graph()

with graph.as_default():
input_x = tf.placeholder(tf.float32, shape=(None, width, height, 1))
input_z = tf.placeholder(tf.float32, shape=(None, code_length))
is_training = tf.placeholder(tf.bool)

gen = generator(z=tf.reshape(input_z, (-1, 1, 1, code_length)), is_training=is_training)

r_input_x = tf.image.resize_images(images=input_x, size=(64, 64))

crit_1_l = critic(x=r_input_x, is_training=is_training, reuse_variables=False)
crit_2_l = critic(x=gen, is_training=is_training, reuse_variables=True)

loss_c = tf.reduce_mean(crit_2_l - crit_1_l)
loss_g = tf.reduce_mean(-crit_2_l)

variables_g = [variable for variable in tf.trainable_variables() if variable.name.startswith('generator')]
variables_c = [variable for variable in tf.trainable_variables() if variable.name.startswith('critic')]

with tf.control_dependencies(tf.get_collection(tf.GraphKeys.UPDATE_OPS)):
optimizer_c = tf.train.AdamOptimizer(0.00005, beta1=0.5, beta2=0.9).minimize(loss=loss_c, var_list=variables_c)

with tf.control_dependencies([optimizer_c]):
training_step_c = tf.tuple(tensors=[tf.assign(variable, tf.clip_by_value(variable, -0.01, 0.01))
for variable in variables_c])

training_step_g = tf.train.AdamOptimizer(0.00005, beta1=0.5, beta2=0.9).minimize(loss=loss_g, var_list=variables_g)

As it's possible to see, there are no differences in the placeholder section, in the definition of the generator, and in the image resizing to the target dimensions of 64 × 64. In the next block, we define the two Critic instances (which are perfectly analogous to the ones declared in the previous example).

The two loss functions are simpler than a standard GAN, as they work directly with the Critic outputs, computing the sample mean over a batch. In the original paper, the authors suggest using RMSProp as the standard optimizer, in order to avoid the instabilities that a momentum-based algorithm can produce. However, Adam, with lower forgetting factors (μ1 = 0.5 and μ2 = 0.9) and a learning rate η = 0.00005, is faster than RMSProp, and doesn't lead to instabilities. I suggest testing both options, trying to maximize the training speed while preventing the mode collapse. Contrary to the previous example, in this case we need to clip all of the Critic variables after each training step. To avoid that, the internal concurrency can alter the order of some operations; it's necessary to employ a nested dependency control context manager. In this way, the actual training_step_c (responsible for clipping and reassigning the values to each variable) will be executed only after the optimizer_c step has completed.

Now, we can create the InteractiveSession, initialize the variables, and start the training process, which is very similar to the previous one:

import numpy as np
import tensorflow as tf

nb_epochs = 200
nb_critic = 5
batch_size = 64
nb_iterations = int(nb_samples / batch_size)

session = tf.InteractiveSession(graph=graph)
tf.global_variables_initializer().run()

samples_range = np.arange(nb_samples)

for e in range(nb_epochs):
c_losses = []
g_losses = []

for i in range(nb_iterations):
for j in range(nb_critic):
Xi = np.random.choice(samples_range, size=batch_size)
X = np.expand_dims(X_train[Xi], axis=3)
Z = np.random.uniform(-1.0, 1.0, size=(batch_size, code_length)).astype(np.float32)

_, c_loss = session.run([training_step_c, loss_c],
feed_dict={
input_x: X,
input_z: Z,
is_training: True
})
c_losses.append(c_loss)

Z = np.random.uniform(-1.0, 1.0, size=(batch_size, code_length)).astype(np.float32)

_, g_loss = session.run([training_step_g, loss_g],
feed_dict={
input_x: np.zeros(shape=(batch_size, width, height, 1)),
input_z: Z,
is_training: True
})

g_losses.append(g_loss)

print('Epoch {}) Avg. critic loss: {} - Avg. generator loss: {}'.format(e + 1, np.mean(c_losses), np.mean(g_losses)))

The main difference is that, in this case, the Critic is trained n_critic times before each generator training step. The result of the generation of 50 random samples is shown in the following screenshot:

Samples generated by a WGAN trained with the Fashion MNIST dataset

As it's possible to see, the quality is slightly higher, and the samples are smoother. I invite the reader to also test this model with an RGB dataset, because the final quality is normally excellent.

When working with these models, the training time can be very long. To avoid waiting to see the initial results (and to perform the required tuning), I suggest using Jupyter. In this way, it's possible to stop the learning process, check the generator ability, and restart it without any problem. Of course, the graph must remain the same, and the variable initialization must be performed only at the beginning.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.21.246.223