What is so powerful about CycleGAN?

One of the best parts of CycleGAN is that it does not require paired input and output data. In many applications of style transfer, paired data is a critical piece of the training. CycleGAN was one of the first GAN implementations that was able to break that mold and reliably train models without paired input:

Unpaired input data from the CycleGAN paper

CycleGAN has fairly simply Convolutional Neural Networks (CNNs) for the generator and the discriminator. The real secret sauce to this particular paper is in the implementation of the architecture—how we stitch these networks together to train them to learn this representation from A to B and B to A. The pseudocode is fairly straightforward but does require us to pay attention to our bookkeeping (we'll have a lot of generators!):

Initialize the generator that goes from A to B and B to A:

# Build the models from the paper
init generator_A_to_B
init generator_B_to_A

Initialize a discriminator for each of the image types—one for each style:

init discriminatorA
init discriminatorB

Put all of the networks into an adversarial training architecture and initialize the network to train:

init GAN()

Train the networks by grabbing the batches for A and B, training each discriminator and training the GAN model in an adversarial mode:

while batches available:
    grab BatchA
     grab BatchB

     train discriminatorA(BatchA)
     train discriminatorB(BatchB)

     train GAN(BatchA, BatchB)

As you can see, the design is fairly simple—the secret sauce is actually built into the GAN architecture itself. In the CycleGAN paper, there're a few great graphics that show how the architecture should be built.

There're three steps to cover in the overall design:

To translate between two different styles, we'll need two generators and two discriminators. The generator G will translate from X to Y and be checked by discriminator Y (D_Y). Likewise, the generator F will translate from Y to X and be checked by discriminator X (D_X):

Basic CycleGAN architecture

One of the key features of the CycleGAN paper is evaluating from x to then reconstructed . By going back to the reconstructed , you will actually have a solid metric to base your learners on:

X to Y then reconstructed X CycleGAN architecture

As with step 2, we will go from y to style transferred and then reconstructed . By going in both directions, we are able to define an architecture that can evaluate the translated photo and the reconstructed photo in the adversarial steps:

Y to X then reconstructed Y CycleGAN architecture

Table of Contents for What is so powerful about CycleGAN?

Create new playlist

Sign In

Sign Up

Table of Contents for
What is so powerful about CycleGAN?