Stack-II

The main components of the Stage-II StackGAN are a generator network and a discriminator network. The generator network is an encoder-decoder type of network. The random noise z is not used during this stage, on the assumption that the randomness has already been preserved by , where is the image generated by the generator network of Stage-I.

We start by using the pre-trained text encoder to generate Gaussian conditioning variables . This generates the same text embedding . The Stage-I and Stage-II Conditioning Augmentations have different fully connected layers for generating different means and standard deviations. This means that the Stage-II GAN learns to capture useful information in the text embedding that is omitted by the Stage-I GAN.

The problems with the images generated by the Stack-I GAN are that they can lack vivid object parts, they may contain shape distortions, and that they may omit important details that are very important for the generation of photo-realistic images. A Stack-II GAN is built upon the output of the Stack-I GAN. The Stack-II GAN is conditioned on the low-resolution image generated by the Stack-I GAN and the text description. It produces high-resolution images by correcting defects.

Table of Contents for Stack-II

Create new playlist

Sign In

Sign Up

Table of Contents for
Stack-II