Architecture of StackGAN

StackGAN is a two-stage network. Each stage has two generators and two discriminators. StackGAN is made up of many networks, which are as follows:

  • Stack-I GAN: text encoder, Conditioning Augmentation network, generator network, discriminator network, embedding compressor network
  • Stack-II GAN: text encoder, Conditioning Augmentation network, generator network, discriminator network, embedding compressor network

Source: arXiv:1612.03242 [cs.CV]

The preceding image is self-explanatory. It represents both stages of the StackGAN network. As you can see, the first stage is generating images with dimensions of 64x64. Then the second stage takes these low-resolution images and generates high-resolution images with dimensions of 256x256. In the next few sections, we will explore the different components in the StackGAN network. Before doing this, however, let's get familiar with the notations that are used in this chapter:

Notation

Description

 t

This is a text description of the true data distribution. 

 z

This is a randomly sampled noise vector from a Gaussian distribution .

 

This is a text embedding of the given text description generated by a pre-trained encoder.

 

This text conditioning variable is a Gaussian conditioning variable sampled from a distribution . It captures the different meanings of . 

 

 This is a conditioning Gaussian distribution.

   N(0,I)

This is a normal distribution.

  

 This is a diagonal covariance matrix.

 pdata

 This is the true data distribution.

 Pz

This is the Gaussian distribution.

D1 

This is the Stage-I discriminator.

G1

This is the Stage-I generator.

 D2

This is the Stage-II discriminator.

 G2

This is the Stage-II generator.

N2

These are the dimensions of the random noise variable.

 

These are the Gaussian latent variables for the Stack-II GAN.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.221.179.220