Architecture of StackGAN

StackGAN is a two-stage network. Each stage has two generators and two discriminators. StackGAN is made up of many networks, which are as follows:

Stack-I GAN: text encoder, Conditioning Augmentation network, generator network, discriminator network, embedding compressor network
Stack-II GAN: text encoder, Conditioning Augmentation network, generator network, discriminator network, embedding compressor network

Source: arXiv:1612.03242 [cs.CV]

The preceding image is self-explanatory. It represents both stages of the StackGAN network. As you can see, the first stage is generating images with dimensions of 64x64. Then the second stage takes these low-resolution images and generates high-resolution images with dimensions of 256x256. In the next few sections, we will explore the different components in the StackGAN network. Before doing this, however, let's get familiar with the notations that are used in this chapter:

Notation	Description
t	This is a text description of the true data distribution.
z	This is a randomly sampled noise vector from a Gaussian distribution .
	This is a text embedding of the given text description generated by a pre-trained encoder.
	This text conditioning variable is a Gaussian conditioning variable sampled from a distribution . It captures the different meanings of .
	This is a conditioning Gaussian distribution.
N(0,I)	This is a normal distribution.
	This is a diagonal covariance matrix.
pdata	This is the true data distribution.
Pz	This is the Gaussian distribution.
D1	This is the Stage-I discriminator.
G1	This is the Stage-I generator.
D2	This is the Stage-II discriminator.
G2	This is the Stage-II generator.
N2	These are the dimensions of the random noise variable.
	These are the Gaussian latent variables for the Stack-II GAN.

Table of Contents for Architecture of StackGAN

Create new playlist

Sign In

Sign Up

Table of Contents for
Architecture of StackGAN