The purpose of the CA network is to convert a text embedding vector () to a conditioning latent variable (). In the CA network, the text embedding vector is passed through a fully connected layer with nonlinearity, which produces the mean and the diagonal covariance matrix .
The following code shows how to create a CA network:
- Start by creating a fully connected layer with 256 nodes and LeakyReLU as the activation function:
input_layer = Input(shape=(1024,))
x = Dense(256)(input_layer)
mean_logsigma = LeakyReLU(alpha=0.2)(x)
The input shape is (batch_size, 1024), and the output shape is (batch_size, 256).
- Next, split mean_logsigma into mean and log_sigma tensors:
mean = x[:, :128]
log_sigma = x[:, 128:]
This operation creates two tensors of dimensions (batch_size, 128) and (batch_size, 128).
- Next, calculate the text conditioning variable using the following code. Refer to the Conditional Augmentation (CA) block section inside the Architecture of StackGAN subsection for more information on how to generate text conditioning variables:
stddev = K.exp(log_sigma)
epsilon = K.random_normal(shape=K.constant((mean.shape[1], ), dtype='int32'))
c = stddev * epsilon + mean
This produces a tensor with a dimension of (batch_size, 128), which is our text conditioning variable. The complete code for the CA network is as follows:
def generate_c(x):
mean = x[:, :128]
log_sigma = x[:, 128:]
stddev = K.exp(log_sigma)
epsilon = K.random_normal(shape=K.constant((mean.shape[1], ), dtype='int32'))
c = stddev * epsilon + mean
return c
The entire code for the conditioning block looks as follows:
def build_ca_model():
input_layer = Input(shape=(1024,))
x = Dense(256)(input_layer)
mean_logsigma = LeakyReLU(alpha=0.2)(x)
c = Lambda(generate_c)(mean_logsigma)
return Model(inputs=[input_layer], outputs=[c])
In the preceding code, the build_ca_model() method creates a Keras model with one fully connected layer and LeakyReLU as the activation function.