Downsampling blocks

This block takes a low-resolution image of the dimensions 64x64x3 from the generator of Stage-I and downsamples it to generate a tensor with a shape of 16x16x512. The image goes through a series of 2D convolution blocks.

In this section, we will write the implementation for the downsampling blocks.

Start by creating with the first downsampling block. This block contains a 2D convolution layer with ReLU as the activation function. Before applying the 2D convolution, pad the input with zeros on all sides. The configurations for the different layers in this block are the following:
- Padding size: (1, 1)
- Filters: 128
- Kernel size: (3, 3)
- Strides: 1
- Activation: ReLU

x = ZeroPadding2D(padding=(1, 1))(input_lr_images)
x = Conv2D(128, kernel_size=(3, 3), strides=1, use_bias=False)(x)
x = ReLU()(x)

Next, add a second convolution block with the following configurations:
- Padding size: (1, 1)
- Filters: 256
- Kernel size: (4, 4)
- Strides: 2
- Batch normalization: Yes
- Activation: ReLU

x = ZeroPadding2D(padding=(1, 1))(x)
x = Conv2D(256, kernel_size=(4, 4), strides=2, use_bias=False)(x)
x = BatchNormalization()(x)
x = ReLU()(x)

After that, add another convolution block with the following configurations:
- Padding size: (1, 1)
- Filters: 512
- Kernel size: (4, 4)
- Strides: 2
- Batch normalization: Yes
- Activation: ReLU

x = ZeroPadding2D(padding=(1, 1))(x)
x = Conv2D(512, kernel_size=(4, 4), strides=2, use_bias=False)(x)
x = BatchNormalization()(x)
x = ReLU()(x)

The downsampling block generates a tensor with a shape of 16x16x512. After that, we have a series of residual blocks. Before feeding this tensor to the residual block, we need to concatenate it to the text-conditioning variable. The code to do this is as follows:

# This block will extend the text conditioning variable and concatenate it to the encoded images tensor.
def joint_block(inputs):
    c = inputs[0]
    x = inputs[1]

    c = K.expand_dims(c, axis=1)
    c = K.expand_dims(c, axis=1)
    c = K.tile(c, [1, 16, 16, 1])
    return K.concatenate([c, x], axis=3)
# This is the lambda layer which we will add to the generator network
c_code = Lambda(joint_block)([c, x])

Here, the shape of c is (batch_size, 228) and the shape of x is (batch_size, 16, 16, 512). The shape of the c_code will be (batch_size, 640).

Table of Contents for Downsampling blocks

Create new playlist

Sign In

Sign Up

Table of Contents for
Downsampling blocks