© Poornachandra Sarang 2021
P. SarangArtificial Neural Networks with TensorFlow 2https://doi.org/10.1007/978-1-4842-6150-7_12

12. Style Transfer

Poornachandra Sarang1  
(1)
Mumbai, India
 

Introduction

Ever wish you could paint like Picasso or the famous Indian painter M.F. Husain? It looks like neural networks have made your every wish come true. In this chapter, you would learn one such technique that uses neural networks to compose your own clicked picture in the style of a famous artist or rather in a style of your own choice. The technique is called neural style transfer, which is outlined in Leon A. Gatys’ famous paper – “A Neural Algorithm of Artistic Style.” Though the paper is a great read, you would not need all those details given in the paper to understand this chapter.

Neural style transfer is an optimization technique that blends the contents of an image in the style of another image. The TensorFlow Hub that you have learned to use in a previous chapter contains a pre-trained model for style transfers. First, I will show you how to use this model to get quickly started on your style transfer learning. This will be followed by a do-it-yourself example that will teach you how to extract the contents and the style from two different images and then perform the transformation on the content image to create another stylized image.

To give you a quick view of what you are going to achieve, look at Table 12-1.
Table 12-1

The content, style, and stylized images

 ../images/495303_1_En_12_Chapter/495303_1_En_12_Figa_HTML.gif

The first image on the left is the content image, the image in the middle is the style image, and the image on the right is the stylized image. Note how the style of the middle image is applied to the contents of the image on the left to produce a new stylized image.

The theory behind the style transfer is trivial, and I will be covering it in the custom style transfer proGram later in this chapter.

So, let us get started with fast style transfer.

Fast Style Transfer

The TF Hub provides a pre-trained model for doing quick style transfers. The module name is “arbitrary-image-stylization-v1-256/2”. This performs a fast artistic style transfer as compared to the original work for artistic style transfer with neural networks. The model may work on any arbitrary painting styles. The model is based on the technique proposed by Golnaz et al. in their famous paper “Exploring the structure of a real-time, arbitrary neural artistic stylization network” (https://arxiv.org/abs/1705.06830). The proposed model combines the flexibility of the neural algorithm of artistic style with the speed of fast style transfer networks. This facilitates the real-time stylization using any content/style image pair.

You will be using this pre-trained model hosted on TensorFlow Hub for getting a quick understanding of the effects of style transfer.

Creating Project

Create a new Colab project and rename it to TFHubStyleTransfer. Import the required libraries.
import tensorflow as tf
import re
import urllib
import numpy as np
import matplotlib.pyplot as plt
import PIL.Image
import tensorflow_hub as hub
from tensorflow.keras.preprocessing.image
import load_img, img_to_array
from matplotlib import gridspec
from IPython import display
from PIL import Image

Downloading Images

The project requires two images at any given instance – the content image and the style image. The content image will be modified to adapt the style given in the style image. For your testing, I have uploaded a few images on the book’s repository. The URL for an image takes the following form:
https://raw.githubusercontent.com/Apress/artificial-neural-networks-with-tensorflow-2/main/ch12/ferns.jpg
Write a function to extract the filename, download the file, and set a new path to the image. The function definition is trivial and is shown in Listing 12-1.
def download_image_from_URL(imageURL):
  imageName = re.search('[a-z0-9-]+.
                        (jpe?g|png|gif|bmp|JPG)',
                        imageURL, re.IGNORECASE)
  imageName = imageName.group(0)
  urllib.request.urlretrieve(imageURL, imageName)
  imagePath = "./" + imageName
  return imagePath
Listing 12-1

Function for creating an image URL

Call this function to create a path to the target image.
# This is the path to the image you want to transform.
target_url = "https://raw.githubusercontent.com/Apress/artificial-neural-networks-with-tensorflow-2/main/ch12/ferns.jpg"
target_path = download_image_from_URL(target_url)
Likewise, download the image for styling and set its path.
# This is the path to the style image.
style_url = "https://raw.githubusercontent.com/Apress/artificial-neural-networks-with-tensorflow-2/main/ch12/on-the-road.jpg"
style_path = download_image_from_URL(style_url)
The user interface as seen in the Colab notebook is shown in Figure 12-1.
../images/495303_1_En_12_Chapter/495303_1_En_12_Fig1_HTML.jpg
Figure 12-1

Colab interface for selecting image files

Select the desired target and style images from the drop-down list on the right-hand side of the cell.

You can display both the selected images using the matplotlib imshow function with the following code fragment:
content = Image.open(target_path)
style = Image.open(style_path)
plt.figure(figsize=(10, 10))
plt.subplot(1, 2, 1)
plt.imshow(content)
plt.title('Content Image')
plt.subplot(1, 2, 2)
plt.imshow(style)
plt.title('Style Image')
plt.tight_layout()
plt.show()
The output is shown in Figure 12-2.
../images/495303_1_En_12_Chapter/495303_1_En_12_Fig2_HTML.jpg
Figure 12-2

The content and style images

So, we have two images with different dimensions.

Preparing Images for Model Input

The module in tfhub that does the image transformation requires the images to be in a specific format for good results. First, we transform the style image to a tensor using the following function:
def image_to_tensor_style(path_to_img):
    img = tf.io.read_file(path_to_img)
    img = tf.image.decode_image
            (img, channels=3, dtype=tf.float32)
    img = tf.image.resize(img, [256,256])
    img = img[tf.newaxis, :]
    return img

The function reads the image data by calling the read_file method. It decodes the image by calling decode_image into three RGB channels. The image is resized to 256x256 as the pre-trained model uses this particular size for styling. Finally, the image data is returned to the caller as a tensor of shape (1, 256, 256, 3). We add a new dimension to the image which is used later while processing a batch of images.

Likewise, we write a function to convert the target image into a tensor as follows:
def image_to_tensor_target(path_to_img, image_size):
    img = tf.io.read_file(path_to_img)
    img = tf.image.decode_image
            (img, channels=3, dtype=tf.float32)
    img = tf.image.resize(img, [image_size,image_size],
                          preserve_aspect_ratio=True)
    img = img[tf.newaxis, :]
    return img

This function in addition to the image path takes an additional parameter, and that is the user-defined image size. The large images take up a lot of memory during processing, so I have added a parameter so that you can reduce the size of the image while maintaining its aspect ratio. Note the preserve_aspect_ratio parameter in the image resize method call. When the target image that you have loaded earlier is resized to 400, the output tensor will take the shape (1, 1200, 1600, 3). Note that the aspect ratio is maintained. The aspect ratio is the ratio of width to height of an image. If you set the width to 400, the height would be increased/decreased in proportion to this aspect ratio.

You will now convert both the images to tensors by calling the previously defined two methods:
output_image_size = 400
target_image = image_to_tensor_target
                    (target_path,output_image_size)
style_image = image_to_tensor_style(style_path)

Performing Styling

To apply a new style to the target image, you will need to load the pre-trained module from the tfhub. You do this using the following statement:
hub_module = hub.load('https://tfhub.dev/google/magenta/arbitrary-image-stylization-v1-256/2')
The module name is arbitrary-image-stylization-v1-256/2. After the module is loaded, you would perform the transformation by simply sending the two tensors as inputs to the module as follows:
outputs = hub_module(tf.constant(target_image),
                     tf.constant(style_image))
stylized_image = outputs[0]

The outputs is a tensor of shape (1, 300, 400, 3). This is the data for our transformed image. Note the size 300x400 which is the scaled-down version of our original image of size 1600x1200.

Displaying Output

To display the image, we will need to convert the tensor to the image format using the following code:
tensor = stylized_image*256
tensor = np.array(tensor, dtype=np.uint8)
tensor = tensor[0]
PIL.Image.fromarray(tensor)
We need to multiply the image data to the scale of 256 as the stylized_image contains the data that is in the scale of 0 to 1. The output image is shown in Figure 12-3.
../images/495303_1_En_12_Chapter/495303_1_En_12_Fig3_HTML.jpg
Figure 12-3

Stylized image

If you want to display a scaled-down image, shown in Figure 12-4, simply call the imshow method as follows:
plt.imshow(tensor)
../images/495303_1_En_12_Chapter/495303_1_En_12_Fig4_HTML.jpg
Figure 12-4

Scaled-down stylized image

Some More Results

I did a few more transformations on the images loaded in the project. The results are shown in Table 12-2.
Table 12-2

Model inference on different images

-../images/495303_1_En_12_Chapter/495303_1_En_12_Figb_HTML.gif

Full Source

The full source code for TFHubStyleTransfer is given in Listing 12-2.
import tensorflow as tf
import re
import urllib
import numpy as np
import matplotlib.pyplot as plt
import PIL.Image
import tensorflow_hub as hub
from tensorflow.keras.preprocessing.image
import load_img, img_to_array
from matplotlib import gridspec
from IPython import display
from PIL import Image
def download_image_from_URL(imageURL):
  imageName = re.search('[a-z0-9-]+.
                        (jpe?g|png|gif|bmp|JPG)',
                        imageURL, re.IGNORECASE)
  imageName = imageName.group(0)
  urllib.request.urlretrieve(imageURL, imageName)
  imagePath = "./" + imageName
  return imagePath
# This is the path to the image you want to transform.
target_url = "https://raw.githubusercontent.com/Apress/artificial-neural-networks-with-tensorflow-2/main/ch12/ferns.jpg"
target_path = download_image_from_URL(target_url)
# This is the path to the style image.
style_url = "https://raw.githubusercontent.com/Apress/artificial-neural-networks-with-tensorflow-2/main/ch12/on-the-road.jpg"
style_path = download_image_from_URL(style_url)
content = Image.open(target_path)
style = Image.open(style_path)
plt.figure(figsize=(10, 10))
plt.subplot(1, 2, 1)
plt.imshow(content)
plt.title('Content Image')
plt.subplot(1, 2, 2)
plt.imshow(style)
plt.title('Style Image')
plt.tight_layout()
plt.show()
def image_to_tensor_style(path_to_img):
    img = tf.io.read_file(path_to_img)
    img = tf.image.decode_image
            (img, channels=3, dtype=tf.float32)
    img = tf.image.resize(img, [256,256])
    img = img[tf.newaxis, :]
    return img
def image_to_tensor_target(path_to_img, image_size):
    img = tf.io.read_file(path_to_img)
    img = tf.image.decode_image
            (img, channels=3, dtype=tf.float32)
    img = tf.image.resize(img,
                          [image_size,image_size],
                          preserve_aspect_ratio=True)
    img = img[tf.newaxis, :]
    return img
output_image_size = 400
target_image = image_to_tensor_target
                    (target_path,output_image_size)
style_image = image_to_tensor_style(style_path)
hub_module = hub.load('https://tfhub.dev/google/magenta/arbitrary-image-stylization-v1-256/2')
outputs = hub_module(tf.constant(target_image),
                     tf.constant(style_image))
stylized_image = outputs[0]
tensor = stylized_image*256
tensor = np.array(tensor, dtype=np.uint8)
tensor = tensor[0]
PIL.Image.fromarray(tensor)
plt.imshow(tensor)
Listing 12-2

TFHubStyleTransfer full source

Do It Yourself

Having learned how to do a quick style transfer, it is time to learn the techniques behind these projects. The principle behind the style transfer is to extract the style of an image, usually a famous painting, and apply it to the contents of an image of your choice. Thus, there are two input images, namely, content image and style image. The newly generated image is generally called a stylized image. The generated image contains the same contents as the content image, but has acquired the style similar to the style image. As you understand, this is obviously not done by just superimposing the images. So, our proGram must be able to separately distinguish between the content and the style of a given image. That is where we will use the VGG16 pre-trained network model to extract this information and build our own network to create a stylized image based on these inputs. The Android apps such as Prisma and Lucid do such style transfers. Though you will not be taught to develop a similar Android application, this project will teach you the internals of such apps.

Let us first look at the VGG16 architecture to understand how to extract the content and style from an image.

VGG16 Architecture

Gatys et al. (2015) came up with the core idea behind the style transfers. The core idea is to say that the CNNs (Convolutional Neural Networks) pre-trained for image classification know how to encode the perceptual and semantic information about an image. There are many such pre-trained CNNs available in the world. We will use VGG16 to extract the features of an image and then work independently on its content and style. The original paper uses the 19-layer VGG network model from Simonyan and Zisserman (2015). The VGG16 model architecture is shown in Figure 12-5.
../images/495303_1_En_12_Chapter/495303_1_En_12_Fig5_HTML.jpg
Figure 12-5

The VGG16 architecture (Image credits: researchgate.​net)

As we are not doing an image classification and are interested in just the feature extraction, we do not need the fully connected layers or the final softmax classifier of the VGG network. We only need the part of the model. So, how do we extract only a certain portion of the model? Fortunately for us, this is a very trivial task as Keras provides a pre-trained VGG16 model in which you can separate out the layers. Keras does provide many other models including the latter VGG19. To remove the top most fully connected layers, you need to set the value of the include_top variable to False while extracting the model layers.

Creating Project

Create a new Colab project and rename it to CustomStyleTransfer. Install the following two packages:
!pip install keras==2.3.1
!pip install tensorflow==2.1.0
Note

At the time of publishing, it was discovered that the pre-trained VGG16 model used in this project runs with the Keras and TensorFlow versions specified above and does not yet support the newer versions as of this writing.

Import the required libraries.
import tensorflow as tf
import re
import urllib
from tensorflow.keras.preprocessing.image
import load_img, img_to_array
from matplotlib import pyplot as plt
from IPython import display
from PIL import Image
import numpy as np
from tensorflow.keras.applications import vgg16
from tensorflow.keras import backend as K
from keras import backend as K
from scipy.optimize import fmin_l_bfgs_b

Downloading Images

As in the earlier project, you would write a download function and call it to download the two required images for the project. The code is given in Listing 12-3.
def download_image_from_URL(imageURL):
  imageName = re.search
          ('[a-z0-9-]+.(jpe?g|png|gif|bmp|JPG)',
                imageURL, re.IGNORECASE)
  imageName = imageName.group(0)
  urllib.request.urlretrieve(imageURL, imageName)
  imagePath = "./" + imageName
  return imagePath
# This is the path to the image you want to transform.
target_url = "https://raw.githubusercontent.com/Apress/artificial-neural-networks-with-tensorflow-2/main/ch12/blank-sign.jpg"
target_path = download_image_from_URL(target_url)
# This is the path to the style image.
style_url = "https://raw.githubusercontent.com/Apress/artificial-neural-networks-with-tensorflow-2/main/ch12/road.jpg"
Listing 12-3

Function for downloading images

We would scale the target image to have a height of 400 pixels. To maintain the aspect ratio, we recompute the width as follows:
width, height = load_img(target_path).size
img_height = 400
img_width = int(width * img_height / height)

Displaying Images

To display the two images, we use a code similar to the one in the previous project. The code is given in Listing 12-4.
content = Image.open(target_path)
style = Image.open(style_path)
plt.figure(figsize=(10, 10))
plt.subplot(1, 2, 1)
plt.imshow(content)
plt.title('Content Image')
plt.subplot(1, 2, 2)
plt.imshow(style)
plt.title('Style Image')
plt.tight_layout()
plt.show()
Listing 12-4

Displaying content and style images

The output is shown in Figure 12-6.
../images/495303_1_En_12_Chapter/495303_1_En_12_Fig6_HTML.jpg
Figure 12-6

Content and style images

Preprocessing Images

As said earlier, you will be using the VGG16 model for extracting the features in an image. We need to process our image data as per the VGG training process. Fortunately, Keras provides this preprocessing not just for VGG16, but many other popular models such as ResNet, Inception, DenseNet, and so on. The library provides a function called preprocess_input that takes a tensor or numpy array encoding a batch of images as an input and returns a preprocessed numpy array or a tf.tensor with type float32. The method converts the image from RGB to BGR and zero-centers each channel. Note that the VGG networks were trained on images with each channel normalized by mean = [103.939, 116.779, 123.68] and having channels BGR (Blue/Green/Red). The code in Listing 12-5 performs this preprocessing on a given image.
def preprocess_image(image_path):
    img = load_img(image_path,
            target_size=(img_height, img_width))
    img = img_to_array(img)
    img = np.expand_dims(img, axis=0)
    img =
  tf.keras.applications.vgg16.preprocess_input(img)
    return img
Listing 12-5

Preprocessing image for the VGG16 network

If you wish to view the outputs, we need to do the reverse preprocessing. Also, we must clip all the values within the 0–255 range. We do this in the following function definition:
def deprocess_image(x):
    # Remove zero-center by mean pixel
    x[:, :, 0] += 103.939
    x[:, :, 1] += 116.779
    x[:, :, 2] += 123.68
    # 'BGR'->'RGB'
    x = x[:, :, ::-1]
    x = np.clip(x, 0, 255).astype('uint8')
    return x

You will now build the model based on the VGG16 model.

Model Building

To build the model, we feed the VGG16 with our image tensor data and extract the feature maps, the content, and style representations. The model will be loaded with pre-trained ImageNet weights. The model building code is given here:
target = K.constant(preprocess_image(target_path))
style = K.constant(preprocess_image(style_path))
# This placeholder will contain our generated image
combination_image = K.placeholder
                      ((1, img_height, img_width, 3))
# We combine the 3 images into a single batch
input_tensor = K.concatenate([target,
                              style,
                              combination_image],
                                  axis=0)
# Build the VGG16 network with our batch of 3 images as input.
model = vgg16.VGG16(input_tensor=input_tensor,
                    weights='imagenet',
                    include_top=False)

In this code, we first build the input tensor for our content and style images by calling our earlier defined preprocess method. We create a placeholder for the destination image and then create a tensor for three images by calling the concatenate method. This is then used as a parameter to the VGG16 method to extract our desired model.

You can now examine the model summary.
model.summary()
The summary output is shown in Figure 12-7.
../images/495303_1_En_12_Chapter/495303_1_En_12_Fig7_HTML.jpg
Figure 12-7

Model summary

Content Loss

We will compute the content loss at each desired layer and add them up. At each iteration, we feed our input image to the model. The model will compute all the content losses properly, and as we have an eager execution, all gradients will also be computed. The content loss is an indication of how similar the randomly generated noisy image (G) is to the content image (C). The content loss is computed as follows.

Assume that we choose a hidden layer (L) in a pre-trained network (VGG network) to compute the loss. Let P and F represent the original image and the image that is generated. Let F[l] and P[l] be the feature representation of the respective images in layer L. Then, the content loss is defined as follows:
$$ {L}_{content}left(overrightarrow{P},overrightarrow{X},l
ight)=frac{1}{2}sum limits_{ij}{left({F}_{ij}^l-{P}_{ij}^l
ight)}^2 $$
We code this formula of the content loss as follows:
def content_loss(base, combination):
    return K.sum(K.square(combination - base))

Style Loss

To compute the style loss, we first need to compute the Gram matrix. The Gram matrix is an additional preprocessing step that is added to find the correlation between the different channels which will later be used for the measure of the style itself.

We define the Gram matrix as follows:
def gram_matrix(x):
    features = K.batch_flatten
                (K.permute_dimensions(x, (2, 0, 1)))
    gram = K.dot(features, K.transpose(features))
    return gram
The style loss computes the Gram matrix for both the style and the generated image and then returns the cost to the caller. The cost is the square of the difference between the Gram matrix of the style and the Gram matrix of the generated image. Mathematically, this is expressed as
$$ {L}_{GM}left(S,G,l
ight)=frac{1}{4{N}_l^2{M}_l^2}sum limits_{ij}{left( GMleft[l
ight]{(S)}_{ij}- GMleft[l
ight]{(G)}_{ij}
ight)}^2 $$
The definition of the style loss function is given here:
def style_loss(style, combination):
    S = gram_matrix(style)
    C = gram_matrix(combination)
    channels = 3
    size = img_height * img_width
    return K.sum(K.square(S - C)) /
                (4. * (channels ** 2) * (size ** 2))

Total Variation Loss

To regularize the output for smoothness, we define a total variation loss in adjacent pixels as follows:
def total_variation_loss(x):
    a = K.square(
        x[:, :img_height - 1, :img_width - 1,
                :] - x[:, 1:, :img_width - 1, :])
    b = K.square(
        x[:, :img_height - 1, :img_width - 1 ,
                :] - x[:, :img_height - 1, 1:, :])
    return K.sum(K.pow(a + b, 1.25))

Computing Losses for Content and Style

We first select the VGG16 content and style layers. I have used the layers defined in Johnson et al. (2016) rather than the ones suggested by Gatys et al. (2015) because this produces a better end result.

First, we map all the layers to a dictionary.
# Dict mapping layer names to activation tensors
outputs_dict = dict([(layer.name, layer.output)
                       for layer in model.layers])
We extract the content layer:
# Name of layer used for content loss
content_layer = 'block5_conv2'
We extract the style layers:
# Name of layers used for style loss;
style_layers = ['block1_conv1',
                'block2_conv1',
                'block3_conv1',
                'block4_conv1',
                'block5_conv1']
We define a few weight variables to be used in the computation of the weighted average of the loss components. Think of them as the hyperparameters for the style and content layers that decide the weightage given to these layers in the final model.
total_variation_weight = 1e-4
style_weight = 10.
content_weight = 0.025
We compute the total loss by adding all components.
# Define the loss by adding all components to a `loss` variable
loss = K.variable(0.)
layer_features = outputs_dict[content_layer]
target_features = layer_features[0, :, :, :]
combination_features = layer_features[2, :, :, :]
loss = loss + content_weight *
            content_loss(target_features,
                         combination_features)
for layer_name in style_layers:
    layer_features = outputs_dict[layer_name]
    style_reference_features = layer_features
                                [1, :, :, :]
    combination_features = layer_features
                                [2, :, :, :]
    sl = style_loss(style_reference_features,
                            combination_features)
    loss += (style_weight / len(style_layers)) * sl
    loss += total_variation_weight *
             total_variation_loss(combination_image)

Evaluator Class

Lastly, we will define a class called Evaluator to compute the loss and the gradients in one pass.
grads = K.gradients(loss, combination_image)[0]
# Function to fetch the values of the current loss and the current gradients
fetch_loss_and_grads =
                 K.function([combination_image],
                       [loss, grads])
class Evaluator(object):
    def __init__(self):
        self.loss_value = None
        self.grads_values = None
    def loss(self, x):
        assert self.loss_value is None
        x = x.reshape((1, img_height, img_width, 3))
        outs = fetch_loss_and_grads([x])
        loss_value = outs[0]
        grad_values =
                outs[1].flatten().astype('float64')
        self.loss_value = loss_value
        self.grad_values = grad_values
        return self.loss_value
    def grads(self, x):
        assert self.loss_value is not None
        grad_values = np.copy(self.grad_values)
        self.loss_value = None
        self.grad_values = None
        return grad_values
evaluator = Evaluator()

Generating Output Image

Now that we have all the utility functions ready, it is time to generate a stylized image. We start with a random collection of pixels (a random image) and use the L-BFGS (Limited Memory Broyden-Fletcher-Goldfarb-Shanno) algorithm for optimization. The algorithm uses the second derivative to minimize or maximize the function and is significantly quicker than the standard gradient descent. The training loop is shown here:
iterations = 50
x = preprocess_image(target_path)
x = x.flatten()
for i in range(1, iterations):
    x, min_val, info = fmin_l_bfgs_b
                      (evaluator.loss,
                      x,
                      fprime=evaluator.grads,
                                     maxfun=10)
    print('Iteration %0d, loss: %0.02f' %
                          (i, min_val))
img = x.copy().reshape((img_height, img_width, 3))
img = deprocess_image(img)

At the end of the training, we copy the final output image in a variable and reprocess it to make it ready for display.

Displaying Images

We now display all three images using the following code:
plt.figure(figsize=(50, 50))
plt.subplot(3,3,1)
plt.imshow(load_img(target_path, target_size=(img_height, img_width)))
plt.subplot(3,3,2)
plt.imshow(load_img(style_path, target_size=(img_height, img_width)))
plt.subplot(3,3,3)
plt.imshow(img)
plt.show()
The output image is shown in Figure 12-8.
../images/495303_1_En_12_Chapter/495303_1_En_12_Fig8_HTML.jpg
Figure 12-8

The content, style, and stylized images

Full Source

The full source code for CustomStyleTransfer is given in Listing 12-6.
!pip install keras==2.3.1
!pip install tensorflow==2.1.0
import tensorflow as tf
import re
import urllib
from tensorflow.keras.preprocessing.image
import load_img, img_to_array
from matplotlib import pyplot as plt
from IPython import display
from PIL import Image
import numpy as np
from tensorflow.keras.applications import vgg16
from tensorflow.keras import backend as K
from keras import backend as K
from scipy.optimize import fmin_l_bfgs_b
def download_image_from_URL(imageURL):
  imageName = re.search
          ('[a-z0-9-]+.(jpe?g|png|gif|bmp|JPG)',
                imageURL, re.IGNORECASE)
  imageName = imageName.group(0)
  urllib.request.urlretrieve(imageURL, imageName)
  imagePath = "./" + imageName
  return imagePath
# This is the path to the image you want to transform.
target_url = "https://raw.githubusercontent.com/Apress/artificial-neural-networks-with-tensorflow-2/main/ch12/blank-sign.jpg"
target_path = download_image_from_URL(target_url)
# This is the path to the style image.
style_url = "https://raw.githubusercontent.com/Apress/artificial-neural-networks-with-tensorflow-2/main/ch12/road.jpg"
style_path = download_image_from_URL(style_url)
# Dimensions for the generated picture.
width, height = load_img(target_path).size
img_height = 400
img_width = int(width * img_height / height)
content = Image.open(target_path)
style = Image.open(style_path)
plt.figure(figsize=(10, 10))
plt.subplot(1, 2, 1)
plt.imshow(content)
plt.title('Content Image')
plt.subplot(1, 2, 2)
plt.imshow(style)
plt.title('Style Image')
plt.tight_layout()
plt.show()
# Preprocess the data as per VGG16 requirements
def preprocess_image(image_path):
    img = load_img(image_path,
                target_size=(img_height, img_width))
    img = img_to_array(img)
    img = np.expand_dims(img, axis=0)
    img =
  tf.keras.applications.vgg16.preprocess_input(img)
    return img
def deprocess_image(x):
    # Remove zero-center by mean pixel
    x[:, :, 0] += 103.939
    x[:, :, 1] += 116.779
    x[:, :, 2] += 123.68
    # 'BGR'->'RGB'
    x = x[:, :, ::-1]
    x = np.clip(x, 0, 255).astype('uint8')
    return x
target = K.constant(preprocess_image(target_path))
style = K.constant(preprocess_image(style_path))
# This placeholder will contain our generated image
combination_image = K.placeholder
                      ((1, img_height, img_width, 3))
# We combine the 3 images into a single batch
input_tensor = K.concatenate([target, style, combination_image], axis=0)
# Build the VGG16 network with our batch of 3 images as input.
model = vgg16.VGG16(input_tensor=input_tensor,
                    weights='imagenet',
                    include_top=False)
model.summary()
# compute content loss for the generated image
def content_loss(base, combination):
    return K.sum(K.square(combination - base))
def gram_matrix(x):
    features = K.batch_flatten
                (K.permute_dimensions(x, (2, 0, 1)))
    gram = K.dot(features, K.transpose(features))
    return gram
def style_loss(style, combination):
    S = gram_matrix(style)
    C = gram_matrix(combination)
    channels = 3
    size = img_height * img_width
    return K.sum(K.square(S - C)) /
                (4. * (channels ** 2) * (size ** 2))
def total_variation_loss(x):
    a = K.square(
        x[:, :img_height - 1, :img_width - 1,
                :] - x[:, 1:, :img_width - 1, :])
    b = K.square(
        x[:, :img_height - 1, :img_width - 1,
                :] - x[:, :img_height - 1, 1:, :])
    return K.sum(K.pow(a + b, 1.25))
# Dict mapping layer names to activation tensors
outputs_dict = dict([(layer.name, layer.output)
                       for layer in model.layers])
# Name of layer used for content loss
content_layer = 'block5_conv2'
# Name of layers used for style loss;
style_layers = ['block1_conv1',
                'block2_conv1',
                'block3_conv1',
                'block4_conv1',
                'block5_conv1']
# Weights in the weighted average of the loss components
total_variation_weight = 1e-4
style_weight = 10.
content_weight = 0.025
# Define the loss by adding all components to a `loss` variable
loss = K.variable(0.)
layer_features = outputs_dict[content_layer]
target_features = layer_features[0, :, :, :]
combination_features = layer_features[2, :, :, :]
loss = loss + content_weight *
            content_loss(target_features,
                         combination_features)
for layer_name in style_layers:
    layer_features = outputs_dict[layer_name]
    style_reference_features = layer_features
                                [1, :, :, :]
    combination_features = layer_features
                                [2, :, :, :]
    sl = style_loss(style_reference_features,
                            combination_features)
    loss += (style_weight / len(style_layers)) * sl
    loss += total_variation_weight *
             total_variation_loss(combination_image)
grads = K.gradients(loss, combination_image)[0]
# Function to fetch the values of the current loss and the current gradients
fetch_loss_and_grads =
                   K.function([combination_image],
                                 [loss, grads])
class Evaluator(object):
    def __init__(self):
        self.loss_value = None
        self.grads_values = None
    def loss(self, x):
        assert self.loss_value is None
        x = x.reshape((1, img_height, img_width, 3))
        outs = fetch_loss_and_grads([x])
        loss_value = outs[0]
        grad_values =
                outs[1].flatten().astype('float64')
        self.loss_value = loss_value
        self.grad_values = grad_values
        return self.loss_value
    def grads(self, x):
        assert self.loss_value is not None
        grad_values = np.copy(self.grad_values)
        self.loss_value = None
        self.grad_values = None
        return grad_values
evaluator = Evaluator()
iterations = 50
x = preprocess_image(target_path)
x = x.flatten()
for i in range(1, iterations):
    x, min_val, info = fmin_l_bfgs_b
                      (evaluator.loss,
                      x,
                      fprime=evaluator.grads,
                                     maxfun=10)
    print('Iteration %0d, loss: %0.02f' %
                          (i, min_val))
img = x.copy().reshape((img_height, img_width, 3))
img = deprocess_image(img)
plt.figure(figsize=(50, 50))
plt.subplot(3,3,1)
plt.imshow(load_img(target_path, target_size=(img_height, img_width)))
plt.subplot(3,3,2)
plt.imshow(load_img(style_path, target_size=(img_height, img_width)))
plt.subplot(3,3,3)
plt.imshow(img)
plt.show()
Listing 12-6

CustomStyleTransfer full source

Summary

In this chapter, you learned another important technique in neural networks, and that is neural style transfer. The technique allows you to transform the contents of an image of your choice into a style of another image. You learned how to do this style transfer in two different ways. The first one was to use a pre-trained model provided in tfhub to perform a fast artistic style transfer. The style transfers using this method are quick and do a very good job. The second approach was to build your own network to do the style transfers at the core level. We used a VGG16 pre-trained model for image classification to extract the contents and style of images. You then learned to create a network which learned how to apply the style to the given contents over several iterations. This method allows you to perform your own experiments on style transfer.

In the next chapter, you will learn how to generate images using Generative Adversarial Networks (GANs).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.218.163.28