Introduction
Ever wish you could paint like Picasso or the famous Indian painter M.F. Husain? It looks like neural networks have made your every wish come true. In this chapter, you would learn one such technique that uses neural networks to compose your own clicked picture in the style of a famous artist or rather in a style of your own choice. The technique is called neural style transfer, which is outlined in Leon A. Gatys’ famous paper – “A Neural Algorithm of Artistic Style.” Though the paper is a great read, you would not need all those details given in the paper to understand this chapter.
Neural style transfer is an optimization technique that blends the contents of an image in the style of another image. The TensorFlow Hub that you have learned to use in a previous chapter contains a pre-trained model for style transfers. First, I will show you how to use this model to get quickly started on your style transfer learning. This will be followed by a do-it-yourself example that will teach you how to extract the contents and the style from two different images and then perform the transformation on the content image to create another stylized image.
The content, style, and stylized images
|
The first image on the left is the content image, the image in the middle is the style image, and the image on the right is the stylized image. Note how the style of the middle image is applied to the contents of the image on the left to produce a new stylized image.
The theory behind the style transfer is trivial, and I will be covering it in the custom style transfer proGram later in this chapter.
So, let us get started with fast style transfer.
Fast Style Transfer
The TF Hub provides a pre-trained model for doing quick style transfers. The module name is “arbitrary-image-stylization-v1-256/2”. This performs a fast artistic style transfer as compared to the original work for artistic style transfer with neural networks. The model may work on any arbitrary painting styles. The model is based on the technique proposed by Golnaz et al. in their famous paper “Exploring the structure of a real-time, arbitrary neural artistic stylization network” (https://arxiv.org/abs/1705.06830). The proposed model combines the flexibility of the neural algorithm of artistic style with the speed of fast style transfer networks. This facilitates the real-time stylization using any content/style image pair.
You will be using this pre-trained model hosted on TensorFlow Hub for getting a quick understanding of the effects of style transfer.
Creating Project
Downloading Images
Function for creating an image URL
Select the desired target and style images from the drop-down list on the right-hand side of the cell.
So, we have two images with different dimensions.
Preparing Images for Model Input
The function reads the image data by calling the read_file method. It decodes the image by calling decode_image into three RGB channels. The image is resized to 256x256 as the pre-trained model uses this particular size for styling. Finally, the image data is returned to the caller as a tensor of shape (1, 256, 256, 3). We add a new dimension to the image which is used later while processing a batch of images.
This function in addition to the image path takes an additional parameter, and that is the user-defined image size. The large images take up a lot of memory during processing, so I have added a parameter so that you can reduce the size of the image while maintaining its aspect ratio. Note the preserve_aspect_ratio parameter in the image resize method call. When the target image that you have loaded earlier is resized to 400, the output tensor will take the shape (1, 1200, 1600, 3). Note that the aspect ratio is maintained. The aspect ratio is the ratio of width to height of an image. If you set the width to 400, the height would be increased/decreased in proportion to this aspect ratio.
Performing Styling
The outputs is a tensor of shape (1, 300, 400, 3). This is the data for our transformed image. Note the size 300x400 which is the scaled-down version of our original image of size 1600x1200.
Displaying Output
Some More Results
Model inference on different images
- |
Full Source
TFHubStyleTransfer full source
Do It Yourself
Having learned how to do a quick style transfer, it is time to learn the techniques behind these projects. The principle behind the style transfer is to extract the style of an image, usually a famous painting, and apply it to the contents of an image of your choice. Thus, there are two input images, namely, content image and style image. The newly generated image is generally called a stylized image. The generated image contains the same contents as the content image, but has acquired the style similar to the style image. As you understand, this is obviously not done by just superimposing the images. So, our proGram must be able to separately distinguish between the content and the style of a given image. That is where we will use the VGG16 pre-trained network model to extract this information and build our own network to create a stylized image based on these inputs. The Android apps such as Prisma and Lucid do such style transfers. Though you will not be taught to develop a similar Android application, this project will teach you the internals of such apps.
Let us first look at the VGG16 architecture to understand how to extract the content and style from an image.
VGG16 Architecture
As we are not doing an image classification and are interested in just the feature extraction, we do not need the fully connected layers or the final softmax classifier of the VGG network. We only need the part of the model. So, how do we extract only a certain portion of the model? Fortunately for us, this is a very trivial task as Keras provides a pre-trained VGG16 model in which you can separate out the layers. Keras does provide many other models including the latter VGG19. To remove the top most fully connected layers, you need to set the value of the include_top variable to False while extracting the model layers.
Creating Project
At the time of publishing, it was discovered that the pre-trained VGG16 model used in this project runs with the Keras and TensorFlow versions specified above and does not yet support the newer versions as of this writing.
Downloading Images
Function for downloading images
Displaying Images
Displaying content and style images
Preprocessing Images
Preprocessing image for the VGG16 network
You will now build the model based on the VGG16 model.
Model Building
In this code, we first build the input tensor for our content and style images by calling our earlier defined preprocess method. We create a placeholder for the destination image and then create a tensor for three images by calling the concatenate method. This is then used as a parameter to the VGG16 method to extract our desired model.
Content Loss
We will compute the content loss at each desired layer and add them up. At each iteration, we feed our input image to the model. The model will compute all the content losses properly, and as we have an eager execution, all gradients will also be computed. The content loss is an indication of how similar the randomly generated noisy image (G) is to the content image (C). The content loss is computed as follows.
Style Loss
To compute the style loss, we first need to compute the Gram matrix. The Gram matrix is an additional preprocessing step that is added to find the correlation between the different channels which will later be used for the measure of the style itself.
Total Variation Loss
Computing Losses for Content and Style
We first select the VGG16 content and style layers. I have used the layers defined in Johnson et al. (2016) rather than the ones suggested by Gatys et al. (2015) because this produces a better end result.
Evaluator Class
Generating Output Image
At the end of the training, we copy the final output image in a variable and reprocess it to make it ready for display.
Displaying Images
Full Source
CustomStyleTransfer full source
Summary
In this chapter, you learned another important technique in neural networks, and that is neural style transfer. The technique allows you to transform the contents of an image of your choice into a style of another image. You learned how to do this style transfer in two different ways. The first one was to use a pre-trained model provided in tfhub to perform a fast artistic style transfer. The style transfers using this method are quick and do a very good job. The second approach was to build your own network to do the style transfers at the core level. We used a VGG16 pre-trained model for image classification to extract the contents and style of images. You then learned to create a network which learned how to apply the style to the given contents over several iterations. This method allows you to perform your own experiments on style transfer.
In the next chapter, you will learn how to generate images using Generative Adversarial Networks (GANs).