Building your model using TensorFlow

Now that we have seen several methods of obtaining the images we need, or, in the absence of any, creating our own, we will now use TensorFlow to create the classification model for our flower use case:

  1. Creating the folder structure: To start with, let's create the folder structure that's required for our flower classification use case. First, create a main folder called image_classification. Within the image_classification folder, create two folders: images and tf_files. The images folder will contain the images that are required for model training, and the tf_files folder will hold all the generated TensorFlow-specific files during runtime.
  2. Downloading the images: Next, we need to download the images that are specific to our use case. Using the example of Flowers, our images will come from the VGG datasets page we discussed earlier.
Please feel free to use your own datasets, but make sure that each category is in its own separate folder. Place the downloaded image dataset within the images folder.

For example, the complete folder structure will look like this:

  1. Creating the Python script: In this step, we will create the TensorFlow code that is required to build our model. Create a Python file named retrain.py within the main image_classification folder.

Once this is complete, the following code block should be copied and used. Below we have broken out the process into several steps in order to describe what is taking place:

  1. The following code block is the complete script that goes into retrain.py:
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import argparse
import collections
from datetime import datetime
import hashlib
import os.path
import random
import re
import sys
import tarfile
import numpy as np
from six.moves import urllib
import tensorflow as tf
from tensorflow.python.framework import graph_util
from tensorflow.python.framework import tensor_shape
from tensorflow.python.platform import gfile
from tensorflow.python.util import compat
FLAGS = None
MAX_NUM_IMAGES_PER_CLASS = 2 ** 27 - 1 # ~134M
  1. Next, we need to prepare the images so that they can be trained, validated, and tested:
result = collections.OrderedDict()
sub_dirs = [
os.path.join(image_dir,item)
for item in gfile.ListDirectory(image_dir)]
sub_dirs = sorted(item for item in sub_dirs
if gfile.IsDirectory(item))
for sub_dir in sub_dirs:

The first thing we are going to do is to retrieve the images from the directory path where they are stored. We will use the images to create the model graph using the model that you previously downloaded and installed.

The next step is to bottleneck the array initialization by creating what is known as bottleneck files. Bottleneck is an informal term used for the layer just before the final output layer that does the actual classification. (TensorFlow Hub calls this an image feature vector.) This layer has been trained to output a set of values that's good enough for the classifier to use in order to distinguish between all the classes it's been asked to recognize. This means that it must be a meaningful and compact summary of the images, since it must contain enough information for the classifier to make a good choice in a very small set of values.

It's important that we have bottleneck values for each image. If the bottleneck values aren't available for each image, we will have to create them manually because these values will be required in the future when training the images. It is highly recommended to cache these values in order to speed up processing time later. Because every image is reused multiple times during training, and calculating each bottleneck takes a significant amount of time, it speeds things up to cache these bottleneck values on disk to avoid repeated recalculated. By default, bottlenecks are stored in the /tmp/bottleneck directory (unless a new directory was specified as an argument).

When we retrieve the bottleneck values, we will do so based upon the filenames of images that are stored in the cache. If distortions were applied to images, there might be difficulty in retrieving the bottleneck values. The biggest disadvantage of enabling distortions in our script is that the bottleneck caching is no longer useful, since input images are never reused exactly. This directly correlates to a longer training process time, so it is highly recommended this happens once you have a model that you are reasonably happy with. Should you experience problems, we have supplied a method of getting the values for images which have distortions supplied as a part of the GitHub repository for this book.

Please note that we materialize the distorted image data as a NumPy array first.

Next, we need to send the running inference on the image. This requires a trained object detection model and is done by using two memory copies.

Our next step will be to apply distortion to the images. Distortions such as cropping, scaling and brightness are supplied as percentage values which control how much of each distortion is applied to each image. It's reasonable to start with values of 5 or 10 for each of them and then experiment from there to see which/what helps and what does not.

We next need to summarize our model based on accuracy and loss. We will use TensorBoard visualizations to analyze it. If you do not already know, TensorFlow offers a suite of visualization tools called TensorBoard which allows you to visualize your TensorFlow graph, plot variables about the execution, and show additional data like images that pass through it. The following is an example TensorBoard dashboard:

Our next step will be to save the model to a file, as well as setting up a directory path to write summaries for the TensorBoard.

At this point we should point out the create_model_info function, that will return the model information. In our example below, we handle both MobileNet and Inception_v3 architectures. You will see later how we handle any other architecture but these:

def create_model_info(architecture):
architecture = architecture.lower()
if architecture == 'inception_v3':
# pylint: disable=line-too-long
data_url = 'http://download.tensorflow.org/models/image/imagenet/inception-2015-12-05.tgz'
# pylint: enable=line-too-long
bottleneck_tensor_name = 'pool_3/_reshape:0'
bottleneck_tensor_size = 2048
input_width = 299
input_height = 299
input_depth = 3
resized_input_tensor_name = 'Mul:0'
model_file_name = 'classify_image_graph_def.pb'
input_mean = 128
input_std = 128
elif architecture.startswith('mobilenet_'):
parts = architecture.split('_')
if len(parts) != 3 and len(parts) != 4:
tf.logging.error("Couldn't understand architecture name '%s'",
architecture)
return None
version_string = parts[1]
if (version_string != '1.0' and version_string != '0.75' and
version_string != '0.50' and version_string != '0.25'):
tf.logging.error(
""""The Mobilenet version should be '1.0', '0.75', '0.50', or '0.25',
but found '%s' for architecture '%s'""",
version_string, architecture)
return None
size_string = parts[2]
if (size_string != '224' and size_string != '192' and
size_string != '160' and size_string != '128'):
tf.logging.error(
"""The Mobilenet input size should be '224', '192', '160', or '128',
but found '%s' for architecture '%s'""",
size_string, architecture)
return None
if len(parts) == 3:
is_quantized = False

If the above argument turns out to be false, this means that we encountered an architecture which we were not expecting. If this happens, we will need to execute the following code block to obtain the result. In this instance we are not dealing with either MobileNet or Inception_V3 and will default to using version 1 of MobileNet:

else:
if parts[3] != 'quantized':
tf.logging.error(
"Couldn't understand architecture suffix '%s' for '%s'", parts[3],
architecture)
return None
is_quantized = True
data_url = 'http://download.tensorflow.org/models/mobilenet_v1_'
data_url += version_string + '_' + size_string + '_frozen.tgz'
bottleneck_tensor_name = 'MobilenetV1/Predictions/Reshape:0'
bottleneck_tensor_size = 1001
input_width = int(size_string)
input_height = int(size_string)
input_depth = 3
resized_input_tensor_name = 'input:0'
if is_quantized:
model_base_name = 'quantized_graph.pb'
else:
model_base_name = 'frozen_graph.pb'
model_dir_name = 'mobilenet_v1_' + version_string + '_' + size_string
model_file_name = os.path.join(model_dir_name, model_base_name)
input_mean = 127.5
input_std = 127.5
else:
tf.logging.error("Couldn't understand architecture name '%s'", architecture)
raise ValueError('Unknown architecture', architecture)
return {
'data_url': data_url,
'bottleneck_tensor_name': bottleneck_tensor_name,
'bottleneck_tensor_size': bottleneck_tensor_size,
'input_width': input_width,
'input_height': input_height,
'input_depth': input_depth,
'resized_input_tensor_name': resized_input_tensor_name,
'model_file_name': model_file_name,
'input_mean': input_mean,
'input_std': input_std,
}
==============================================================

Another important point we should note is that we will need to decode the image JPEG data after processing. The following function, add_jpeg_decoding, is a complete code snippet which does this by calling the tf.image.decode_jpeg function:

def add_jpeg_decoding(input_width, input_height, input_depth, input_mean,
input_std):
jpeg_data = tf.placeholder(tf.string, name='DecodeJPGInput')
decoded_image = tf.image.decode_jpeg(jpeg_data, channels=input_depth)
decoded_image_as_float = tf.cast(decoded_image, dtype=tf.float32)
decoded_image_4d = tf.expand_dims(decoded_image_as_float, 0)
resize_shape = tf.stack([input_height, input_width])
resize_shape_as_int = tf.cast(resize_shape, dtype=tf.int32)
resized_image = tf.image.resize_bilinear(decoded_image_4d,
resize_shape_as_int)
offset_image = tf.subtract(resized_image, input_mean)
mul_image = tf.multiply(offset_image, 1.0 / input_std)
return jpeg_data, mul_image

And here, in all its glory is our main function. Basically we do the following:

  • Set our logging level to INFO
  • Prepare the file system for usage
  • Create our model information
  • Download and extract our data
def main(_):
tf.logging.set_verbosity(tf.logging.INFO)
prepare_file_system()
model_info = create_model_info(FLAGS.architecture)
if not model_info:
tf.logging.error('Did not recognize architecture flag')
return -1
maybe_download_and_extract(model_info['data_url'])
graph, bottleneck_tensor, resized_image_tensor = (
create_model_graph(model_info))
image_lists = create_image_lists(FLAGS.image_dir, FLAGS.testing_percentage,
FLAGS.validation_percentage)

The preceding retrain.py file is available for download as part of the assets within this book.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.31.22