Chapter 8: Best Practices for Model Training and Performance

In order for a supervised machine learning model to be well trained, it requires large volumes of training data. In this chapter, we are going to look at a few common examples and patterns for handling input data. We will specifically learn how to access training data regardless of its size and train the model with it. After that, we will look at regularization techniques that help to prevent overfitting. Having large volumes of training data is no guarantee of a well-trained model. In order to prevent overfitting, we may need to apply various regularization techniques in our training processes. We will take a look at a number of such techniques, starting with the typical Lasso (L1), Ridge (L2), and elastic net regularizations, before moving on to a modern regularization technique known as adversarial regularization. With these techniques at our disposal, we put ourselves in a good position vis-à-vis reducing overfitting as a result of training.

When it comes to regularization, there is no straightforward way to determine which method works best. It certainly depends on other factors, such as the distribution or sparsity of features and the volume of data. The purpose of this chapter is to provide various examples and give you several choices to try during your own model training process. In this chapter, we will cover the following topics:

  • Input handling for loading data
  • Regularization to reduce overfitting

Input handling for loading data

Many common examples that we typically see tend to focus on the modeling aspect, such as how to build a deep learning model using TensorFlow with various layers and patterns. In these examples, the data used is almost always loaded into the runtime memory directly. This is fine as long as the training data is sufficiently small. But what if it is much larger than your runtime memory can handle? The solution is data streaming. We have been using this technique to feed data into our model in the previous chapters, and we are going to take a closer look at data streaming and generalize it to more data types.

The streaming data technique is very similar to a Python generator. Data is ingested into the model training process in batches, meaning that all the data is not sent at one time. In this chapter, we are going to use an example of flower image data. Even though this data is not big by any means, it is a convenient tool for our teaching and learning purposes in this regard. It is multiclass and contains images of different sizes. This reflects what we usually have to deal with in reality, where available training images may be crowdsourced or provided at different scales or dimensions. In addition, an efficient data ingestion workflow is needed as the frontend to the model training process.

Working with the generator

When it comes to the generator, TensorFlow now has a very convenient ImageDataGenerator API that greatly simplifies and speeds up the code development process. From our experience in using pretrained models for image classification, we have seen that it is often necessary to standardize image dimensions (height and width as measured by the number of pixels) and normalize image pixel values to within a certain range (from [0, 255] to [0, 1]).

The ImageDataGenerator API provides optional input parameters to make these tasks almost routine and reduce the work of writing your own functions to perform standardization and normalization. So, let's take a look at how to use this API:

  1. Organize raw images. Let's begin by setting up our image collection. For convenience, we are going to use the flower images directly from the tf.keras API:

    import tensorflow as tf

    import tensorflow_hub as hub

    data_dir = tf.keras.utils.get_file(

        'flower_photos', 'https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz',

        untar=True)

    In the preceding code, we use the tf.keras API to download the images of five flower types.

  2. Next, we will set up ImageDataGenerator and streaming objects with flow_from_directory. In this step, several operations are defined:

    a. Image pixel intensity is scaled to a range of [0, 255], along with a cross-validation fraction. The ImageDataGenerator API comes with optional input argument rescaling and validation_split. These arguments have to be in a dictionary format. Therefore, we can organize the rescale (normalization) factor and fraction for cross-validation together in datagen_kwargs.

    b. The image height and width are both reformatted to 224 pixels. The flow_ from_directory API contains the optional target_size, batch_size, and interpolation arguments. These arguments are designed in a dictionary format. We may use these input arguments to set image size standardization, batch size, and the resampling interpolation algorithm in dataflow_kwargs.

    c. The preceding settings are passed to the generator instance. We then pass these into ImageDataGenerator and flow_from_directory:

    pixels =224

    BATCH_SIZE = 32

    IMAGE_SIZE = (pixels, pixels)

    NUM_CLASSES = 5

    datagen_kwargs = dict(rescale=1./255, validation_split=.20)

    dataflow_kwargs = dict(target_size=IMAGE_SIZE,

                           batch_size=BATCH_SIZE,

                       interpolation="bilinear")

    valid_datagen = tf.keras.preprocessing.image.ImageDataGenerator(

        **datagen_kwargs)

    valid_generator = valid_datagen.flow_from_directory(

        data_dir, subset="validation", shuffle=False, **dataflow_kwargs)

    train_datagen = valid_datagen

    train_generator = train_datagen.flow_from_directory(

    data_dir, subset="training", shuffle=True, **dataflow_kwargs)

    The preceding code demonstrates a typical workflow for creating image generators as a means of ingesting training data in to a model. Two dictionaries are defined and hold the arguments we need. Then, the ImageDataGenerator API is invoked, followed by the flow_from_directory API. The process is repeated for training data as well. For the results, we have set up an ingestion workflow for training and cross-validation data through train_generator and valid_generator.

  3. Retrieve mapping for labels. Since we use ImageDataGenerator to create a data pipeline for training, we may use it to retrieve image labels as well:

    labels_idx = (train_generator.class_indices)

    idx_labels = dict((v,k) for k,v in labels_idx.items())

    print(idx_labels)

    In the preceding code, idx_labels is a dictionary that maps the classification model output, which is an index, to the flower class. This is idx_labels:

    {0: 'daisy', 1: 'dandelion', 2: 'roses', 3: 'sunflowers', 4: 'tulips'}

    Since this is a multiclass classification problem, our model prediction will be an array of five probabilities. Therefore, we want the position of the class with the highest probability, and then we will map the position to the name of the corresponding class using idx_labels.

  4. Build and train the model. This step is the same as we performed in the previous chapter, Chapter 7, Model Optimization, where we will build a model by means of transfer learning. The model of choice is a ResNet feature vector, and the final classification layer is a dense layer with five nodes (NUM_CLASSES is defined to be 5, as indicated in step 2), and these five nodes output probabilities for each of the five classes:

    mdl = tf.keras.Sequential([

        tf.keras.layers.InputLayer(input_shape=IMAGE_SIZE + (3,)),

    hub.KerasLayer("https://tfhub.dev/google/imagenet/resnet_v1_101/feature_vector/4", trainable=False),

        tf.keras.layers.Dense(NUM_CLASSES,

                 activation='softmax', name = 'custom_class')

    ])

    mdl.build([None, 224, 224, 3])

    mdl.compile(

      optimizer=tf.keras.optimizers.SGD(lr=0.005,

                                               momentum=0.9),

      loss=tf.keras.losses.CategoricalCrossentropy(from_logits=True, label_smoothing=0.1),

      metrics=['accuracy'])

    steps_per_epoch = train_generator.samples

                                // train_generator.batch_size

    validation_steps = valid_generator.samples

                                // valid_generator.batch_size

    mdl.fit(

        train_generator,

        epochs=5, steps_per_epoch=steps_per_epoch,

        validation_data=valid_generator,

        validation_steps=validation_steps)

    The preceding code shows the general flow of setting up a model's architecture through training. We started by building an mdl model using the tf.keras sequential API. Once the loss function and optimizer were designated, we compiled the model. Since we want to include cross-validation as part of the training routine, we need to set up step_per_epoch, which is the total number of data batches for the generator to yield as one epoch. This process is repeated for cross-validation data. Then we call the Fit API to launch the training process for five epochs.

The preceding steps demonstrate how to start with ImageDataGenerator to build a pipeline that flows image data from the image directory via flow_from_directory, and we are also able to handle image normalization and standardization routines as input arguments.

TFRecord dataset – ingestion pipeline

Another means of streaming training data into the model during the training process is through the TFRecord dataset. TFRecord is a protocol buffer format. Data stored in this format may be used in Python, Java, and C++. In enterprise or production systems, this format may provide versatility and promote reusability of data across different applications. Another caveat for TFRecord is that if you wish to use TPU as your compute target, and you wish to use a pipeline to ingest training data, then TFRecord is the means to achieve it. Currently, TPU does not work with generators. Therefore, the only way to stream data through a pipeline approach is by means of TFRecord. Again, the size of this dataset does not require TFRecord in reality. This is only used for learning purposes.

We are going to start with a TFRecord dataset already prepared. It contains the same flower images and classes as seen in the previous section. In addition, this TFRecord dataset is partitioned into training, validation, and test datasets. This TFRecord dataset is available in this book's GitHub repository. You may clone this repository with the following command:

git clone https://github.com/PacktPublishing/learn-tensorflow-enterprise.git

Once this command is complete, get in the following path:

learn-tensorflow-enterprise/tree/master/chapter_07/train_base_model/tf_datasets/flower_photos

You will see the following TFRecord datasets:

image_classification_builder-train.tfrecord-00000-of-00002

image_classification_builder-train.tfrecord-00001-of-00002

image_classification_builder-validation.tfrecord-00000-of-00001

image_classification_builder-test.tfrecord-00000-of-00001

Make a note of the file path where these datasets are stored.

We will refer to this path as <PATH_TO_TFRECORD>. This could be the path in your local system or any cloud notebook environment where you uploaded and mounted these TFRecord files:

  1. Set up the file path. As you can see, in this TFRecord collection, there are multiple parts (two) of train.tfrecord. We will use the wildcard (*) symbol to denote multiple filenames that follow the same naming pattern. We may use glob to keep track of the pattern, pass it to list_files to create a list of files, and then let TFRecordDataset create a dataset object.
  2. Recognize and encode the filename convention. We want to have a pipeline that can handle the data ingestion process. Therefore, we have to create variables to hold the file path and naming convention:

    import tensorflow as tf

    import tensorflow_hub as hub

    import tensorflow_datasets as tfds

    root_dir = '<PATH_TO_TFRECORD>'

    train_file_pattern = "{}/image_classification_builder-train*.tfrecord*".format(root_dir)

    val_file_pattern = "{}/image_classification_builder-validation*.tfrecord*".format(root_dir)

    test_file_pattern = "{}/image_classification_builder-test*.tfrecord*".format(root_dir)

    Here, we encoded text string representations of the file path to training, validation, and test data in the train_file_pattern, val_file_pattern, and test_file_pattern variables. Notice that we used the wildcard operator * to handle multiple file parts, if any. This is an important way to achieve scalability in data ingestion pipelines. It doesn't matter how many files there are, because now you have a way to find all of them by means of the path pattern.

  3. Create a file list. To create an object that can handle multiple parts of TFRecord files, we will use list_files to keep track of these files:

    train_all_files = tf.data.Dataset.list_files( tf.io.gfile.glob(train_file_pattern))

    val_all_files = tf.data.Dataset.list_files( tf.io.gfile.glob(val_file_pattern))

    test_all_files = tf.data.Dataset.list_files( tf.io.gfile.glob(test_file_pattern))

    In the preceding code, we use the tf.io API to make a reference to the training, validation, and test files. The path to these files is defined by train_file_pattern, val_file_pattern, and test_file_pattern.

  4. Create a dataset object. We will use TFRecordDataset to create dataset objects from training, validation, and test list objects:

    train_all_ds = tf.data.TFRecordDataset(train_all_files, num_parallel_reads = AUTOTUNE)

    val_all_ds = tf.data.TFRecordDataset(val_all_files, num_parallel_reads = AUTOTUNE)

    test_all_ds = tf.data.TFRecordDataset(test_all_files, num_parallel_reads = AUTOTUNE)

    The TFRecordDataset API reads the TFRecord file referenced by the file path variables.

  5. Inspect the sample size. So far, there is no quick way to establish the sample size in each TFRecord. The only way to do so is by iterating it:

    print("Sample size for training: {0}".format(sum(1 for _ in tf.data.TFRecordDataset(train_all_files)))

         ,' ', "Sample size for validation: {0}".format(sum(1 for _ in tf.data.TFRecordDataset(val_all_files)))

         ,' ', "Sample size for test: {0}".format(sum(1 for _ in tf.data.TFRecordDataset(test_all_files))))

    The preceding code prints and verifies the sample sizes in each of our TFRecord datasets.

    The output should be as follows:

    Sample size for training: 3540

    Sample size for validation: 80

    Sample size for test: 50

    Since we are able to count samples in TFRecord datasets, we know that our data pipeline for TFRecord is set up correctly.

TFRecord dataset – feature engineering and training

When we used a generator as the ingestion pipeline, the generator took care of batching and matching data and labels during the training process. However, unlike the generator, in order to use the TFRecord dataset, we have to parse it and perform some necessary feature engineering tasks, such as normalization and standardization, ourselves. The creator of TFRecord has to provide a feature description dictionary as a template for parsing the samples. In this case, the following feature dictionary is provided:

features = {

    'image/channels' :  tf.io.FixedLenFeature([], tf.int64),

    'image/class/label' :  tf.io.FixedLenFeature([], tf.int64),

    'image/class/text' : tf.io.FixedLenFeature([], tf.string),

    'image/colorspace' : tf.io.FixedLenFeature([], tf.string),

    'image/encoded' : tf.io.FixedLenFeature([], tf.string),

    'image/filename' : tf.io.FixedLenFeature([], tf.string),

    'image/format' : tf.io.FixedLenFeature([], tf.string),

    'image/height' : tf.io.FixedLenFeature([], tf.int64),

    'image/width' : tf.io.FixedLenFeature([], tf.int64)

    })

We will go through the following steps to parse the dataset, perform feature engineering tasks, and submit the dataset for training. These steps follow the completion of the TFRecord dataset – ingestion pipeline section:

  1. Parse TFRecord and resize the images. We will use the preceding dictionary to parse TFRecord in order to extract a single image as a NumPy array and its corresponding label. We will define a decode_and_resize function that should be used:

    def decode_and_resize(serialized_example):

        # resized image should be [224, 224, 3] and     # normalized to value range [0, 255]

        # label is integer index of class.

        

        parsed_features = tf.io.parse_single_example(

          serialized_example,

          features = {

        'image/channels' :  tf.io.FixedLenFeature([],                                                 tf.int64),

        'image/class/label' :  tf.io.FixedLenFeature([],                                                 tf.int64),

        'image/class/text' : tf.io.FixedLenFeature([],                                                tf.string),

        'image/colorspace' : tf.io.FixedLenFeature([],                                                tf.string),

        'image/encoded' : tf.io.FixedLenFeature([],                                                tf.string),

        'image/filename' : tf.io.FixedLenFeature([],                                                tf.string),

        'image/format' : tf.io.FixedLenFeature([],                                                         tf.string),

        'image/height' : tf.io.FixedLenFeature([], tf.int64),

        'image/width' : tf.io.FixedLenFeature([], tf.int64)

        })

        image = tf.io.decode_jpeg(parsed_features[

                                'image/encoded'], channels=3)

        label = tf.cast(parsed_features[

                              'image/class/label'], tf.int32)

        label_txt = tf.cast(parsed_features

                             ['image/class/text'], tf.string)

        label_one_hot = tf.one_hot(label, depth = 5)

        resized_image = tf.image.resize(image, [224, 224],

                                             method='nearest')

        return resized_image, label_one_hot

    The decode_and_resize function takes a dataset in TFRecord format, parses it, extracts the metadata and actual image, and then returns the image and its label.

    At a more detailed level inside this function, the TFRecord dataset is parsed with parsed_feature. This is how we extract different metadata from the dataset. The image is decoded by the decode_jpeg API, and is resized to 224 x 224 pixels. As for the label, it is extracted and one-hot encoded. Finally, the function returns the resized image and the corresponding one-hot label.

  2. Normalize the pixel value. We also need to normalize pixel values within the range [0, 255]. Here, we define a normalize function to do this:

    def normalize(image, label):

        #Convert `image` from [0, 255] -> [0, 1.0] floats

        image = tf.cast(image, tf.float32) / 255.

        return image, label

    Here, the image is rescaled, pixel-wise, to a range of [0, 1.0] by dividing each pixel by 255. The results are cast to float32 to represent floating-point values. This function returns the rescaled image with its label.

  3. Execute these functions. These functions (decode_and_resize and normalize) are designed to be applied to each sample within TFRecord. We use a map to accomplish this:

    resized_train_ds = train_all_ds.map(decode_and_resize, num_parallel_calls=AUTOTUNE)

    resized_val_ds = val_all_ds.map(decode_and_resize, num_parallel_calls=AUTOTUNE)

    resized_test_ds = test_all_ds.map(decode_and_resize, num_parallel_calls=AUTOTUNE)

    resized_normalized_train_ds = resized_train_ds.map(normalize, num_parallel_calls=AUTOTUNE)

    resized_normalized_val_ds = resized_val_ds.map(normalize, num_parallel_calls=AUTOTUNE)

    resized_normalized_test_ds = resized_test_ds.map(normalize, num_parallel_calls=AUTOTUNE)

    Here, we apply decode_and_resize to all the datasets, and then normalize the dataset at a pixel-wise level.

  4. Batch datasets for training processes. The final step to be performed on the TFRecord dataset is batching. We will define a few variables for this purpose, and define a function, prepare_for_model, for batching:

    pixels =224

    IMAGE_SIZE = (pixels, pixels)

    TRAIN_BATCH_SIZE = 32

    # Validation and test data are small. Use all in a batch.

    VAL_BATCH_SIZE = sum(1 for _ in tf.data.TFRecordDataset(val_all_files))

    TEST_BATCH_SIZE = sum(1 for _ in tf.data.TFRecordDataset(test_all_files))

    def prepare_for_model(ds, BATCH_SIZE, cache=True, TRAINING_DATA=True, shuffle_buffer_size=1000):

      if cache:

        if isinstance(cache, str):

          ds = ds.cache(cache)

        else:

          ds = ds.cache()

      ds = ds.shuffle(buffer_size=shuffle_buffer_size)

      if TRAINING_DATA:

        # Repeat forever

        ds = ds.repeat()

      ds = ds.batch(BATCH_SIZE)

      ds = ds.prefetch(buffer_size=AUTOTUNE)

      return ds

    Cross-validation and test data are not separated into batches. Therefore, the entire cross-validation data is a single batch, and likewise for test data.

    The prepare_for_model function takes a dataset and then caches it in memory and prefetches it. If this function is applied to the training data, it also repeats it infinitely to make sure you don't run out of data during the training process.

  5. Execute batching. Use the map function to apply the batching function:

    NUM_EPOCHS = 5

    SHUFFLE_BUFFER_SIZE = 1000

    prepped_test_ds = prepare_for_model(resized_normalized_test_ds, TEST_BATCH_SIZE, False, False)

    prepped_train_ds = resized_normalized_train_ds.repeat(100).shuffle(buffer_size=SHUFFLE_BUFFER_SIZE)

    prepped_train_ds = prepped_train_ds.batch(TRAIN_BATCH_SIZE)

    prepped_train_ds = prepped_train_ds.prefetch(buffer_size = AUTOTUNE)

    prepped_val_ds = resized_normalized_val_ds.repeat(NUM_EPOCHS).shuffle(buffer_size=SHUFFLE_BUFFER_SIZE)

    prepped_val_ds = prepped_val_ds.batch(80)

    prepped_val_ds = prepped_val_ds.prefetch(buffer_size = AUTOTUNE)

    The preceding code sets up batches of training, validation, and test data. These are ready to be fed into the training routine. We have now completed the data ingestion pipeline.

  6. Build and train the model. This part does not vary from the previous section. We will build and train a model with the same architecture as seen in the generator:

    FINE_TUNING_CHOICE = False

    NUM_CLASSES = 5

    IMAGE_SIZE = (224, 224)

    mdl = tf.keras.Sequential([

        tf.keras.layers.InputLayer(input_shape=IMAGE_SIZE +

                                   (3,), name='input_layer'),

        hub.KerasLayer("https://tfhub.dev/google/imagenet/resnet_v1_101/feature_vector/4", trainable=FINE_TUNING_CHOICE, name = 'resnet_fv'),

        tf.keras.layers.Dense(NUM_CLASSES,

                 activation='softmax', name = 'custom_class')

    ])

    mdl.build([None, 224, 224, 3])

    mdl.compile(

      optimizer=tf.keras.optimizers.SGD(lr=0.005,

                                               momentum=0.9),

      loss=tf.keras.losses.CategoricalCrossentropy(

                      from_logits=True, label_smoothing=0.1),

      metrics=['accuracy'])

    mdl.fit(

        prepped_train_ds,

        epochs=5, steps_per_epoch=100,

        validation_data=prepped_val_ds,

        validation_steps=1)

    Notice that the training and validation datasets are passed into the model as prepped_train_ds and prepped_val_ds, respectively. In this regard, it is no different to how we passed generators into the model for training. However, the extra work we had to do in terms of parsing, standardizing, and normalizing these datasets is substantially more complex compared to generators.

The benefit of TFRecord is that if you have a large dataset, then breaking it up and storing it as TFRecord in multiple parts will help you stream the data into the model faster than using a generator. Also, if your compute target is TPU, then you cannot stream training data using a generator; you will have to use the TFRecord dataset to stream training data into the model for training.

Regularization

During the training process, the model is learning to find the best set of weights and biases that minimize the loss function. As the model architecture becomes more complex, or simply starts to take on more layers, the model is being fitted with more parameters. Although this may help to produce a better fit during training, having to use more parameters may also lead to overfitting.

In this section, we will dive into some regularization techniques that can be implemented in a straightforward fashion in the tf.keras API.

L1 and L2 regularization

Traditional methods to address the concern of overfitting involve introducing a penalty term in the loss function. This is known as regularization. The penalty term is directly related to model complexity, which is largely determined by the number of non-zero weights. To be more specific, there are three traditional types of regularization used in machine learning:

  • L1 regularization (also known as Lasso): Here is the loss function with L1 regularization:

It uses the sum of the absolute values of the weights, w, multiplied by a user-defined penalty value, λ, to measure complexity (that is, the number of parameters that are fitted to the model indicate how complex it is). The idea is that the more parameters, or weights, that are used, the higher the penalty applied. We want the best model with the fewest parameters.

  • L2 regularization (also known as Ridge): Here is the loss function with L2 regularization:

It uses the sum of the squares of the weights, w, multiplied by a user-defined penalty value, λ, to measure complexity.

  • Elastic net regularization: Here is the loss function with L1 and L2 regularization:

It uses a combination of L1 and L2 to measure complexity. Each regularization term has its own penalty factor.

(Reference: pp. 38-39, Antonio Gulli and Sujit Pal, Deep Learning with Keras, Packt 2017, https://www.tensorflow.org/api_docs/python/tf/keras/regularizers/)

These are the keyword input parameters available for model layer definition, including dense or convolutional layers, such as Conv1D, Conv2D, and Conv3D:

  • kernel_regularizer: A regularizer applied to the weight matrix
  • bias_regularizer: A regularizer applied to the bias vector
  • activity_regularizer: A regularizer applied to the output of the layer

(Reference: p. 63, Antonio Gulli and Sujit Pal, Deep Learning with Keras, Packt 2017, https://www.tensorflow.org/api_docs/python/tf/keras/regularizers/Regularizer)

Now we will take a look at how to implement some of these parameters. As an example, we will leverage the model architecture built in the previous section, namely, a ResNet feature vector layer followed by a dense layer as the classification head:

KERNEL_REGULARIZER = tf.keras.regularizers.l2(l=0.1)

ACTIVITY_REGULARIZER = tf.keras.regularizers.L1L2(l1=0.1,l2=0.1)

mdl = tf.keras.Sequential([

    tf.keras.layers.InputLayer(input_shape=IMAGE_SIZE + (3,)),

    hub.KerasLayer("https://tfhub.dev/google/imagenet/resnet_v2_50/feature_vector/4",trainable=FINE_TUNING_CHOICE),

    tf.keras.layers.Dense(NUM_CLASSES

                         ,activation='softmax'

                         ,kernel_regularizer=KERNEL_REGULARIZER

                          ,activity_regularizer =

                          ACTIVITY_REGULARIZER

                          ,name = 'custom_class')

])

mdl.build([None, 224, 224, 3])

Notice that we are using an alias to define regularizers of interest to us outside the layer. This will make it easy to adjust the hyperparameters (l1, l2) that determine how strongly we want the regularization term to penalize the loss function for potential overfit:

KERNEL_REGULARIZER = tf.keras.regularizers.l2(l=0.1)

ACTIVITY_REGULARIZER = tf.keras.regularizers.L1L2(l1=0.1,l2=0.1)

This is followed by the addition of these regularizer definitions in the dense layer definition:

tf.keras.layers.Dense(NUM_CLASSES

                      ,activation='softmax'

                      ,kernel_regularizer=KERNEL_REGULARIZER

                      ,activity_regularizer =

                                           ACTIVITY_REGULARIZER

                      ,name = 'custom_class')

These are the only changes that are required to the code used in the previous section.

Adversarial regularization

An interesting technique known as adversarial learning emerged in 2014 (if interested, read the seminal paper published by Goodfellow et al., 2014). This idea stems from the fact that a machine learning model's accuracy can be greatly compromised, and will produce incorrect predictions, if the inputs are slightly noisier than expected. Such noise is known as adversarial perturbation. Therefore, if the training dataset is augmented with some random variation in the data, then we can use this technique to make our model more robust.

TensorFlow's AdversarialRegularization API is designed to complement the tf.keras API and simplify model building and training processes. We are going to reuse the TFRecord dataset downloaded as the original training data. Then we will apply a data augmentation technique to this dataset, and finally we will train the model. To do so follow the given steps:

  1. Download and unzip the training data (if you didn't do so at the start of this chapter). You need to download flower_tfrecords.zip, the TFRecord dataset that we will use from Harvard Dataverse (https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/1ECTVN). Put it in the compute node you intend to use. It may be your local compute environment or a cloud-based environment such as JupyterLab in Google AI Platform, or Google Colab. Unzip the file once you have downloaded it, and make a note of its path. We will refer to this path as <PATH_TO_TFRECORD>. In this path, you will see these TFRecord datasets:

    image_classification_builder-train.tfrecord-00000-of-00002

    image_classification_builder-train.tfrecord-00001-of-00002

    image_classification_builder-validation.tfrecord-00000-of-00001

    image_classification_builder-test.tfrecord-00000-of-00001

  2. Install the library. We need to make sure that the neural structured learning module is available in our environment. If you haven't done so yet, you should install this module using the following pip command:

    !pip install --quiet neural-structured-learning

  3. Create a file pattern object for the data pipeline. There are multiple files (two). Therefore, we may leverage the file naming convention and wildcard * qualifier during the data ingestion process:

    import tensorflow as tf

    import neural_structured_learning as nsl

    import tensorflow_hub as hub

    import tensorflow_datasets as tfds

    AUTOTUNE = tf.data.experimental.AUTOTUNE

    root_dir = './tfrecord-dataset/flowers'

    train_file_pattern = "{}/image_classification_builder-train*.tfrecord*".format(root_dir)

    val_file_pattern = "{}/image_classification_builder-validation*.tfrecord*".format(root_dir)

    test_file_pattern = "{}/image_classification_builder-test*.tfrecord*".format(root_dir)

    For convenience, the path to these TFRecord files is designated as the following variables: train_file_pattern, val_file_pattern, and test_file_pattern. These paths are represented as text strings. The wildcard symbol * is used to handle multiple file parts, in case there are any.

  4. Take an inventory of all the filenames. We may use the glob API to create a dataset object that tracks all parts of the file:

    train_all_files = tf.data.Dataset.list_files( tf.io.gfile.glob(train_file_pattern))

    val_all_files = tf.data.Dataset.list_files( tf.io.gfile.glob(val_file_pattern))

    test_all_files = tf.data.Dataset.list_files( tf.io.gfile.glob(test_file_pattern))

    Here, we use the tf.io API to refer to the file paths indicated in the previous step. The filenames referred to by the glob API of tf.io are then encoded in a list of filenames by the list_files API of tf.data.

  5. Establish the loading pipeline. Now we may establish the reference to our data source via TFRecordDataset:

    train_all_ds = tf.data.TFRecordDataset(train_all_files, num_parallel_reads = AUTOTUNE)

    val_all_ds = tf.data.TFRecordDataset(val_all_files, num_parallel_reads = AUTOTUNE)

    test_all_ds = tf.data.TFRecordDataset(test_all_files, num_parallel_reads = AUTOTUNE)

    Here, we use the TFRecordDataset API to create respective datasets from our source.

  6. To check whether we have total visibility of the data, we will count the sample sizes in each dataset:

    train_sample_size = sum(1 for _ in tf.data.TFRecordDataset(train_all_files))

    validation_sample_size = sum(1 for _ in tf.data.TFRecordDataset(val_all_files))

    test_sample_size = sum(1 for _ in tf.data.TFRecordDataset(test_all_files))

    print("Sample size for training: {0}".format(train_sample_size)

         ,' ', "Sample size for validation: {0}".format(validation_sample_size)

         ,' ', "Sample size for test: {0}".format(test_sample_size))

    Currently, the way to find out how many samples are in a TFRecord file is by iterating through it. In the code:

    sum(1 for _ in tf.data.TFRecordDataset(train_all_files))

    We use the for loop to iterate through the dataset, and sum up the iteration count to obtain the final count as the sample size. This coding pattern is also used to determine validation and test dataset sample sizes. The sizes of these datasets are then stored as variables.

    The output of the preceding code will look like this:

    Sample size for training: 3540

    Sample size for validation: 80

    Sample size for test: 50

  7. As regards data transformation, we need to transform all images to the same size, which is 224 pixels in height and 224 pixels in width. The intensity level of each pixel should be in the range [0, 1]. Therefore, we need to divide each pixel's value by 255. We need these two functions for these transformation operations:

    def decode_and_resize(serialized_example):

        # resized image should be [224, 224, 3] and     # normalized to value range [0, 255]

        # label is integer index of class.

      

        parsed_features = tf.io.parse_single_example(

          serialized_example,

          features = {

        'image/channels' :  tf.io.FixedLenFeature([],                                                    tf.int64),

        'image/class/label' :  tf.io.FixedLenFeature([],                                                 tf.int64),

        'image/class/text' : tf.io.FixedLenFeature([],                                                tf.string),

        'image/colorspace' : tf.io.FixedLenFeature([],                                                tf.string),

        'image/encoded' : tf.io.FixedLenFeature([],                                                tf.string),

        'image/filename' : tf.io.FixedLenFeature([],                                                tf.string),

        'image/format' : tf.io.FixedLenFeature([],                                                     tf.string),

        'image/height' : tf.io.FixedLenFeature([], tf.int64),

        'image/width' : tf.io.FixedLenFeature([], tf.int64)

        })

        image = tf.io.decode_jpeg(parsed_features[

                                'image/encoded'], channels=3)

        label = tf.cast(parsed_features['image/class/label'],

                                                     tf.int32)

        label_txt = tf.cast(parsed_features[

                              'image/class/text'], tf.string)

        label_one_hot = tf.one_hot(label, depth = 5)

        resized_image = tf.image.resize(image, [224, 224],

                                             method='nearest')

    return resized_image, label_one_hot

    The decode_and_resize function takes a dataset in TFRecord format, parses it, extracts the metadata and actual image, and returns the image and its label. At a more detailed level, inside this function, the TFRecord dataset is parsed with parsed_feature. This is how we extract different metadata from the dataset. The image is decoded by the decode_jpeg API, and it is resized to 224 by 224 pixels. As for the label, it is extracted and one-hot encoded.

  8. Finally, the function returns the resized image and the corresponding one-hot label.

    def normalize(image, label):

        #Convert `image` from [0, 255] -> [0, 1.0] floats

        image = tf.cast(image, tf.float32) / 255.

        return image, label

    This function takes a JPEG image and normalizes pixel values (dividing each pixel by 255) in the range of [0, 1.0], and casts it to tf.float32 to represent floating-point values. It returns the normalized image with its corresponding label.

  9. Execute data transformation. We will use the map function to apply the preceding transformation routines to each element in our dataset:

    resized_train_ds = train_all_ds.map(decode_and_resize, num_parallel_calls=AUTOTUNE)

    resized_val_ds = val_all_ds.map(decode_and_resize, num_parallel_calls=AUTOTUNE)

    resized_test_ds = test_all_ds.map(decode_and_resize, num_parallel_calls=AUTOTUNE)

    resized_normalized_train_ds = resized_train_ds.map(normalize, num_parallel_calls=AUTOTUNE)

    resized_normalized_val_ds = resized_val_ds.map(normalize, num_parallel_calls=AUTOTUNE)

    resized_normalized_test_ds = resized_test_ds.map(normalize, num_parallel_calls=AUTOTUNE)

    In the preceding code, we apply decode_and_resize to each dataset, and then we rescale it by applying the normalize function to each pixel in the dataset.

  10. Define the parameters for training. We need to specify the batch size for our dataset, as well as the parameters that define the epochs:

    pixels =224

    IMAGE_SIZE = (pixels, pixels)

    TRAIN_BATCH_SIZE = 32

    VAL_BATCH_SIZE = validation_sample_size

    TEST_BATCH_SIZE = test_sample_size

    NUM_EPOCHS = 5

    SHUFFLE_BUFFER_SIZE = 1000

    FINE_TUNING_CHOICE = False

    NUM_CLASSES = 5

    prepped_test_ds = resized_normalized_test_ds.batch(TEST_BATCH_SIZE).prefetch(buffer_size = AUTOTUNE)

    prepped_train_ds = resized_normalized_train_ds.repeat(100).shuffle(buffer_size=SHUFFLE_BUFFER_SIZE)

    prepped_train_ds = prepped_train_ds.batch(TRAIN_BATCH_SIZE)

    prepped_train_ds = prepped_train_ds.prefetch(buffer_size = AUTOTUNE)

    prepped_val_ds = resized_normalized_val_ds.repeat(NUM_EPOCHS).shuffle(buffer_size=SHUFFLE_BUFFER_SIZE)

    prepped_val_ds = prepped_val_ds.batch(80)

    prepped_val_ds = prepped_val_ds.prefetch(buffer_size = AUTOTUNE)

    In the preceding code, we defined the parameters required to set up the training process. Datasets are also batched and fetched for consumption.

    Now we have built our dataset pipeline that will fetch a batch of data at a time to ingest into the model training process.

  11. Build your model. We will build an image classification model using the ResNet feature vector:

    mdl = tf.keras.Sequential([

        tf.keras.layers.InputLayer(input_shape=IMAGE_SIZE + (3,)),

        hub.KerasLayer("https://tfhub.dev/google/imagenet/resnet_v2_50/feature_vector/4",trainable=FINE_TUNING_CHOICE),

        tf.keras.layers.Dense(NUM_CLASSES, activation='softmax', name = 'custom_class')

    ])

    mdl.build([None, 224, 224, 3])

    We use the tf.keras sequential API to build an image classification model. It first uses the input layer to accept the training data as 224 by 224 by 3 pixels. Then we leverage the feature vector of ResNet_V2_50 as the middle layer. We will use it as is (trainable = FINE_TUNING_CHOICE. FINE_TUNING_CHOICE is set to False in the previous step. If you wish, you may set it to True. However, this would increase your training time significantly). Finally, the output layer is represented by a dense layer with five nodes (NUM_CLASSES = 5). Each node represents a probability value for the respective flower type.

    So far, there is nothing specific to adversarial regularization. Starting with the next step, we will begin by building a configuration object that specifies adversarial training data and launch the training process.

  12. Convert the training samples to a dictionary. A particular requirement for adversarial regularization is to have training data and labels combined as a dictionary and then streamed into the training process. This can easily be accomplished with the following function:

    def examples_to_dict(image, label):

      return {'image_input': image, 'label_output': label}

    This function accepts the image and corresponding label, and then reformats these as key-value pairs in a dictionary.

  13. Convert the data and label collection into a dictionary. For the batched dataset, we may use the map function again to apply examples_to_dict to each element in the dataset:

    train_set_for_adv_model = prepped_train_ds.map(examples_to_dict)

    val_set_for_adv_model = prepped_val_ds.map(examples_to_dict)

    test_set_for_adv_model = prepped_test_ds.map(examples_to_dict)

    In this code, each sample in the dataset is also converted to a dictionary. This is done via the map function. The map function applies the examples_to_dict function to each element (sample) in the dataset.

  14. Create an adversarial regularization object. Now we are ready to create an adv_config object that specifies adversarial configuration. Then we wrap the mdl base model we created in a previous step with adv_config:

    adv_config = nsl.configs.make_adv_reg_config()

    adv_mdl = nsl.keras.AdversarialRegularization(mdl,

    label_keys=['label_output'],

    adv_config=adv_config)

    Now we have a model, adv_mdl, that contains the base model structure as defined by mdl. adv_mdl includes knowledge of the adversarial configuration, adv_config, which will be used to create adversarial images during the training process.

  15. Compile and train the model. This part is similar to what we did previously. It is no different to training the base model, except for the input dataset:

    adv_mdl.compile(optimizer=tf.keras.optimizers.SGD(lr=0.005, momentum=0.9),

        loss=tf.keras.losses.CategoricalCrossentropy(

                      from_logits=True, label_smoothing=0.1),

        metrics=['accuracy'])

    adv_mdl.fit(

        train_set_for_adv_model,

        epochs=5, steps_per_epoch=100,

        validation_data=val_set_for_adv_model,

        validation_steps=1)

    Notice now the input to the fit function for training is train_set_for_adv_model and val_set_for_adv_model, which is a dataset that streams each sample as a dictionary into the training process.

It doesn't take a lot of work to set up adversarial regularization with tf.keras and adversarial regularization APIs. Basically, an extra step is required to reformat the sample and label into a dictionary. Then, we wrap our model using the nsl.keras.AdversarialRegularization API, which encapsulates the model architecture and adversarial regularization object. This makes it very easy to implement this type of regularization.

Summary

This chapter presented some common practices for enhancing and improving your model building and training processes. One of the most common issues in dealing with training data handling is to stream or fetch training data in an efficient and scalable manner. In this chapter, you have seen two methods to help you build such an ingestion pipeline: generators and datasets. Each has its strengths and purposes. Generators manage data transformation and batching quite well, while a dataset API is designed where a TPU is the target.

We also learned how to implement various regularization techniques using the traditional L1 and L2 regularization, as well as a modern regularization technique known as adversarial regularization, which is applicable to image classification. Adversarial regularization also manages data transformation and augmentation on your behalf to save you the effort of generating noisy images. These new APIs and capabilities enhance TensorFlow Enterprise's user experience and help save on development time.

In the next chapter, we are going to see how to serve a TensorFlow model with TensorFlow Serving.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.225.35.81