In this chapter, we will be using a pre-trained TensorFlow model, specifically the Inception model, and we’ll integrate the model into a Windows Presentation Foundation (WPF) application. We will be taking the pre-trained model and applying transfer learning, by adding some pictures of food and bodies of water. After the transfer learning has been performed, we then allow the user to select their own images. By the end of the chapter, you should have a firm grasp of what it takes to integrate a TensorFlow model into your ML.NET application.

The following topics will be covered in this chapter:

Breaking down Google's Inception model
Creating the image classification desktop application
Exploring additional production application enhancements

Breaking down Google's Inception model

Google's Inception model (https://github.com/google/inception) has been trained on millions of images to help with one of the growing questions in our society—what is in my image? The type of applications wanting to answer this question range from matching faces, automatically detecting weapons or unwanted objects, sports branding in game pictures (such as the brand of sneakers), and image archivers that provide users with the support they need to search without manual tags, to name just a few.

This type of question is typically answered with object recognition. An application of object recognition that you might already be familiar with is optical character recognition (OCR). OCR is when an image of characters can be interpreted as text, such as what is found in Microsoft's OneNote Handwriting to Text feature, or in a toll booth that reads license plates. The particular application of object recognition that we will be looking into specifically is called image classification.

The Inception model helps with this problem by using deep learning to classify images. The model was trained in a supervised approach on millions of images, with the output being a neural network. The advantage of this approach is that the pre-built model can be enhanced with a smaller subset of images, which is what we will be doing in the next section of this chapter. This approach of adding additional data and labels is called transfer learning. This approach can also be helpful when creating customer-specific models.

Think of it like creating a branch from your master branch in GitHub; you might want to just add one class or modify one element without having to re-create the entire code base. In regards to models, take for instance, an image classifier for automobiles. Let us assume that you obtain millions of images covering US and foreign cars, trucks, vans, and more. A new customer comes to you requesting you to create a model to help monitor vehicles entering a government facility. The previous model should not be thrown away and won't need to be fully retrained, simply adding more commercial (or maybe military) vehicles with labels would be needed.

For a larger and more in-depth deep dive into Google's image classification, a good resource is their developer documentation, which can be found from https://developers.google.com/machine-learning/practica/image-classification/.

Creating the WPF image classification application

As mentioned earlier, the application that we will be creating is an image classification application, specifically allowing the user to select an image and determine whether it is either food or water. This is achieved through the aforementioned and included, pre-trained TensorFlow Inception model. The first time that the application is run, the ML.NET version of the model is trained with the images and the tags.tsv file (to be reviewed in the next section).

As with previous chapters, the completed project code, sample dataset, and project files can be downloaded here: https://github.com/PacktPublishing/Hands-On-Machine-Learning-With-ML.NET/tree/master/chapter12 .

Exploring the project architecture

In this chapter, we will dive into a WPF desktop application. As mentioned in the first section of this chapter, we will be using the WPF framework to create our application. You might be asking, why not a UWP application such as the browser application that we created in Chapter 10, Using ML.NET with UWP? The reasoning, at least at the time of writing, is that TensorFlow support, specifically for image classification, is not fully supported in a UWP application. Perhaps, in future versions of ML.NET, this will be added. For other non-image-based applications, you may be able to use TensorFlow in a UWP application.

Those who have done WPF development previously, and are looking closely, will notice that the project utilizes .NET Core 3.1. In .NET Core 3.0, Microsoft added support for WPF and WinForms, therefore, you are no longer tied to the Windows-only .NET Framework for GUI development. Instead, this support is added through the Microsoft.WindowsDesktop.App.WPF NuGet package.

For this example, we will be using the Microsoft.ML (1.3.1) NuGet package—in addition to several additional NuGet packages—to be able to utilize TensorFlow within our .NET application. These include the following:

Microsoft.ML.ImageAnalytics (1.3.1)
Microsoft.ML.TensorFlow (1.3.1)
SciSharp.TensorFlow.Redist (1.14.0)

By the time you are reading this, there may very well be newer versions of the packages and they should work, however, the versions that were noted above are the ones that we are going to use in this deep dive, and what is available in the GitHub repository.

In the following screenshot, you will find the Visual Studio Solution Explorer view of the solution. Due to the TensorFlow support being much more particular about project types and CPU targets, we have gone back to a single project, as opposed to the three-project architecture that was used in the previous several chapters:

The tags.tsv file (found in the assetsimages folder in the code repository) contains eight rows, which map the included images to the preclassification:

ChickenWings.jpg food
Steak.jpg food
Pizza.jpg food
MongolianGrill.jpg food
Bay.jpg water
Bay2.jpg water
Bay3.jpg water
Beach.jpg water

If you want to experiment with your own classification, delete the included images, copy your images, and update the tags.tsv file with the label. I should note, all of the images that are included were taken by me on various vacations to California—feel free to use them as you wish.

The files in the assets/inception folder contain all of the Google pre-trained files (and license file).

Diving into the WPF image classification application

As discussed in the opening section, our desktop application is a WPF application. For the scope of this example, as found in Chapter 10, Using ML.NET with UWP, we are using standard approaches for handling the application architecture by following the Model-View-ViewModel (MVVM) design pattern.

The files that we will be diving into in this section are as follows:

MainWindowViewModel
MainWindow.xaml
MainWindow.xaml.cs
BaseML
ImageDataInputItem
ImageDataPredictionItem
ImageClassificationPredictor

The rest of the files inside the WPF project were untouched from the default Visual Studio .NET Core 3.1 WPF application template; for example, the App.xaml and AssemblyInfo.cs files.

The MainWindowViewModel class

The purpose of the MainWindowViewModel class is to contain our business logic and control the view, as shown here:

The first thing we do is instantiate our previously discussed ImageClassificationPredictor class, so that it can be used to run predictions:

private readonly ImageClassificationPredictor _prediction = new ImageClassificationPredictor();

The next block of code handles the power of MVVM for the classification string, and also stores the selected image. For each of these properties, we call OnPropertyChanged upon a change in values, which triggers the binding of the View to refresh for any field that is bound to these properties:

private string _imageClassification;

public string ImageClassification
{
    get => _imageClassification;

    set
    {
        _imageClassification = value;
        OnPropertyChanged();
    }
}

private ImageSource _imageSource;

public ImageSource SelectedImageSource
{
    get => _imageSource;

    set
    {
        _imageSource = value;
        OnPropertyChanged();
    }
}

Next, we define the Initialize method, which calls the predictor's Initialize method. The method will return a tuple, which indicates whether the model can't be loaded or whether it is not found, along with the exception (if thrown):

public (bool Success, string Exception) Initialize() => _prediction.Initialize();

Then, we handle what happens when the user clicks the Select Image button. This method opens a dialog box prompting the user to select an image. If the user cancels the dialog, the method returns. Otherwise, we call the two helper methods to load the image into memory and classify the image:

public void SelectFile()
{
    var ofd = new OpenFileDialog
    {
        Filter = "Image Files(*.BMP;*.JPG;*.PNG)|*.BMP;*.JPG;*.PNG"
    };

    var result = ofd.ShowDialog();

    if (!result.HasValue || !result.Value)
    {
        return;
    }

    LoadImageBytes(ofd.FileName);

    Classify(ofd.FileName);
}

The LoadImageBytes method takes the filename and loads the image into our MVVM-based ImageSource property so, after selection, the image control is automatically updated to a view of the selected image:

private void LoadImageBytes(string fileName)
{
    var image = new BitmapImage();

    var imageData = File.ReadAllBytes(fileName);

    using (var mem = new MemoryStream(imageData))
    {
        mem.Position = 0;

        image.BeginInit();
        
        image.CreateOptions = 
                BitmapCreateOptions.PreservePixelFormat;
        image.CacheOption = BitmapCacheOption.OnLoad;
        image.UriSource = null;
        image.StreamSource = mem;
        
        image.EndInit();
    }

    image.Freeze();

    SelectedImageSource = image;
}

And lastly, the Classify method takes the path and passes it into the Predictor class. Upon returning the prediction, the classification and confidence are built into our MVVM ImageClassification property, therefore, the UI is updated automatically:

public void Classify(string imagePath)
{
 var result = _prediction.Predict(imagePath);

 ImageClassification = $"Image ({imagePath}) is a picture of {result.PredictedLabelValue} with a confidence of {result.Score.Max().ToString("P2")}";
}

The last element of the MainWindowViewModel class is the same OnPropertyChanged method that we defined in Chapter 10, Using ML.NET with UWP, which allows the MVVM magic to happen. With our ViewModel class defined, let us move on to the MainWindow XAML file.

The MainWindow.xaml class

As discussed in the Breaking down UWP architecture section of Chapter 10, Using ML.NET with UWP, when describing the development, XAML markup is used to define your user interface. For the scope of this application, our UI is relatively simple: Button, Image Control, and TextBlock.

We will look at the code now:

The first thing that we define is our grid. In XAML, a grid is a container similar to a <div> in web development. We then define our rows. Similar to Bootstrap (but easier to understand in my opinion), is the pre-definition of the height of each row. Setting a row to Auto will auto-size the height to the content's height, while an asterisk translates to using all remaining height based on the main container's height:

<Grid.RowDefinitions>
    <RowDefinition Height="Auto" />
    <RowDefinition Height="*" />
    <RowDefinition Height="Auto" />
</Grid.RowDefinitions>

We first define our Button object, which will trigger the aforementioned SelectFile method in our ViewModel class:

<Button Grid.Row="0" Margin="0,10,0,0" Width="200" Height="35" Content="Select Image File" HorizontalAlignment="Center" Click="btnSelectFile_Click" />

We then define our Image control, which is bound to our previously reviewed SelectedImageSource property that is found in our ViewModel class:

<Image Grid.Row="1" Margin="10,10,10,10" Source="{Binding SelectedImageSource}" />

We then add the TextBlock control that will display our classification:

<TextBlock Grid.Row="2" Text="{Binding ImageClassification, Mode=OneWay}" TextWrapping="Wrap" Foreground="White" Margin="10,10,10,10" HorizontalAlignment="Center" FontSize="16" />

With the XAML aspect of our View defined, let us now dive into the code behind of the MainWindow class.

The MainWindow.xaml.cs file

The MainWindow.xaml.cs file contains the code behind the XAML view, which is discussed here:

The first thing that we define is a wrapper property around the DataContext property, which is built into the base Window class:

private MainWindowViewModel ViewModel => (MainWindowViewModel) DataContext;

Next, we define the constructor for MainWindow, in order to initialize the DataContext property to our MainWindowViewModel object. If the initialization fails, we do not want the application to continue. In addition, we need to let the user know why it failed, using a MessageBox object:

public MainWindow()
{
    InitializeComponent();

    DataContext = new MainWindowViewModel();

    var (success, exception) = ViewModel.Initialize();

    if (success)
    {
        return;
    }

    MessageBox.Show($"Failed to initialize model - {exception}");

    Application.Current.Shutdown();
}

Lastly, we call the ViewModel's SelectFile method to handle the image selection and classification:

private void btnSelectFile_Click(object sender, RoutedEventArgs e) => ViewModel.SelectFile();

With the code behind of the MainWindow class behind us, that concludes the WPF component. Let us now focus on the machine learning part of the example.

The BaseML class

The BaseML class, as used in most of the previous examples, exposes a base class for our ML.NET classes. In the case of this example, we actually streamlined the class due to the nature of using a pre-trained model. The class now simply initializes the MLContext property:

public class BaseML
{
    protected MLContext MlContext;

    public BaseML()
    {
        MlContext = new MLContext(2020);
    }
}

With the streamlined BaseML class reviewed, let us dive into the ImageDataInputItem class.

The ImageDataInputItem class

The ImageDataInputItem class contains our class to pass into the model; the essential property is the ImagePath property:

public class ImageDataInputItem
{
    [LoadColumn(0)]
    public string ImagePath;

    [LoadColumn(1)]
    public string Label;
}

While smaller than most of our input classes, the Inception model only requires the two properties. Now, let us dive into the output class that is called ImageDataPredictionItem.

The ImageDataPredictionItem class

The ImageDataPredictionItem class contains the prediction response, including the confidence of the predicted value string (to contain Water or Food in the case of the included images):

public class ImageDataPredictionItem : ImageDataInputItem
{
    public float[] Score;

    public string PredictedLabelValue;
}

Much like the input class, the output class has only two properties, similar to previous examples. With the input and output classes behind us, let us dive into the ImageClassificationPredictor class, which uses these classes for transfer learning and predictions.

The ImageClassificationPredictor class

The ImageClassificationPredictor class contains all of the code that is needed to load and predict against the Inception TensorFlow model:

First, we need to define several helper variables to access the images and .tsv files:

// Training Variables
private static readonly string _assetsPath = 
    Path.Combine(Environment.CurrentDirectory, "assets");
private static readonly string _imagesFolder = 
    Path.Combine(_assetsPath, "images");
private readonly string _trainTagsTsv = 
    Path.Combine(_imagesFolder, "tags.tsv");
private readonly string _inceptionTensorFlowModel = 
    Path.Combine(_assetsPath, "inception", 
                 "tensorflow_inception_graph.pb");

private const string TF_SOFTMAX = "softmax2_pre_activation";
private const string INPUT = "input";

private static readonly string ML_NET_MODEL = 
    Path.Combine(Environment.CurrentDirectory, "chapter12.mdl");

Next, we define the settings that the pre-trained Inception model needs:

private struct InceptionSettings
{
    public const int ImageHeight = 224;
    public const int ImageWidth = 224;
    public const float Mean = 117;
    public const float Scale = 1;
    public const bool ChannelsLast = true;
}

Next, we create our Predict method and overload that simply takes the image file path. Like in previous examples, we create PredictionEngine with a call to our MLContext object, passing in our input class (ImageDataInputItem) and our output class (ImageDataPredictionItem), and then calling the Predict method to get our model prediction:

public ImageDataPredictionItem Predict(string filePath) => 
    Predict(new ImageDataInputItem 
        {
            ImagePath = filePath 
        }
    );

public ImageDataPredictionItem Predict(ImageDataInputItem image)
{
    var predictor = MlContext.Model.CreatePredictionEngine
           <ImageDataInputItem, ImageDataPredictionItem>(_model);

    return predictor.Predict(image);
}

4. Finally, we initialize and extend our pre-trained model with our own samples:

public (bool Success, string Exception) Initialize()
{
    try
    {
        if (File.Exists(ML_NET_MODEL))
        {
            _model = MlContext.Model.Load(ML_NET_MODEL, 
                              out DataViewSchema modelSchema);

            return (true, string.Empty);
        }

       ...
    }
    catch (Exception ex)
    {
        return (false, ex.ToString());
    }
}

For the full code, please refer to the following GitHub repository link: https://github.com/PacktPublishing/Hands-On-Machine-Learning-With-ML.NET/blob/master/chapter12/chapter12.wpf/ML/ImageClassificationPredictor.cs. With the Initialize method completed, that concludes the code deep dive. Let us now run the application!

Running the image classification application

Since we are using a pre-trained model, we can just run the application from Visual Studio. Upon running the application, you will be presented with a mostly empty window:

Clicking on the Select Image File button and then selecting an image file will trigger the model to run. In my case, I selected a picture from a recent vacation to Germany, which came back with a 98.84% confidence score:

Feel free to try various files on your machine to see the confidence score and classification—if you start noticing issues, add more samples to the images folder and tags.tsv file, as noted in the earlier section. Be sure to delete the chapter12.mdl file prior to making these changes.

Additional ideas for improvements

Now that we have completed our deep dive, there are a couple of additional elements that could possibly further enhance the application. A few ideas are discussed here.

Self-training based on the end user's input

One of the advantages, as noted in the opening section of this chapter, is the ability to utilize transfer learning in dynamic applications. Unlike previous example applications that have been reviewed in this book, this application could actually allow the end user to select a series (or folder) of images, and with a few code changes, build the new .tsv file and train a new model. For a web application or commercial product, this would provide a high value and would also reduce the burden on you to, for instance, obtain images of every type—a daunting, and more than likely futile, goal.

Logging

As mentioned in the Logging section of Chapter 10, Using ML.NET with UWP, having a desktop application has its pros and cons. The biggest con necessitating the need for logging is that your desktop application could be installed on any number of configurations of Windows 7 to Windows 10, with an almost unlimited number of permutations. As mentioned previously, logging by utilizing NLog (https://nlog-project.org/) or a similar open source project is highly recommended, coupled with a remote logging solution such as Loggly, so that you can get error data from your user's machines. Given the GDPR and recent CCPA, we need to ensure that the data that is leaving the end user's machines is conveyed and that these logs do not include personal data (or actual images uploaded to a remote server via the logging mechanism).

Utilizing a database

Similar to the performance optimization suggestion in Chapter 10, Using ML.NET with UWP, if a user selects the same image more than once, especially if this application was being used in a kiosk or converted to a web application, the performance advantages of storing the classification could be fairly significant. A quick and easy method for achieving this could be to perform a SHA256 of the image, and check that hash against a database. Depending on the number of users and if they are going to be concurrent, I would suggest one of two options:

If the users are going one at a time and the application is going to remain a WPF application, using the previously mentioned lightweight database—LiteDB (http://www.litedb.org/)—would be recommended.
If you are launching a large web application using a production, then MongoDB or a horizontally scalable database, such as Microsoft's CosmosDB would be recommended in order to ensure that the database lookups wouldn't be slower than simply re-performing the model prediction.

Summary

Over the course of this chapter, we have deep-dived into what goes into creating a WPF application using a pre-trained TensorFlow model. We also reviewed and looked closely into Google's image classification Inception model. In addition, we learned how to take that model and integrate it in order to perform image classification on user-selected images. Lastly, we also discussed some ways to further enhance the example application.

In the next and last chapter, we will focus on using a pre-trained ONNX model in a WPF application for object detection.

Table of Contents for Using TensorFlow with ML.NET

Create new playlist

Sign In

Sign Up