The Classify method

Now that we understand a little bit about CNNs, we are ready to put this knowledge into practice. We are going to create an asynchronous classification method. TensorFlow can work with a number of formats when it needs to detect images, so we are going to generalize our method to only accept the appropriate types:

public async Classify(image: tf.Tensor3D | ImageData | HTMLImageElement | 
  HTMLCanvasElement | HTMLVideoElement):
    Promise<TensorInformation[] | null> {
}

Only one of these types is specific to TensorFlow—the Tensor3D type. All the other types are standard DOM types, so this can be easily consumed in a web page without having to jump through numerous hoops to convert the image into a suitable format.

We haven't introduced our TensorInformation interface yet. When we receive classifications back from MobileNet, we receive a classification name and a confidence level for the classification. This comes back as Promise<Array<[string, number]>> from the classification operation, so we convert this into something more meaningful for our consuming code:

export interface TensorInformation {
  className: string;
  probability: number;
}

We now know that we are going to be returning an array of classifications and a probability (the confidence level). Getting back to our Classify method, we need to load MobileNet if it has not previously been loaded. This operation can take a while, which is why we cache it so that we don't have to reload it the next time we call this method:

if (!this.model) {
  this.model = await mobilenet.load();
}

We have accepted the defaults for the load operation. There are a number of options that we could have supplied if we needed to:

version: This sets the MobileNet version number, and defaults to 1. Right now, there are two values that can be set: 1 means that we use MobileNetV1, and 2 means that we use MobileNetV2. Practically, for us, the difference between versions relates to the accuracy and performance of the model.
alpha: This can be set to 0.25, 0.5, 0.75, or 1. Surprisingly, this has nothing to do with the alpha channel on an image. Instead, it refers to the width of the network that will be used, effectively trading accuracy for performance. The higher the number, the greater the accuracy. Conversely, the higher the number, the slower the performance. The default for the alpha is 1.
modelUrl: If we wanted to work with a custom model, we could supply this here.

If the model loads successfully, then we can now perform the image classification. This is a straightforward call to the classify method, taking in the image that has been passed into our method. Following the completion of this operation, we return the array of classification results:

if (this.model) {
  const result = await this.model.classify(image);
  return {
    ...result,
  };
}

The model.classify method returns three classifications by default, but if we wanted to, we could pass a parameter to return a different number of classifications. If we wanted to retrieve the top five results, we would change the model.classify line, as follows:

const result = await this.model.classify(image, 5);

Finally, in the unlikely event that the model failed to load, we return null. With this in place, our completed Classify method looks like this:

public async Classify(image: tf.Tensor3D | ImageData | HTMLImageElement | 
  HTMLCanvasElement | HTMLVideoElement):
    Promise<TensorInformation[] | null> {
  if (!this.model) {
    this.model = await mobilenet.load();
  }
  if (this.model) {
    const result = await this.model.classify(image);
    return {
      ...result,
    };
  }
  return null;
}

TensorFlow really can be that simple. Obviously, behind the scenes, a great deal of complexity has been hidden, but that is the beauty of well-designed libraries. They should shield us from the complexities while leaving us with room to get into the more complex operations and customization if we need to.

So, that's our image classification component written. How do we use it in our Vue application, though? In the next section, we are going to see how we modify the HelloWorld component to use this class.

Table of Contents for The Classify method

Create new playlist

Sign In

Sign Up

Table of Contents for
The Classify method