Converting Keras Tiny YOLO to Core ML

In the previous section, we discussed the concepts of the model and algorithm we will be using in this chapter. In this section, we will be moving one step closer to realizing the example project for this chapter by converting a trained Keras model of Tiny YOLO to Core ML using Apple's Core ML Tools Python package; but, before doing so, we will quickly discuss the model and the data it was trained on.

YOLO was conceived on a neural network framework called darknet, which is currently not supported by the default Core ML Tools package; fortunately, the authors of YOLO and darknet have made the architecture and weights of the trained model publicly available on their website at https://pjreddie.com/darknet/yolov2/. There are a few variations of YOLO that have been trained on either the dataset from Common Objects in Context (COCO), which consists of 80 classes, or The PASCAL Visual Object Classes (VOC) Challenge 2007, which consists of 20 classes.

The official website can be found at http://cocodataset.org, and The PASCAL VOC Challenge 2007 at http://host.robots.ox.ac.uk/pascal/VOC/index.html.

In this chapter, we will be using the Tiny version of YOLOv2 and the weights from the model trained on the The PASCAL VOC Challenge 2007 dataset. The Keras model we will be using was modeled on the configuration file and weights available on the official site (link presented previously).

As usual, we will be omitting a lot of details of the model and will instead provide the model in its diagrammatic form, shown next. We'll then discuss some of the relevant parts before moving on to convert it into a Core ML model:

The first thing to notice is the shape of the input and output; this indicates what our model will be expecting to be fed and what it will be outputting for us to use. As shown before, the input size is 416 x 416 x 3, which, as you might suspect, is a 416 x 416 RGB image. The output shape needs a little more explanation and it will become more apparent when we arrive at coding the example for this chapter.

The output shape is 13 x 13 x 125. The 13 x 13 tells us the size of the grid being applied, that is, the 416 x 416 image is broken into a grid of 13 x 13 cells, as follows:

As discussed previously, each cell has a 125-vector encoding of the probability of an object being present and, if so, the bounding box and probabilities across all 20 classes; visually, this is explained as follows:

The final point about the model that I want to highlight is its simplicity; the bulk of the network is made up of convolutional blocks consisting of a convolutional layer: Batch Normalization, LeakyReLU activation, and finally a MaxPooling layer. This progressively increases the filter size (depth) of the network until it has reached the desired grid size and then transforms the data using only the convolutional layer, Batch Normalization, and LeakyReLU activation, dropping the MaxPooling.

Now, we have introduced the terms batch normalization and leaky ReLU, which may be unfamiliar to some; here I will provide a brief description of each, starting with batch normalization. It's considered best practice to normalize the input layer before feeding it into a network. For example, we normally divide pixel values by 255 to force them into a range of 0 to 1. We do this to make it easier for the network to learn by removing any large values (or large variance in values), which may cause our network to oscillate when adjusting the weights during training. Batch normalization performs this same adjustment for the outputs of the hidden layers rather than the inputs.
ReLU is an activation function that sets anything below 0 to 0, that is, it doesn't allow non-positive values to propagate through the network. Leaky ReLU provides a less strict implementation of ReLU, allowing a small non-zero gradient to slip through when the neuron is not active.

This concludes our brief overview of the model. You can learn more about it from the official paper YOLO9000: Better, Faster, Stronger by J. Redmon and A. Farhadi, available at https://arxiv.org/abs/1612.08242. Let's now turn our attention to converting the trained Keras model of the Tiny YOLO to Core ML.

As alluded to in Chapter 1, Introduction to Machine Learning, Core ML is more of a suite of tools than a single framework. One part of this suite is the Core ML Tools Python package, which assists in converting trained models from other frameworks to Core ML for easy and rapid integration. Currently, official converters are available for Caffe, Keras, LibSVM, scikit-learn, and XGBoost, but the package is open source, with many other converters being made available for other popular ML frameworks, such as TensorFlow.

At its core, the conversion process generates a model specification that is a machine-interpretable representation of the learning models and is used by Xcode to generate the Core ML models, consisting of the following:

Model description: Encodes names and type information of the inputs and outputs of the model
Model parameters: The set of parameters required to represent a specific instance of the model (model weights/coefficients)
Additional metadata: Information about the origin, license, and author of the model

In this chapter, we present the most simplistic of flows, but we will revisit the Core ML Tools package in Chapter 6, Creating Art with Style Transfer, to see how to deal with custom layers.

To avoid any complications when setting up the environment on your local or remote machine, we will leverage the free Jupyter cloud service provided by Microsoft. Head over to https://notebooks.azure.com and log in, or register if you haven't already.

Once logged in, click on the Libraries menu link from the navigation bar, which will take you to a page containing a list of all your libraries, similar to what is shown in the following screenshot:

Next, click on the + New Library link to bring up the Create New Library dialog:

Then click on the From GitHub tab and enter https://github.com/packtpublishing/machine-learning-with-core-ml in the GitHub repository field. After that, give your library a meaningful name and click on the Import button to begin the process of cloning the repository and creating the library.

Once the library has been created, you will be redirected to the root; from here, click on the Chapter5/Notebooks folder to open up the relevant folder for this chapter. Finally, click on the Notebook Tiny YOLO_Keras2CoreML.ipynb. To help ensure that we are all on the same page (pun intended), here is a screenshot of what you should see after clicking on the Chapter5/Notebooks folder:

With our Notebook now loaded, it's time to walk through each of the cells to create our Core ML model; all of the required code exists and all that remains is executing each of the cells sequentially. To execute a cell, you can either use the shortcut keys Shift + Enter or click on the Run button in the toolbar (which will run the currently selected cell), as shown in the following screenshot:

I will provide a brief explanation of what each cell does; ensure that you execute each cell as we walk through them so that we all end up with the converted model, which we will then download and use in the next section for our iOS project.

We start by ensuring that the Core ML Tools Python package is available in the environment, by running the following cell:

!pip install coremltools

Once installed, we make the package available by importing it:

import coremltools

The model architecture and weights have been serialized and saved to the file tinyyolo_voc2007_modelweights.h5; in the following cell, we will pass this into the convert function of the Keras converter, which will return the converted Core ML model (if no errors occur). Along with the file, we also pass in values for the parameters input_names, image_input_names, output_names, and image_scale. The input_names parameter takes in a single string, or a list of strings for multiple inputs, and is used to explicitly set the names that will be used in the interface of the Core ML model to refer to the inputs of the Keras model.

We also pass this input name to the image_input_names parameter so that the converter treats the input as an image rather than an N-dimensional array. Similar to input_names, values passed to output_names will be used in the interface of the Core ML model to refer to the outputs of the Keras model. The last parameter, image_scale, allows us to add a scaling factor to our input before being passed to the model. Here, we are dividing each pixel by 255, which forces each pixel to be in a range of 0.0 to 1.0, a typical preprocessing task when working with images. There are plenty more parameters available, allowing you to tune and tweak the inputs and outputs of your model. You can learn more about these at the official documentation site here at https://apple.github.io/coremltools/generated/coremltools.converters.keras.convert.html. In the next snippet, we perform the actual conversion using what we have just discussed:

coreml_model = coremltools.converters.keras.convert(
    'tinyyolo_voc2007_modelweights.h5',
    input_names='image',
    image_input_names='image',
    output_names='output',
    image_scale=1./255.)

With reference to the converted model, coreml_model, we add metadata, which will be made available and displayed in Xcode's ML model views:

coreml_model.author = 'Joshua Newnham'
coreml_model.license = 'BSD'
coreml_model.short_description = 'Keras port of YOLOTiny VOC2007 by Joseph Redmon and Ali Farhadi'
coreml_model.input_description['image'] = '416x416 RGB Image'
coreml_model.output_description['output'] = '13x13 Grid made up of: [cx, cy, w, h, confidence, 20 x classes] * 5 bounding boxes'

We are now ready to save our model; run the final cell to save the converted model:

coreml_model.save('tinyyolo_voc2007.mlmodel')

With our model now saved, we return to the previous tab showing the contents of the Chapter5 directory and download the tinyyolo_voc2007.mlmodel file. We do so by either right-clicking on it and selecting the Download menu item, or by clicking on the Download toolbar item, as shown in the following screenshot:

With our converted model in hand, it's now time to jump into Xcode and work through the example project for this chapter.

Table of Contents for Converting Keras Tiny YOLO to Core ML

Create new playlist

Sign In

Sign Up

Table of Contents for
Converting Keras Tiny YOLO to Core ML