Creating and training a model

Thanks to the great effort by Apple's engineers, the process of creating common machine learning models is incredibly easy and will no doubt spark a new wave of intelligent apps over the coming months.

In this section, you will see just how easy it is as we walk through creating an image classifier for our application using Create ML.

Create ML is accessible using Xcode Playground, so there is a good place to start. Open up Xcode and create a new Playground, ensuring that you select macOS as the platform, as shown here:

Once in the playground, import CreateML and Foundation as follows:

import CreateML
import Foundation

Next, create a URL that points to the directory that contains your training data:

let trainingDir = URL(fileURLWithPath: "/<PATH TO DIRECTORY WITH TRAINING DATA>")

The only thing left to do is to create an instance of our model, passing in the path to our training data (I did say it was incredibly easy):

let model = try MLImageClassifier(
    trainingData: .labeledDirectories(at: trainingDir))

Create ML offers you the flexibility of providing a custom dictionary of labels and their associated files or through the convenience of a MLImageClassifier.DataSource. This can either be a hierarchical directory structure where classes are organized into their respective folders, MLImageClassifier.DataSource.labeledDirectories (as we have done in this example), or one where each file has been named with respect to their associated class, MLImageClassifier.DataSource.labeledFiles.

As soon as the model is instantiated, it will begin training. Once finished, it will output the accuracy achieved on your training set to the console, as shown in the following screenshot:

We are almost done; this tells us that our model has fit our training data well, but it doesn't tell us how well it will generalize, that is, how well it will work on images it hasn't seen before. It's possible (and common) for deep neural networks to remember their training data, commonly referred to as overfitting. To avoid overfitting, and therefore make it more likely to produce something usable in the real world, it's a common practice to split your data into three buckets. The first bucket is used to train your model. The second bucket, called validation data, is used during training (typically at the end of each iteration/epoch) to see how well the model is generalizing. It also provides clues as to when the model starts overfitting (when the training accuracy and validation accuracy begin to diverge). The last bucket is only used once you are satisfied with how your model performs on the validation data and is the determinant of how well your model actually works; this bucket is known as the test data.

How much data do you reserve for validation and testing? For shallow learners, it was common to have a 70/20/10 (training, validation, and test) split. But deep learning normally implies big datasets, in which case the reserved data for validation and test may be excessive. So the answer really depends on how much data you have and what type of data it is.

Therefore, before deploying our model, we evaluate it on a dataset it hasn't seen during training. Once again, collect an equal amount of data for each of your classes and return here once you've done so.

As we had done before, create a URL that points to the directory that contains your validation data:

let validationDir = URL(fileURLWithPath: "/<PATH TO DIRECTORY WITH VALIDATION DATA>")

Now it's simply a matter of calling evaluation on the model, as shown here:

model.evaluation(on: .labeledDirectories(at: validationDir))

This will perform inference on each of our validation samples and report the accuracy, which you can access via quick looks:

Satisfied with our validation accuracy, we are now ready to export our model, but just before we do so, let's perform a prediction on an individual image.

You can easily do this by calling the prediction method of your model instance (or predictions if you have multiple samples you want to perform inference on), as shown in this snippet:

let strawberryUrl = URL(
    fileURLWithPath: "/<PATH TO STRAWBERRY>")

print(try model.prediction(from: strawberryUrl))

If all goes well, then Strawberry should be output to your console. Now, feeling confident with our model, it's time to export it.

In keeping with the nature of Create ML, exporting is simply a single line of code:

try model.write(toFile: "<PATH TO FILE>")

From here, it's just a matter of importing the Core ML model into your project, as we have seen many times throughout this book.

We have almost concluded our brief introduction to Create ML; but before we move on, I want to quickly highlight a few things, starting with model parameters.

Table of Contents for Creating and training a model

Create new playlist

Sign In

Sign Up

Table of Contents for
Creating and training a model