Bringing it all together

If you haven't done already, pull down the latest code from the accompanying repository: https://github.com/packtpublishing/machine-learning-with-core-ml. Once downloaded, navigate to the directory Chapter8/Start/QuickDrawRNN and open the project QuickDrawRNN.xcodeproj. Once loaded, you will see a project that should look familiar to you as it is almost a replica of the project we built in the previous chapter. For this reason, I won't be going over the details here, but feel free to refresh your memory by skimming through the previous chapter.

Rather I want to spend some time highlighting what I consider one of the most important aspects of designing and building the interface between people and machine learning systems. Let's start with this and then move on to migrating our code across from our playground project.

I consider Quick, Draw! a great example that highlights a major responsibility of the designer of any interface of a machine learning system. What makes it stand out is not the clever preprocessing that makes it invariant to scale and translation. Nor is it the sophisticated architecture that can effectively learn complex sequences, but rather the mechanism used to capture the training data. One major obstacle we have in creating intelligent systems is obtaining (enough) clean and labeled data that we can use to train our models. Quick, Draw! tackled this by, I assume, intentionally being a tool for capturing and labeling data through the façade of a compelling game—compelling enough to motivate a large number of users to generate sufficient amounts of labeled data. Although some of the sketches are questionable, the sheer number of sketches dilutes these outliers.

The point is that machine learning systems are not static, and we should design opportunities to allow the user to correct the system, where applicable, and capture new data, either implicitly (with the user's consent) and/or explicitly. Allowing a level of transparency between the user and system and allowing the user to correct the model when wrong not only provides us with new data to improve our model, but also—just as important—assists the user in building a useful mental model of the system. Thus it builds some intuition around the affordances of our system, which help them use it correctly.

In our example project, we can easily expose the predictions and provide the means for the user to correct the model. But to ensure that this chapter is concise, we will just look at how we obtain an updated model that typically (remembering that Core ML is suited for inference as opposed to training) we would train off the device. In such a case, you would upload the data to a central server and fetch an updated model when available. As mentioned before, here we will look at the latter: how we obtain the updated model. Let's see how.

Previously I mentioned, and implied, that you would typically upload new training data and train your model off the device. This, of course, is not the only option and it's reasonable to perform training on the device using the user's personal data to tune a model. The advantage of training locally is privacy and lower latency, but it has the disadvantage of diminishing collective intelligence, that is, improvement of the model from collective behavior. Google proposed a clever solution that ensured privacy and allowed for collaboration. In a post titled Federated Learning: Collaborative Machine Learning without Centralized Training Data, they described a technique of training locally on the device using personalized data and then uploading only the tuned model to the server, where it would average the weights from the crowd before updating a central model. I encourage you to read the post at https://research.googleblog.com/2017/04/federated-learning-collaborative.html.

As you may have come to expect when using Core ML, the bulk of the work is not interfacing with the framework but rather the activities before and after it. Compiling and instantiating a model can be done in just two lines of code, as follows:

let compiledUrl = try MLModel.compileModel(at: modelUrl)
let model = try MLModel(contentsOf: compiledUrl)

Where modelUrl is a URL of a locally stored .mlmodel file. Passing it to compileModel will return the .mlmodelc file. This can be used to initialize an instance of MLModel, which provides the same capabilities as a model bundled with your application.

Downloading and compilation are time consuming. So you not only want to do this off the main thread but also want to avoid having to perform the task unnecessary; that is, cache locally and only update when required. Let's implement this functionality now; click on the QueryFacade.swift file on the left-hand-side panel to bring it to focus in the main editor window. Then add a new extension to the QueryFacade class, which is where we will add our code responsible for downloading and compiling the model.

Our first task is to test whether we need to download the model. We do this by simply checking whether we have the model and our model is considered recent. We will use NSUserDefaults to keep track of the location of the compiled model as well as a timestamp of when it was last updated. Add the following code to your extension of QueryFacade, which is be responsible for checking whether we need to download the model:

private var SyncTimestampKey : String{
    get{
        return "model_sync_timestamp"
    }
}

private var ModelUrlKey : String{
    get{
        return "model_url"
    }
}

private var isModelStale : Bool{
    get{
        if let modelUrl = UserDefaults.standard.string(
            forKey: self.ModelUrlKey){
            if !FileManager.default.fileExists(atPath: modelUrl){
                return true
            }
        }
        
        let daysToUpdate : Int = 10
        let lastUpdated = Date(timestamp:UserDefaults.standard.integer(forKey: SyncTimestampKey))

        guard let numberOfDaysSinceUpdate = NSCalendar.current.dateComponents([.day], from: lastUpdated, to: Date()).day else{
            fatalError("Failed to calculated elapsed days since the model was updated")
        }
        return numberOfDaysSinceUpdate >= daysToUpdate
    }
}

As mentioned, we first check whether the model exists, and if so, then test how many days have elapsed since the model was last updated, testing this against some arbitrary threshold for which we consider the model to be stale.

The next method we implement will be responsible for downloading the model (the .mlmodel file); this should look familiar to most iOS developers, with the only notable piece of code being the use of a semaphore to make the task synchronous, as the calling method will be running this off the main thread. Append the following code to your QueryFacade extension:

private func downloadModel() -> URL?{
    guard let modelUrl = URL(
        string:"https://github.com/joshnewnham/MachineLearningWithCoreML/blob/master/CoreMLModels/Chapter8/quickdraw.mlmodel?raw=true") else{
            fatalError("Invalid URL")
    }
    
    var tempUrl : URL?
    
    let sessionConfig = URLSessionConfiguration.default
    let session = URLSession(configuration: sessionConfig)
    
    let request = URLRequest(url:modelUrl)
    
    let semaphore = DispatchSemaphore(value: 0)
    
    let task = session.downloadTask(with: request) { (tempLocalUrl, response, error) in
        if let tempLocalUrl = tempLocalUrl, error == nil {
            tempUrl = tempLocalUrl
        } else {
            fatalError("Error downloading model (String(describing: error?.localizedDescription))")
        }
        
        semaphore.signal()
    }
    task.resume()
    _ = semaphore.wait(timeout: .distantFuture)
    
    return tempUrl
}

I have highlighted the statements related to making this task synchronous; essentially, calling semaphore.wait(timeout: .distantFuture) will hold the current thread until it is signaled to move on, via semaphore.signal(). If successful, this method returns the local URL of the downloaded file.

Our last task is to tie this all together; the next method we implement will be called when QueryFacade is instantiated (which we will add just after this). It will be responsible for checking whether we need to download the model, proceeding to download and compile if necessary, and instantiating an instance variable model, which we can use to perform inference. Append the final snippet of code to your QueryFacade extension:

private func syncModel(){
    queryQueue.async {
        
        if self.isModelStale{
            guard let tempModelUrl = self.downloadModel() else{
                return
            }
            
            guard let compiledUrl = try? MLModel.compileModel(
                at: tempModelUrl) else{
                fatalError("Failed to compile model")
            }
            
            let appSupportDirectory = try! FileManager.default.url(
                for: .applicationSupportDirectory,
                in: .userDomainMask,
                appropriateFor: compiledUrl,
                create: true)
            
            let permanentUrl = appSupportDirectory.appendingPathComponent(
                compiledUrl.lastPathComponent)
            do {
                if FileManager.default.fileExists(
                    atPath: permanentUrl.absoluteString) {
                    _ = try FileManager.default.replaceItemAt(
                        permanentUrl,
                        withItemAt: compiledUrl)
                } else {
                    try FileManager.default.copyItem(
                        at: compiledUrl,
                        to: permanentUrl)
                }
            } catch {
                fatalError("Error during copy: (error.localizedDescription)")
            }
            
            UserDefaults.standard.set(Date.timestamp,
                                      forKey: self.SyncTimestampKey)
            UserDefaults.standard.set(permanentUrl.absoluteString,
                                      forKey:self.ModelUrlKey)
        }
        
        guard let modelUrl = URL(
            string:UserDefaults.standard.string(forKey: self.ModelUrlKey) ?? "")
            else{
            fatalError("Invalid model Url")
        }
        
        self.model = try? MLModel(contentsOf: modelUrl)
    }
}

We start by checking whether we need to download the model, and if so, proceed to download and compile it:

guard let tempModelUrl = self.downloadModel() else{
    return
}

guard let compiledUrl = try? MLModel.compileModel(
    at: tempModelUrl) else{
    fatalError("Failed to compile model")
}

To avoid having to perform this step unnecessarily, we then save the details somewhere permanently, setting the model's location and the current timestamp in NSUserDefaults:

let appSupportDirectory = try! FileManager.default.url(
    for: .applicationSupportDirectory,
    in: .userDomainMask,
    appropriateFor: compiledUrl,
    create: true)

let permanentUrl = appSupportDirectory.appendingPathComponent(
    compiledUrl.lastPathComponent)
do {
    if FileManager.default.fileExists(
        atPath: permanentUrl.absoluteString) {
        _ = try FileManager.default.replaceItemAt(
            permanentUrl,
            withItemAt: compiledUrl)
    } else {
        try FileManager.default.copyItem(
            at: compiledUrl,
            to: permanentUrl)
    }
} catch {
    fatalError("Error during copy: (error.localizedDescription)")
}

UserDefaults.standard.set(Date.timestamp,
                          forKey: self.SyncTimestampKey)
UserDefaults.standard.set(permanentUrl.absoluteString,
                          forKey:self.ModelUrlKey)

Finally, we instantiate and assign an instance of MLModel to our instance variable model. The last task is to update the constructor of the QueryFacade class to kick off this process when instantiated; update the QueryFacade init method with the following code:

init() {
    syncModel()
}

At this stage, we have our model ready for performing inference; our next task is to migrate the code we developed in our playground to our project and then hook it all up. Given that we have spent the first part of this chapter discussing the details, I will skip the specifics here but rather include the additions for convenience and completeness.

Let's start with our extensions to the CGPoint structure; add a new swift file to your project called CGPointRNNExtension.swift and add the following code in it:

extension CGPoint{
    public static func getSquareSegmentDistance(
        p0:CGPoint,
        p1:CGPoint,
        p2:CGPoint) -> CGFloat{
        let x0 = p0.x, y0 = p0.y
        var x1 = p1.x, y1 = p1.y
        let x2 = p2.x, y2 = p2.y
        var dx = x2 - x1
        var dy = y2 - y1
        
        if dx != 0.0 && dy != 0.0{
            let numerator = (x0 - x1) * dx + (y0 - y1) * dy
            let denom = dx * dx + dy * dy
            let t = numerator / denom
            
            if t > 1.0{
                x1 = x2
                y1 = y2
            } else{
                x1 += dx * t
                y1 += dy * t
            }
        }
        
        dx = x0 - x1
        dy = y0 - y1
        
        return dx * dx + dy * dy
    }
}

Next, add another new swift file to your project called StrokeRNNExtension.swift and add the following code:

extension Stroke{
    
    public func simplify(epsilon:CGFloat=3.0) -> Stroke{
        
        var simplified: [CGPoint] = [self.points.first!]
        
        self.simplifyDPStep(points: self.points,
                            first: 0, last: self.points.count-1,
                            tolerance: epsilon * epsilon,
                            simplified: &simplified)
        
        simplified.append(self.points.last!)
        
        let copy = self.copy() as! Stroke
        copy.points = simplified
        
        return copy
    }
    
    func simplifyDPStep(points:[CGPoint],
                        first:Int,
                        last:Int,
                        tolerance:CGFloat,
                        simplified: inout [CGPoint]){
        
        var maxSqDistance = tolerance
        var index = 0
        
        for i in first + 1..<last{
            let sqDist = CGPoint.getSquareSegmentDistance(
                p0: points[i],
                p1: points[first],
                p2: points[last])
            
            if sqDist > maxSqDistance {
                maxSqDistance = sqDist
                index = i
            }
        }
        
        if maxSqDistance > tolerance{
            if index - first > 1 {
                simplifyDPStep(points: points,
                               first: first,
                               last: index,
                               tolerance: tolerance,
                               simplified: &simplified)
            }
            
            simplified.append(points[index])
            
            if last - index > 1{
                simplifyDPStep(points: points,
                               first: index,
                               last: last,
                               tolerance: tolerance,
                               simplified: &simplified)
            }
        }
    }
}

Finally, we will add a couple of methods that we implemented in the playground to our StrokeSketch class to handle the required preprocessing; start by adding a new .swift file called StrokeSketchExtension.swift and block out the extension as follows:

import UIKit
import CoreML

extension StrokeSketch{

}

Next, we copy and paste in the simplify method, which we implement in the playground as follows:

public func simplify() -> StrokeSketch{
    let copy = self.copy() as! StrokeSketch
    copy.scale = 1.0
    
    let minPoint = copy.minPoint
    let maxPoint = copy.maxPoint
    let scale = CGPoint(x: maxPoint.x-minPoint.x,
                        y:maxPoint.y-minPoint.y)
    
    var width : CGFloat = 255.0
    var height : CGFloat = 255.0
    
    if scale.x > scale.y{
        height *= scale.y/scale.x
    } else{
        width *= scale.y/scale.x
    }
    
    // for each point, subtract the min and divide by the max
    for i in 0..<copy.strokes.count{
        copy.strokes[i].points = copy.strokes[i].points.map({
            (pt) -> CGPoint in
            let x : CGFloat = CGFloat(
                Int(((pt.x - minPoint.x)/scale.x) * width)
            )
            let y : CGFloat = CGFloat(
                Int(((pt.y - minPoint.y)/scale.y) * height)
            )
            
            return CGPoint(x:x, y:y)
        })
    }
    
    copy.strokes = copy.strokes.map({ (stroke) -> Stroke in
        return stroke.simplify()
    })
    
    return copy
}

As a reminder, this method is responsible for the preprocessing of a sequence of strokes, as described previously. Next, we add our static method preprocess to the StrokeSketch extension, which takes an instance of StrokeSketch and is responsible for putting its simplified state into a data structure that we can pass to our model for inference:

public static func preprocess(_ sketch:StrokeSketch)
    -> MLMultiArray?{
    let arrayLen = NSNumber(value:75 * 3) 
    
    let simplifiedSketch = sketch.simplify()
    
    guard let array = try? MLMultiArray(shape: [arrayLen],
                                        dataType: .double)
        else{ return nil }
    
    
    let minPoint = simplifiedSketch.minPoint
    let maxPoint = simplifiedSketch.maxPoint
    let scale = CGPoint(x: maxPoint.x-minPoint.x,
                        y:maxPoint.y-minPoint.y)
    
    var data = Array<Double>()
    for i in 0..<simplifiedSketch.strokes.count{
        for j in 0..<simplifiedSketch.strokes[i].points.count{
            let point = simplifiedSketch.strokes[i].points[j]
            let x = (point.x-minPoint.x)/scale.x
            let y = (point.y-minPoint.y)/scale.y
            let z = j == simplifiedSketch.strokes[i].points.count-1 ?
                1 : 0
            
            data.append(Double(x))
            data.append(Double(y))
            data.append(Double(z))
        }
    }
    
    let dataStride : Int = 3
    for i in stride(from: dataStride, to:data.count, by: dataStride){
        data[i - dataStride] = data[i] - data[i - dataStride] 
        data[i - (dataStride-1)] = data[i+1] - data[i - (dataStride-1)] 
        data[i - (dataStride-2)] = data[i+2] // EOS
    }

    data.removeLast(3)
    
    var dataIdx : Int = 0
    let startAddingIdx = max(array.count-data.count, 0)
    
    for i in 0..<array.count{
        if i >= startAddingIdx{
            array[i] = NSNumber(value:data[dataIdx])
            dataIdx = dataIdx + 1
        } else{
            array[i] = NSNumber(value:0)
        }
    }
    
    return array
}

If anything looks unfamiliar, then I encourage you to revisit the previous section, where we delve into the details of what these methods do (and why).

We now have our model and functionality for preprocessing the input; our last task is to tie this all together. Head back to the QueryFacade class and locate the method classifySketch. As a reminder, this method is called via queryCurrentSketch, which in turn is triggered anytime the user completes a stroke. The method is expected to return a dictionary of category and probability pairs, which is then used to search and download related drawings of most likely categories. At this point, it's simply a matter of using the work we have previously done, with one little caveat. If you recall from previous chapters, when we imported our model into the project, Xcode would conveniently generate a strongly typed wrapper for our model and its associated inputs and outputs. A disadvantage of downloading and importing at runtime is that we forgo these generated wrappers and are left to do it manually.

Starting backwards, after making the prediction, we are expecting an instance of MLFeatureProvider to be returned, which in turn has a method called featureValue. This returns an instance of MLFeatureValue for a given output key (classLabelProbs). The returned instance of MLFeatureValue exposes properties set by the model during inference; here we are interested in the dictionaryValue property of type [String:Double] (category and its associated probability).

Obviously, to obtain this output, we need to call predict on our model, which is expecting an instance adhering to the MLFeatureProvider protocol that was generated for us, as mentioned previously. Given that in most instances you will have access and knowledge of the model, the easiest way to generate this wrapper is to import the model and extract the generated input, which is exactly what we will do.

Locate the file CoreMLModels/Chapter8/quickdraw.mlmodel in the accompanying repository https://github.com/packtpublishing/machine-learning-with-core-ml, and drag the file into your project as we have done in previous chapters. Once imported, select it from the left-hand-side panel and click on the arrow button within the Model Class section, as shown in the following screenshot:

This will open up the generated classes; locate the class quickdrawInput and copy and paste it to your QueryFacade.swift, ensuring that it's outside the QueryFacade class (or extensions). Because we are only concerned with the strokeSeq input, we can strip all other variables; clean it up such that you are left with something like the following:

class quickdrawInput : MLFeatureProvider {
    
    var strokeSeq: MLMultiArray
    
    var featureNames: Set<String> {
        get {
            return ["strokeSeq"]
        }
    }
    
    func featureValue(for featureName: String) -> MLFeatureValue? {
        if (featureName == "strokeSeq") {
            return MLFeatureValue(multiArray: strokeSeq)
        }
        return nil
    }
    
    init(strokeSeq: MLMultiArray) {
        self.strokeSeq = strokeSeq
    }
}

We are finally ready to perform inference; return to the classifySketch method within the QueryFacade class and add the following code:

if let strokeSketch = sketch as? StrokeSketch, let
    x = StrokeSketch.preprocess(strokeSketch){
    
    if let modelOutput = try! model?.prediction(from:quickdrawInput(strokeSeq:x)){
        if let classPredictions = modelOutput.featureValue(
            for: "classLabelProbs")?.dictionaryValue as? [String:Double]{
            
            let sortedClassPredictions = classPredictions.sorted(
                by: { (kvp1, kvp2) -> Bool in
                kvp1.value > kvp2.value
            })
            
            return sortedClassPredictions
        }
    }
}

return nil

No doubt most of this will look familiar to you; we start by extracting the features via the preprocess method we implemented at the start of this chapter. Once we have obtained these features, we wrap them in an instance of quickdrawInput, before passing them to our model's prediction method to perform inference. If successful, we are returned the output, with which we proceed to extract the appropriate output, as discussed previously. Finally we sort the results before returning them to the caller.

With that complete, you are now in a good position to test. Build and deploy to the simulator or device, and if everything goes as planned, you should be able to test the accuracy of your mode (or drawing, depending on how you look at it):

Let's wrap up this chapter by reviewing what we have covered.

Table of Contents for Bringing it all together

Create new playlist

Sign In

Sign Up

Table of Contents for
Bringing it all together