Optimizing with batches

At the moment, our process involves iterating over each photo and performing inference on each one individually. With the release of Core ML 2, we now have the option to create a batch and pass this batch to our model for inference. As with efficiencies gained with economies of scale, here, we also gain significant improvements; so let's walk through adapting our project to process our photos in a single batch rather than individually.

Let's work our way up the stack, starting in our YOLOFacade class and moving up to the PhotoSearcher. For this we will be using our model directly rather than proxying through Vision, so our first task is to replace the model property of our YOLOFacade class with the following declaration:

let model = tinyyolo_voc2007().model

Now, let's rewrite the detectObjects method to handle an array of photos rather than a single instance; because this is where most of the changes reside, we will start from scratch. So, go ahead and delete the method from your YOLOFacade class and replace it with the following stub:

func detectObjects(photos:[UIImage], completionHandler:(_ result:[[ObjectBounds]]?) -> Void){
    
    // TODO batch items (array of MLFeatureProvider)

    // TODO Wrap our items in an instance of MLArrayBatchProvider 

    // TODO Perform inference on the batch 
    
    // TODO (As we did before) Process the outputs of the model 

    // TODO Return results via the callback handler 
}

I have made the changes to the signature of the method bold and listed our remaining tasks. The first is to create an array of MLFeatureProvider. If you recall from Chapter 3, Recognizing Objects in the World, when we import a Core ML model into Xcode, it generates interfaces for the model, its input and output. The input and output are subclasses of MLFeatureProvider, so here we want to create an array of tinyyolo_voc2007Input, which can be instantiated with instances of CVPixelBuffer.

To create this, we will transform the array of photos passed into the method, including the required preprocessing steps (resizing to 416 x 416). Replace the comment // TODO batch items (array of MLFeatureProvider) with the following code:

let X = photos.map({ (photo) -> tinyyolo_voc2007Input in
    guard let ciImage = CIImage(image: photo) else{
        fatalError("(#function) Failed to create CIImage from UIImage")
    }
    let cropSize = CGSize(
        width:min(ciImage.extent.width, ciImage.extent.height),
        height:min(ciImage.extent.width, ciImage.extent.height))
    
    let targetSize = CGSize(width:416, height:416)
    
    guard let pixelBuffer = ciImage
        .centerCrop(size:cropSize)?
        .resize(size:targetSize)
        .toPixelBuffer() else{
        fatalError("(#function) Failed to create CIImage from UIImage")
    }
    
    return tinyyolo_voc2007Input(image:pixelBuffer)
    
}) 

// TODO Wrap our items in an instance of MLArrayBatchProvider 

// TODO Perform inference on the batch 
    
// TODO (As we did before) Process the outputs of the model 

// TODO Return results via the callback handler

For reasons of simplicity and readability, we are omitting any kind of error handling; obviously, in production you want to handle exceptions appropriately.

To perform inference on a batch, we need to have our input conform to the MLBatchProvider interface. Fortunately, Core ML provides a concrete implementation that conveniently wraps array. Let's do this now; replace the comment // TODO Wrap our items in an instance of MLArrayBatchProvider with the following code:

let batch = MLArrayBatchProvider(array:X)

// TODO Perform inference on the batch 
    
// TODO (As we did before) Process the outputs of the model 

// TODO Return results via the callback handler

To perform inference, it's simply a matter of calling the predictions method on our model; as usual, replace the comment // TODO Perform inference on the batch with the following code:

guard let batchResults = try? self.model.predictions(
    from: batch,
    options: MLPredictionOptions()) else{
        completionHandler(nil)
        return
}

// TODO (As we did before) Process the outputs of the model 

// TODO Return results via the callback handler

What we get back is an instance of MLBatchProvider (if successful); this is more or less a collection of results for each of our samples (inputs). We can access an specific result via the batch providers features(at: Int) method, which returns an instance of MLFeatureProvider (in our case, an in tinyyolo_voc2007Output).

Here we simply process each result as we had done before to obtain the most salient; replace the comment // TODO (As we did before) Process the outputs of the model with the following code:

var results = [[ObjectBounds]]()

for i in 0..<batchResults.count{
    var iResults = [ObjectBounds]()
    
    if let features = batchResults.features(at: i)
        as? tinyyolo_voc2007Output{
        
        if let observationDetectObjects = self.detectObjectsBounds(
            array: features.output){
            
            for detectedObject in observationDetectObjects.map(
                {$0.transformFromCenteredCropping(
                    from: photos[i].size,
                    to: self.targetSize)}){
                        
                    iResults.append(detectedObject)
            }
            
        }
    }
    results.append(iResults)
}

// TODO Return results via the callback handler

The only difference here, than before, is that we are iterating over a batch of outputs rather than a single one. The last thing we need to do is call the handler; replace the comment // TODO Return results via the callback handler with the following statement:

completionHandler(results)

This now completes the changes required to our YOLOFacade class; let's jump into the PhotoSearcher and make the necessary, and final, changes.

The big change here is that we now need to pass in all photos at once rather than passing each one individually. Locate the detectObjects method and replace its body with the following code:

var results = [SearchResult]()

yolo.detectObjects(photos: photos) { (photosObjectBounds) in
    if let photosObjectBounds = photosObjectBounds,
        photos.count == photosObjectBounds.count{
        for i in 0..<photos.count{
            results.append(SearchResult(
                image: photos[i],
                detectedObjects: photosObjectBounds[i],
                cost: 0.0))
        }
    }
}

return results

Same code but organised a little differently to handle batch inputs and output from and to the YOLOFacade class. Now is a good time to build, deploy and run the application; paying particular attention to efficiencies gained from adapting batch inference. When you return; we will conclude this chapter with a quick summary.

Table of Contents for Optimizing with batches

Create new playlist

Sign In

Sign Up

Table of Contents for
Optimizing with batches