Adding features to our service

At this point, our API has two GET calls and one POST call. The GET call estimates the time it will take for each type of complaint to be closed, based on the historic median for each type of call. However, this approach is obviously very naive—it takes into account neither location, time, nor a number of similar complaints in the queue for the same area. To improve our estimate, let's use an ML model, trained to predict a given complaint type, location, and time. You can find all of the details on model training in the 311model.ipynb notebook. What is important is that the trained model is stored as a Pickle file and expects four features (we collected earlier): type of complaint, latitude, longitude, and time complaint was filled.

Let's now modify our code so that it will take those features and run a model:

First, we need to load a model from pickle in our code (we use joblib, which is a little more efficient for scikit-learn models, or any objects containing NumPy arrays, for that matter):

clf = joblib.load('model.joblib')

Now, because we use a custom transformer for time features, we need to import it into the file:

from ml import TimeTransformer  # this line should be at the top

One inconvenient property of pickle (and joblib) objects is that they don't store all of the dependencies internally; to make it all more complicated, they reference those dependencies, expecting them to be at the same location, relative to the object. In other words, you have to import everything exactly as it was done when the object was stored.

And finally, we can implement the method as follows:

@app.get('/predict/{complaint_type}', tags=['predict'])
def predict_time(complaint_type:ComplaintType, latitude:float, longitude:float, created_date:datetime):

    obj = pd.DataFrame([{'complaint_type':complaint_type.value,
                         'latitude':latitude, 'longitude':longitude,
                         'created_date':created_date},])
    obj = obj[['complaint_type', 'latitude','longitude', 'created_date']]


    predicted = clf.predict(obj)
    logger.info(predicted)
    return {'estimated_time': predicted[0]}

Once the application is reloaded, go to the docs, and try to execute the method:

>>> curl -X GET "http://127.0.0.1:8000/predict/vehicle?latitude=40.701258&longitude=-73.935493&created_date=2019-06-08%2018%3A00%3A10" -H "accept: application/json"

{
  "estimated_time": 0.59
}

Voilà! Our first ML model is up and running. Here is what it looks like via the OpenAPI page:

We were able to build a working API application, serving predictions of our pre-trained model. This application may be used by the government or concerned citizens, eager to know how long it is likely to take 311 to review and close the application.

RESTful APIs are the bread and butter of data-driven software and services. Their technical look, however, may intimidate and confuse the inexperienced person. As an alternative, we could build and serve a web page that would be easy to read and understand for such an audience. Let's see how that works in the next section.

Table of Contents for Adding features to our service

Create new playlist

Sign In

Sign Up

Table of Contents for
Adding features to our service