Deploying and testing your API loads with Locust

Once the application is deployed, but before it is publicly announced or used, it is a good idea to estimate how many requests it can handle. Usually, you can roughly predict the requirements for the service by estimating the number of requests it needs to execute at peak periods, how long those periods are, how fast it should respond, and so on. Once you're clear on the requirements, you'll need to test-load your application.

Test-loads should be performed on the actual, deployed server, not your localhost. Here, we skip over the whole topic of deploying your model. We also didn't use ngnix or any similar gateway servers, which would cache requests, boosting the performance of the API significantly. Deployment of the application deserves a separate book and can be achieved in many ways, depending on your existing infrastructure and resources and the importance of the application. One popular way is to generate a Docker container that can then be pulled and deployed by any cloud infrastructure platform. We will touch on containers in Chapter 20, Best Practices and Python Performance.

To run a test-load, Locust requires a simple Python script, which they call a locustfile. Let's see how to use it:

The following code is the file we wrote for our 311 API:

from locust import HttpLocust, TaskSet, task

class WebsiteTasks(TaskSet):
    
    @task
    def preduct(self):
        self.client.get("/predict/residential?latitude=40.675719430504&longitude=-73.860535138411&created_date=2019-06-14T00%3A02%3A11.000")
    
    @task
    def preduct_async(self):
        self.client.get("/predict_async/residential?latitude=40.675719430504&longitude=-73.860535138411&created_date=2019-06-14T00%3A02%3A11.000")
        
    @task
    def dashboard(self):
        self.client.get("/dashboard/dashboard")

class WebsiteUser(HttpLocust):
    
    task_set = WebsiteTasks
    min_wait = 5000
    max_wait = 15000

Here, we target two prediction endpoints and a dashboard, keeping all of the parameters the same.

For more rigorous testing, it would be a good idea to generate random request parameters each time, but for now, let's keep it simple. Having the file, let's fire up locust:

locust --host=http://127.0.0.1:8000

Now, head to http://127.0.0.1:8089.
In the Locust initial form, specify the desired number of users and growth (other parameters could be set in locustfile.py) to simulate and hit Run. For the following screenshot, we set 5,000 users maximum, with a growth of 10 per second.

Once the simulation is started, you can monitor the resultant performance, as well as failures and thrown exceptions, in real time, via the Locust dashboard:

Upon running, Locust will offer you a plan of attack: how many users and at which rate of growth to emulate. Once values are defined, it will start loading the traffic and will show you the results in real time. It's now up to you to monitor and analyze the result, and give a final verdict: did your application pass the test?

Building a web endpoint is an exciting phase of the work. Indeed, you're making your work available for the world to use. Do not rush, skipping the testing part! Making sure your application is written effectively and is fast and up to the traffic loads may save you a lot of time—and nerves—down the road.

Table of Contents for Deploying and testing your API loads with Locust

Create new playlist

Sign In

Sign Up

Table of Contents for
Deploying and testing your API loads with Locust