Achieving SLAs in distributed systems

Let's see an example how to achieve SLAs in distributed systems, assuming the enterprise application resides in a high performance scenario where it's crucial to meet guaranteed response times. The application synchronously communicates with one or more backend systems that provide necessary information. The overall system needs to meet an SLA response time of 200 milliseconds.

In this scenario the backend applications support in meeting the SLA time by applying backpressure and preventively rejecting requests that won't meet the guaranteed SLA. By doing so the originating application has the chance to use another backend service that may respond in time.

In order to appropriately configure pooling, the engineers need to know the average response time of the backend system, here 20 milliseconds. The corresponding business functionality defines a dedicated thread pool by using a dedicated managed executor service. The thread pool can be configured individually.

The configuration is achieved by following some steps: The engineers configure the maximum limit of the thread pool size plus the maximum queue size, so that the SLA time is n times the average response time. This n, here 10, is the maximum number of requests the system will handle at a time, consisting of the maximum pool size and maximum queue size limit. Any request that exceeds this number is immediately rejected by a service temporarily unavailable response. This is based on the calculation that the new request will likely exceed the calculated SLA time of 200 milliseconds, if the current number of handled requests exceeds n.

Immediately rejecting requests sounds like a harsh response, but by doing so, the client application is given the opportunity to retry a different backend without consuming the whole SLA time in vain in a single invocation. It's a case example for high performance scenarios with multiple backends where meeting SLAs has a high priority.

The implementation of this scenario is similar to the backpressure example in the previous chapter. The client uses different backends as a fallback if the first invocation failes with an unavailable service. This implicitly makes the client resilient since it uses multiple backends as fallback. The backend service implicitly applies the bulkhead pattern. A single functionality that is unavailable doesn't affect the rest of the application.

Table of Contents for Achieving SLAs in distributed systems

Create new playlist

Sign In

Sign Up

Table of Contents for
Achieving SLAs in distributed systems