Chapter 13: Using Service Tiers

Chapter 13.

After you have assigned service tiers for all your services, how do you use them? There are a few ways:

Expectations

What is the expected uptime for the service? What is its reliability? How many problems does it have? How often is it allowed to fail?

Responsiveness

How fast or slow should you respond to a problem, and what courses of actions are available to you in resolving the issue?

Dependencies

What are the service tiers of your dependencies and those who depend on you, and how does that affect your service interactions?

Let’s look at each of these.

Expectations

Your service’s expectations are an important part of your service to your customers. Service-level agreements (SLAs) are one way to manage these expectations. This is so important that Service-Level Agreements is entirely dedicated to this topic.

Responsiv eness

When a problem occurs in your system, your responsiveness to the issue depends on these two factors:

The severity of the issue

The tier of service that is having the issue

A high-severity problem on a Tier 1 service should be treated as more important than a high-severity problem on a Tier 3 service. That is clear. But if a Tier 1 service has a medium severity problem, this might need a higher level of responsiveness than a high-severity problem on a Tier 3 service. Figure 13-1 demonstrates this.

Figure 13-1.  

Fi gure 13-1 . Responsiveness for service tier versus problem severity

The higher the severity of the problem, or the higher importance of the service (lower service tier number), the faster and more critical a response to the problem becomes. The parallel lines in Figure 13-1 show lines of similar response importance. A low- to medium-severity Tier 1 problem would require a similar response to an extremely high-severity Tier 3 problem. A Tier 4 problem almost never requires a critical response.

Furthermore, a low-severity Tier 2 problem would require a similar response to a high-severity Tier 4 problem.

You can use this information to adjust many aspects of your responsiveness. For example, you can use the responsiveness level to determine the following:

Which types of problems for which services require an immediate pager notification be sent.

The expected resolution SLAs.

The escalation path for slow response or slow resolution.

A schedule when a response should be provided (24 × 7 or business hours only).

Whether emergency deployment or production changes are warranted.

The SLAs in which your service should perform around availability and responsiveness.

Dependencies

If you are building a service, the relationship between the service tier you assign to your service and the service tier of your dependencies is significantly important. Figure 13-2 shows the relationship between your service level and that of a service dependency.

Figure 13-2.  

Figure 13-2 . Service dependency criticality

If your service is a higher tier (lower number) than your dependent service, your dependency is a critical dependency. If your service is a lower tier (higher number) than your dependent service, your dependency is a noncritical dependency.

Critical Dependency

If you’ve determined that your dependency is critical, it is important that you, as a service developer, deal with failures of your dependency in a way that does not significantly affect your service.

Your service is responsible for performing as much of its capabilities as is possible if a critical dependency fails. This is because the dependency is a lower tier (higher number), which means it likely will not have the same level of availability and reliability as your service requires.

As an example, looking at the application shown in Figure #fig(ExampleOnlineStore), focus on the website frontend service, which is a Tier 1 service. When this service tries to display a specific product detail page to a customer, it needs to determine the current price of the product. To do this, it makes calls to the price & shipping cost calculator (PSCC) service to determine the price.

What if the PSCC service (a Tier 2 service) was down? The website frontend service (a Tier 1 service) still must function as best as it can. So, what does it need to do?

It needs to gracefully handle failure messages (or lack of response) from the PSCC service. As soon as it determines that the PSCC service is down, it needs to figure out what to do in displaying the product detail page. There are a couple options:

It could show a cached copy of the price on the page (if it had that available).

It could show the product detail page, but not show the current price. Instead, it could show a message such as “Not available,” or “Price not currently available,” or even “Add to cart to see current price.”

The customer can still see pictures of the product, customer reviews, and other product details. Although the experience is degraded, the customer can still complete some very important tasks on your site.

We call this graceful degradation (dealing with service failures was covered in greater detail in Dealing with Service Failures).

Noncritical Dependency

If you’ve determined that your dependency is noncritical, you can mostly ignore service failures of your dependency.

This is because your dependent service, having a higher tier (lower number), will have higher levels of availability and responsiveness than your service requires.

As an example, consider the online store application illustrated in Figure #chapter(servicetiers)-#chapfig(ExampleOnlineStore), but this time focus on the weekly order report service, which is a Tier 4 service. For it to get the information it needs to generate its report, it makes calls to the order management service, which is a Tier 1 service.

What happens if the order management service is down? What should the weekly order report service do? Well, it’s probably reasonable for the weekly order report service to simply fail, as well. Given that the order management service is a Tier 1 service, any problems it will have will be dealt with very quickly, with a high responsiveness and high sense of urgency—much higher than would be needed to deal with the failure of the weekly order report service.

As such, the weekly order report does not need to do anything special to deal with an outage of the order management service, because it is OK for the weekly order report to simply not operate if the order management service is not available.

Summary

Service tiers provide a convenient way of expressing the criticality of a service to the service’s owners, dependencies, and consumers. They provide a way of understanding expectations between services in a manner that is simple to understand and communicate. Simplicity reduces the chance of mistakes, and service tiers provide a simple model for communicating expectations in a manner designed to be easy to understand and easy to utilize.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
13.58.197.26