HA best practices

In order to build HA Kubernetes systems, it's important to note that availability is as often a function of people and process as it is a failure in technology. While hardware and software fails often, humans and their involvement in the process is a very predictable drag on the availability of all systems.

It's important to note that this book won't get into how to design a microservices architecture for failure, which is a huge part of coping with some (or all) system failures in a cluster scheduling and networking system such as Kubernetes.

There's another important concept that's important to consider: graceful degradation.

Graceful degradation is the idea that you build functionality in layers and modules, so even with the catastrophic failure of some pieces of the system, you're still able to provide some level of availability. There is a corresponding term for the progressive enhancement that's followed in web design, but we won't be using that pattern here. Graceful degradation is an outcome of the condition of a system having fault tolerance, which is very desirable for mission critical and customer-facing systems.

In Kubernetes, there are two methods of graceful degradation:

Infrastructure degradation: This kind of degradation relies on complex algorithms and software in order to handle unpredictable failure of hardware, or software-defined hardware (think virtual machines, Software-Defined Networking (SDN), and so on). We'll explore how to make the essential components of Kubernetes highly available in order to provide graceful degradation in this form.
Application degradation: While this is largely determined by the aforementioned strategies of microservice best practice architectures, we'll explore several patterns here that will enable your users to be successful.

In each of these scenarios, we're aiming to provide as much full functionality as possible to the end user, but if we have a failure of application, Kubernetes components, or underlying infrastructure, the goal should be to give some level of access and availability to the users. We'll strive to abstract away completely underlying infrastructure failure using core Kubernetes strategies, while we'll build caching, failover, and rollback mechanisms in order to deal with application failure. Lastly, we'll build out Kubernetes components in a highly available fashion.

Table of Contents for HA best practices

Create new playlist

Sign In

Sign Up

Table of Contents for
HA best practices