Even though messaging allows for a very loosely coupled type of communication, it is common in many scenarios that a large downtime or message loss are not acceptable, especially when guaranteed delivery must take place. In the previous chapter, we described how RabbitMQ supports clustering and how it focuses on queue scalability rather than providing high availability. In this chapter, we will further discover mechanisms for establishing high availability at the level of the message broker.
Topics covered in the chapter:
When we design and develop large systems that need to be up-and-running most of the time, we need to consider what would happen when a single component fails. This could be due to a hardware, network, or any other type of failure. Some systems, for example, have an SLA (service level agreement) that specifies a 99.99 percent uptime. In this regard, high availability should be considered for every such component that could turn out to be a bottleneck, including the message broker. This not only allows you to justify the SLAs (service level agreements) defined over your system, which increases confidence in its reliability, it also allows you to implement a system that minimizes as much as possible the impact of having a system that fails from time to time for a certain amount of time—at least until some manual intervention takes place in order to bring it up. This imposes the risk of losing money; the more users are impacted by a system failure, the more likely it is your SLAs oblige you to pay out. In reality, there are general solutions that allow you to provide high availability clusters for systems that do not have built-in support for creating such clusters. Luckily RabbitMQ provides mechanisms for that, as we will discover later in this chapter.
Moreover, we may want to perform upgrades without having to disrupt users of our system or backup data while the system is running.
High availability may be considered when:
3.144.222.185