Improving reliability with the cloud

In previous sections, you have seen examples of a cloud workload for the disaster recovery site. Many organizations have started to choose the cloud for disaster recovery sites to improve application reliability, as the cloud provides various building blocks. Also, cloud providers such as AWS have a marketplace where you can purchase a variety of ready-to-use solutions from providers.

The cloud provides data centers that are available across the geographic location at your fingertips. You can choose to create a reliability site on another continent without any hassle. With the cloud, you can easily create and track the availability of your infrastructures such as backups and machine images.

In the cloud, easy monitoring and tracking help to make sure your application is highly available as per the SLA. The cloud enables you to have fine control over IT resources, cost, and handling trade-offs for RPO/RTO requirements. Data recovery is critical for application reliability. Data resources and locations must align with RTOs and RPOs.

The cloud provides easy and effective testing of your disaster recovery plan. You inherit features available in the cloud, such as the logs and metrics mechanisms for various cloud services. Built-in metrics are a powerful tool for gaining insight into the health of your system. With all available monitoring capabilities, you can notify the team in case of any threshold breach or trigger automation for system self-healing. For example, AWS provides CloudWatch, which can collect logs and generate metrics while monitoring different applications and infrastructure components. It can trigger various automation to scale your application.

The cloud provides a built-in change management mechanism that helps to track provisioned resources. Cloud providers extend out-of-the-box capabilities to ensure applications and operating environments are running known software and can be patched or replaced in a controlled manner. For example, AWS provides AWS System Manager, which has the capability of patching and updating cloud servers in bulk. The cloud has tools to back up data, applications, and operating environments to meet requirements for RTOs and RPOs. Customers can leverage cloud support or a cloud partner for their workload handling needs.

With the cloud, you can design a scalable system, which can provide flexibility to add and remove resources automatically to match the current demand. Data is one of the essential aspects of any application's reliability. The cloud provides out-of-the-box data backup and replication tools, including machine images, databases, and files. In the event of a disaster, all of your data is backed up and appropriately saved in the cloud, which helps the system to recover quickly.

Regular interaction across the application development and operation team will help to address and prevent known issues and design gaps, which will reduce the risk of failures and outages. Always architect your applications to achieve resiliency and distribute them to handle any outages. Distribution should span different physical locations to achieve high levels of availability.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.63.136