Planning the RTO and RPO

Any application needs to define service availability in an aspect of a Service-Level Agreement (SLA). Organizations define SLAs to ensure application availability and reliability for their users. You may want to define an SLA, saying my application should be 99.9% available in a given year or the organization can tolerate it if the application is down for 43 minutes per month, and so on. The RPO and RTO for an application is mostly driven by the defined SLA.

The RPO is the amount of data loss an organization can tolerate in the aspect of time. For example, my application is fine if it loses 15 minutes' worth of data. The RPO helps to define a data backup strategy. The RTO is about application downtime and how much time my application should take to recover and function normally after failure incidence. The following diagram illustrates the difference between the RTO and RPO: 

RTO and RPO

In the preceding diagram, suppose the failure occurs at 10 A.M. and you took the last backup at 9 A.M.; in the event of a system crash, you would lose 1 hour of data. When you restore your system, there is an hour's worth of data loss, as you were taking data backups every hour. In this case, your system RPO is 1 hour, as it can tolerate living with 1 hour's worth of data loss. In this case, the RPO indicates that the maximum data loss that can be tolerated is 1.

If your system takes 30 minutes to restore to the backup and bring up the system, then it defines your RTO as half an hour. It means the maximum downtime that can be tolerated is 30 minutes. The RTO is the time it takes to restore the entire system after a failure that causes downtime.

An organization typically decides on an acceptable RPO and RTO based on the user experience and financial impact on the business in the event of system unavailability. Organizations consider various factors when determining the RTO/RPO, which includes the loss of business revenue and damage to their reputation due to downtime. IT organizations plan solutions to provide effective system recovery as per the defined RTO and RPO.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.69.255