Three Types of Contingency Planning

One of the most important things about conducting an RA is that it forces an organization to think “What if?” An organization must plan for the risks that it accepts or cannot avoid, mitigate, or transfer. It creates contingency plans to limit the financial loss it might experience because of an adverse event. Contingency plans help to minimize the length of time that services and processes are interrupted. They also help minimize customer impact because of a serious event.

The three main types of contingency plans are:

  • Incident response (IR) plans
  • Disaster recovery (DR) plans
  • Business continuity (BC) plans

An organization creates these plans to respond to events that might negatively affect IT resources and business processes. Keep in mind that this chapter discusses these types of contingency plans from an information security perspective. These plans can have a much larger scope than just IT.

It is also important to remember that the most important goal of any type of contingency plan is to preserve human life. Other goals are secondary.

Incident Response Planning

An organization uses its incident response (IR) process to react to attacks against its IT infrastructure. Having an IR process is important because it helps make sure that an organization can recover from security incidents. Organizations that are able to recover quickly from incidents are more likely to be able to continue business operations. IR is also called incident handling.

Decorative image NOTE

IR is a reactive term that describes how an organization responds to an incident. Incident handling is a proactive term that describes how an organization manages an incident. Organizations may use both terms interchangeably.

IR describes how an organization:

  • Detects information security incidents
  • Determines the cause of the incident
  • Mitigates the damage caused by the incident
  • Recovers from the incident

An incident is any event that involves the organization’s equipment, data, or other resources. An incident must adversely affect the confidentiality, integrity, and/or availability of the organization’s data and IT systems. The intent of the threat source does not matter. An incident includes malicious attacks, as well as the harmful acts of well-meaning employees. Incidents are usually violations of the organization’s policies, accepted security practices, or the law.

Organizations encounter several information security threats each day. Each one of these can be an incident if it adversely affects the security of an organization’s resources and data. You hear about these incidents in the media nearly every day. For example, in 2015 a student installed keystroke loggers on computers at his university to gather username and password information from university professors. The student then used that information to change his grades in the university’s computer system. The university discovered the hack through logging and audit review security measures. In 2018 the student was sentenced to 4 months in prison, 2 years of supervised probation, and ordered to pay restitution to the university for unauthorized access and damage to its computer network.4

When the university discovered that records had been improperly modified, it would have declared an information security incident and formed a team to respond to the system intrusion. That team would have been responsible for determining the damage that the incident caused, finding the source of the incident, mitigating the damage caused by the incident, and recovering the computer systems affected by the incident.

The student hacker’s conduct was most likely a violation of the university’s acceptable use policy (AUP). Many organizations have these policies to define acceptable behaviors for the use of its IT resources. In this case, the student hackers violated policies and also broke the law.

Based upon what we know about this case from publicly available information, we know that the university involved did the following:

  • Detecting the incident—The university used normal information security practices such as logging and audit review to discover that information in an IT system had been changed.
  • Determining the cause of the incident—These same measures, and most likely computer forensic analysis, helped the university discover that the cause of the incident was student hackers who had installed a keystroke logger.
  • Mitigating the damage caused by the incident—The university was able to restore the changed data because it made regular data backups.
  • Recovering from the incident—The university restored data and, likely, hardened its systems to prevent this type of incident from happening again in the future.

One of the most important parts of IR is documentation. An organization must document every incident that it encounters. That way it can refer to its documentation if it encounters a similar incident in the future.

Incident Response Team

The IR team is responsible for creating the organization’s IR policy and plan. This team has a different focus from that of the operational information security employees who will follow the IR plan and respond to incidents. Often the IR team will include many information security team members. Similar to an RA team, the IR planning team must include advisors from several departments across the organization.

The IR team may help draft the initial IR policy and create the plans that define the organization’s IR structure. Even though IR is often an information security responsibility, other departments may need to participate in an IR process. The information security team may have to interact with many of these departments when handling an incident.

For instance, the operational team will be involved if there is an incident involving a physical trespasser to the organization’s data center. The operational team most likely will involve the organization’s physical security personnel in handling the incident because the incident included actual trespass onto the organization’s property. IT personnel will be involved to see if any equipment has been stolen, and internal audit and legal counsel will be involved if any equipment or data are stolen. Human resources (HR) personnel could become involved if it appears that the trespasser is a former employee. Marketing and communications personnel could become involved if the facts of the incident trigger the laws that require security breach notification.

The IR team is responsible for making sure that the procedures are in place to help all of these different departments work together. They must be able to respond quickly and efficiently in the event of an incident. The operational information security team needs to know whom they can contact in each department for help. The IR team’s planning puts this structure in place.

The IR planning team should include information security and IT representatives. It also should have members from physical security, HR, and internal audit. Legal counsel should be included on the team to address any legal or regulatory issues.

The IR team will help the organization create its IR policy. These policies are very specific to an organization’s structure and culture. Similar to all information security policies, the IR policy is a statement of executive management’s commitment to the IR process. The policy should state the purpose and goals of IR. It also should define what an incident is. The policy must set forth, at a high level, the organization’s operational IR structure. It should define the roles and responsibilities within this structure. The IR policy also can contain information about how to measure the effectiveness of the IR process.

IR Plan Process

Once the organization approves its IR policy, the IR team can continue to define the IR plan. The IR policy contains the overall IR goals, whereas the IR plan contains the procedural elements that are necessary to meet those policy goals. Operational information security teams will follow the IR plan to fulfill their IR job duties. In this section, the term incident handlers will be used to refer to the operational teams that respond to an incident.

An IR plan is specific to a particular organization. However, most IR plans have five basic parts. They are:

  • Incident triage
  • Investigation
  • Containment or mitigation
  • Recovery
  • Review

The triage phase is the first phase in the IR process. In this phase, a potential incident is initially assessed. It is at this point that the primary handler will verify whether an adverse event meets the definition of an incident. The primary handler is the person who is in charge of coordinating an organization’s response to an information security incident. This person is often a member of the organization’s information security team.

Decorative image NOTE

The word triage is most commonly associated with the medical profession. It is the process of sorting and prioritizing patient care based on the severity of a patient’s condition.

The Operational Incident Response Team

An organization often plans and coordinates IR at a high level within an organization. A cross-functional team, called the IR team, puts the IR plan into place. The responsibility for the daily operations of the IR plan often falls to an organization’s information security team and other IT personnel.

There are several different roles involved in IR, which are reviewed briefly here. You may find these terms used in resources describing how to plan and implement an IR program:

  • Victim—The person or resources that are targeted in an incident. The victim is often the organization and its IT resources and data.
  • Attacker—The person or mechanism that caused the incident.
  • Incident reporter—The first person or mechanism that reports an incident. The incident reporter does not have to be a person. An automated intrusion detection system (IDS) can be an incident reporter. A person who notices an unusual incident and reports it is also an incident reporter.
  • Primary handler—The person who is in charge of coordinating the response to a particular incident. This person is responsible for making sure that the IR process is documented. Often this person is a member of the organization’s information security department. If an organization does not have a dedicated information security department, this is the person with information security job duties.
  • Secondary handlers—These are the personnel involved in investigating, responding to, and recovering from an incident. Secondary handlers include technicians, analysts, and operational staff who take part in handling the incident. Legal counsel and an organization’s internal auditors also can be secondary handlers. The type of secondary handlers involved in IR depends on the nature of the incident.

Not all events are information security incidents. The event must have an adverse effect on the confidentiality, integrity, and/or availability of an organization’s IT resources or data. For example, it might be an incident if an employee mistakenly deletes a critical file needed for processing the organization’s weekly payroll because the employee has compromised the availability of needed data. It might not be an information security incident if the employee mistakenly deleted a non-critical file. (Although it certainly would be a business process issue.) The primary handler decides whether a reported event is actually an incident.

If an event is an incident, the primary handler must classify it, which can be done in several ways. They can be sorted based on threat source. An example would be whether the incident occurred because of an internal threat or an external threat. Incidents also can be sorted based upon the type of vulnerability or threat that is exploited.

Organizations may use any method for sorting incidents. For example, some organizations must use guidance prepared by NIST.5 The U.S. Department of Homeland Security’s National Cybersecurity and Communications Integration Center (NCCIC) uses IR categories that are based on NIST guidance. The NCCIC is the federal government’s IR center and is sometimes referred to as the United States Computer Emergency Readiness Team (US-CERT). All federal agencies must report information security incidents to the NCCIC/US-CERT. The NCCIC/US-CERT’s incident categories are:

  • Category 1: Unauthorized Access—Unauthorized access is technical or physical access to an IT system without permission. An agency must report these incidents even if data is not compromised.
  • Category 2: Denial of Service (DoS)—Any event that prevents the normal operation of IT resources such that use of those resources is harmed.
  • Category 3: Malicious Code—Any event that involves the use of malicious code to successfully infect, breach, or compromise IT resources. These events include viruses, worms, and Trojan horses.
  • Category 4: Improper Use—Any event that is a violation of the agency’s AUP or other related policies.
  • Category 5: Scans, Probes, and Attempted Access—Any event where an IT resource is scanned or probed in an attempt to access or identify the agency’s IT systems.
  • Category 6: Investigation—This category is for unusual events that do not fall into one of the other categories. These incidents require more review because they are odd or potentially harmful.

The classification of an incident may change as the incident is investigated because the organization learns more about the incident as the investigation progresses. Incidents also should be sorted based upon potential severity. Severity is assessed based upon the perceived level of impact to the confidentiality, integrity, and availability of an organization’s IT resources or data. The severity of an incident also may change. Organizations often classify severity on a low-medium-high scale. An organization might classify severity as follows:

  • Low—The adverse effect on the confidentiality, integrity, or availability of the organization’s data or IT resources is limited. A low-impact event causes little or no damage.
  • Medium—The adverse effect on the confidentiality, integrity, or availability of the organization’s data or IT resources is moderate. A medium-impact event results in significant damage to assets.
  • High—The adverse effect on the confidentiality, integrity, or availability of the organization’s data or IT resources is severe. A high-impact event results in major damage to assets.

Classifying the nature and severity of an incident helps incident handlers know which incidents require priority handling. If the incident handlers must respond to multiple incidents, classification promotes efficiency. It also makes sure organizations respond to incidents according to the incident’s potential to hurt the organization. Classification also helps incident handlers know which incidents to escalate to management.

Investigation is the second phase in the IR process. During this phase, the incident handlers learn about the incident and its source, as well as the impact that the incident is having on the organization. The incident handlers must find all the resources that are affected by the incident. They also must contact other areas as needed to fully understand the scope of the incident. The IR policy and plan let the incident handlers know who they must contact.

The incident handlers must keep management informed of their IR activities. It is important that management be informed in case there are any regulatory requirements that need to be considered as the organization responds to the incident. For example, if an incident involves the disclosure of the protected health information of more than 500 people, HIPAA requires that the organization notify the Department of Health and Human Services about the disclosure.6 Executive management, in consultation with legal counsel, must make the final decisions about contacting third parties or the media.

It is also important during the investigation phase for the incident handlers to follow the organization’s own internal policies. If handlers are investigating an incident that might be a crime, it is important that the team follow good evidentiary practices. If an incident appears to be a crime, the incident handlers must contact law enforcement according to the terms of the IR plan.

The organization must begin the containment phase almost as soon as an incident is reported. During this step, incident handlers must take steps to limit the damage caused by the incident. They will use different methods to contain an incident depending upon its nature. For example, if the incident is a self-propagating virus, incident handlers may remove an infected system from the organization’s network. If the incident is a particularly authentic-looking phishing email, incident handlers might issue an alert or other notification to the organization’s employees. The alert would tell the employees not to respond to a phishing email. Incident handlers might use several different tactics to mitigate an incident.

The organization repairs and recovers its IT resources and data during the recovery phase. The IT resources and data should be repaired in such a way that they are not vulnerable to the same type of incident again. This is called hardening. In addition to hardening damaged IT resources, the organization must harden resources that are similar to the damaged resources. This makes sure that similar resources are not harmed by similar incidents.

After an IT resource is repaired, it should be tested for any additional vulnerabilities or weaknesses before it is put back into production. The recovery and repair method will depend upon the nature of the incident.

All stages in the IR process must be fully documented. The primary incident handler is responsible for making sure that each step in the process has been fully documented. This is important because the IR planning team can review the notes from the incident handlers to determine whether the IR plan worked as intended. The IR planning team will want to know which part of the plan worked well, and which parts need to be improved for the future. During the review stage, both the incident handlers and the IR planning team can review the documentation to learn:

Decorative image NOTE

As a general information security best practice, a repaired IT resource should not be tested by the same person that repairs or recovers it. This is a separation of duties best practice to make sure that vulnerabilities and weaknesses are not overlooked.

  • Dollar amount spent in handling the incident
  • Dollar amount spent to prevent similar incidents in the future
  • Loss of staff time in handling the incident
  • How the response to the current incident compares with similar incidents in the past
  • Recommendations on policy and procedural changes because of lessons learned from the incident

The review phase is often overlooked because it is easy for an organization’s employees to go back to their normal operational duties after an incident. The IR planning team must make sure that the review process is formally required by policy.

The IR process is shown in FIGURE 14-2. As described in this section, many of the stages overlap with one another. The lines between each stage can be indistinct at times.

A flow diagram depicts the incident response plan phases. The stages, pointing from left to right, are as follows: Triage, investigation, containment, recovery, and review.

FIGURE 14-2
Incident response plan phases.

Disaster Recovery and Business Continuity Planning

DR and BC plans help an organization respond to a disaster. A disaster is a sudden, unplanned event. Disasters negatively affect the organization’s critical business functions for an unknown period. The difference between a disaster and an incident is subtle. A disaster severely affects the organization’s infrastructure and interrupts critical business functions. An incident tends to refer to service failures that affect the confidentiality, integrity, and/or availability of the organization’s data and IT systems. An incident may become a disaster in some situations.

Decorative image NOTE

It helps to think of an incident as an event that an organization can deal with during its normal operations. A disaster is an event that completely disrupts those normal operations.

Examples of disasters include natural threats and deliberate, human-made threats. Natural threats are uncontrollable events. They include earthquakes, fires, and flood. Human-made threats include sabotage and terrorist activities. Threats that can evolve into a disaster also include equipment, electrical, and communications infrastructure failure. Disasters are not predictable. Organizations cannot control these types of threats. All they can do is take measures to try to limit the damage caused by these events. Organizations also can make plans to respond to these types of events.

DR plans focus on how an organization recovers its IT systems after a disaster. These plans focus on IT systems only. They are the organization’s immediate response to restoring critical IT resources after a devastating disaster or event. DR is largely a function of IT and is part of a larger BC plan.

BC plans focus on how the organization continues its business during and after a disaster. These plans tend to be more comprehensive. They cover all parts of a business, not just IT systems. These plans address the period between the disaster and a return to normal operations.

The formal distinction between DR and BC plans is eroding. This is because organizations rely on IT systems for many of their critical functions. Some organizations build their whole business model on their IT systems. Today, most organizations consider DR and BC to be the same thing. You may find the terms used interchangeably. Most references recognize that a DR plan must be part of a comprehensive BC plan.

In this chapter, DR and BC plans will be discussed together, unless it is necessary to differentiate between the two. If it is necessary to differentiate between the two types of plans, the distinction will be made clear in the text.

DR/BC Team

The DR/BC team is responsible for creating an organization’s DR/BC policy and plans. Similar to the RA and IR teams, this team must include members from many areas of the organization. The people who are going to be responsible for carrying out the plan in the event of a disaster also should be included on the team.

A DR/BC plan is an organization-wide plan; however, specific departments may have secondary DR/BC plans. These department-level plans must be consistent with the organization’s DR/BC plan. An organization’s overall DR/BC plan has several goals:

  • Ensure that the organization’s employees are safe.
  • Minimize the organization’s amount of loss.
  • Recover critical business systems and infrastructure within a certain period.
  • Resume critical business operations within a certain period.
  • Repair or replace damaged facilities.
  • Return to normal operations.

The DR/BC team will help the organization create its DR/BC policy. These policies are very specific to an organization’s structure and culture.

A DR/BC team must include members from IT, HR, executive management, physical security, and legal counsel. It also must include the managers or owners of critical business processes. The team also should include the employees, and their backups, who will be in charge of directing the organization’s activities in a real disaster. It is important that backup personnel be included in case the people with primary DR/BC responsibilities are not available in a disaster.

One thing to keep in mind for DR/BC plans is that disasters are not always limited to the organization. A disaster in a geographic region may mean that both the organization and its employees will be affected by it. Employees must personally respond to the same disaster. If employees must respond to the disaster at home with their own families, they may not be in a position to help the organization respond to the disaster as well. An organizational DR/BC plan must acknowledge this risk.

DR/BC Plan Development

Many of the steps in the DR/BC planning process are similar to the RA and IR planning processes. Contingency planning is a part of risk management. The steps in the DR/BC planning process are:

  • Develop the DR/BC policy.
  • Conduct a business impact analysis.
  • Identify threats and potential controls.
  • Determine recovery strategy.
  • Design and maintain the plan.

The DR/BC team must make sure that policies and plans are in place that help the organization complete these steps. The DR/BC policy is a statement of executive management’s commitment to the BC planning that should state the purpose and goals of DR/BC. The policy should define DR/BC roles and responsibilities. The policy should include the organization’s resource requirements for any DR/BC plan.

After a DR/BC policy is approved, the DR/BC team must conduct a business impact analysis (BIA). A BIA identifies key business operations. It also identifies the resources that support those operations. The DR/BC team uses a BIA to estimate how long those critical operations and resources can be offline before the organization’s entire business is negatively affected.

To complete a BIA, the DR/BC team must:

  • Identify critical business processes—The BR/DR team must identify the organization’s critical business processes. There is no master list of processes that are critical to all organizations. Each organization is different. Some critical processes might include payroll, attendance scheduling, and customer service activities.
  • Identify IT resources that support critical business processes—The BR/DR team must identify the resources that support its critical business processes. Resources can include the organization’s communications infrastructure. They also can include individual IT systems and components.
  • Determine how long IT resources can be offline—The DR/BC team must identify the effect on business organizations if a resource is disrupted or damaged and a critical process cannot run.
  • Determine recovery criticality—The DR/BC team must prioritize how the organization will handle IT resources and business processes following a disaster.

A BIA closely resembles a risk assessment. Many of the same tools and techniques used in an RA are used to complete a BIA. The team can interview employees throughout the organization to learn about its many business processes. The team also could send a questionnaire to a sampling of employees. The questionnaire could ask questions about business processes and resources.

Once the DR/BC team determines the organization’s critical processes and resources, it must figure out how long those processes and resources can be offline before the organization experiences irreparable harm. This period is called maximum tolerable downtime (MTD). Some processes and systems are so critical that they can be down only for a few minutes before an organization suffers irreparable damage. If these processes are offline longer than the MTD, the organization might fail. Processes and systems that are not essential to business operations may have an MTD of days or weeks.

The DR/BC team must determine the order in which IT resources will be reviewed and restored following the disaster. It uses the results of the BIA to make this determination. If an organization must resume a business process within a short period, it will need to make sure that it can put people and processes in place to recover the process within that period. The priority list of processes and resources helps executive management make recovery strategy decisions.

Decorative image NOTE

Some resources use the term maximum acceptable outage (MAO) in place of MTD. The two terms mean the same thing.

After the BIA is complete, the DR/BC team must identify threats and potential controls. This step is very similar to a risk assessment. In fact, the organization may have completed this exercise as part of an RA. If so, the DR/BC team can use those results at this step. The DR/BC team must identify the threats to the organization that have disaster potential. It also must identify potential controls to respond to those threats. These controls try to reduce the possibility of the organization experiencing a disaster. If a disaster cannot be avoided, these controls may lessen the amount of damage to the organization.

For example, an organization’s data center may be located in an area prone to tornadoes. If the organization cannot move its data center, it may try to fortify it. The organization might try to make the data center more resistant to wind-related damage to protect its business processes.

Some common preventative controls that an organization can implement include:

  • Fire detection and suppression systems
  • Installing backup generators or uninterruptible power supplies
  • Offsite storage of system backup media
  • Frequent backups of critical data
  • Extra equipment inventories for critical IT resources

The DR/BC team also must determine a recovery strategy. It consults with executive management to do this. An organization’s recovery strategy addresses the resources that it must recover after a disaster and the order in which those resources must be recovered. An organization will have to consider a wide variety of recovery strategies.

An organization must prepare recovery strategies for its:

  • Critical business processes—The organization must plan for recovering its business processes. It must understand all the workflow steps needed to complete a business process. It must know the resources and supplies needed to support these processes.
  • Facilities and supplies—The organization must make sure that it has a plan to restore its main facility. It also must restore the utilities needed to support that facility. Utilities include telecommunications and electrical infrastructure.

    Backup Site Options

    It is rare when an organization experiences a disaster that forces it out of its main facility for a long time. However, the organization still must plan for this possibility. It must have a location to which it can move its operations, which is called a backup site. An organization has several planning options for a backup site. It is important for you to know the differences between them.

    A mirrored site, a fully operational backup site, actively runs the organization’s IT processes in parallel with the organization’s main facility. In this way, a mirrored site is a redundant facility. An organization can immediately transfer all of its IT operations to the mirrored site, which is already staffed with the organization’s employees. This is the most expensive type of backup site to maintain. This type of backup site is appropriate for organizations that have a low MTD for critical processes. This type of backup site supports high availability.

    A hot site is an operational backup site that has all of the equipment and infrastructure that an organization needs to continue its business operations. The equipment in the hot site is fully compatible with the organization’s main facility. A hot site can become operational within minutes to hours after a disaster. However, it is not staffed with people, and it does not process data in parallel with the main facility. If needed, the organization must bring data backups to the hot site facility. A hot site, although expensive to maintain, may be the best choice for an organization that can afford some, but not a lot, of downtime.

    A warm site is a compromise between a hot site and a cold site. A warm site is space that contains some, but not all, of the equipment that an organization will need to continue operations in the event of a disaster. The warm site is partially prepared for operations, in that it has electricity and network connectivity. This type of site is more expensive than a cold site.

    A cold site is a backup site that is little more than reserved space. It is the most inexpensive type of backup site, as it does not have any equipment or hardware set up. Although it will have electrical service, it most likely will not have network connectivity. It can take weeks for an organization to get a cold site ready for business operations. An organization will have to acquire equipment and infrastructure to make the site operational.

  • Employee environment—The organization must have plans in place for supporting its employees during a disaster. This means making sure that it has ways to communicate with employees during a disaster. The organization also must have plans in place to manage employee responsibilities until the organization can return to normal operations.
  • IT operations—The organization must have plans in place to resume its IT operations. This means making sure that infrastructure components are in place so that business can resume. The organization will want to have contracts with its vendors so that it can get replacement equipment quickly.
  • Data recovery—The organization must have a way to recover its data and operational information, as well as retrieve data from offsite storage facilities. It also must have plans to retrieve paper-based information from its main facility.

A recovery strategy also must include the people that will implement it. A team with specialized skills and knowledge must head each recovery area. For example, the organization’s storage administrators should serve on the team in charge of data recovery. Each team must have a leader.

Once an organization develops its DR/BC plan, it must monitor and update it in response to changing conditions. An organization must update its plan anytime its business processes or technology change. It must update the plan any time key personnel change. Organizations put contingency plans in place so that they can respond to events that adversely affect them. It is not enough to create a plan and put it away on a shelf for use “just in case.”

A DR/BC plan helps an organization respond to a disaster. These plans are an important part of an organization’s risk management activities. The DR/BC planning process is shown in FIGURE 14-3.

A flow diagram illustrates the disaster recovery, D R, and business continuity, B C, planning process.

FIGURE 14-3
DR/BC planning process.

Description

Testing the Plan

Organizations must test their contingency plans on a regular basis to make sure that the plan accounts for all critical business functions and processes. It also should test its contingency plans to make sure that the plans do not have any deficiencies. An organization must correct plan deficiencies.

Contingency plan testing has several objectives. They include:

  • Help employees become familiar with and accept the DR/BC plan.
  • Train employees how to respond during an emergency.
  • Identify weaknesses or deficiencies within the plan.
  • Make sure that all of the checklists and procedures needed to implement the plan are created and in place.
  • Make sure that all the resources and supplies needed to implement the plan are in place and are operational.
  • Make sure that all communications mechanisms work properly.
  • Make sure that all DR/BC teams are able to work well together.

Decorative image NOTE

In the DR/BC plan context, a single point of failure is a step in the plan or an assumption within the plan that is critical to the performance of the entire plan. If that step or the assumption fails, then a critical portion of the plan, or the entire plan, could fail. Identifying single points of failure is a critical part of testing contingency plans.

Through testing, an organization can learn that it is missing key business process areas. It also can identify single points of failure within the plan and take steps to correct them. Any changes that are made to the DR/BC plan as part of the test review must be fully documented. Changes to the plan also must be communicated to all members of the organization.

There are five ways to test DR/BC plans. These tests also could be used to test an organization’s IR capability.

A checklist test is one of the most basic types of DR/BC tests. In this type of test, the DR/BC team makes sure that supplies and inventory items that are needed to execute the DR/BC plan are in place. This type of test makes sure that sufficient supplies are stored at backup facilities. It also makes sure that the organization has enough reference copies of the DR/BC plan and that all copies have current information.

A walk-through test is often used with a checklist test. In this type of test, the DR/BC team “walks through” the entire DR/BC plan. They study each area of the plan to make sure that all of the assumptions and tasks stated in it are correct. This type of test also helps the people who are responsible for executing the DR/BC plan become very familiar with it. This type of test is sometimes called a tabletop walk-through test or tabletop test because the members of the team will sit around a table as they study the plan.

A simulation test is a more realistic version of a walk-through test. In this type of test, the organization role-plays a disaster scenario. The scope of these types of tests has to be carefully defined so that they do not negatively affect normal business activities. This test is designed to measure the effectiveness of employee notification procedures. Depending upon the scope of the test, an organization might try to measure how fast it can set up its backup site. It also could measure how fast its vendors can provide additional equipment.

FYI

Organizations with critical business functions that affect many customers tend to be serious about DR/BC planning and testing. AT&T has a Network Disaster Recovery Team, whichis responsible for restoring voice and data network communications to an area that is affected by a disaster. The team conducts four DR exercises each year. It was deployed in 2017 to respond to wildfires in California. You can read about the team’s efforts at https://www.business.att.com/solutions/family/network-services/network-disaster-recovery.html.

A parallel test is designed to test the organization’s IT recovery processes. In this type of test, the organization tests its ability to recover its IT systems and its business data. The organization brings its backup sites online. It will then use historical business data to test how those systems operate. In this test, the organization tests both data processing and data recovery. During the test, the organization continues normal business operations at its main facility. The test is conducted using historical data.

A full interruption test is designed to test the organization’s entire DR/BC plan. This test involves a scenario that destroys or severely damages the organization’s main facility. The organization must transfer all business and IT functions to its backup site. In this type of test, all normal business operations stop. Operations are shut down at the main site. They must be transferred to the backup site using the processes stated in the DR/BC plan.

A full interruption test is the most expensive kind of contingency plan test. It can help the organization learn a lot about its DR/BC plan’s effectiveness. However, it also has the potential to negatively affect the organization’s business. If the organization cannot get business operations resumed at the backup site, then the test itself can create a disaster situation for the organization. Organizations undertake these types of tests with great care.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.129.13.201