Chapter 17

Business Continuity

“Well, here’s another nice mess you’ve gotten me into!”

—Stan Laurel

Learning Objectives

After studying this chapter, you should be able to:

  • Present an overview of business continuity concepts, including the operation of business continuity management systems, the objectives for business continuity, and the essential components for maintaining business continuity.

  • Understand the key elements of a business continuity program.

  • Explain the concept of resilience in the context of business continuity.

  • Outline the elements of a business continuity plan.

  • Discuss performance analysis of a business continuity management system.

  • Describe the phases of business continuity operation following a disruptive event.

  • Present an overview of business continuity best practices.

A fundamental concern for all organizations is business continuity. An organization needs to perform essential functions during an emergency situation that disrupts normal operations and resume normal operations in a timely manner after the emergency has ended.

The International Organization for Standardization (ISO) has published a family of standards for business continuity management that enterprise security managers should be familiar with:

  • ISO 22300, Security and Resilience—Vocabulary: Provides a glossary of relevant terms.

  • ISO 22301, Business Continuity Management Systems—Requirements: Specifies requirements for setting up and managing an effective business continuity management system (BCMS). This is the first international standard focused exclusively on business continuity.

  • ISO 22313, Business Continuity Management Systems—Guidance: Provides guidance, where appropriate, on the requirements specified in ISO 22301 and provides recommendations (“should”) and permissions (“may”) in relationship to them.

  • ISO 22317, Business Continuity Management Systems: Guidelines for Business Impact Analysis (BIA): Provides guidelines (based on good international practice) for performing a business impact analysis (BIA), which is a requirement of ISO 22301 (clause 8.2). It provides guidance for establishing, implementing, and maintaining a formal and documented process for business impact analysis. It is applicable to all organizations, regardless of type, location, size, and nature of the organization.

  • ISO 22318, Business Continuity Management Systems: Guidelines for Business Impact Analysis (BIA): Provides guidelines for supply chain continuity.

Two additional useful guidance documents are:

  • National Institute of Standards and Technology (NIST) SP 800-34, Contingency Planning Guide for Federal Information Systems: Provides a detailed description of the planning process.

  • European Union Agency for Network and Information Security’s (ENISA’s) IT Business Continuity Management: An Approach for Small and Medium Sized Organizations: Provides a detailed list of controls for implementing business continuity plans.

Section 17.1 introduces key concepts of business continuity management (BCM). Figure 17.1 provides a useful three-layer model for BCM, covering governance and policy, readiness, and operations. Sections 17.2 through 17.4 address these areas in turn.

An illustration shows the element of business continuity management.
FIGURE 17.1 Elements of Business Continuity Management

Figure 17.2 provides another view of BCM, showing the flow between the major elements, as suggested in ISO 22301.

An illustration shows the ISO 22301 methodology for BCM.
FIGURE 17.2 ISO 22301 Methodology for BCM

BCM is a broad area that deals with all sorts of disasters, including natural disasters, health and safety incidents, and cyber attacks. ISO 27002, Code of Practice for Information Security Controls, specifically focuses on threats to information assets and information and communications technology (ICT) systems. This chapter describes BCM in general terms, with specific reference to information assets and ICT systems, where appropriate.

17.1 Business Continuity Concepts

This section provides an overview of business continuity. After providing a number of important definitions, the section surveys threats to business continuity and a general look the typical enterprise approach to business continuity. The following definitions, based on those in ISO 22300, are relevant for the discussion in this chapter:

  • Business: For purposes of discussing business continuity, the operations and services performed by an organization in pursuit of its objectives, goals, or mission. As such, it is equally applicable to large, medium, and small organizations operating in industrial, commercial, public, and not-for-profit sectors.

  • Business continuity: The capability of an organization to continue delivering products or services at acceptable predefined levels following a disruptive incident. Business continuity embraces all the operations in a company, including how employees function in compromised situations.

  • Business continuity management (BCM): A holistic management process that identifies potential threats to an organization and the impacts to business operations those threats, if realized, might cause, and that provides a framework for building organizational resilience with the capability of an effective response that safeguards the interests of its key stakeholders, reputation, brand, and value-creating activities.

  • Business continuity management system (BCMS): Part of an overall management system that establishes, implements, operates, monitors, reviews, maintains, and improves business continuity. The management system includes organizational structure, policies, planning activities, responsibilities, procedures, processes, and resources.

  • Business continuity manager: An individual who manages, designs, oversees, and/or assesses an enterprise’s business continuity capability to ensure that the enterprise’s critical functions continue to operate following disruptive events.

  • Business continuity plan (BCP): The documentation of a predetermined set of instructions or procedures that describe how an organization’s mission/business processes will be sustained during and after a significant disruption.

  • Business continuity program: An ongoing management and governance process supported by top management and appropriately resourced to implement and maintain business continuity management.

Threats

It should be clear that a high priority for every organization is the ability to prevent, if possible, and recover rapidly, if necessary, from a substantial disruption of operations and/or resource availability. The importance of planning for business continuity is evident when considering the broad range of threats to continuity. These threats can be grouped as natural disasters, systems problems, cyber attacks, and human-caused disasters. The following list is based on threats defined in ENISA IT Business Continuity Management: An Approach for Small and Medium Sized Organizations [ENIS10] and the Federal Financial Institutions Examination Council’s Business Continuity Planning [FFIE15].

Natural Disasters

Threats in the natural disaster category include the following:

  • Accidental fire: Sources include wildfires, lightning, wastebasket fires, and short-circuits.

  • Severe natural event: This category includes damage resulting from an earthquake, a hurricane, a tornado, or other severe weather, such as extreme heat, cold, humidity, wind, or drought.

  • Accidental flood: Causes of flooding include pipe leakage from air-conditioning equipment, leakage from a water room on the floor above, fire nozzle being open, accidental triggering of sprinkler systems, broken water main, and open window during a rainstorm.

  • Accidental failure of air conditioning: Failure, shutdown, or inadequacy of the air-conditioning service may cause assets requiring cooling or ventilation to shut down, malfunction, or fail completely.

  • Electromagnetic radiation: Electromagnetic radiation originates from an internal or external device, such as radar, radio antenna, and electricity-generating station. It can interfere with proper functioning of equipment or quality of service of wireless transmission and reception.

  • Air contaminants: Some disasters produce a secondary problem by polluting the air for a wide geographic area. Natural disasters such as flooding can also result in significant mold or other contamination after the water has receded. Nearby discharge or release of hazardous materials may also produce an airborne threat. The severity of these contaminants can affect air quality at an institution and may even result in evacuation for an extended period of time (for example, in the case of volcanic eruptions).

Systems Problems

Systems problems include the following:

  • Software malfunction: A design error, an installation error, or an operating error committed during modification can cause incorrect execution.

  • Equipment malfunction/failure: Threats include a logical or physical event causing an equipment item to malfunction and/or problem due to failure to follow equipment qualification procedures after updates/upgrades or use of equipment under conditions outside its operating limits (such as temperature or humidity).

  • Breach of information system maintainability: Lack of expertise in the system may make retrofitting and upgrading impossible. Examples are inability to correct an operating problem or respond to new needs, failure of external software and hardware maintenance companies, and termination of a support contract leaving a lack of competency or resources for system upgrades.

Cyber Attacks

Chapter 14, “Technical Security Management,” discusses the numerous technical threats to ICT systems, and Chapter 15, “Threat and Incident Management,” discusses cyber attacks in some detail, again with a focus on ICT systems.

With reference to the broader issue of business continuity, management must also be aware of threats to cyber-physical systems. NIST SP 1500-201, Framework for Cyber-Physical Systems: Volume 1, Overview, defines a cyber-physical device as a device that has an element of computation and interacts with the physical world through sensing and actuation. It defines a cyber-physical system (CPS) as a smart system that includes engineered interacting networks of physical and computational components. CPS generally involve sensing, computation, and actuation. CPS involve traditional information technology (IT), as in the passage of data from sensors to the processing of those data in computation. CPS also involves traditional operational technology (OT) for control aspects and actuation. The combination of these IT and OT worlds along with associated timing constraints is a particularly new feature of CPS. As organizations rely on CPS, such as in the area of CPS, they need to consider a range of threats to both the IT and OT aspects of CPS. A discussion of this complex area is beyond the scope of this book; see NIST SP 1500-202, Framework for Cyber-Physical Systems: Volume21, Working Group Reports, for details.

Human-Caused Disasters

Human-caused threats include the following:

  • Theft of equipment

  • Deliberate fire

  • Deliberate flood

  • Deliberate loss of power supply

  • Deliberate failure of air conditioning

  • Destruction of equipment or media

  • Unauthorized use of equipment

  • Vandalism and explosive discharges

Business Continuity in Operation

In essence, business continuity management is concerned with mitigating the effects of disasters. Figure 17.3, based on figures in ISO 22313, illustrates the two ways in which business continuity management achieves that mitigation. The relative distances depicted in the figure imply no specific time scales. The gray curve shows the pace of recovery from a disaster with a business continuity plan in place, and the black curve shows the typical recovery pace without a business continuity plan.

A line graph of the level of business operations versus time shows the effectiveness of business continuity management.
FIGURE 17.3 Effectiveness of Business Continuity Management

When a disaster occurs, the worst-case scenario is that it has the potential to bring some business processes or functions to a complete halt. A business continuity plan includes resilience properties and quick or instantaneous switchover mechanisms that mitigate this initial impact. A business continuity plan also calls for the implementation of capabilities and procedures that result in more rapid restoration of operational capability.

Figure 17.3 also depicts that the recovery process goes through three overlapping stages. Section 17.4 discusses this process.

Business Continuity Objectives

Enterprises undertake business continuity planning to reduce the consequences of any disruptive event to a manageable level. The specific objectives of a particular organization’s continuity plan may vary, depending on its mission and functions, its capabilities, and its overall continuity strategy. The Federal Emergency Management Agency’s Continuity Guidance for Non-Federal Entities [FEMA09] outlines the following objectives for business continuity management:

  • Minimize loss of life, injury, and property damage.

  • Mitigate the duration, severity, or pervasiveness of disruptions that do occur.

  • Achieve timely and orderly resumption of essential functions and the return to normal operations.

  • Protect essential facilities, equipment, records, and assets.

  • Be executable with or without warning.

  • Meet the operational requirements of the respective organization. Continuity plans may need to be operational within minutes of activation, depending on the essential function or service, but certainly should be operational no later than 12 hours after activation.

  • Meet the sustainment needs of the respective organization. An organization may need to plan for sustained continuity operations for up to 30 days or longer, depending on resources, support relationships, and the respective continuity strategy adopted.

  • Ensure the continuous performance of essential functions that require additional considerations beyond traditional continuity planning (such as pandemic influenza).

  • Provide an integrated and coordinated continuity framework that takes into consideration other relevant organizational, governmental, and private-sector continuity plans and procedures.

Essential Components for Maintaining Business Continuity

An organization’s resilience is directly related to the effectiveness of its business continuity capability. An organization’s continuity capability rests on the following key components that are essential to maintaining business continuity:

  • Management: Continuity of management is critical to ensure continuity of essential functions. An organization should have a detailed contingency plan that indicates a clear line of succession so that designated backup individuals have the authority needed to maintain continuity when key managers are unavailable.

  • Staff: There is a twofold requirement with respect to staff. First, all staff should be trained on how to maintain continuity of operations (COOP) or restore operations in response to an unexpected disruption. Second, the organization should develop guidelines for vertical training and cross training so that staff can take on functions of peers and those above and below them in the reporting hierarchy, as needed.

    continuity of operations (COOP)

    An effort in an organization to ensure that it can continue to perform the essential business functions during a wide range of emergencies, including localized acts of nature, accidents, and technological or attack-related emergencies.

  • ICT systems: A top priority following a disruption is communications, both internal and external. Communication systems and technology should be interoperable, robust, and reliable. An organization should identify critical IT systems and have backup and rollover capabilities tested and in place

  • Buildings and equipment: This component includes the buildings where essential functions are performed. Organizations should have separate backup locations available where management and business process functions can continue during disruptions that in some way disable the primary facility. This component also covers essential equipment and utilities.

17.2 Business Continuity Program

Recall from Chapter 2, “Security Governance,” that an information security program consists of the management, operational, and technical aspects of protecting information and information systems. It encompasses policies, procedures, and management structure and mechanism for coordinating security activity. A business continuity program, as defined in the beginning of this chapter, encompasses these considerations, although it is not limited to just ICT systems but covers the broader business continuity area.

Governance

Business continuity governance is concerned with establishing and maintaining management structures and processes that provide a framework for maintaining business continuity in response to major security incidents and disasters.

A typical process for establishing the management framework includes the following tasks:

  • Executive management meets to define objectives and goals of a business continuity strategy and policy.

  • Senior management appoints a business continuity director and a BCM steering committee.

  • Business continuity specialists prepare a business/process effort chart, showing level of effort and time, as well as a project plan. Key items include:

    • Identifying key/critical services

    • Determining exclusions from the BCM scope

    • Determining implementation timeline goals

  • Executive management communicates to all directors and managers about the upcoming business continuity planning program.

  • Department heads commit to goals of business continuity planning.

  • Department directors and managers appoint point-of-contact individuals.

  • The business continuity director meets with department directors and managers to discuss objectives.

Business Impact Analysis

SP 800-34 defines business impact analysis (BIA) as analysis of an information system’s requirements, functions, and interdependencies used to characterize system contingency requirements and priorities in the event of a significant disruption. A BIA helps identify and prioritize information systems and components that are critical to supporting the organization’s mission/business processes.

A typical BIA includes the following steps:

  1. Inventory key business elements, including:

    • Business processes

    • Information systems/applications

    • Assets

    • Personnel

    • Suppliers

  2. Develop intake forms to gather consistent information, interview key experts throughout the business, and get information from inventories.

  3. Assess and prioritize all business functions and processes, including their interdependencies.

  4. Identify the potential impact of business disruptions resulting from uncontrolled, nonspecific events on the institution’s business functions and processes.

  5. Identify the legal and regulatory requirements for the institution’s business functions and processes.

  6. Determine the maximum tolerable downtime (MTD) for each business process.

  7. Calculate a reasonable recovery time objective (RTO) and recovery point objective (RPO) for each business process. The processes with the shortest MTD or RTO are the most critical business processes. Get agreement from senior management.

maximum tolerable downtime (MTD)

The amount of time after which an organization’s viability is irrevocably threatened if product and service delivery are not resumed.

recovery time objective (RTO)

The target time set for resumption of product, service, or activity delivery after an incident. It is the maximum allowable downtime that can occur without severely impacting the recovery of operations or the time in which systems, applications, or business functions must be recovered after an outage (for example, the point in time at which a process can no longer be inoperable).

recovery point objective (RPO)

The amount of data that can be lost without severely impacting the recovery of operations or the point in time in which systems and data must be recovered (for example, the date and time of a business disruption).

The result of a BIA is to identify time-sensitive processes and the requirements to recover them in the time frame that is acceptable to the entity.

Risk Assessment

An organization needs to perform a risk analysis on each critical process to identify any vulnerabilities that exist, along with steps to mitigate those vulnerabilities. Chapter 3, “Information Risk Assessment,” discusses this process in detail for security management, and the same process applies for business continuity.

In essence, business continuity risk assessment addresses three questions: What can go wrong? What is the likelihood that the undesired event might occur? and What would be the impact should it occur? FEMA’s Continuity Guidance for Non-Federal Entities [FEMA09] defines the following critical steps in the risk assessment process:

  1. Inventory the essential functions provided by the organization. These are the functions whose interruption causes unacceptable business impact.

  2. Identify the threats that can impact delivery of the essential functions. This step includes exploring potential natural events, systems events, intentional human events, and non-intentional human-caused events that could adversely affect the ability of the organization to perform its essential functions.

  3. Develop continuity hazard scenarios. Perform all of these assessment steps within the context of a set of scenarios, each of which is a unique combination of a particular hazard and the organization’s essential functions. Within each scenario, consider risks to the four key elements management, staff, ICT systems, and facilities, as appropriate. Scenario risk assessment includes the following steps:

    1. Determine the risk information needed to assess the risk. Describe the information necessary to assess the risk for each scenario. For each information item, specify the information type, precision, and certainty required, as well as the analysis resources available.

    2. Assess the risk. For each scenario, assess the threat, vulnerability, and consequence, where:

      • Threat is the likelihood of a type of attack that might be attempted or that the scenario will occur.

      • Vulnerability is the likelihood that an attacker would succeed with a particular attack type or that the scenario will result in the expected level of consequence.

      • Consequence is the potential impact of a particular attack or the negative impact of the scenario.

  4. Identify existing safeguards/countermeasures. For each scenario, identify the existing safeguards that are in place to reduce either the likelihood (for example, security countermeasures) or consequence (for example, redundant capabilities) of the hazard.

Table 17.1, from ENISA IT Business Continuity Management: An Approach for Small and Medium Sized Organizations [ENIS10], provides a set of categories that an assessment team can use to evaluate the organization’s risk profile.

TABLE 17.1 Risk Profile Evaluation Table

Risk Area

High

Medium

Low

Legal and Regulatory

Sensitive/personal customer data

Handles sensitive/personal customer data

Handles personal customer data

Does not handle personal customer data

Loss/destruction of such data

Will lead to significant legal fines

Will lead to legal fines

N/A

Failure to meet agreed service level agreements (SLAs) with customers

Will result in non-frivolous lawsuits

May result in non-frivolous lawsuits

N/A

Productivity

Services and operational processes

Highly dependent on information systems, applications, and third-party services

Dependent on information systems, applications, and third-party services

Not directly dependent on information systems, applications, and third-party services

Interruptions to the provision of the aforementioned

Requires significant expenses and effort to resume business and recover from market loss

Organization can use backup procedures for a limited time without significant productivity effect

Organization can use backup procedures for a time without productivity effect

Financial Stability

Unavailability of products of less than one day

Major one-time financial loss

Significant one-time financial loss

No financial loss

Revenues related to continuous provision of online services

Directly

Indirectly

Not related

Unavailability of online presence

Will lead to direct financial loss

Will not lead to direct financial loss

Will not lead to direct or indirect loss

Fines due to noncompliance with legal and regulatory requirements

May lead to intolerable financial loss

Possible but will not affect financial stability

No or marginal fines

Reputation and Loss of Customer Confidence

Unavailability of service

Significant loss of customers

Considerable loss of customers

Marginally noticed by customers

Business Continuity Strategy

A business continuity strategy is a conceptual summary of preventive and recovery strategies that must be carried out between the occurrence of a disaster and the time when normal operations are restored. Strategy design involves understanding the requirements gathered during the business impact analysis and risk assessment and effectively translating them into actionable strategies. Furthermore, it involves considering the costs/benefits of any proposed strategy.

Figure 17.4 illustrates the type of trade-off that management needs to consider. The cost of disruption derives from the business impact analysis and risk assessment. Against that is the cost of resources to implement a business continuity program. Typically, the longer a disruption continues, the more costly it becomes for the organization. But the shorter the RTO, the more costs are incurred. For example, for short recovery times, an organization may require a mirror data site that is always active and updated, whereas a longer RTO may enable the enterprise to rely on a less costly tape backup system.

A line graph shows the business continuity strategy.
FIGURE 17.4 Cost Balancing for Business Continuity Management

The business continuity strategy does not indicate specific security controls; that is the task of the business continuity plan. Rather, the business continuity strategy is at a higher, strategic, level and provides overall guidance.

ISO 22301 divides business continuity strategy into three categories: determination and selection, resource requirements, and protection and mitigation.

Determination and Selection

The first category of developing a business continuity strategy, and the first part of a business continuity document, consists of determining possible business continuity strategies based on the business impact analysis and risk assessment.

ISO 22301 calls out three areas to be considered in developing the strategy:

  • Protecting prioritized activities: For activities deemed significant for maintaining continuity, the organization needs to look at the general strategic question of how each activity is carried out. The goal is to determine a strategy that reduces the risk to the activity. The organization should also consider alternatives, including outsourcing (for example, using cloud services) or fundamentally altering the activity to avoid risk.

  • Stabilizing, continuing, resuming, and recovering prioritized activities and their dependencies and supporting resources: The organization should provide more detailed options for managing each prioritized activity during the business continuity process. Examples include:

    • Temporarily relocating an activity to a backup site or relocating some for the IT and other resources that support the activity

    • Using redundant equipment and other resources during the business continuity process

    • Making substitutions for the normal activity that may involve different personnel, different resources, and/or different processes

  • Mitigating and responding to impacts: Finally, the organization should spell out its strategies for containing the damage to the organization from disasters. These strategies may include insurance, preplanned replacement/repair service, and a plan for maintaining the company’s reputation.

Resource Requirements

The purpose of the resource requirements category is to determine the resources necessary to implement each of the business continuity strategy categories. ISO 22301 lists the following types of resources to consider:

  • People: Consider a number of questions, such as the following:

    • Does there need to be one or more dedicated business continuity officers?

    • What level of effort is required of other employees to participate in business continuity implementation and the business continuity process?

    • What resource commitment is needed for awareness programs and training?

  • Information and data: Estimate the resources needed to maintain backup and redundant copies of critical information assets.

  • Buildings, work environment, and associated utilities: Include the cost of hardening or protecting resources as well as the cost of any standby or fallback facilities that the organization maintains.

  • Facilities, equipment, and consumables: Include estimates for protecting and providing redundancy.

  • ICT systems: Estimate resources for protecting and providing redundancy for ICT systems.

  • Transportation: Consider the possibility of moving equipment and personnel for the duration of the response and recovery phase.

  • Finance: Determine options to ensure needed financing for the duration of an incident to meet extra expenses associated with response and recovery.

  • Partners and suppliers: Indicate what commitments are needed from partners and suppliers and what the cost to the organization will be.

Protection and Mitigation

Both ISO 22301 and 22313 refer to protection and mitigation as part of developing a strategy. This category is best viewed as the culmination of developing a strategy, when those involved in developing the business continuity strategy submit the strategy options and recommendations to management for feedback, selection, and approval. With the information provided, management can evaluate the cost/benefit analysis to determine the optimal strategies, based on requirements and the organization’s risk appetite.

17.3 Business Continuity Readiness

Business continuity readiness, refers to the capability of an organization and its assets to respond to, manage, and recover from a disruptive event. This section looks at the actions taken to prepare for such disruptive and disastrous events.

Awareness

An awareness program ensures that an organization’s personnel are aware of the importance of business continuity and understand their roles in maintaining business continuity. An organization should ensure that all staff learn about awareness as part of the induction program for new hires and then an ongoing basis. The objectives of an awareness program include:

  • Establishing objectives of a BCM awareness and training program

  • Identifying functional awareness and training requirements

  • Recognizing appropriate internal and external audiences

  • Developing awareness and training methodology

  • Identifying, acquiring, or developing awareness tools

  • Leveraging external awareness opportunities

  • Overseeing the delivery of awareness activities

  • Establishing the foundation for evaluating the program’s effectiveness

  • Communicating implications of not conforming to BCM program requirements

  • Ensuring continual improvement of the BCM program

  • Confirming that personnel are aware of their roles and responsibilities in the BCM program

An awareness session should cover the following topics:

  • An overview of what BCM is

  • Why BCM is important to the organization

  • The staff’s role in an emergency

  • What staff should do if the BCM plan is invoked

  • The emergency contact numbers

  • Identification and escalation of incidents

  • Triggers for incident response and activation of the business continuity plan(s)

  • How to respond to special events

  • Measures to be taken during site evacuation

Training

Training provides skills and familiarizes leadership and staff with the procedures and tasks to perform in executing continuity plans. FEMA’s Continuity Guidance for Non-Federal Entities [FEMA09] recommends that a training program include the following:

  • Annual training for personnel (including host or contractor personnel) who are assigned to activate, support, and sustain continuity operations

  • Annual training for the organization’s leadership on that organization’s essential functions, including training on individual position responsibilities

  • Annual training for all organization personnel who assume the authority and responsibility of the organization’s leadership if that leadership is incapacitated or becomes otherwise unavailable during a continuity situation

  • Annual training for all pre-delegated authorities for making policy determinations and other decisions, at the field, satellite, and other organizational levels, as appropriate

  • Personnel briefings on organization continuity plans that involve using or relocating to continuity facilities, existing facilities, or virtual offices

  • Annual training on the capabilities of communications and IT systems to be used during an incident

  • Annual training regarding identification, protection, and availability of electronic and hard copy documents, references, records, information systems, and data management software and equipment (including sensitive data) needed to support essential functions during a continuity situation

  • Annual training on an organization’s devolution option for continuity to address how each organization identifies and conducts its essential functions during an increased threat situation or in the aftermath of a catastrophic emergency

  • Annual training for all reconstitution plans and procedures to resume normal organization operations from the original or replacement primary operating facility

In terms of the specific content of training, the National Emergency Crisis and Disasters Management Authority’s Business Continuity Management Standard and Guide [NCEM12] recommends the following:

  • Include procedures for evacuation, shelter-in-place, check-in at the evacuation site, responsibility toward employees, activation and preparation of alternative work sites, and handling of requests for information by internal and external stakeholders

  • Provide response and recovery teams education and training on their responsibilities and duties, including how to interact with first responders

Resilience

Resilience of the infrastructure, assets, and procedures of an enterprise—referred to as information system resilience—improves the organization’s ability to withstand and recover from disruptive events.

information system resilience

The ability of an information system to continue to (1) operate under adverse conditions or stress, even if in a degraded or debilitated state, while maintaining essential operational capabilities and (2) recover to an effective operational posture in a time frame consistent with mission needs.

The IBM white paper Resilient Infrastructure: Improving Your Business Resilience [GOBL02] defines elements of business resilience. The first three are primarily defensive in nature but are the common strategies used by enterprises and a necessary part of business continuity management:

business resilience

The ability an organization has to quickly adapt to disruptions while maintaining continuous business operations and safeguarding people, assets, and overall brand equity. Business resilience goes a step beyond disaster recovery, offering post-disaster strategies to avoid costly downtime, shore up vulnerabilities, and maintain business operations in the face of additional, unexpected breaches.

  • Recovery: The provision for safe, rapid, offsite data recovery in the event of a disaster

  • Hardening: The fortification of all or part of an infrastructure to make it less susceptible to natural disaster, employee error, or malicious actions

  • Redundancy: The duplication of all or part of the infrastructure to supply hot, active backup service in the event of an unanticipated event

Resilient Infrastructure: Improving Your Business Resilience [GOBL02] also defines three offensive measures that go beyond traditional approaches to resilience:

  • Accessibility: If the primary work site is inaccessible, accessibility measures enable enterprise personnel, partners, and customers to access the infrastructure from other locations. These measures include the deployment of diverse communication technologies (for example, wireless, fax, email, instant messaging).

  • Diversification: In order to decrease the probability that a single disaster will significantly degrade business operations, diversification measures entail the physical distribution of resources (hard assets and people) and implementation of diverse communication pathways. These measures should create an operational infrastructure that is physically distributed but capable of being managed as if it were a single consolidated entity.

  • Autonomation: This refers to the inclusion of self-managed hardware and software components in the infrastructure. These products make decisions without human intervention or, at a minimum, bypass a problem and alert a human attendant to initiate appropriate action. Many such products are available today, and more will be introduced in the near future. As technology progresses, resilient infrastructures will contain more autonomic components with self-configuring, self-healing, self-protecting, and self-optimizing capabilities.

Control Selection

Control selection, as discussed in other chapters, is the selection of specific measures related to assets and operations that meet the security objective. ENISA IT Business Continuity Management: An Approach for Small and Medium Sized Organizations [ENIS10] provides a comprehensive set of controls in two categories: organizational continuity controls and asset-based continuity controls. There are 5 sets of organizational continuity controls, each containing a number of specific controls, for a total of 39 controls. These are the categories:

  • Business continuity management: Includes controls that require the organization’s business strategies to routinely incorporate business continuity considerations

  • Business continuity policy, plans, and procedures: Requires an organization to have a comprehensive set of documented, current business continuity policies, plans, and procedures that are periodically reviewed and updated

  • Test business continuity plan: Incorporates security controls in order to complete a test simulation of the continuity plan to ensure its smooth running if the time comes to implement it

  • Sustain business continuity management: Includes controls that require staff members to understand their security roles and responsibilities. Security awareness, training, and periodic reminders should be provided for all personnel

  • Service providers/third parties business continuity management: Includes security controls that enforce documented, monitored, and enforced procedures for protecting the organization’s information when working with external organizations

The following is an example of a control in the business continuity policy, plans, and procedures category:

The organization has a comprehensive business continuity plan, which is periodically reviewed and updated. The plan address key business continuity topic areas, including:

  • Critical Business Functions Priority List

  • Critical Business Functions IT Infrastructure Dependencies

  • Contact List(s) with Business Continuity Manager/Team

  • Critical Business Functions Protection & Recovery Strategy

  • Business Continuity Relative Procedures (Incident Response, Emergency, etc.)

  • Testing Reassessing and Maintaining Business Continuity Plan

  • Critical Suppliers List & Contact Details

The asset-based continuity controls are more extensive and comprise 92 controls in 5 sets:

  • Hardware and network: Covers resilience, backup, redundancy, and recovery actions

  • Application: Covers resilience, backup, and recovery actions

  • Data: Covers data storage, data backup, and recovery actions

  • People: Covers physical security, awareness and training, and recovery actions

  • Facilities: Covers IT site, environmental security, physical security, and recovery actions

The following is an example of a control in the application category:

Application Backup: Control requires that there is a documented backup procedure that is routinely updated, periodically tested, that calls for regularly scheduled backups of application software and requires periodic testing and verification of the ability to restore from backups. The control requires that the organization performs via the procedure a full Backup of the application files, database and any other available application modules. When this control is applied to a service (such as email or internet provisioning) then establishing an alternate backup service is required in addition to backing up any relevant data. When considering backup of services that produce or store data the ability to have a usable local copy of the data or to transfer existing data to the backup service has to be considered and evaluated.

An organization should use the business impact analysis and the risk assessments as inputs to a selection process for determining the cost/benefit of each control in order to make an optimal selection.

Business Continuity Plan

Whereas a business continuity strategy provides an overall view of an enterprise’s approach to business continuity management, a business continuity plan establishes documented procedures and resources for preparing for and responding to disruptive incidents.

ISO 22301 requires that an organization produce a business continuity plan or an interrelated set of business continuity plans. The organization should establish documented procedures for responding to a disruptive incident and identify how it will continue or recover its activities within a predetermined time frame. The plan or plans should address the requirements of the plan(s) users. ISO 22313 provides guidelines on the development of the plan or plans.

There is no single approach that all organizations can use to develop and document business continuity procedures. The end goal is to create a response structure, warning and communication procedures, and recovery plans that result in a repeatable, effective response and recovery process that can be invoked and executed without delay following the onset of a disruptive incident.

The Western Australian Government’s Business Continuity Management Guidelines [WAG15] provides useful guidance on the content of a set of plans that covers all phases of business continuity operations, consisting of an overview document, an emergency response plan, a crisis management plan, and a set of recovery and restoration plans (see Figure 17.5). The following sections provide plan outlines from Business Continuity Management Guidelines.

An illustration shows the overview of the BCM plan.
FIGURE 17.5 Components of BCM Plan Documentation
BCM Plan Overview

A BCM plan overview is a description of the framework, policy, processes, and overall strategies for providing for business continuity readiness and performing business continuity operations. The overview document does not provide specific guidance on dealing with disruptions. Rather, it documents the organization’s approach to business continuity.

As an example, per Business Continuity Management Guidelines [WAG15], a BCM plan overview document may contain the following sections:

  1. Version Control Information

  2. Distribution List

  3. Purpose of the BCM Plan

  4. Objectives of the BCM Plan

  5. BCM Policy

  6. BCM Process Overview

  7. Critical Business Activities

    1. Maximum Acceptable Outage

    2. Interdependencies

  8. Business Continuity Strategies and Requirements

    1. Broad Strategies

    2. Resource Requirements

    3. Systems and Applications Requirements

  9. Response Options

    1. Planning Parameters

    2. Business Continuity Site

  10. Response Plan

    1. Guiding Principles

    2. Crisis Management Organization

      1. Crisis Management Team

      2. On Scene Response Team

      3. Crisis Support Teams

      4. Business Continuity Teams

      5. IT Disaster Recovery Team

    3. Notification and Escalation Process

    4. Command Centre

  11. Training, Exercise and Maintenance

    1. Training Requirements and Protocols

    2. Exercise Requirements and Protocols

    3. Maintenance Requirements and Protocols

Emergency Response Plan

An emergency response plan covers actions that should take place immediately following a critical incident for the protection of people and assets.

According to Business Continuity Management Guidelines [WAG15], an emergency response plan document may contain the following sections:

  1. Introduction

    1.1. Definitions

    1.2. Purpose

  2. Emergency Reporting Procedures

    2.1. Basic Reporting Procedures

    2.2. Priorities of Directive

    2.3. Emergency Telephone Numbers

  3. Prevention

    3.1. Fire Prevention

    3.2. Accident Prevention

  4. First Aid

  5. Responding to Emergencies

    5.1. Fire Emergency

    5.2. Earthquake Emergency

    5.3. Bomb Threats

    5.4. Robberies and Hold-ups

    5.5. Kidnapping – Hostage Situation

Crisis Management Plan

A crisis management plan provides guidance on dealing with disruptive incidents after the initial emergency response. Such a plan should provide guidance on quickly developing an organized, systematic response that seeks to maintain some level of continuity.

A crisis management plan document may contain the following eight sections [WAG15]:

  1. Purpose

    1.1. Outlines the purpose of the plan and circumstances under which the plan is to be used

  2. Definition of Crisis Events

    2.1. Defines what constitutes a crisis event that leads to the activation of the Crisis Management Plan

  3. Crisis Management Team Structure

    3.1. Outlines the purpose and membership of the Crisis Management Team

    3.2. Describes the roles and responsibilities of the team members

  4. Notification and Escalation Process

    4.1. Outlines the process by which an incident is reported, assessed, and escalated through various levels of management, leading to the activation of the Crisis Management Team

  5. Command Centre

    5.1. Describes the purpose of the command center, its location and resources to be made available to support the Crisis Management Team

  6. Communications During a Crisis

    6.1. Describes the communications protocols and tools to be used, how events are to be tracked and recorded and how status updates are to be communicated in a crisis situation

  7. Contact Lists

    7.1. Contact lists of the Crisis Management Team members, senior management, key staff, service providers, emergency services and other stakeholders who need to be informed and/or are needed to provided assistance during a crisis situation

  8. Actions Checklists

    8.1. Checklists of issues and actions that the Crisis Management Team need to consider for crisis management response and business continuity. These serve as reminders to ensure that no critical issues or actions are forgotten in the confusion and chaos that may result in a crisis situation.

Recovery/Restoration Plans

Recovery/restoration plans are targeted at individual teams that are responsible for responding to certain types of disruption or supporting certain aspects of recovery and restoration. The objective is to define the procedures and needed resources for maintaining critical business activities and for recovering as quickly as possible in order to resume normal operations.

A recovery/restoration plan document may contain the following sections [WAG15]:

  1. Purpose

  2. Team Charter

  3. Team Composition

  4. Activities and Strategy

  5. Phase 1: Assessment and Notification

    5.1. Incidents during office hours

    1. Initial Alert

    2. Evacuation

    3. Initial Assessment

    4. Plan Invocation

    5.2. Incidents outside office hours

    1. Initial Alert

    2. Initial Assessment

    3. Plan Invocation

  6. Phase 2: Plan Activation

    6.1. Upon arrival at business continuity site

    6.2. Business resumption

    1. Within 1 day

    2. Within 3 days

    3. Within 5 days

    4. Within 10 days

  7. Phase 3: Return to Normalcy

    7.1. Damage assessment

    7.2. Salvage and restoration

    7.3. Relocation

    Appendix 1 Contact Lists

    Appendix 2 Resource Requirements

    Appendix 3 System/Application Requirements

    Appendix 4 Vital Records Requirements

Exercising and Testing

Exercising and testing are essential for an organization to validate its ability to effectively respond to and recover from disruptive incidents in the time frame established by management. Exercising and testing must be ongoing to accommodate staff turnover as well as changes in facilities, equipment, and the threat environment.

ISO 22300 defines the terms exercise and test as follows:

  • Exercise: A process to train for, assess, practice, and improve performance in an organization. Exercises can be used for:

    • Validating policies, plans, procedures, training, equipment, and interorganizational agreements

    • Clarifying and training personnel in roles and responsibilities

    • Improving interorganizational coordination and communications

    • Identifying gaps in resources

    • Improving individual performance

    • Identifying opportunities for improvement and controlled opportunity to practice improvisation

  • Test: A procedure for evaluation; a means of determining the presence, quality, or veracity of something.

    • A test may be referred to as a trial.

    • Testing is often applied to supporting plans.

    • A test is a unique and particular type of exercise, which incorporates an expectation of a pass or fail element within the goal or objectives of the exercise being planned.

An exercise focuses on the business continuity plan and tries to determine if the personnel, procedures, and equipment are all in place and in a state of readiness to respond to an incident. Testing focuses more on individual aspects of business continuity and ensures that equipment and procedures are maintained in a constant state of readiness to support continuity activation and operations.

Exercises

Exercises are needed to assure the organization that its business continuity procedures are reliable. Even for well-designed and analyzed procedures, exercises suggest areas for improvement and often uncover flaws in procedures.

The following is a list, in increasing order of complexity, of types of exercises:

  • Seminar exercise (or plan walkthrough): An exercise in which the participants are divided into groups to discuss specific issues.

  • Tabletop exercise: A facilitated exercise in which participants are given specific roles to perform, either as individuals or groups. This book’s document resource site provides an example of a business continuity tabletop exercise.

  • Simple exercise: A planned rehearsal of a possible incident designed to evaluate an organization’s capability to manage that incident and to provide an opportunity to improve the organization’s future responses and enhance the relevant competences of those involved.

  • Drill: Coordinated, supervised activities usually employed to exercise a single specific operation, procedure, or function in a single agency.

  • Simulation: An exercise in which a group of players, usually representing a control center or management team, react to a simulated incident notionally happening elsewhere.

  • Live play: An exercise activity that is as close as safely practicable to the expected response to a real incident. For the most comprehensive form of this exercise, referred to as full interruption, operations are shut down at the primary site and shifted to the recovery site in accordance with the disaster recovery plan.

Cybersecurity Book Resource Site https://app.box.com/v/ws-cybersecurity

Depending on the size and needs of an organization, management may choose to use one or more types of exercises. The responsible person or group, such as a business continuity manager, should determine one or more scenarios to guide exercise participants and encourage the usage of, review, and feedback on the business continuity plans. The following are examples of scenarios:

  • Loss of facility: Continuing the delivery of critical products and services following the loss of a key facility (for example, due to fire)

  • Loss of people: Continuing the delivery of critical products and services with a reduced workforce (for example, due to pandemic)

  • Loss of technology: Continuing the delivery of critical products and services without access to technology or systems (for example, due to data center failure)

  • Loss of equipment: Continuing the delivery of critical products and services following the loss of key equipment (such as a metal press)

  • Loss of suppliers: Continuing the delivery of critical products and services (such as payroll processing)

Tests

The objective of testing is to identify and address business continuity plan deficiencies by validating one or more of the system components and the operability of the plan. Testing takes several forms and accomplishes several objectives. Make sure it is conducted in an environment that is as similar to the operating environment as possible. FEMA’s Continuity Guidance for Non-Federal Entities [FEMA09] lists the following as guidelines for testing:

  • Annual testing (at a minimum) of alert, notification, and activation procedures for continuity personnel

  • Annual testing of plans for recovering vital records, critical information systems, services, and data

  • Annual testing of primary and backup infrastructure systems and services (for example, for power, water, and fuel) at continuity facilities

  • Annual testing and exercising of required physical security capabilities

  • Testing and validating of equipment to ensure the internal and external interoperability and viability of communications systems

  • Annual testing of the capabilities required to perform an organization’s essential functions

  • A process for formally documenting and reporting tests and their results

  • Annual testing of internal and external interdependencies identified in an organization’s continuity plan, with respect to performance of the organization’s and other organizations’ essential functions

Planning for an Exercise or a Test

Exercise and test planning is dictated by the objectives for testing defined in the BCP. Each individual exercise or test plan should identify quantifiable measurements of the exercise or test objective. Include the following items in your plan for an exercise or a test:

  • Goal: Specifies the business continuity function or component of the BCP to be tested.

  • Objectives: List the anticipated results. Objectives should be challenging, specific, measurable, achievable, realistic, and timely.

  • Scope: Identifies the departments or organizations involved, the critical business function, the geographic area, and the test conditions and presentation.

  • Artificial aspects and assumptions: Defines which exercise aspects are artificial or assumed, such as background information, procedures to be followed, and equipment availability.

  • Participant instructions: Explains that the exercise provides an opportunity to test the BCP before an actual disaster.

  • Exercise or test narrative: Gives participants the necessary background information, sets the environment, and prepares participants for action. It is important to include factors such as time, location, method of discovery, and sequence of events, whether events are finished or still in progress, initial damage reports, and any external conditions.

  • Evaluation: Determines whether objectives were achieved, based on impartial monitoring. Participants’ performance, including attitude, decisiveness, command, coordination, communication, and control are assessed. Debriefing is short yet comprehensive, explaining what did and did not work and emphasizing successes and opportunities for improvement. Be sure to include participant feedback in the exercise evaluation.

Performance Evaluation

Performance evaluation assesses the alignment of the BCMS (the operations of the BCMS as well as the planning process) to management requirements and the requirements in standards such as ISO 22301. ISO 22301 includes three key performance evaluation requirements:

  • Establish, monitor, analyze, evaluate, and update metrics to assess performance of the BCMS at regular intervals

  • Establish and maintain an internal audit process to ensure that the BCMS aligns with management expectations and ISO 22301

  • Communicate the performance of the BCMS and its solutions to program sponsors and other top management representatives through the management review process, with the objective of prioritizing continual improvement opportunities

Performance Metrics

Feedback derived from metrics guides management to prioritized ongoing improvement and adjustment to business continuity procedures. Good business continuity metrics have the following characteristics:

  • Help senior managers (and/or their target audience) quickly see the performance of the response and recovery solutions based on risk to the organization’s products and services

  • Convey information that is important to senior managers

  • Focus on performance rather than exclusively on activities

  • Assist senior management in identifying problem areas to focus attention and remediation efforts

Table 17.2, based on FEMA’s Continuity Guidance for Non-Federal Entities [FEMA09], provides a list of metrics that an organization can use to measure its ability to meet its continuity requirements. For each of the seven continuity considerations, management should use a simple grading system to show status, as defined in the table, using green for success, yellow for mixed results, and red for unsatisfactory.

TABLE 17.2 Continuity Considerations and Metrics

Continuity Requirements

Key Questions

Metrics

The continuation of the performance of essential functions during any emergency should be for a period up to 30 days or until normal operations are resumed and the capability to be fully operational at alternate sites as soon as possible after the occurrence of an emergency but not later than 12 hours after COOP activation.

  • Is your organization able to perform its current essential functions during any emergency and for up to 30 days or resumption of normal operations?

  • Is your organization able to be fully operational at an alternate site within 12 hours of COOP activation?

  • Measure ability to perform essential functions through test, training and exercise, identifying gaps and solutions.

  • Measure capability to be fully operational at a COOP site within 12 hours through testing, training, and exercises, identifying gaps and solutions.

Plan and document succession orders and preplanned devolution of authorities that ensure the emergency delegation of authority in advance, in accordance with applicable law.

  • Does your organization have accessible and complete orders of succession that are familiar to successors?

  • Does your organization have accessible and complete devolution of authorities known by those to whom they devolve?

  • Document and train on succession orders.

  • Document and train on devolution of authorities.

Safeguard vital resources, facilities, and records.

  • Are your vital resources safeguarded?

  • Are your facilities safeguarded?

  • Are your records safeguarded?

  • Will your continuity staff have official access to your vital resources, facilities, and records in an emergency?

  • Document measures to safeguard vital resources, facilities, and records.

  • Document measures taken to ensure official access to vital resources, facilities, and records.

Make provisions for the acquisition of the resources necessary for continuity operations on an emergency basis.

  • Have you identified emergency continuity resources?

  • Do you have agreements/contracts to acquire emergency continuity resources?

  • Identify your emergency continuity resource requirements.

  • Identify what agreements/contracts you have made to meet these requirements.

  • Identify what additional agreements/contracts are needed.

Make provisions for the availability and redundancy of critical communications capabilities at alternate sites in order to support connectivity between and among key government leadership, internal elements, other executive departments and agencies, critical partners, and the public.

  • Do you have critical communications capability at your alternate site(s)?

  • Do you have redundant communications capability at your alternate site(s)?

  • Identify your current communications capability at your alternate site.

  • Identify what communications capability is necessary.

  • Identify the plan to improve communications at your alternate site in six months, one year, and two years.

Make provisions for reconstitution capabilities that allow for recovery from a catastrophic emergency and resumption of normal operations.

  • What is your plan for ensuring your reconstitution capability?

  • Identify your reconstitution capability plan.

Make provisions for the identification, training, and preparedness of personnel capable of relocating to continuity facilities to support the continuation of the performance of essential functions.

  • Have you identified, trained, and prepared personnel to relocate to alternate sites to continue essential functions?

  • Verify that staff are identified, trained, and prepared to relocate to alternate sites.

ISO 22313 lists the following requirements for monitoring performance:

  • Setting of performance metrics, including qualitative and quantitative measurements that are appropriate to the needs of the organization

  • Monitoring the extent to which the organization’s business continuity policy and objectives are met

  • Identifying when monitoring and measuring should take place

  • Assessing the performance of the processes, procedures, and functions that protect prioritized activities

  • Putting in place proactive measures of performance that monitor compliance of the BCMS with applicable legislation, statutory, and regulatory requirements

  • Putting in place reactive measures of performance to monitor failures, incidents, nonconformances (including near misses and false alarms), and other historical evidence of deficient BCMS performance

  • Recording data and results of monitoring and measurement sufficient to facilitate subsequent corrective action analysis

Internal Audit

An organization should implement an internal audit process for business continuity, whose purpose is to evaluate the performance of the BCMS. This does not necessarily mean that an internal audit department is needed or has this task. An audit needs to be performed by some knowledgeable person or group independent of the BCMS. An organization should scale the depth and frequency of audit activities and reporting to the assessed importance of business continuity. While the scope of audit activities and deliverables may vary, in all cases they must encompass an independent and objective evaluation of the effectiveness of the BCMS.

Key tasks for internal auditing are as follows:

  • Ensure that the audit program is capable of determining whether the BCMS conforms to requirements

  • Ensure that the audit program is capable of determining whether the BCMS conforms to the BC plan

  • Establish and implement the audit program

  • Ensure that top management reviews the effectiveness of the audit program

Management Review

Management review is an essential aspects of business continuity management. Such a review needs to evaluate readiness and conformance to requirements and standards. The management review needs to address the following points:

  • Results of BCM audits; post-emergency, crisis, or disaster reviews; and exercise results

  • BCM status of key suppliers and outsource partners, as available

  • Level of remaining and acceptable risks

  • Inadequately managed risks, including those identified in the entity’s previous risk assessment

  • Internal or external changes likely to affect the entity’s BCM capability

  • Results of exercises, tests, and self-assessments

  • Accomplishments of training and awareness programs

  • Follow-up procedures based on previous management reviews

  • Proposed recommendations for development of the entity’s BCM capability

An organization should include the following in the review document:

  • Scope of the review

  • Reasons for the review

  • People involved in the review

  • Areas where issues exist, especially any raised risks

  • Recommendations for corrective and preventive actions

  • Brief review of tests and exercises

17.4 Business Continuity Operations

As depicted in Figure 17.1, business continuity operations constitutes the foundation layer for business continuity management. In response to a disruptive event, the business continuity process proceeds in three overlapping phases (see Figure 17.6):

  1. Emergency response: Focused on arresting or stabilizing an event

  2. Crisis management: Focused on safeguarding the organization

  3. Business recovery/restoration: Focused on fast restoration and recovery of critical business processes

    A graph of the level of effort versus time shows the phases involved in the business continuity process.
    FIGURE 17.6 Business Continuity Process

Figure 17.6 provides a rough indication of the typical level of effort and time scale for each phase. The relative values depend on the nature and severity of the incident, the complexity of the organization, and the organization’s state of readiness.

Emergency Response

An emergency response is an urgent response to a fire, flood, civil commotion, natural disaster, bomb threat, or other serious situation, with the intent of protecting lives, limiting damage to property, and minimizing disruption of system operations.

Generally, an emergency response is of limited duration—usually minutes to hours. Key tasks performed by designated emergency response personnel include:

  • Account for staff and visitors

  • Deal with casualties

  • Contain/limit damage

  • Assess damage

  • Invoke the business continuity plan by contacting the crisis management team point of contact

The nature of a security incident dictates which personnel are involved in the emergency response. For example, in the case of a fire alarm, generally all staff have been instructed on the evacuation procedure to a gathering point safe area. One person or one staff position per building or floor can be assigned the job takeoff taking a roll call at the gathering point. This should be a person designated to confirm that the fire department has received the alarm and is responding. From that point, a crisis manager team leader may take over the task of coordinating the business continuity response.

As another example, to deal with a power outage, an emergency response team can be designated for the data center. One member of the team should verify that the backup generator is operating properly and check with the service provider for a status report. If power is restored before any further action is needed, then the incident is closed. If the power outage is prolonged, the emergency response team may inform the crisis management team so that the crisis management team coordinates an offsite location with a current mirror image of the data center and takes any other business continuity tasks required as the incident unfolds.

Crisis Management

Crisis management involves ensuring that processes, controls, and resources are available immediately following a disruption to ensure that the enterprise continues to deliver its critical business services.

Typically, crisis management occurs over a time frame of hours to days. Key tasks performed by the crisis management team include the following:

  • Contacting staff, customers, and suppliers, as needed

  • Performing the initial recovery of critical business processes to the extent possible

  • Rebuilding lost work in progress

A crisis management team needs to react quickly. This should be a small group of a dozen or fewer individuals who can easily coordinate among themselves. This team must include individuals who have the authority to provide corporate leadership and direct business continuity activities during times of crisis or in emergencies. Table 17.3, from Business Continuity Management Guidelines [WAG15], shows the makeup of a typical crisis management team.

TABLE 17.3 Crisis Management Team

Role

Responsibilities

Crisis manager/team leader

  • Provides overall leadership

  • Liaises with board and CEO

  • Allocates resources, sets priorities, and resolves conflicts

  • Briefs the company spokesperson

Command center coordinator

  • Keeps the command center functioning, including supporting technologies and resources

  • Maintains the status board for the crisis and call register

Corporate communications staff

  • Act as a single source of information to internal and external stakeholders and media

  • Provide media management

Human resources staff

  • Provide employee assistance, such as medical assistance, counseling, insurance claims, payroll duties, and so on.

  • Handle emergency evacuation/repatriation

  • Liaise with victims’ families

  • Provide recruitment support

Corporate security staff

  • Ensure staff safety

  • Liaise with emergency services

  • Monitor emergency response

  • Provide for security of assets and staff

  • Communicate with external parties on security intelligence

Administration and logistics support staff

  • Facilitate and supports recovery efforts, possibly consisting of food services, transport arrangements, mail duties, insurance, legal, finance requirements, and so on

Premises and facilities staff

  • Coordinate damage assessment, salvage and repair operations, and reconstruction

  • Support the insurance claim process

  • Plan for relocation to the primary site

Business recovery coordinator

  • Coordinates execution of business recovery plans

  • Provides status updates to the crisis management team

IT recovery coordinator

  • Coordinates execution of IT recovery plans

  • Resolves system, network, and application issues

  • Provides status updates to the crisis management team

Business Recovery/Restoration

Business recovery/restoration is aimed at getting the enterprise back to normal operation as soon as practical. Typically, business recovery/restoration occurs over a time frame of days to weeks or possibly even months. Key tasks performed by the crisis management team include the following:

  • Damage repair/replacement

  • Relocation to a permanent place of work

  • Restoration of normal IT operations

  • Recovery of costs from insurers

Business recovery/restoration may involve a number of teams, depending on the size and complexity of the organization, and the teams may be organized on functional or departmental lines. Table 17.4, from Business Continuity Management Guidelines [WAG15], shows the makeup of a typical business recovery/restoration team.

TABLE 17.4 Response/Recovery Team

Role

Responsibilities

Team leader

  • Provides overall leadership to the team

  • Ensures that critical activities are restored within the required time frames

  • Keeps the crisis management team appraised of business continuity progress

Alternate team leader

  • Acts as a backup to the team leader

BCM coordinator

  • Assists the team leader, as required

  • Coordinates communications within the team and liaises with other areas of the agency

  • Maintains a status board on the team’s business continuity progress

Team members

  • Carry out business continuity tasks in accordance with the team’s business continuity and recovery plan

Standby team members

  • Are on standby at home

  • Provide assistance with business continuity tasks when called upon

  • Support long-term recovery task when required

In addition to the activities that are specific to the various recovery/restoration teams and the crisis management team, all team leaders need to address the following common concerns:

  • Ensure that all local activities and dependencies are addressed by the plans.

  • Have administrative responsibility for the plans.

  • Coordinate routine updates to the detailed information supporting the crisis management and recovery response procedures (for example, risk assessments, contact lists, personnel assignments, hardware and software specifications, network diagrams, vital records, inventory lists, offsite backup schedules).

  • Coordinate electronic access to, and hard copy distribution of, the relevant plans and procedures to personnel who need them.

  • Ensure that relevant persons are aware of the plans and their role in any post-disruption activities identified by the plans.

  • Protect the confidentiality, integrity, and availability of the emergency response and business continuity plans and associated procedures.

  • Ensure that service agreements with other stakeholders, emergency services, and business continuity service providers are agreed and in place.

  • Ensure that out-of-hours emergency responsibilities are addressed and understood.

The flowchart in Figure 17.7 provides a general picture of the relationship between security incident management, emergency response, crisis management, and recovery/restoration.

A flowchart shows the relationship between incident response and business continuity.
FIGURE 17.7 Incident Response and Business Continuity

17.5 Business Continuity Best Practices

The Information Security Forum’s (ISF’s) Standard of Good Practice for Information Security (SGP) breaks down the best practices in the business continuity category into two areas and seven topics and provides detailed checklists for each topic. The areas and topics are as follows:

  • Business continuity framework: The objective of this area is to develop an organizationwide business continuity strategy and program that is supported by a resilient technical infrastructure and an effective crisis management capability.

    • Business continuity strategy: Provides a checklist of actions for developing a business continuity strategy similar in nature to the actions for developing an information security strategy.

    • Business continuity program: Provides guidance on defining the business continuity requirements for each business environment.

    • Resilient technical environments: Describes techniques for ensuring resilience, including resilient hardware/software, redundancy, isolation, and backups.

    • Crisis management: Provides a checklist of elements that should be part of a crisis management plan.

  • Business continuity process: The objective of this area is to develop, maintain, and regularly test business continuity plans and arrangements (sometimes referred to as disaster recovery plans) for critical business processes and applications throughout the organization.

    • Business continuity planning: Describes all the elements that should be included in a business continuity plan, including risk assessment, assignment of roles, metrics to be met, and the process of responding to an incident that threatens business continuity.

    • Business continuity arrangements: Lists the elements that should be included in a disaster recovery plan.

    • Business continuity testing: Describes an approach to testing business continuity plans.

17.6 Key Terms and Review Questions

Key Terms

After completing this chapter, you should be able to define the following terms:

accessibility

autonomic computing

awareness

business continuity

business continuity management (BCM)

business continuity management system (BCMS)

business continuity manager

business continuity plan

business continuity program

business continuity readiness

business continuity strategy

business impact analysis

business recovery/restoration

business resilience

continuity of operations (COOP)

crisis management

drill

diversification

emergency response

exercise

hardening

information system resilience

internal audit

live play

management review

maximum tolerable downtime (MTD) performance evaluation

performance metrics recovery

recovery point objective (RPO)

recovery time objective (RTO)

redundancy

resilience

risk assessment

simple exercise

simulation

tabletop exercise

training

Review Questions

Answers to the Review Questions can be found online in Appendix C, “Answers to Review Questions.” Go to informit.com/title/9780134772806.

1. Describe the three key elements of business continuity.

2. What natural disaster threats can disrupt business continuity?

3. What human-caused disasters can disrupt business continuity?

4. What are the four key business components that are critical for maintaining business continuity?

5. What are the key steps of business impact analysis?

6. According to ISO 22301, what are three key areas to consider while developing a business continuity strategy?

7. What are the key objectives of a business continuity awareness program?

8. Briefly define the term business resilience.

9. Define five sets of organizational continuity controls.

10. List some improvement exercises for participants in an ideal BCP.

11. What are the characteristics of good business continuity metrics?

12. What are the three phases of the business continuity process in response to a disruptive event?

17.7 References

ENIS10: European Union Agency for Network and Information Security, ENISA IT Business Continuity Management: An Approach for Small and Medium Sized Organizations. January 2010. https://www.enisa.europa.eu/publications/business-continuity-for-smes/at_download/fullReport

FEMA09: Federal Emergency Management Agency, Continuity Guidance for Non-Federal Entities (States, Territories, Tribal and Local Government Jurisdictions, and Private Sector Organizations). Continuity Guidance Circular 1 (CGC 1), January 21, 2009.

FFIE15: Federal Financial Institutions Examination Council, Business Continuity Planning. February 2015.

GOBL02: Goble, G., Fields, H., & Cocchiara, R., Resilient Infrastructure: Improving Your Business Resilience. IBM Global Service White Paper. September 2002.

NCEM12: National Emergency Crisis and Disasters Management Authority, Business Continuity Management Standard and Guide. United Arab Emirates Supreme Council for National Security Standard AE/HSC/NCEMA 7000, 2012. https://www.ncema.gov.ae/content/documents/BCM%20English%20NCEMA_29_8_2013.pdf

WAG15: Western Australian Government, Business Continuity Management Guidelines. June 2015. https://www.icwa.wa.gov.au/__data/assets/pdf_file/0010/6112/Business-Continuity-Management-Guidelines.pdf

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.128.189.98