CHAPTER 4: BUSINESS CONTINUITY STRATEGY

BC strategy essentially means the identification of how the organisation is going to continue to meet the needs, and expectations, of its customers, clients or other stakeholders, in the event of some interruptive situation.

ISO22301 focuses on the protection, stabilisation and resumption of prioritised activities, which is fine if the direct resumption of an activity is a viable strategy, however, where the nature of the activity(ies) is such that meeting customers’ needs in the shorter term is best done in other ways, then some interpretation of the Standard’s requirements will be necessary.

In some respects, this chapter is an extension of the DR resource analysis in Chapter 3. There are some resources which can be replaced relatively easily, and others which cannot. It’s worth remembering here that the Standard assumes that all activities are capable of being recovered, or restored, within the RTO. In this chapter, we are also going to look at strategies for activities that simply could not be treated in this way.

IT disaster recovery

The DR industry provides replacement resources for many types of organisation, as long as what they do is essentially to process information. They have people who sit at desks and use computers and telephones.

Office space, desks, chairs, personal computers (PCs) and telephones, are all generic tools which are used by the majority of organisations, and which can be obtained in a variety of different ways, at relatively short notice. They also require little customisation, to suit the individual organisation’s needs.

For the type of organisation that uses only these ITDR resources, BC strategy should be relatively straightforward.

It is to be assumed that, should it be interrupted, every organisation with the slightest interest in this subject would wish to resume what it does. From this point, it becomes a matter of analysing the MTPD, or RTO, for each activity, as discussed in Chapter 3.

This process results in a resource timeline, similar to the activity recovery timeline in the previous chapter. This is a fairly simple process which can be executed manually, depending upon the complexity and size of the organisation. However, many activities that have a particular resource available to them, might be able to continue functioning without that resource for a period of time, so the RTO of the resource can be longer than the MTPD or RTO for the activity.

Ideally, the difference between the two, the ‘resource tolerance’, should be captured, along with other relevant information about the activity. The resulting resource timeline might look something like the example for IT applications in Figure 21.

Resource

RTO

RPO

E-mail

2D

2H

File server

2D

1D

Internet

2D

N/A

Database

2D

1D

MRP

2D

1H

HR system

14D

2D

Payroll

14D

2D

Figure 21: Resource timeline

RPO (recovery point objective) refers to the time point at which data should be restored to the application in question. Depending upon the nature of the organisation, its products and services, and any regulatory or customer requirements, RPO may be in days, hours, or even minutes.

This information should then enable the support service providers, such as IT, HR or facilities management, to establish contingencies against these timescale and recovery point requirements. Because the activity owner has expressed resource utilisation in terms of what is used, rather than how it is achieved, the support service provider can then arrange contingencies that will deliver the required resource.

A classic example of this is in IT applications. Ideally, users will state a requirement in terms of an IT application, as opposed to a particular server and other infrastructure. They would also state the requirement for user terminals, such as PCs, also in as generic terms as possible.

Availability

Disaster recovery services are often marketed as ‘availability’ or ‘high availability’ solutions. They do pretty much what their name suggests – make things available in a time of need. However, the cost and actual availability should be considered carefully – a very good reason to conduct a BIA.

The fact is that generic resources, such as servers, PCs and office space, can be acquired reasonably easily in the marketplace. Of course, we have to gamble that the things we need will be available exactly when required, which is one of the key reasons that the ITDR industry exists. DR providers generally give a level of assurance that the necessary resources will be available when required, though this is almost never guaranteed. Most DR providers’ contractual terms and conditions refer to ‘best efforts’, or ‘best endeavours’, in respect of the obligation upon them to provide the IT equipment, space and support services described in the contract.

If the BIA suggests that resources are not actually required immediately, but only after, say, three days, then one could speculate that it would be possible, in those three days, to find some serviced, office accommodation, buy some IT equipment, and get the organisation’s activities running again within the RTO.

But, it’s a gamble. The ITDR industry naturally wants everyone to think that it’s too much of a gamble, and to believe that, in fact, the only reason that the required resources would not be available from them, would be because someone else got there first.

Some larger organisations have their own exclusive, virtually guaranteed, DR facilities. There are examples of stand-by ‘hot sites’, for the exclusive use of one organisation; but very few organisations can justify this level of availability.

Within the outsourced ITDR sector there are three fairly standard levels of availability (see Figure 22)

Availability

Resources provided

Recovery requirements

Cold site

Space:

Seats (desks, etc.)

IT hardware

Telephones

Server builds

Application installation

Desktop build/configure

Data restore

Test and handover

Warm site

Items above, plus:

Built servers

Possibility of installed applications

Possibility of configured desktops (PCs)

Application installation

Data restore

Test and handover

Hot site

Items above, plus:

Applications installed in servers

Configured desktops

Possibility of replicated (live) application data

Possibility of configured telephone switch/call management system

Data restore

Test and handover

Figure 22: Standard levels of availability

Failover

This term refers to higher levels of availability for systems should they fail, usually offering continued access, not only to systems, but also to data. Failover amounts to another system automatically taking over from a failed ‘master’ system, so that there is little, or even no, interruption to system availability.

The ITDR marketplace

In the UK, the ITDR marketplace has become fairly polarised in the past five years, with a small number of large corporations having acquired their smaller competitors, so that the field of choice is now relatively narrow. This sector undoubtedly provides a very necessary service to the many organisations which could not reasonably create these types of contingencies for themselves.

ITDR providers achieve this by syndicating the resources they have, using a ‘retainer and right to use’ model, so-called because they are generic, and can be used by the majority of organisations.

Reciprocal and co-operative arrangements

For some organisations, the possibilities exist for setting up arrangements, whereby resources could be made available at another office or site within the organisation, or by another organisation. This latter approach is fairly uncommon, not least because of the competitive pressures between commercial organisations. However, there are some examples, particularly in the professional services and public sectors.

At the same time, this concept also represents a fairly common trap that many organisations fall into. It is tempting to say simply that, if Office A were unavailable, the staff could simply be transferred to Office B. This ‘plan’ is often based on untested and invalid assumptions, because it is unlikely that Office B will have the space, desks and IT equipment for everyone from Office A.

In many cases, however, a plan like this can be properly formulated. It could include arrangements for staff from Office B to be displaced, perhaps to work at home (the assumption that they would then be able to access the IT network would then need to be tested), or even from a customer’s or client’s premises.

Organisations that have more than one site can also consider building ‘resilience’ into their IT networks and data storage, where maintaining IT equipment on multiple sites is a feasible option. In some cases it is possible, not only to make alternative systems available if the main system should fail, but also to store back-up data in these, or similar systems, as opposed to the more traditional method of using tapes stored in a safe or off-site archive.

Go out and buy it

Many BCM practitioners express horror at such an idea, but for many organisations this can be a legitimate strategy for some, or even all, resources.

Computers, furniture and temporary accommodation are available in the market and, if one can set out a sensible plan based on what is likely to be available at any given time, then this can also be a very cost-effective strategy.

The ITDR market really provides the sort of availability required by many financial institutions, and others, where core activities must be up and running again within a day or less. But for organisations with longer MTPDs and RTOs, the acquisition of serviced, office accommodation and suitable off-the-shelf IT equipment, may well be possible within a number of days, or perhaps a week.

Some research of the IT equipment market would quickly reveal the likely delivery times for the hardware, and any software licences that might be required. Provided there is a reasonable range of suppliers, most of whom generally have stocks of the required products, then a list of these, with the appropriate detail, could prove to be a quite robust contingency plan.

In these cases, as with ITDR services, each organisation must make its own judgement as to how likely it is that the resources will be available when needed.

The go-out-and-buy-it strategy may also be a suitable, or indeed the only, approach for other resources, such as industrial plant or stocks of materials.

As we shall see later, there are also many situations where resources simply could not be replaced in anything like the RTO or MTPD, and where, once again, different strategies would be required.

People

BCM and, before it, DR, seem largely to have always assumed that organisations will not be deprived of this most important resource. It has to be said that, because many organisations have a number of people, compared with having one building, or one IT room, or one factory, the people resource is generally thought to be more resilient than the rest. The threat of an influenza pandemic, however, brings the dependency on people in the context of BCM into focus. The strategic options for replacing people are, in reality, extremely limited, except in some cases where agency staff could be used.

Here, the strategy may have to be much more about adjusting how the organisation operates if it loses some of its critical people.

The rest of the resource spectrum

So far, we have considered the strategic options for organisations whose activities are largely information processing or, at least, are achieved with the sort of generic resources described above.

But what about manufacturing companies, hotels, logistics and distribution businesses, or even schools and hospitals?

There is virtually no DR provision for them, except for their IT systems. This is where the Standard, which arguably grew out of the financial services industry, somewhat sidesteps the issue. The code of practice (Part 1) refers to the following resource types:

  • People
  • Premises
  • Technology
  • Information
  • Supplies
  • Stakeholders.

The only one of these that comes close to addressing this type of organisation is ‘supplies’, but there is little in this section of the Standard to indicate what is expected.

A BCMS cannot be complete, or comprehensive, if it does not provide for other operational aspects of the organisation, so strategies should be developed in respect of these resources, or, perhaps more appropriately, for the activities that depend upon them.

In manufacturing, for example, much of the resource base is bespoke, special purpose, or simply not available on a replacement basis within the sort of timescale that might be required.

There are other resources that might fall into the same generic category as those provided by the ITDR industry, such as vehicles, and some industrial plant and equipment. All the same, it would seem that the ‘retainer and right to use’ model, adopted by the ITDR industry, has not found favour in these other areas, where a more conventional hire market generally exists.

It should also be noted that the possibilities for rapidly replacing operational resources for organisations that do things other than process information, are generally quite limited. To address this set of issues, we must look again at what BCM is for.

BCM objectives

The beginning of this chapter suggests that BCM is there to ensure that the organisation meets the expectations of its customers, or other stakeholders, in the event of an interruptive incident. In fact, the Standard is really about restoring the critical activities that support key products and services. This implicit objective does not seem to take into account the value of business interruption insurance products.

A typical manufacturing company may well have more than enough business interruption insurance cover, so that any loss of profits resulting from production being interrupted would be met by this insurance. If that company were supplying some sort of commodity product, then customers might not be inconvenienced by such an interruption; in this case there would probably be little or no damage to the company’s reputation. As soon as it was ready to reinstate its supplies into the market in question, the demand would remain and the company would continue to supply as before.

But for a great many products, and the vast majority of services, customers would be inconvenienced and, in many cases, incur losses as a result of the interruption to supplies. In the absence of any other arrangements, they might be forced to seek alternative supplies, and would then no longer demand what the company had previously supplied. Business interruption insurance typically does not cover this type of loss.

This is the scenario where the customer is let down by the supplier, even though it may not be the supplier’s fault, and it can often lead to the customer making a positive decision not to buy from that supplier again. It may well be the case that the new supplier does not have any BCM arrangements in place, but that does not help the manufacturer who has just lost a customer. The customer held an expectation that the supplier would continue to supply products and would not let the customer down.

Many companies are nervous about discussing with their customers the possibility that something could go wrong, even though the vast majority of purchasing professionals know very well that no supply chain is 100% guaranteed. But if BCM is discussed with customers, their expectations can ultimately be modified, and a range of options can be examined.

The point here is that even if the only option is that supplies to the customer are simply suspended in the event of an interruptive incident, the customer would have effectively ‘signed up’ to that option, and its expectation would be met. That is a strategy.

Deliverables

A key feature of all management systems is that the thinking that has gone into it is visible and therefore documented.

As well as a strategy document being the sort of evidence a certification assessor would be looking for, it is also the most logical way of ensuring consistency in the development of the system, and continuity, as people within the organisation change.

The strategy document, or documents, will need to address the following:

  • Ways of protecting activities against the effect of interruptive incidents; though these are likely, also, to exist as controls in the risk assessment and management section.
  • Strategies for activities that can be viably resumed and for those that cannot.
  • Information about the resources, interdependent activities, and other sources of help that might be relied upon, including:
  • People
  • Information systems and information itself
  • Other operating equipment and systems
  • Physical infrastructure
  • Transportation, including of people
  • Finance, for example for replacing stock, or rent for short-term workspace
  • Other organisations, including BaU, and emergency suppliers/providers
  • Ways of reducing the associated risk – again likely also to exist as controls in the risk assessment and management section.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.143.40