Chapter 6: Designing Services to Be Fit for Use: Service Design Part 2: The Warranty Processes

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 6

Designing Services to Be Fit for Use: Service Design Part 2: The Warranty Processes

In This Chapter

Making sure you can provide what the customer wants

Checking you have enough capacity

Preparing for all eventualities

Keeping IT secure

Would you agree to doing something for someone without checking you have the ways and means of achieving it? If someone asks you for a lift, do you check whether your partner is using the car that day? If someone asks to borrow some money, do you check you have enough spare after you’ve paid your bills? In other words, do you check you have the right assets, and that the assets are properly configured and ready to go?

This chapter and Chapter 5 cover the service design stage of the ITIL service lifecycle. (I introduce the service lifecycle in Chapter 3.) This is the stage that covers the gathering and analysis of the service requirements, and the design of the service to meet the requirements. Chapter 5 covers the design coordination process and the relationship management processes: service level management, service catalogue management and supplier management. In this chapter, I cover what I call the warranty processes: availability, management, capacity management, IT service continuity management, and information security management.

Think of the ITIL warranty processes as being similar to the warranty you get when you buy a new washing machine. The warranty means you should be able to get the machine fixed quickly if it breaks, ensuring that you can wash your clothes as often as you like. Warranty closely relates to the service levels that are described in your service level agreements (SLAs) (more on these in Chapter 5). In this chapter, I cover the four areas of warranty: availability, capacity, continuity and security. They don’t wash your clothes for you, but they do allow your customers to get value from the IT service as often as agreed.

Making Sure the Service Is Available: Availability Management

What do the users do when the service breaks – that is, it becomes unavailable? They complain. Why? Because they can’t achieve their business goals – they can’t get the value that they expect from the service. Think how you feel when you’re at home and your broadband connection or satellite connection fails. I bet you use a few choice expressions to describe your providers!

So what do I mean by availability? The availability of what? The IT service. In accordance with what? The SLAs (see Chapter 5) or agreed business needs. Asking the business about availability is the equivalent of asking, ‘When do you want it?’ So the aim is to keep the users happy by ensuring that the service is actually there and working when they want it.

‘Availability’ is just a clever way of saying that the service is working and you can use it. Your phone works – so you can make and receive phone calls.

The purpose of the availability management process is to ensure that the level of availability delivered in all IT services meets the agreed availability needs and/or service level targets in a cost-effective and timely manner. Availability management is concerned with meeting both the current and future availability needs of the business.

To give you a better idea of what availability management does, here are some other objectives:

Ensure that service availability meets agreed targets, by managing services and resource-related availability performance.

Ensure that services are designed to meet agreed availability requirements.

Assist with the diagnosis and resolution of availability-related incidents and problems.

Produce and maintain an up-to-date availability plan.

Providing good availability of your service isn’t something that should happen by accident. Some IT departments provide an IT service that they think is about right and then tweak it when the business complains. But ideally, you start thinking about availability when designing the service – in the service design stage of the service lifecycle.

Seeing the process in action

Through the use of the service level management process, the IT department liaises with the business and identifies its needs. For example, if a project has been approved and set up for the development of a new IT service, then, at some point early in the project, the service level manager gathers the service level requirements (SLRs). These include the customer’s requirements for the availability of the service. I don’t know how your organisation measures availability, but often it’s expressed as a percentage – for example, the email service will be available 99 per cent of the time.

Using the availability management process, you perform the activities to review these requirements and check to see whether they are achievable. If the requirements are not achievable, then you make suggestions on how the availability of the service can be improved to meet them, and report back to the service level manager, who can discuss the requirements further with the business. This discussion includes considering who will pay for the necessary improvements.

Once the requirements are agreed, availability management ensures that the service is designed to meet these availability requirements. This is part of the service design stage. During the service transition stage, availability management can get involved in testing to ensure that the design works in the way expected. In the service operation stage, availability management monitors the availability of the service to ensure that the requirements have been met.

Beryl, the sales manager, has a requirement for the sales IT service. She has stated that ‘The service must be available from 8 a.m. until 6 p.m.; if it breaks, you will fix it within 30 minutes; it will not break more than twice in any one week. And by the way, Friday morning is our busiest time, so please keep the service going all morning.’ Your technical staff look at these requirements from an availability point of view. They may calculate the required availability of the service: 98 per cent (the maths isn’t that difficult!). A review of the infrastructure that supports the new service shows that availability is currently 95 per cent – so not enough. Your technical staff (technical architects) now redesign the service by building in greater resilience and more reliable components, and calculate that 98 per cent availability can be achieved. However, the increased cost will be £10,000. The service level manager (business relationship manager) discusses this with Beryl, who agrees that this is a worthwhile return on investment and agrees to make the money available.

Defining some availability management terms

Availability, reliability, maintainability, serviceability and resilience – these are terms you find in the availability management process. That’s a lot of jargon (much of it designed by someone with an -ability passion); but don’t worry, in the next sections I explain each term. To begin:

Availability: According to ITIL, this is ‘the ability of a service or component to perform its agreed function when required’. Availability refers to both the end-to-end service and the components. You normally refer to the availability of the overall service in your SLA. Availability management must consider two levels: service availability and component availability. Component availability depends upon reliability and maintainability.

Reliability: The ITIL books define reliability as ‘a measure of how long a service or component can perform its agreed function without interruption’. In other words, for how long will it work before breaking? Technical staff have a couple of ways of measuring this:

• Mean time between failures (MTBF): On average, how long does the component work before failing?

• Mean time between service incidents (MTBSI): On average, how frequently does the component fail?

Maintainability: ITIL defines maintainability as ‘a measure of how quickly and effectively a service or component can be restored to normal working after a failure’. So, when a component breaks, can you fix it quickly? For example, if a user’s PC breaks, how long will it take you to fix it and help the user get back to work? This is usually measured using mean time to restore service (MTRS): on average, how long does it take to fix this type of component?

Serviceability: In ITIL’s words this is ‘the ability of a third-party supplier to meet the terms of its contract’. So, for components or services that you outsource, you need to be sure that the supplier sticks to its part of the bargain. Serviceability is a bit of a strange term because it just refers to the contractual arrangements you have with a supplier, and you can’t measure this. However, you should be able to measure the agreed levels of availability, reliability and/or maintainability for a supporting service or component.

Resilience: ITIL’s definition of resilience is ‘the ability of a component or IT service to resist failure or to recover quickly following a failure’. For example, you may have two network routers sitting side by side, both doing the same job at the same time. If one fails, the other carries on as if nothing has happened. In this case, the user still receives the service and is blissfully unaware that anything has failed. A second example would be having a spare laptop ready to replace a user’s desktop system – an example of restoring the service very quickly.

Improving availability

Generally speaking, you improve the availability of a service in two ways: increase uptime and decrease downtime (no, these don’t mean the same):

Increasing uptime: Keeping the IT service there for as long as possible requires you to design for availability: use reliable components or build resilience into the design of the service so that if a single component fails, the whole service doesn’t fail. One form of resilience is to build in redundancy: have two components doing the same job, so if one fails the other takes over. If you apply this principle to the entire end-to-end service, you can eliminate any single points of failure, where the failure of a single component can affect many services (for example, a network switch failure can affect many services and many users). Building in a lot of resilience costs a lot of money, so you need to justify the business requirement and have someone who’s prepared to pay.

Decreasing downtime: The ability of your staff to restore the service in a timely fashion when it fails. This includes taking a look at your incident management and problem management processes (see Chapter 8) to ensure they take into account the needs of your critical services. You review your processes and procedures, especially how people contribute.

The speed of recovery relies on the following:

• Detecting the failure quickly: Don’t wait for the user to tell you an incident has occurred; use monitoring tools.

• Diagnosing the fault quickly: Have the right staff available at the right time who can point at a component and say, ‘That’s the one that’s failed.’ You can sometimes use monitoring tools.

• Fixing the fault quickly: Have spare parts available and staff that have practised various scenarios and know what to do to fix the fault and restore the service.

Looking at the activities of availability management

ITIL groups the activities of the availability management process into those that are proactive and those that are reactive.

Proactive activities

The proactive activities of availability management take place during the service design stage of the service lifecycle. This is where you ensure that you have clear requirements for the availability of the service, and the design activities focus on creating a service that meets these requirements. The activities are as follows:

Is your car available?

Would you think I was mad if I asked whether your car was available? You might wonder what I meant by this. Perhaps I’m asking whether your car is available for use. To be available for use, is it parked outside? Is there fuel in it? Is it roadworthy? In fact everything necessary for you or I to use the car as expected. In general terms, I imagine your car normally works the way you expect, as and when you want it to.

Let me ask you a question. What do you think the manufacturer considered when designing your car? More specifically, what did it consider when designing the components of your car? Surely it must select components that contribute to the successful operation of your car as a whole. I think there are two main factors. First, I expect it chose components that have good reliability, that is ones that work for a reasonable amount of time without failing. Second, I hope it chose components that are maintainable, that is ones that are easy to repair or replace when they break. Imagine you go into your local garage and ask the mechanic to replace your brake pads; I expect you would be surprised if the reply was ‘Sorry, but in your car I have to take the engine out to replace the brake pads.’ That’s not very maintainable.

This, of course, is an analogy. Your IT services are no different. For your whole service to be available, you have to ensure that each component of the service (for example servers, network equipment, PCs, software) has the appropriate level of reliability and maintainability.

Risk assessment and management. You review the assets and components that make up the service, and identify any potential risks to the availability of the service – what could cause a failure. Then you suggest cost-justifiable improvements to reduce the risks. (You can find out a bit more about risk management in Chapter 9.)

Planning and designing new or changed services. You get the requirements for new or changed services and ensure they translate into requirements of availability. Designing the service involves the factors I describe in the previous section ‘Improving availability’. This activity also involves the design and selection of the components and how they contribute to the overall service availability.

Implementing cost-justifiable countermeasures. The previous two activities yield many suggestions for improving the availability of your IT services. Each suggestion is likely to cost money, and you can’t justify every one. So you review the various options and choose the most appropriate.

Reviewing all new and changed services and testing all availability and resilience mechanisms. You continually review your services and regularly test the fail-over mechanisms you have in place. Only through such activities can you be sure everything will work the way you expect it to, when you want it to.

Continual review and improvement. Nothing remains stable for long: customer needs change and technology progresses. Availability management must be aware of changes coming from the business as well as from other parts of the IT provider organisation, and react accordingly. Regular reviews are an ideal way to ensure that changes don’t go unnoticed and opportunities for improvement and optimisation are captured and acted on.

Reactive activities

The reactive activities of availability management take place during the service operation stage of the service lifecycle. These activities aim to ensure that the targets for availability of the service are met and that any issues are spotted before outages occur:

Monitoring, measuring, analysing, reporting and reviewing service and component availability. You do this to ensure that availability targets are met. You provide much of this data to the service level management process (see Chapter 5) to contribute to the regular reports that are delivered to the customers, usually on a day-to-day basis, by your service operation staff.

Investigating all service and component unavailability and instigating remedial action. Here availability management gets involved with the incident management and problem management processes (see Chapter 8). You can think of this activity as providing expert help when there are availability-related incidents and problems. In addition, availability management ensures that previous problems influence the design of future services.

Have We Got Enough? Capacity Management

After availability (which I cover in the earlier section ‘Looking at the activities of availability management’), what annoys users most is IT services that run slowly. Oh, how they complain. But sometimes speed is a matter of perception: how do you know whether the IT service is running as fast as intended?

Asking the business about capacity is like asking, ‘How much do you want and how fast should it work?’ Capacity management is concerned with both the capacity and performance of the IT service. So the aim is to keep the users happy by ensuring that there is enough of the service to meet the needs of all the users, and that the service works fast enough.

The capacity of the service depends on the component parts. So this may include having enough network bandwidth, enough server processing power, enough data storage, powerful enough PCs. Also consider software. Poorly written software code can lead to an application that uses more resources than necessary. Then you have people. Having enough people may be a management issue not an IT service management issue, but you may be able to use some similar concepts or methods of capacity management to help.

The purpose of the capacity management process is to ensure that the capacity of IT services and the IT infrastructure meets the agreed capacity- and performance-related requirements in a cost-effective and timely manner. Capacity management is concerned with meeting both the current and future availability needs of the business.

Eh? You’ve got to know how much is needed then provide what you’ve agreed.

Other objectives of capacity management include:

Ensure that service performance achievements meet their agreed targets, by managing the performance and capacity of both services and resources.

Assist with the diagnosis and resolution of performance- and capacity-related incidents and problems.

Ensure that services are designed to meet agreed performance and capacity requirements.

Produce and maintain a capacity plan – in line with the budget lifecycle.

Much of capacity management is a balancing act – in fact, two balancing acts:

Cost against resources: Balancing the need to ensure that the appropriate capacity is present with making the most efficient use of capacity-related resources.

Supply versus demand: Ensuring that the available supply of IT processing power is matched to the demands made on it by the business.

In common with the availability management process (see the earlier section ‘Making Sure the Service is Available: Availability Management’), the requirements for capacity are gathered via the service level management process (see Chapter 5). Will the customer understand what is meant by the capacity of the IT service? I suspect not. But you should be able to ask the customer how many staff use the IT service and when are the busiest times of day. You then follow the same process that I outline in the earlier section ‘Seeing the process in action’.

Capacity management and demand management

A close relationship exists between capacity management and demand management. Demand management is a process described in the service strategy stage and is concerned with identifying the patterns of business activity or patterns of use of the IT services. The information gathered is a valuable input to the capacity management process because it represents the requirements that have to be met by the resources that contribute to the capacity of the IT service. If you want to know more about demand management, take a look at Chapter 4.

Beryl, the sales manager, has requirements for capacity and performance. In response to my questions, Beryl says, ‘I have 100 users in the department. The busiest time of day is between 10 and 12 a.m., and all 100 users may need to use the system. At lunchtime we get very few calls, but in the afternoon things pick up again, though not quite as much as in the morning. By the way, when all 100 users are using the service, the system should respond in less than two seconds when users hit the return key.’ Your technical staff look at these requirements from a capacity point of view. They calculate the technical resource required to provide this service, in other words how much network bandwidth, processor power and disc storage. They discover that they can meet the requirements on most days; however, on Wednesday afternoons the response time may be greater than two seconds if all 100 users use the service. This is because of processing that takes place in the finance department. The service level manager discusses this with Beryl, who agrees that this is acceptable because it’s unusual for the sales department to be that busy on a Wednesday afternoon.

A similarity between availability and capacity is that the capacity and performance of the service depends on the quality of the contributing components and assets. So, for a train company to run its service, it must have enough trains and they must run fast enough. There must also be enough staff, and they must be in the right place at the right time. This is exactly true of IT services as well.

Defining some capacity management terms

Here are a couple of bits of terminology:

Performance: Performance refers to how quickly the computer system processes your data and responds to requests from users. Performance is often measured by system response time or transaction response time. This is usually the time it takes from when the users press the return key to when something appears on screen.

example.eps Think about when you are using the Internet at home. Perhaps you are on your favourite Internet shopping site. How long will you wait for the web page to refresh – 30 seconds, 15 seconds, 5 seconds or just 1 second – before you give up and go to a competitor’s site? Response time is important to users; it can slow down their productivity or just simply frustrate them.

Capacity plan: You use this plan to manage the resources required to deliver IT services. The plan contains scenarios for different predictions of business demand, and costed options to deliver the agreed service level targets. You usually create the plan once a year in line with your budget cycle. The plan has many purposes. One is a method of asking the organisation for more money. Capacity costs money, therefore it makes sense to plan carefully any increase (or decrease) in capacity. The capacity plan documents the need for additional capacity for the coming year, and can be used to justify the expenditure.

Understanding capacity management sub-processes

Capacity management carries out many activities, but the purpose of the activities can vary depending upon which sub-process you consider:

Business capacity management (BCM): Translates business needs and plans into requirements for the service and IT infrastructure to ensure that the future business requirements for the IT services are planned and implemented. To do this, the BCM sub-process receives requirements from the business (via service level management or the service strategy) and reviews them to see whether anything will create a change in demand for the IT services (the process looks for things that may trip up IT).

example.eps Typically, market campaigns and promotions cause issues. I have heard of many cases where business units have ‘forgotten’ to mention a TV campaign that led to a large increase in usage of the IT service, and the IT department was unable to plan for it. Even when events are planned for, it’s sometimes difficult to forecast them correctly. For example, in 2011 the UK Government launched a website that allows the public to see crime figures for their region. So many people tried to use the site that it crashed and was taken down.

Service capacity management (SCM): Focuses on the end-to-end IT service and is concerned with whether the agreed service capacity levels are provided to the customers. This involves the management, control and prediction of the end-to-end performance and capacity of the live services. So the key is to monitor and measure against the targets in the SLAs.

Component capacity management (CCM): Focuses on the underlying technology of the service that provides the capacity. You need to understand how each component contributes to the service. Similarly, each component may become a bottleneck. So this sub-process includes the management, control and prediction of the performance utilisation and capacity of individual IT technology components.

Looking at the activities of capacity management

Here are the main activities of the capacity management process:

Reviewing current capacity and performance. When you receive a set of service requirements, you review them to see whether you can achieve them from a capacity and performance point of view. This involves reviewing the current design of your services.

Improving current service and component capacity. You can use many capacity management techniques to identify how to make improvements to your services; check out the ITIL service design book, which has plenty of suggestions.

Assessing, agreeing and documenting new requirements and capacity. Requirements for changes or new services will most likely be received through the service level management process (see Chapter 5). You translate the requirements into detailed capacity requirements. Capacity and resources cost money, so you carefully assess each requirement, and any improvements must be cost-justifiable.

Planning new capacity. After you review capacity requirements and decide how you are to meet the requirements, you must plan to make the resources available. The capacity plan contains your recommendations for achieving this, along with the costs. The plan should be regularly reviewed and kept up to date.

Being Prepared for Anything: IT Service Continuity Management

Imagine your main computer centre is located near a main road. One day a tanker carrying a toxic substance crashes on the main road and the police create an exclusion zone and evacuate the area. How long can you operate your IT service for? Your services are still available and they still work; technically, there has been no failure. The organisation’s business is evacuated along with the IT staff. How long is it before you need access, which you are denied by the police, to the services? Is this an occasion when you invoke your disaster recovery – or IT service continuity – plan?

IT service continuity management (ITSCM) is more or less another name for disaster recovery.

The purpose of the ITSCM process is to support the overall business continuity management process by ensuring that, by managing the risks that may seriously affect the IT services, the IT service provider can always provide the minimum agreed business-continuity-related service levels.

Business continuity management refers to the activities that your business performs to decide what to do in the event of a disaster or other large event that prevents the business from operating. So, the business comes first. It decides what business processes are essential to the viability of the organisation. The prime concern of an organisation when faced with some unexpected event is to remain in business. It should plan for this. This is business continuity management. When the business realises that one or many of these activities involve an IT service, the IT provider must get involved in planning how to recover the essential IT services. Once you’ve got the requirements from the customer, you then identify the appropriate way of recovering the service. This, of course, costs money.

The service level manager asks Beryl, the sales manager, what should happen if there is some sort of disaster. Her response is, ‘Because the sales department represents the main means of generating income for the business, the sales IT service must be recovered within 24 hours.’ Once your technical staff review these requirements, they discover that there is already a recovery site that supports one of the manufacturing department’s IT services which uses similar technology. They decide that, for a small amount of investment, the recovery site can be extended to support the sales IT service as well. The service level manager returns to Beryl with the good news, and Beryl says she agrees with this solution.

Defining some IT service continuity management terms

The following sections explain some of the terminology you come across in the ITIL service continuity management process.

Business impact analysis

In the event of a disaster, it’s important for the IT department to have a plan in place that tells it which IT services should be recovered and in what order. Who tells IT what to do first? Simple answer: the business. The severity of the disaster is determined by the size of the impact on the business.

An important part of IT service continuity is to perform a business impact analysis (BIA). This is, of course, an analysis of the impact of a disaster on the business.

Think of a supermarket that has to close unexpectedly for a day; what would be the business impacts? I’m sure you’ve just said ‘loss of revenue’. So straight away the supermarket is losing money because it isn’t selling anything. Second, there’s a loss of reputation: customers won’t be impressed and will shop elsewhere. Third, additional operational expense is incurred: store staff hanging around doing nothing; IT staff running around trying to fix things.

Disasters can have an impact on businesses in many ways, and not all impacts occur at the same time. If a bank’s main trading IT service were to fail, the impact would occur immediately and the bank would lose money. However, if the human resources system failed, the bank might be able to manage without it for several days.

A bank has just completed a BIA which involved members of the IT provider organisation. One of the outcomes is a set of requirements for IT stating that in the event of a disaster the IT services should be recovered in the following order:

Number 1: The trading service. To be recovered in ten minutes.

Number 2: The email service. To be recovered in four hours.

Number 3: The human resources service. To be recovered in two days.

Risk analysis

The BIA (see the previous section) helps you determine which services you have to recover. Now you can look more closely at these services to identify the risks to them. You may consider threats such as flood, fire and pestilence (disasters have been caused by rodents eating through cables). You can then consider how vulnerable your systems are to the threats. For example, the river that your IT data centre is near is not the only source of a threat of flooding. The threat could just as easily be that the pipe which feeds the coffee machine in the corridor situated over the top of the computer room bursts and floods your equipment.

After you identify and value risks, you must decide whether to take action. This may be risk reduction – in other words, move the coffee machine! Or the appropriate action may be to set up a recovery plan. You can find out a little more about risk by looking at Chapter 9.

IT service continuity strategy

Your IT service continuity strategy is your decision about how you intend to recover your IT services in the event of a disaster. There are a couple of extremes:

A complete duplicate of your computer room, or data centre: You have duplicate servers, data storage and network connections in another location some distance away from your main computer room. You ensure that the data is duplicated on both systems, so when you have a disaster you simply switch to your second system. Great, but very expensive.

An empty room available that you can use as your backup data centre: This must be far enough away from your main system so that both cannot be affected by the same disaster. So when you have a disaster, you have to acquire some hardware and software from your favourite computer store and take it to this empty room and rebuild your systems from scratch. In reality, I hope this is less ad hoc. You have a proper plan in place. However, it does describe the other extreme at which recovering your IT services is likely to take several days.

Other strategies exist in between these two extremes, and the clever bit is to decide which is most appropriate to your organisation. You make the decision based on the result of your BIA and risk analysis (see the preceding sections).

Looking at the activities of IT service continuity management

These activities aren’t a process for invoking disaster recovery but a process for recognising the need to plan for significant events and for setting up the appropriate facilities:

Initiation: Getting started. This requires recognition of the importance of business continuity management and ITSCM from senior management. Hopefully this is followed by getting approval, funding and support to establish business continuity management and ITSCM. To get started, someone must decide whether you need a project team, a department or just an individual to be responsible for ITSCM. Then you must create a plan to introduce ITSCM.

Requirements and strategy: This activity is split in two:

• Requirements: This is the point at which you must perform a BIA and a risk analysis (see the earlier sections on these). The results of these two activities tell you which IT services you must recover in the event of a disaster, in which order you must recover them, and how long you have to recover each IT service.

• Strategy: You take the output of the business impact analysis and risk analysis and decide the type of recovery site. You can decide to have no recovery system or a fully equipped duplicate system.

Implementation: This doesn’t mean implementing or invoking your continuity plan. Now you’ve decided what recovery mechanism you need, you set it up, which usually involves:

• Developing the ITSCM plans and procedures

• Organisational planning – deciding who’ll do what

• Building or acquiring the recovery site and systems

• Initial testing of the recovery systems

Ongoing management: Once you have your continuity plan in place, for how long will it remain up to date? Not long! As soon as someone makes a change to one of your IT services, your continuity plan is out of date. That’s life! Here are some ideas of things you can do to ensure the continuity plan remains up to date:

• Education, awareness and training of all those affected or involved

• Regular review and audit of your ITSCM processes

• Regular testing of your recovery systems

• Linking in to the change management processes (see Chapter 7) so that you can trap anything that may alter your recovery plans

Ensuring Security: Information Security Management

Where are your valuables stored? Don’t worry, I’m not going to come round and steal them! I hope they’re secure. I expect that your money is in the bank and your valuable possessions are stored somewhere safe. What do you do when young children come to your house? Do you put your favourite vase or ornaments out of reach so they don’t get damaged? I’m sure children don’t deliberately break things, but accidents happen. Mind you, do you set expectations? I still remember from my childhood, a friend’s parent telling me, ‘This is how you behave when you are in our house: be careful not to touch anything.’ So what has this got to do with information security management? The point is, we all have valuable possessions, and we do whatever we have to do to protect them. What’s more, one of the best ways to protect things from accidental damage is to set expectations and encourage people to act in a specific way when they’re in your domain.

Instead of the word possession, ITIL uses the word asset. An asset is something of value. Not all your assets have the same value. So here’s what you do:

1. Decide which are your most valuable assets. Start with the ones you can’t manage without.

2. Decide what the risks are to the assets. For example, could the assets be stolen by a burglar or damaged in a fire?

3. Decide the best way of protecting your assets.

According to ITIL, IT assets can be resources (things like infrastructure, applications, information and people) and capabilities (things like processes and knowledge). You can find out more about IT assets in Chapter 3.

One of your most valuable IT assets is likely to be the information and data that is stored on and used by your systems. This is one of the most important things to protect. The data is at the heart of your organisation. It may be data about the products you sell, what they cost, and the customers you sell them to. Each organisation’s data will be different. In IT terms, this data is stored on hard discs and data volumes scattered around your IT systems. The data must be protected against deliberate or accidental damage. In order to do this you also have to protect your IT infrastructure and applications. This is the same as locking your house when you go out in order to protect your valuables that are hidden inside. Your house is the equivalent of the infrastructure, and your valuables are the equivalent of your IT data.

The purpose of the information security management process is to align IT security with business security and ensure that the confidentiality, integrity and availability of the organisation’s assets, information, data and IT services always matches the agreed needs of the business.

So, the information security management process:

Provides strategic direction for security activities

Ensures that a management system is in place to manage all aspects of IT security

Manages information security risks

Protects the interests of those relying on information, and the systems and communications that deliver the information, from harm resulting from failures of availability, confidentiality and integrity

Relating this information to the other warranty aspects, the users cannot get value if the service is not there (unavailable), or it goes slow (lack of capacity), or if the data and information they use is missing, corrupt or unavailable due to a security incident.

So what is the point of view of Beryl – the sales manager on security? ‘We are bound by the Data Protection Act to protect our clients’ information. Also our discount calculations are company confidential and must be secure. All sales data must be protected and must be restorable in the event of a cyber-attack.’ The technical staff review these requirements. Happily, you discover that Beryl’s view conforms with the standard level of security you have in place, so the service level manager once again delivers the good news to Beryl.

Defining some information security management terms

The following sections Dummify some ITIL technical terms.

Confidentiality, integrity and availability

One way of considering the most appropriate way of protecting your IT assets is to think about confidentiality, integrity and availability:

Confidentiality: Making sure that only the right people know where your assets are, and that the information is seen by only those who have a right to know. IT has clever ways of doing this, including setting access (when you’re given a username and password), and encryption of highly confidential data (electronically coding it so that the recipient’s system has to know the code in order to read it).

Integrity: Making sure your assets don’t get damaged. IT data and information can easily get corrupted if not protected correctly, so you need to be sure that your information is complete, accurate and protected against unauthorised modification. This is what your anti-virus system does.

Availability: Making sure the assets are there when needed. Ensuring that information is available and usable when required, and systems can resist attacks and recover from or prevent failures.

Security risks

Once you’ve established the customer’s security requirements, you must identify any risks that may prevent you from achieving what the customer wants. This includes a review of the assets and components used to provide the IT service. The review identifies anything that’s a risk to security. Here are some examples of risks:

Systems that don’t have password control

Personal USB devices brought in from home and connected to a PC

Users who write their passwords on sticky notes and leave them on the side of the PC

In one organisation, part of a network failed at the same time every day, at about 7:30 every morning. After much investigation it was discovered that a staff member was unplugging a network router (positioned on a windowsill!) in order to plug in and charge a mobile phone. A risk certainly – possibly a security risk.

Getting things under control

Once you have established the sort of risks you are facing (see the preceding section) you need to put controls in place to deal with them.

What is a control? When you add a lock to your house, you’re adding a control. Similarly, when you take out insurance, this is also a control. So you must add similar control to protect your IT services. Here are some examples of security controls:

Installing firewalls to detect and prevent intruders

Taking regular data backups so you can restore corrupt data

Mechanisms that, when your systems are attacked, shut them down to prevent further harm

User access controls (usernames and passwords) to prevent the wrong people getting access to your systems

Information security policy

Once you understand the need for security in your organisation, don’t keep it a secret – tell everyone how important it is. The best way to do this is to create an information security policy: a document that makes it clear to the entire business how seriously the organisation takes security. Circulate the policy to everyone, not just within the IT provider organisation, but to the users and customers.

tip.eps

Here are some ideas for what to put in your information security policy:

Access control: Who is allowed to use your IT services and how they get access

Anti-virus policy: What the anti-virus system protects and how often it is updated

Email and Internet policies: What websites users can and can’t visit; what they can and can’t download

Use and misuse of IT assets: What users can and can’t do to their desktop system

Information security management system

The information security management system (ISMS) is a framework of activities. Have a look at Figure 6-1.

Figure 6-1: The information security management system.

9781119951186-fg0601.eps

This framework of activities wasn’t invented by ITIL but is based on stuff taken from a standard called ISO/IEC 27001. Best not to reinvent the wheel then. The ISMS provides a simple approach to planning and implementing the controls and measures needed for information security management:

Control: The controls needed to maintain information security to support the business needs; includes the organisational structures, roles and responsibilities needed to manage information security

Plan: Identification of the requirements of information security and the planning of how they will be achieved

Implement: Implementation of the controls and mechanisms for achieving the planned level of security

Evaluate: The review and measurement of the success of the controls and mechanisms

Maintain: The continual improvement of information security

If you think the approach of the ISMS looks familiar, you’re right. It’s based on a well-known quality management approach known as the Deming Cycle. The cycle is the plan, do, check, act method for gradual improvement in any organisation. You can find out more about the Deming Cycle in Chapter 9.

Looking at the activities of information security management

The activities of the information security management process are as follows:

Produce and maintain an information security policy. Does what it says on the tin.

Communicate, implement and enforce adherence to all security policies. You must make it clear how seriously security is taken by your organisation. Some organisations link not adhering to the security policy to disciplinary procedures.

Assess and categorise information assets, risks and vulnerabilities. This involves allocating some relative value to each security risk, so that you know which risks are most significant to your organisation.

Regularly assess, review and report security risks and threats. Things change, so you need to do this regularly and keep things up to date.

Impose and review risk security controls, and review and implement risk mitigation: Here you implement the controls and mechanisms that you’ve identified to support your information security policy.

Monitor and manage security incidents and breaches. Breaches must not go unnoticed. Your policy will not be taken seriously if you don’t police it.

Report, review and reduce security breaches and major incidents: You must learn from the things that happen in your organisation.

Identifying Service Design Roles

Each service management process should have a process owner and a process manager. I explain these two roles in Chapter 2. Some of the service design roles are listed in Chapter 5. For the warranty processes, the relevant process manager roles are:

Availability manager

Capacity manager

IT service continuity manager

Information security manager

Each role is responsible for the activities described in the appropriate section of this chapter.

Using technology for service design

In each of the five core ITIL publications (one for each lifecycle stage), a chapter is dedicated to the use of technology in helping you provide the IT services to your customers. IT services, of course, largely consist of technology. But technology is also used by the staff who manage and deliver the services.

Tools can be used in the service design stage for hardware design, software design, environmental design, process design and data design. Such tools can help by:

Speeding up the design process

Helping you and your colleagues stick to standards and conventions

Aiding prototyping, modelling and simulation

Validating designs

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 6: Designing Services to Be Fit for Use: Service Design Part 2: The Warranty Processes

Create new playlist

Sign In

Sign Up

Table of Contents for
Chapter 6: Designing Services to Be Fit for Use: Service Design Part 2: The Warranty Processes