Chapter 13: Service management practices

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

CHAPTER 13: SERVICE MANAGEMENT PRACTICES

Availability management

“The purpose of the availability management practice is to ensure that services deliver agreed levels of availability to meet the needs to customers and users.”

Availability is defined as “the ability of an IT service or other configuration item to perform its agreed function when required”.

Availability management activities include:

•Negotiating and agreeing targets for availability;

•Designing infrastructure and applications to meet agreed levels;

•Ensuring that service and component data is collected to measure availability;

•Monitoring, analysing and reporting on availability; and

•Planning improvements to availability.

Availability is related to how often a service fails, and how quickly it can be recovered after a failure. These are often referred to as mean time between failure (MTBF) and mean time to restore service (MTRS).

Appropriate levels of availability need to be designed into each service. Technology changes including Software as a Service (SaaS) solutions and Cloud hosting platforms have led to significant increases in the availability of services.

Availability measurements could include:

•User outage minutes – incident duration multiplied by the number of users affected

•Number of lost transactions

•Business value lost

•User satisfaction

If failover or recovery mechanisms are used as part of service availability planning, these need to be tested regularly.

Practice considerations

Many organisations have tried to make the business case for availability management, only for the management team to tell them, ‘Things are fine! Why should we spend any money?’

It’s important to understand that just because things are OK today, it doesn’t mean they’ll be OK tomorrow. Sadly, the aftermath of a major incident, or availability loss, can be a great time to get funding and management attention for the implementation of availability management.

The actual definition of availability can also be tough. Services that are ‘up’ or ‘down’ are easy to identify, but what about intermittent availability, or degraded performance? Does that mean the service is available or not? The service provider organisation and the customer might have very different views here. Service level management and relationship management can help availability management to talk to the customer and get precise definitions for ‘service up’ and ‘service down’.

Finally, remember that availability management can be complex and technical. Be very wary of customers who have heard a figure they like the sound of (for example, 99.999% or ‘5 9s’) and ask for that as an availability target. It’s sensible to check that the customer knows exactly what they are asking for and why – and if they are prepared to pay for the service. It can be a good idea to create a table to show your customer exactly what these figures mean in terms of minutes of downtime per week, month or year.

Business analysis

“The purpose of the business analysis practice is to analyze a business or some element of it, define its associated needs, and recommend solutions to address these needs and/or solve a business problem, which must facilitate value creation for stakeholders. Business analysis enables an organization to communicate its needs in a meaningful way, express the rationale for change, and design and describe solutions that enable value creation in alignment with the organization’s objectives.”

Business analysis takes a holistic view, considering:

•Process

•Organisation change

•Technology

•Policies

•Information

•Strategic planning

Business analysis activities include:

•Analysing business systems, processes, services and architectures;

•Identifying and prioritising improvements to the SVS, products and services;

•Identifying and prioritising opportunities for innovation;

•Evaluating and proposing actions to deliver improvement;

•Documenting business requirements to enable improvements; and

•Recommending solutions and validating them with stakeholders.

Process considerations

Business analysis is another area that is being affected by the shift to Agile working and the impact of digital transformation. Some organisations have adopted roles based on the Agile methodology, replacing business analysts with product owners. Other organisations are retaining the business analyst role but renaming it ‘agile business analyst’.

Whatever the role is called, having an area of the organisation that is actively working to address business needs and identify opportunities for improvement is essential. Technology can provide huge advantages, but business users may not know how to articulate their requirements.

Capacity and performance management

“The purpose of the capacity and performance management practice is to ensure that services achieve agreed and expected performance, satisfying current and future demands in a cost-effective way.”

Performance is defined as “a measure of what is achieved or delivered by a system, person, team, practice or service”.

Service performance is typically used to describe the number of service actions performed within a timeframe, and the time required to fulfil a service action at a given level of demand. Service performance relies on service capacity, which describes the maximum throughput of a service or service component.

Capacity and performance management activities include:

•Service performance and capacity analysis, including monitoring current performance and service modelling; and

•Service performance and capacity planning, including requirements analysis, demand forecasting and resource planning, and performance improvement planning.

Poor capacity and performance management capabilities can have a severe impact on service performance and, consequently, on customer satisfaction.

Practice considerations

Capacity and performance management is about more than just technology – it’s about putting policies and procedures in place to govern capacity.

Business sponsors do not always respond well to requests from IT for more capacity investment (or to being asked to use less capacity). Increased access to home computing and the falling price of storage media both contribute to a ‘just buy more space’ attitude from the business. When implementing capacity and performance management, it’s important to educate the business about the cost of capacity. This isn’t just the purchase price of storage, but the ongoing costs of housing, maintaining and backing up the data stored on it. This applies to both Cloud solutions and storage owned and managed by an organisation itself.

The information being stored by the business should be classified with a policy supporting sensible retention. Data can also be archived if it isn’t regularly used, with access available on request.

Investment in capacity should be made ‘just in time’. This means that capacity is bought when needed – it doesn’t mean that it’s bought at the last minute! If an organisation buys capacity too soon, it risks having idle, unused space. A delay in investment might benefit the organisation if the cost of technology falls significantly. If an organisation buys capacity too late, the performance of a service might have already been affected.

Cloud hosting services deliver significant benefits for capacity and performance management, providing scalable solutions that can be easily changed. They do, however, also introduce risks and need to be managed carefully to ensure ongoing value for money.

Change enablement

“The purpose of the change enablement practice is to maximize the number of successful service and product changes by ensuring that risks have been properly assessed, authorizing changes to proceed, and managing the change schedule.”

“A change is the addition, modification, or removal of anything that could have a direct or indirect effect on services.”

In ITIL, the person or group that authorises a change is referred to as the change authority. Change authority may be decentralised in organisations working at high speed and in agile environments, meaning peer review is more important and becomes an indicator of high performance. The change schedule is used to help plan changes, assist in communication, avoid conflicts and assign resources.

Change enablement must balance delivering benefits through successful changes and protecting live service from harmful changes.

Table 23 shows the change types:

Table 23: Types of Change

Standard change	Standard changes are low-risk, pre-authorised changes. They are well understood and documented so they can be implemented without additional authorisation. An example could be giving a new starter access to a piece of approved software.
Normal change	Normal changes need to be scheduled, assessed and authorised via the organisation’s defined process. Lower-risk changes will need less scrutiny than high-risk changes. Many organisations have tools in place that manage change request workflows, automating the process where it makes sense to do so.
Emergency change	Emergency changes need to be implemented as soon as possible, perhaps in response to an issue or a security breach. They are assessed and authorised when possible, but some steps (e.g. testing) might be left out if the level of urgency justifies it. There may be a separate change authority for emergency changes.

Each organisation will define its own scope for change enablement. This often includes:

•IT infrastructure

•Applications

•Documentation

•Processes

•Supplier relationships

•Any other relevant areas

Practice considerations

It is very rare to find an organisation with no change enablement whatsoever. Most organisations have felt the pain of a failed change, or changes, and put in place some kind of process to try to stop it happening again. If you’re looking at change enablement for your organisation, try to find out what already exists and see if you can build on that.

It is normal to face resistance to the introduction of more formal change oversight where there has been none before. Staff might feel they have to ask permission to do their job or perceive that their skills are being doubted. It’s important to sell the benefits of change enablement and communicate that every change is linked and has the potential to affect service elsewhere.

You will really only have one chance with change enablement. Get it wrong, and any future efforts will be met with ‘we tried that, and it was too bureaucratic/not effective’, etc. Think carefully about the level of control that’s actually needed – not too much and not too little.

The common response from technical teams during the implementation of change enablement is for them to ask ‘What about x? Is x a change?’ and ‘What about z? Is z a change?’

The short answer is anything that can affect services or customers is a change. You can sell the process to staff by pointing out how it will protect them, as well as protecting services; if a change goes wrong, the organisation will look at the failings in the change enablement process, not at a particular team member’s performance.

Agile/DevOps environments will often automate some/all of their change enablement. If code is being released frequently, an automated delivery pipeline with staged tests can ensure there is no impact on the live environment. If a test is failed, the code is pulled out. Automated change enablement must still generate logs of some kind to show what changes have been made, and the tests being used will need continual review.

Incident management

“The purpose of the incident management practice is to minimize the negative impact of incidents by restoring normal service operation as quickly as possible.”

An incident is “an unplanned interruption to or reduction in the quality of a service”.

Incidents need to be logged, prioritised and resolved within agreed timescales. They might be escalated to a support team for resolution, depending on the product or service affected and how quickly the resolution is required. Incident management needs to include quality, timely updates to the affected user(s), which requires a high level of collaboration between teams.

The incident management practice activities include:

•Design the incident management practice: the practice has to react appropriately to different incident types, depending on their impact. Major incidents and information security incidents might require special handling.

•Prioritise incidents: incidents with the highest impact and urgency need to be resolved first. Classifications and timescales are agreed with consumers.

•Use an incident management tool to log and manage incidents: the tool may provide links to changes, known errors and knowledge articles. It may also provide incident matching and links to problems.

Swarming

Some organisations use an incident management technique called swarming. A group of stakeholders work together until it becomes clear who is best placed to continue with the incident. Collaboration like this supports information sharing and provides learning opportunities within teams. This approach differs from traditional incident management practices, which escalate from first-line support teams to second-line, third-line, etc.

Practice considerations

In many organisations, incident management is the responsibility of the service desk, and the relationship between the service desk and second-line support teams may be poor. The service desk might see second-line support as being arrogant techies who bounce incidents back unnecessarily, won’t talk to users and fix things in the order they feel like fixing them. Second-line support teams see the service desk as a nuisance, with unqualified staff who send poorly documented incidents that they could have fixed themselves. These relationship issues can be compounded when teams are part of different organisations (for example, second-line support is provided by an external organisation).

Neither of these opinions is entirely true!

Effective incident management can help to break down the barriers between the teams. Incident models (contributed to by second-line staff) make sure the right data is recorded and the incident is sent to the right place. Timescales and priorities make sure incidents are fixed in the right order, with no argument about what should be done first.

Incident management is often an ideal candidate for automation, and many mature tools exist for this process. When selecting a tool, look for ease-of-use functionality, e.g. integration with phone systems, the ability to auto-populate fields and excellent reporting.

Incident management can really help to build buy-in for a service management implementation. It can provide fast improvements for end users along with brilliant management information for IT. If you can start with a practice that delivers results quickly, it can be easier to make a case for investment in other service management processes.

One area that trips up a lot of organisations is the level of categorisation they introduce. Keep it simple, and control which roles are allowed to add categories. The category trees in some organisations have hundreds and hundreds of entries when actually a few dozen would do the job. Be very wary of introducing a ‘misc’ or ‘unknown’ category – you might find everything ends up there.

IT asset management

“The purpose of the IT asset management practice is to plan and manage the full lifecycle of all IT assets, to help the organization to:

•Maximize value

•Control costs

•Manage risks

•Support decision making about purchase, reuse, retirement and disposal of assets

•Meet regulatory and contractual requirements.”

An IT asset is “any financially valuable component that can contribute to the delivery of an IT product or service”.

The scope of asset management will usually include:

•Software

•Hardware

•Networking

•Cloud services

•Client devices

It might also include non-IT assets that are part of service delivery, for example a data centre where IT assets are housed.

Types of asset management include:

•IT asset management (ITAM)

•Software asset management (SAM)

•Hardware asset management (HAM)

IT asset management relies on accurate asset information, which is held in an asset register. Some organisations capture this information in a one-off exercise (an audit, or a stocktake-type exercise), but it’s better to update the information in the asset register regularly, using inputs from service management practices like release management that are involved with changes of asset states.

IT asset management activities vary according to the asset type:

•For hardware assets: labelling, location tracking, lifecycle management through to disposal in line with any relevant regulations or legislation.

•For software assets: protection from unlicensed use, licence tracking and management, lifecycle management.

•Cloud-based assets: role management for users with access to Cloud assets, licence and information security management where relevant, cost management.

•Client assets: recording the individuals who own assets, lifecycle management and management of data on client devices.

The practice activities include:

•Defining, populating and maintaining the asset register;

•Providing storage facilities for assets and related media;

•Controlling asset lifecycles;

•Providing reports and data about assets as required; and

•Auditing assets as required, providing data for external audit activities.

Practice considerations

Asset management (and particularly software asset management) is a practice with a very strong business case. A good software asset manager can save an organisation many thousands of dollars in licence fees and help them avoid the significant costs associated with non-compliance. This is a very specialised area, and if your organisation is just starting out with asset management, I recommend you get external advice. SAM consultants will have licence models and suppliers with which they are particularly experienced, and it is often cheaper and easier to use their experience than to go through the learning curve yourself.

The introduction of Cloud services like Amazon Web Services (AWS) and Microsoft Azure has created new challenges for service management professionals. Some of the technical elements of service management (overall capacity and availability planning, for example) now sit with the Cloud services provider, but cost management, information security and access management still sit with the client organisation. I recommend supplier-specific training as well as doing your own online research if you’re responsible for Cloud services or are integrating these into your service management model.

Monitoring and event management

“The purpose of the monitoring and event management practice is to systematically observe services and service components, and record and report selected changes of state identified as events. This practice identifies and prioritizes infrastructure, services, business processes, and information security events, and establishes the appropriate response to those events, including responding to conditions that could lead to potential faults or incidents.”

An event is “any change of state that has significance for the management of a service or other configuration item. Events are typically recognized through notifications created by an IT service, configuration item or monitoring tool.”

This practice manages events through their lifecycle to prevent, minimise or eliminate any negative impact they might have. Monitoring focuses on observing services and service components to detect any potentially significant state changes or conditions. This is almost always done using automation. Event management focuses on recording and managing the outputs from monitoring that are classified as events. Each event will be assessed, and the correct control action initiated. A control action might be ‘do nothing’, continue monitoring, or initiate another practice like incident management. Not all the outputs from monitoring become events.

Events are classified in one of three ways:

•Informational – no action required, but may form part of a trend or data for analysis.

•Warning – action needs to be taken to avoid negative impact.

•Exception – action required; negative business impact might already have occurred.

As well as classifying events, monitoring and event management processes and procedures must address:

•Identifying what to monitor;

•Implementing and maintaining monitoring;

•Establishing and maintaining thresholds and criteria to identify and classify events;

•Establishing and maintaining policies for event management; and

•Implementing and improving processes and automation for monitoring and event management.

Although much of the monitoring and event management activities are automated, human intervention is still essential. Organisations will use a mix of in-built monitoring capabilities in service components, custom monitoring tools, and human intervention and oversight.

Practice considerations

I’ve seen monitoring and event management implementations go wrong in two main ways. The first is when not enough information is collected. Even though processes are in place, tools aren’t picking up the right events and service is still being affected. This makes it very hard to justify further investment in the process.

The second is when too much information is collected and passed to operations teams; they end up swamped and start to ignore events – even the important ones. The process needs to constantly refine what is being collected and where it is sent, to make sure the important information is not missed.

When implementing monitoring and event management, it’s best to start with a simple set of measures, and then add to them as the practice matures. Incidents and problems can be used as learning opportunities to identify whether further monitoring would be beneficial.

It’s also very important to publicise the success of monitoring and event management. Because the practice works proactively and can prevent incidents before they occur, there’s a danger that its successes are rarely seen outside of operations. If monitoring and event management is proactively preventing incidents and improving the level of service on offer, this should be used to justify further investment.

Problem management

“The purpose of the problem management practice is to reduce the likelihood and impact of incidents by identifying actual and potential causes of incidents and managing workarounds and known errors.”

“A problem is a cause, or potential cause, of one or more incidents.”

Problems require investigation and analysis to identify the causes, develop workarounds, and recommend longer-term resolution to reduce the number and impact of future incidents.

“A workaround is a solution that reduces or eliminates the impact of an incident or problem for which a full resolution is not yet available. Some workarounds reduce the likelihood of incidents.”

Workarounds are documented in problem records, and then reviewed and improved as problem analysis progresses. A workaround can be as simple as asking a user to reboot a PC.

“A known error is a problem that has been analyzed but not yet resolved.”

Known errors are documented and made available to other practices, for example, the service desk.

Figure 12 shows the three phases of problem management:

Figure 12: The phases of problem management¹²

Problem identification activities identify and log problems.

They include:

•Trend analysis of data including incident records;

•Detecting recurring issues;

•Identifying whether major incidents might recur;

•Working with suppliers and partners; and

•Analysing information from developers, testing and project teams.

Problem control activities include:

•Problem analysis, prioritisation and management based on risk;

•Documenting workarounds; and

•Documenting known errors.

Error control activities include:

•Identifying potential permanent solutions;

•Ongoing management and reassessment of known errors; and

•Ongoing management and improvement of workarounds.

Problem management interfaces with other practices including change enablement and incident management as part of its role.

Practice considerations

Problem management needs to be implemented carefully. If it is set up as a separate team or group, there is a danger it will just become a dumping ground for everything the other support staff find too difficult – similar to the issues often experienced by ‘continual improvement’ teams.

Problem management should be viewed as a point of coordination for investigation and resolution of problems. Its role is to work with multiple support teams at once, making sure investigation is carried out in a sensible way and actions are documented so they are not duplicated.

Sometimes an organisation will experience a problem that no one wants to take responsibility for. The network team will insist it’s a server issue, and the server team will insist it’s an application fault. In this situation, problem management needs to have the authority to get the teams to work together, eliminating possible causes until the true cause is identified. In a multi-supplier environment, a service integrator might need to be involved with problem management across the supply chain.

Many organisations struggle to differentiate between incidents and problems. It is essential to define and clearly communicate the scope of the two practices. Just because an incident has been open for a long time doesn’t make it a problem.

Service management tools can significantly improve the performance and usefulness of problem management. If workarounds are easily searchable and problems can be linked to incident records, the practice will work well. If it’s too hard to find a workaround, the incident management staff won’t bother, and problem management won’t deliver value.

You don’t need to have a mature incident management practice and lots of incident records to do problem management. If your incident records are poor, I recommend conducting a user and IT staff survey to find out what they think the top ten recurring incidents are, and work from that.

Release management

“The purpose of the release management practice is to make new and changed services and features available for use.”

A release is “a version of a service or other configuration item, or a collection of configuration items, that is made available for use”.

A release can include:

•Infrastructure components

•Application components

•Documentation

•Training

•Updated or new processes

•Updated or new tools

•Any other required components

Releases may be very minor (for example, one changed feature) or major, such as a new service. A release plan is used to describe the release, release timing and its components. The release schedule then documents the exact timing for releases, based on times that have been negotiated and agreed with customers where necessary. After a release, a post-implementation review is carried out to identify learning and improvement opportunities, and to make sure customers are satisfied with the release outcomes.

Release management activities will be different in waterfall and Agile environments.

Table 24: Release Management Activities – Agile and Waterfall

Agile	Release management activity may take place after deployment, which is delivered in small increments.
Waterfall	Most planning work is done before the release. Deployment and release management activities may be combined into a single process.

Release management may be ‘big bang’ with all changes taking place at once, or ‘staged’, with pilot releases used to test the release before the full rollout.

In a DevOps environment, release management may be integrated into a continuous integration/continuous delivery toolchain. Releases will often contain components provided by external suppliers, so release management needs to work across organisational boundaries.

Practice considerations

Release management relies on effective design activities. If the product or service design has not defined a transition approach, remedial release management work will be needed to make suitable decisions about how to deploy the new or changed product or service.

Release management also requires close links with change enablement, IT asset management and configuration management. Change enablement provides authorisation for release activities, and configuration management provides information to support planning.

In many organisations, I’ve seen situations where release build and test activities are not allocated enough time and resources in the project plan. Once development is complete, these organisations’ normal way of working is to rush the release, worrying about testing, training and documentation once it has gone live. This is obviously not acceptable and, if this is the case, cultural change may also be required to change the priority from meeting a deadline to delivering a quality product.

Release management will be more challenging in an environment with one or more external suppliers involved in service delivery. It is important to make sure that external suppliers align with any release policies implemented within the organisation and to track their compliance.

The organisational attitude to early life support can also affect release management. Early life support should end when operations confirm they will accept the service, because it is performing as it should. In many organisations, early life support is defined by a time period, not by quality standards. Early life support ends after (for example) two weeks, whether the service is performing well or not. This can lead to a rift between transition and operation staff. Early life support must be based around service performance, not time.

The challenges associated with release management (and many other service management practices) will evolve as technology evolves. Cloud computing and SaaS may make the process simpler, if applications are hosted centrally. Bring your own device (BYOD) gives users more control of the hardware they access services from, and so will make the process more complicated.

Service catalogue management

“The purpose of the service catalogue management practice is to provide a single source of consistent information on all services and service offerings, and to ensure that it is available to the relevant audience.”

The service catalogue contains a list of services that are available to customers (a subset of the information in the service portfolio). The catalogue might be a document or spreadsheet, an online portal or a sophisticated tool; the important functionality is to be able to communicate the right information to its users.

Service catalogue management activities include publishing, updating and editing service and product descriptions and any related information. More sophisticated service catalogues will offer ‘views’ of data depending on the role of the person viewing the catalogue:

•User view – shows information about services and how to request them.

•Customer view – adds service level, financial and performance data.

•IT to IT customer view – shows technical, information security and process information for use in service delivery.

•Request catalogue – shows service requests associated with services, and allows requests for new and existing services.

Practice considerations

It is hard to implement a service catalogue and, once it’s been implemented, it can be hard to keep the information in it up to date.

Service catalogue implementation must start with a clear definition of what a ‘service’ actually is. The service provider organisation must consult its customers during this stage – their view of the services provided may be radically different from what the service provider organisation thinks it provides. Where consultation with consumers is not possible, the service catalogue may develop iteratively over time as feedback from consumers is received.

Once the high-level services have been agreed and the service catalogue is populated, it needs to be kept up to date. Service catalogue management needs to link to practices including change enablement and service portfolio management to help it to do this. If these practices are not in place, service catalogue management will have to work much harder to keep track of service changes and updates.

Don’t rush straight in and buy an expensive toolset as part of the process implementation. Service catalogues often start as a simple spreadsheet or matrix, before evolving to views that are more complex. It’s the information in the catalogue and its accessibility that’s important – not the fanciness of the tool used to support it.

Service configuration management

“The purpose of the service configuration management practice is to ensure that accurate and reliable information about the configuration of services, and the configuration items that support them, is available when and where it is needed. This includes information on how configuration items are configured and the relationships between them.”

A configuration item (CI) is “any component that needs to be managed in order to deliver an IT service”.

Service configuration management collects and manages information about CIs including:

•Hardware

•Software

•Networks

•Buildings

•People

•Suppliers

•Documentation

•Services

Service configuration management differs from IT asset management because it collects information on the CIs that support each service (often using data from IT asset management as one of its sources) and the relationships between them. Figure 13 shows an example of a simplified service model for an IT service.

Figure 13: Simplified service model for a typical IT service¹³

The value provided by service configuration management is indirect; it achieves most value when other practices are using the information it supplies. The effort and cost associated with collecting configuration data must be balanced against the benefit realised by having the data.

Configuration information needs to be stored in a controlled way, often in a configuration management system (CMS). The CMS is a “set of tools, data and information that is used to support service configuration management”. Ideally, the CMS will be part of the same toolset or have interfaces to the tools used by other practices, such as incident management and change enablement.

Configuration management needs to be able to:

•Identify new CIs and add them to the CMS;

•Update configuration data when changes are made;

•Verify configuration records are correct; and

•Audit applications and infrastructure to identify any undocumented CIs.

Practice considerations

Service configuration management is, theoretically, an essential capability for service management, because it underpins and supports so many other service management practices. How can we manage our infrastructure if we don’t know what we’ve got, where it is or if it’s working? However, in reality, there are many service provider organisations with incomplete configuration management information – because it’s seen as too complicated, or the business case for configuration management is not clear.

It’s likely that there will be some asset or configuration management happening in parts of your organisation already. See what exists, and whether you can use anything that’s already in place.

Major projects can be a good opportunity to implement service configuration management. If there’s (for example) an office move or hardware refresh happening, that’s a good time to start collecting data and develop a baseline. Once you have this information, the focus moves to keeping it up to date.

The business case for service configuration management will always focus on indirect, rather than direct, process benefits. Think about how much quicker incidents could be resolved, or how many more successful changes there might be. These are where you will really see the process add value.

It’s very unusual to find service configuration management implemented in isolation. Normally, it will be implemented in conjunction with, or after, change enablement and release management. Without change enablement, there is very little hope of keeping the CMS up to date.

Service continuity management

“The purpose of the service continuity management practice is to ensure that the availability and performance of a service are maintained at sufficient levels in the case of a disaster. The practice provides a framework for building organizational resilience with the capability of providing an effective response that safeguards the interests of key stakeholders and the organization’s reputation, brand and value-creating activities.”

Service continuity management supports overall business continuity management (BCM) and planning, by making sure IT and IT services can be resumed following a disaster or a crisis. It is triggered by a disruption or risk that is outside the scope of normal response practices like incident and problem management.

Each organisation needs to define what ‘disaster’ means to it. The Business Continuity Institute defines a disaster as “a sudden unplanned event that causes great damage or serious loss to an organization. It results in an organization failing to provide critical business functions for some predetermined period of time.”

Disaster sources could be:

•Supply chain failure

•Terrorism

•Weather

•Cyber attack

•Political event

They can affect any of an organisation’s stakeholders, and their impact can include:

•Loss of income

•Reputational damage

•Breach of regulations/fines

•Loss of market share

•Insolvency

Table 25 shows some key definitions for service continuity management.

Table 25: Key Definitions for Service Continuity Management

Recovery time objective (RTO)	“The maximum acceptable period of time following a service disruption that can elapse before the lack of business functionality severely impacts the organization. This represents the maximum agreed time within which a product or an activity must be resumed, or resources must be recovered.”
Recovery point objective (RPO)	“The point to which information used by an activity must be restored to enable the activity to operate on resumption.”
Disaster recovery plans	“A set of clearly defined plans related to how an organization will recover from a disaster as well as return to a pre-disaster condition, considering the four dimensions of service management.”
Business impact analysis (BIA)	“A key activity in the practice of service continuity management that identifies vital business functions (VBFs) and their dependencies. These dependencies may include suppliers, people, other business processes, and IT services. BIA defines the recovery requirements for IT services. These requirements include RTOs, RPOs, and minimum target service levels for each IT service.”

Practice considerations

Service continuity management is another service management practice that often starts with a blaze of glory. Ironically, the best time to get funding to start this process is directly after a period of extreme disruption, when the business realises what it has to lose and why it needs to plan (in a similar way to availability management).

In the initial excitement, consultants will be hired, and documents drawn up, but organisations also need to plan for what happens when the consultants leave.

Has a practice owner been put in place? Is there a genuine commitment to the process? Will ongoing testing be seen as a priority, or gradually dropped in favour of the day job?

The level of commitment to service continuity management will always be linked to the type of organisation and the markets in which it operates. If the organisation operates in a risk-averse, time-critical, heavily regulated market, it is much more likely to invest in service continuity management. Imagine a bank having a service loss and saying ‘sorry, we can’t recover your information’. In fact, you might have seen stories very similar to this make the news!

Service continuity management also tends to become more of a concern as organisations mature. Immature organisations are usually much more focused on day-to-day activities and just hope things won’t happen. Mature organisations will consider the risks more carefully and start to plan for when things go wrong. As organisations grow, they may fall within the scope of legislation and regulation that requires them to have more formal service continuity management in place.

Service design

“The purpose of the service design practice is to design products and services that are fit for purpose, fit for use, and that can be delivered by the organization and its ecosystem. This includes planning and organizing people, partners and suppliers, information, communication, technology, and practices for new or changed products and services, and the interaction between the organization and its customers.”

Poor design leads to services and products that don’t meet customer needs or facilitate value creation. Many organisations follow an iterative and incremental approach to service design, to allow products and services to continually adapt as organisational and customer requirements evolve.

Poor service design can lead to products and services that are expensive to run, don’t operate as expected, and don’t meet customer needs. Improvement programmes may only be able to patch over the cracks; effective design can get things right first time. Service design needs to focus on customer experience (CX) and user experience (UX). This can be achieved by involving customers and users in the design process, and will lead to:

•Customer-centric products and services;

•A holistic approach to service design;

•Better estimates for design projects (time, cost, resources, etc.);

•Higher volumes of successful changes;

•Creation of effective, reusable design methods;

•Increased confidence in the ability to deliver new or changed products and services; and

•Maintainable and effective products and services.

Holistic service design needs to consider:

•Other products and services

•All relevant stakeholders

•Existing architectures

•Technology (current and future required)

•Service management practices and processes

•Measurements and metrics

Design thinking

Design thinking is a “practical and human-centred approach that accelerates innovation”. Design thinking activities include:

•Inspiration and empathy, through observation of people, how they work and how they interact with products and services;

•Ideation, which combines divergent and convergent thinking;

•Prototyping, to test, iterate and refine ideas;

•Implementation, to bring concepts to life; and

•Evaluation, to measure performance and identify opportunities for improvement.

Practice considerations

Product and service design practices have benefited from a lot of attention and innovation in recent years. Evolving ways of working (for example, Agile software development) have changed the way organisations approach design activities. Maintaining a focus on customer and user experience forces design teams to walk in their customers’ shoes.

Service design is more effective when there is input from many different stakeholders, including support teams, customers, users and suppliers. Consider where design activities are taking place in your organisation and challenge any siloed approaches to service design.

Lessons to be learned from design projects can be lost if teams are broken up and quickly reassigned to new work, so if possible, I recommend adding a learning activity to each design project before people move on to new activities. As lessons are learned and shared, good design practices will become automatic. DevOps thinking encourages keeping the same people involved in the design and running of products and services; this is also a way to make sure knowledge stays within the team.

The service desk

“The purpose of the Service Desk practice is to capture demand for incident resolution and service requests. It should also be the entry point and single point of contact for the service provider with all of its users.”

The service desk will capture and funnel demand, including:

•Acknowledge: the user needs to know that their contact has been received; for example, issues reported via email could receive an auto-acknowledgement.

•Classify: classification helps the service desk to understand what it is dealing with and how important it is.

•Own: ownership ensures no issue or request gets ‘lost’ between teams or systems.

•Act: resolving things to the user’s satisfaction.

Possible service desk channels include:

•Telephone

•Service portals

•Mobile applications

•Live chat and chatbots

•Email

•Walk-in

•Text messages and social media messaging

•Public and private discussion forums

Service desks may be centralised or virtual:

•Virtual: agents can work from multiple locations, using technology to allow them to collaborate.

•Centralised: the service desk is a team working in a single location.

Some service desk staff are very technical, whereas others are less technical and work more closely with technical teams within the organisation. Service desk staff skills include:

•Empathy

•Emotional intelligence

•Effective communication

•Customer service skills

•Understanding of business priority, incident analysis and prioritisation

Technologies that support service desks include:

•Intelligent telephony systems

•Workforce management/resource planning systems

•Call recording and quality control

•Dashboard and monitoring tools

•Workflow systems

•Knowledge base

•Remote access tools

•Configuration management systems

Practice considerations

Organisational attitudes to the service desk can be surprising. On one hand, it’s the ‘shop window’ of the organisation and vitally important that users get a good experience. On the other hand, staff can work long hours under pressure, in poorly paid roles with little opportunity for advancement.

If your service desk is important to your organisation, the staff who work there need to be treated with respect. Many people use a service desk role as a way to access jobs in the IT industry – it’s a great place to learn about IT quickly before advancing to other roles. Where possible, I recommend allowing service desk staff to rotate and spend time with other teams, and having clear progression plans in place for people to move away from the service desk. This allows them to progress their careers without the organisation losing all of their valuable knowledge.

Organisations like the Service Desk Institute (SDI) and HDI (formerly the Helpdesk Institute) provide support, resources and events for service desk and support staff.

Service level management

“The purpose of the service level management practice is to set clear business-based targets for service levels, and to ensure that delivery of services is properly assessed, monitored, and managed against those targets.”

A service level is “one or more metrics that define expected or achieved service quality”.

A service level agreement is “a documented agreement between a service provider and a customer that identifies both services required and the expected level of service”.

Service level agreements (SLAs) are used to measure the service performance from the customer’s point of view.

Successful SLAs need to:

•Be related to a defined service so the scope is clear;

•Relate to outcomes, not just operational metrics like ‘99% availability’;

•Reflect an agreement between the customer and service provider organisation; and

•Be simple to read and understand.

Service level management provides end-to-end visibility of an organisation’s services:

•It establishes a shared view of services and target service levels.

•It collects, analyses, stores and reports on relevant metrics.

•It performs service reviews and identifies improvement opportunities.

•It captures and reports on service issues.

Key skills for service level management include:

•Relationship management

•Business liaison

•Business analysis

•Supplier management

Service level management will collect information from:

•Business metrics, which measure business activities such as making a sale, or processing an invoice;

•Operational metrics, which help build a picture of overall performance and whether outcomes are being met;

•Customer feedback, including surveys and defined business-related measures; and

•Customer engagement, including initial conversations and listening, discovery and information capture, measurement and ongoing process discussions, and asking simple, open-ended questions.

Practice considerations

Service level management (SLM) isn’t something that an organisation can implement as a project and then forget about. SLAs need to be reviewed and kept up to date as the organisation grows and services change. Many organisations hire consultants to help them implement SLM, because this gives them access to the consultant’s skills and templates. Once the consultants leave, the documents are dropped into a cupboard, or online document store, and not referenced again. Implementing successful SLM means putting the processes in place to manage the agreements, not just creating documents.

It is also tempting to create huge SLAs that are complex, wordy and cover every eventuality – don’t! Start simply – more information can be added later if needed.

SLAs need to represent consensus between the service provider organisation and its consumers. If one side doesn’t engage with the process, it will fail. For example, in some organisations, customers try to implement SLAs to control a service provider organisation or function they perceive to be failing, but the service provider may simply ignore the targets.

For some services (for example, a Cloud hosting service), the organisation providing the service offers only standard options, with no option for customers to modify or adapt the SLA. Here, the consumer has a simple choice: to accept the SLA or not.

Finally, be clear about what SLM is trying to achieve. Some service provider organisations will adopt an ‘easy’ set of targets that they know they can deliver, even if failures occur. They are frightened that if they fail a target, they will be punished. Accept that targets will be breached – but this is an opportunity for improvement, not to have a fight.

Service request management

“The purpose of the service request management practice is to support the agreed quality of a service by handling all pre-defined, user-initiated service requests in an effective and user-friendly manner.”

A service request is “a request from a user or a user’s authorized representative that initiates a service action which has been agreed as a normal part of service delivery”.

Service requests are different from incidents because they are part of normal service delivery. Nothing has failed. They are handled using predefined and pre-agreed procedures, liaising with change enablement where necessary.

Common types of service request include:

•Request for a service delivery action

•Request for information

•Request for provision of a resource or service

•Request for access to a resource or service

•Feedback, compliments and complaints

Successful service request management relies on these considerations:

•Service request management should be automated and standardised as much as possible.

•Continual improvement should be applied to service request management.

•Policies should be used to allow requests to be fulfilled with appropriate authorisation.

•User expectations should be clearly set.

•Requests that are actually incidents or changes need to be redirected to the appropriate practice.

Service requests can have simple or complex workflows. The steps in the workflows should be well-known and proven. The service provider organisation will agree fulfilment times and provide clear status communication to users. Some service requests can be fulfilled via self-service; for example, requesting a new piece of software or access to a printer.

Practice considerations

As with many service management practices, when you implement service request management (SRM), keep it simple. Most end users have some experience of Internet shopping, which will help them understand the concept of ordering things via a web page. However, if your users only have basic IT literacy, keep things as simple as possible.

The request models you define will be very helpful when you create your workflows and user interface for SRM. For example, you might be able to create one simple request template for ‘new user’, rather than expecting your requestor to know exactly what type of chair, desk, mouse, mouse mat, keyboard, base unit, monitor, etc. to ask for.

When it comes to SRM within your own organisation, you need management support and you need your managers to walk the walk. It’s no use trying to force everyone down a central process, if your management team are walking round with the latest shiny toys they paid for out of their own budget. If the organisational decision is that SRM offers two types of laptop as standard, that’s what your management team need to be using.

The SRM workflows will also need to get the right level of approval in the right place. If a line manager has to sign off a mobile device for a user, this authorisation has to be granted before procurement can start. If you don’t manage authorisation carefully, you might find you’ve spent a lot of money and aren’t able to recoup it from the business.

Service validation and testing

“The purpose of the service validation and testing practice is to ensure that new or changed products and services meet defined requirements. The definition of service value is based on input from customers, business objectives, and regulatory requirements, and is documented as part of the value chain activity of design and transition. These inputs are used to establish measurable quality and performance indicators that support the definition of assurance criteria and testing requirements.”

Service validation focuses on the creation and agreement of deployment and release management acceptance criteria. These will address utility and warranty, and must be based on customer requirements. The acceptance criteria are then measured via testing, based on the organisation’s test strategy.

Test types can include:

•Utility/functional tests

Unit test

System test

Integration test

Regression test

•Warranty/non-functional tests

Performance and capacity tests

Security test

Compliance test

Operational test

Practice considerations

The independence of testing is crucial. It’s good practice to use separate resources for development and testing of a new or changed service where practical. Developers know what a service is meant to do, so may be able to work around unexpected results in ways a user could not. They might also be tempted to ignore failures in favour of meeting a deadline.

The test environment is an area that requires careful consideration. It needs to reflect the live environment, so it must be updated when changes are approved and implemented. This needs to be considered when the scope of change management is being agreed.

Data in the test environment must be protected in the same way that live data is, and licensing also needs to be addressed – some software vendors provide licences for test environments free of charge, but others do not.

From an operations perspective, the test environment can look like an extension of the spares storage area. When a piece of the live infrastructure breaks, it’s tempting to grab something from the test environment that isn’t currently in use. This is dangerous. The test environment must be protected so that the testing carried out delivers valid results.

¹² ITIL® Foundation, ITIL 4 edition, figure 5.23.

¹³ ITIL® Foundation, ITIL 4 edition, figure 5.29.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 13: Service management practices

Create new playlist

Sign In

Sign Up

CHAPTER 13: SERVICE MANAGEMENT PRACTICES

Availability management

Practice considerations

Business analysis

Process considerations

Capacity and performance management

Practice considerations

Change enablement

Practice considerations

Incident management

Swarming

Practice considerations

IT asset management

Practice considerations

Monitoring and event management

Practice considerations

Problem management

Practice considerations

Release management

Practice considerations

Service catalogue management

Practice considerations

Service configuration management

Practice considerations

Service continuity management

Practice considerations

Service design

Design thinking

Practice considerations

The service desk

Practice considerations

Service level management

Practice considerations

Service request management

Practice considerations

Service validation and testing

Practice considerations

Table of Contents for
Chapter 13: Service management practices