Capacity and Availability Management: A Project and Work Management Process Area at Maturity Level 3

Purpose

The purpose of Capacity and Availability Management (CAM) is to ensure effective service system performance and ensure that resources are provided and used effectively to support service requirements.



Introductory Notes

The Capacity and Availability Management process area involves establishing and maintaining capacity and availability at a justifiable cost and with an efficient use of resources. Capacity and availability management activities can be performed at different levels of the organization, including across different services.



The Capacity and Availability Management process area involves the following activities:

• Establishing and maintaining a capacity and availability management strategy

• Providing and allocating resources appropriately

• Monitoring, analyzing, understanding, and reporting on current and future demand for services, use of resources, capacity, service system performance, and service availability

• Determining corrective actions to ensure appropriate capacity and availability while balancing costs against resources needed and supply against demand

“Capacity” is the degree to which one thing can support, hold, process, or produce another thing. In the context of services, capacity can refer to the maximum amount of service delivery or maximum number of service requests that a service system can handle successfully within a fixed period of time. Capacity is a quality attribute. The definition and measurement of capacity can differ for different types of services and service systems and can be defined in the service agreement. In addition, capacity definitions and measures can be derived from service agreements, rather than reflected there. If the service agreement has no explicit capacity requirements, it may still imply derived capacity requirements for the service or service system. For some services, capacity can be the maximum size, volume, or throughput of service system components.



As part of establishing the capacity and availability management strategy, the following are determined:

• Resources appropriate to manage

• Aspects of the service system that affect service availability and should be measured, monitored, analyzed, and managed



“Availability” is the degree to which something is accessible and usable when needed. In the context of services, availability can refer to the set of times, places, and other circumstances in which services are to be delivered, service requests are to be honored, or other aspects of a service agreement are to be valid. Availability is a quality attribute. Different work groups can have different definitions and measurements of availability for different types of services and service systems and for various perspectives of availability (e.g., business perspective, end-user perspective, customer perspective, service provider perspective). The definition of availability requires an understanding of how service system components support service requirements for availability, which can be defined in the service agreement. In addition, availability requirements and measures can both depend on and affect other closely related quality attribute requirements, such as maintainability, reliability, sustainability, and security.



Availability is one of the most visible indicators of service quality in the eyes of the end user and customer. For some services, understanding the relationships among attributes such as reliability and maintainability and availability is important to managing availability.



“Capacity management” is focused on how best to provide resources to meet service requirements. “Availability management” is focused on delivering a sustained level of availability to meet service requirements. However, at a high level, many of the best practices for capacity management and availability management are similar enough to be combined, and they become closely coupled. Capacity management provides the means for achieving sustained availability to meet service requirements. (For some services, it provides spare capacity and resilience as well.)

The simultaneous production and consumption of services is one of the unique characteristics of services. This characteristic presents some challenges for managing the capacity and availability of services. If the capacity and availability to provide the service is not present when demand occurs, the customer must wait, resulting in costs of one kind or another (e.g., lower customer satisfaction, lost business as customers give up on waiting, financial penalties). Costs can also be associated with excess capacity when estimated demand does not occur (e.g., cost of staff on the payroll sitting idle, purchasing costs of excess capacity).



Capacity and availability management includes establishing service system representations and using these representations for the following:

• Supporting negotiation of appropriate service agreements

• Planning

• Making decisions

• Considering corrective actions

• Providing and allocating resources to meet current and future service requirements

“Service system representations,” such as models, simulations, diagrams, maps, and prototypes, provide insight into how a service system will behave given specific work volumes and varieties. These representations can be built using spreadsheets, commercial off-the-shelf (COTS) tools (e.g., simulation packages), or tools developed in house. For some services, the representations can be known as historical baselines, trend analyses, analytical models, analysis of waiting times in queues, simulation models, statistical models (e.g., regression models, time series models), causal models (e.g., probabilistic networks), or application sizing.

The scope of capacity and availability management can be one service system or multiple service systems. If the service provider is operating multiple service systems, capacity and availability management processes can be performed independently on each discrete service system but the organization may realize reduced value.


Capacity, Availability, and Service System Representations

Since capacity and availability are distinct attributes of service systems and their components, a question naturally arises: Why did the CMMI for Services model team place the management of these important properties in the same process area? The introductory notes in this process area suggest two different answers to this question: “... at a high level, many of the best practices for capacity management and availability management are similar enough to be combined, and they become closely coupled. Capacity management provides the means for achieving sustained availability to meet service requirements.” So, capacity and availability are handled together by the model because they share some common goals and practices, and because they are managing a common collection of entities (resources) to achieve objectives that are distinct but intrinsically intertwined.

Additional reasons for integrating capacity management with availability management are that both depend on the use of explicit service system representations. These representations may be integrated in ways yielding both capacity and availability information, and capacity and availability estimates derived from these representations may be dependent on each other. Overall capacity of a service system can often be increased by extending the availability of key resources, either by adding additional resources or by extending their in-service cycle length. Conversely, overall availability of a service can often be increased by providing enlarged capacity. There are too many contexts in which it simply doesn’t make sense to manage capacity and availability independently of each other.

Some CMMI for Services reviewers have also questioned why the model team chose to use the somewhat fuzzy term service system representations rather than the more intuitive term service system models in CAM. The informative material in the CAM process area discusses service system representations at length in SP 1.3 without ever addressing this question, although it is careful to explain why service system representations and process performance models are distinct.

The need for that careful explanation is a hint of the real answer: The term model is already overloaded in the CMMI context with two distinct and specialized meanings (one related to process performance and one related to the collection of CMMI best practices). The CMMI for Services model team concluded that it would create too much confusion to establish a third distinct specialized meaning for the word model when referring to artifacts that describe components, relationships, and properties of service systems. The team selected the term representation rather than model as a way of preventing this potential significant confusion (at the price of occasional questions raised about service system representations).


Related Process Areas

Refer to the Incident Resolution and Prevention process area for more information about identifying, controlling, and addressing incidents.

Refer to the Service Continuity process area for more information about establishing and maintaining plans to ensure continuity of services during and following any significant disruption of normal operations.

Refer to the Service Delivery process area for more information about maintaining the service system.


SSD Add

Refer to the Service System Development process area for more information about developing service systems.


Refer to the Strategic Service Management process area for more information about establishing strategic needs and plans for standard services.

Refer to the Measurement and Analysis process area for more information about specifying measures.

Refer to the Work Planning process area for more information about establishing the service strategy and developing a work plan.



Specific Practices by Goal

SG 1 Prepare for Capacity and Availability Management

Preparation for capacity and availability management is conducted.

Preparation for capacity and availability management includes the following activities:

• Establishing and maintaining a strategy for managing capacity and availability to meet service requirements

• Selecting measures and analytic techniques to support availability and capacity management objectives

• Establishing and maintaining service system representations to understand current capacity, availability, and service system performance (i.e., describe what the normal capacity, availability, and service levels are)

Thresholds are established and maintained to define exception conditions in the service system, recognize breaches or near breaches of service requirements, and identify service incidents. In addition to understanding the capacity and availability of the current service system, capacity, availability, and service levels are estimated based on trends in service resource use, service system performance, and expected service requirements.

SP 1.1 Establish a Capacity and Availability Management Strategy

Establish and maintain a strategy for capacity and availability management.

A strategy for capacity and availability management is based on service requirements, failure and change request trend analysis, current resource use, and service system performance. Service system representations can help to develop a strategy for capacity and availability management. A strategy can address the minimum, maximum, and average use of services (i.e., service resources) over the short, medium, and long term as appropriate for the duration of the service.

It may be appropriate for some services to identify, plan for, and manage the availability of surge capacity or “reach-back” resources to respond to sudden, unexpected increases in demand. For some service types, the management of the obsolescence of certain resources and services factor into the strategy for capacity and availability management.

Service system design documentation can help to determine resources and aspects of the service system to be measured, monitored, analyzed, and managed. However, design documents may not be available or may not accurately and comprehensively reflect all aspects of the live service environment that affect capacity and availability. Therefore, it is important to monitor and analyze actual capacity and availability data. Service strategies, information from day-to-day service delivery and monitoring, and service requirements from current service agreements can assist with these determinations.

Refer to the Service Delivery process area for more information about establishing service agreements.

Refer to the Service System Transition process area for more information about preparing for service system transition.

Refer to the Strategic Service Management process area for more information about establishing standard services.

The strategy for capacity and availability management can reflect factors such as constraints due to limited customer funding and the customer’s acceptance of certain risks related to capacity and availability.

The service provider may not be able to influence or control demand and resource adjustments but is still required to formulate a strategy that best meets service requirements. If the service provider can influence or control demand and resource adjustments, the strategy can be more sophisticated than in situations in which the service provider cannot exercise such influence or control.

Example Work Products

1. Capacity and availability management strategy

Subpractices

1. Document resource and service use, performance, and availability.

2. Estimate future resource and service capacity and availability requirements.

3. Develop a capacity strategy that meets service requirements, meets the demand for resources and services, and addresses how resources are provided, used, and allocated.

4. Develop an availability strategy that meets service requirements and addresses delivering a sustained level of availability.

It may be appropriate for some services to include in the strategy an availability testing schedule, a service system maintenance strategy, and planned service outages.

Refer to the Service Continuity process area for more information about preparing for service continuity.

Refer to the Service Delivery process area for more information about maintaining the service system.

Refer to the Service System Transition process area for more information about preparing for service system transition.

5. Document monetized costs and benefits of the strategy and any assumptions.

6. Periodically revise the strategy.

It may also be necessary to revise the strategy on an event-driven basis.

SP 1.2 Select Measures and Analytic Techniques

Select measures and analytic techniques to be used in managing the capacity and availability of the service system.

The measures specified for managing capacity and availability can require the collection of business data, financial data, service data, technical data, service resource use data, performance data, and other data about the capacity and availability of the service system. Measurement objectives and the selection of measures and analytic techniques for capacity and availability management are largely influenced by the service agreement and specific properties of the service system.

Considerations for selection of measures also include which activities are being supported, reporting requirements, and how the information will be used. Supplier agreements should reflect or support the selected measures and analytic techniques as appropriate.

Refer to the Service Delivery process area for more information about establishing service agreements.

Refer to the Measurement and Analysis process area for more information about aligning measurement and analysis activities.

Refer to the Supplier Agreement Management process area for more information about establishing supplier agreements.



Example Work Products

1. Operational definitions of capacity and availability measures

2. Traceability of capacity and availability measures to service requirements

3. Tools to support collection and analysis of capacity and availability data

4. Target measures or ranges to be met for selected measured attributes

Subpractices

1. Identify measures from organizational process assets that support capacity and availability management objectives.

2. Identify and specify additional measures that may be needed to support achieving capacity and availability management objectives for the service.

3. Analyze the relationship between identified measures and service requirements, and derive objectives that state specific target measures or ranges to be met for each measured attribute.

This analysis can provide input to the descriptions of standard services and service levels.

Refer to the Strategic Service Management process area for more information about establishing standard services.

SP 1.3 Establish Service System Representations

Establish and maintain service system representations to support capacity and availability management.

Service system representations provide insight into how the service system will behave given specific work volumes and varieties. These insights are used to support decision making about resource allocation, changes to the service system, service agreements, and other aspects of service management and delivery.

For many services, demand fluctuates widely. Managing services in the face of widely fluctuating demand is one of the unique challenges characteristic of services. Depending on the patterns of fluctuation, the representations can focus on small or medium time intervals (e.g., by hour of the day for work shift scheduling, day of the week, month of the year) or longer time intervals (e.g., seasons of the year, bi-annually, annually).

Estimated growth of the use of service resources is formulated using collected capacity and availability data, estimated service requirements, and service system representations.

Measurement objectives and specific properties of the service system determine the nature and extent of a service system representation. (The service agreement has a major influence on the measurement objectives.) Experience, historical data, modeling expertise, and current resource use can also influence the nature of a service system representation.

Refer to the Measurement and Analysis process area for more information about establishing measurement objectives and specifying analysis procedures.

Representations can be used to analyze the impact of change requests that are likely to affect availability and capacity. Representations can also be used to characterize the range of future demand that can be met and the impact of required service levels on the service system. Before representations of future behavior or service system performance can be established, descriptions of the normal use of service resources and service system performance should be established.



Service system representations can be established to provide input to support development of the service agreement and descriptions of standard services and service levels.

Refer to the Service Delivery process area for more information about establishing service agreements.

Refer to the Strategic Service Management process area for more information about establishing standard services.

Service system representations can be established during design of the service system. However, even if great care is taken during the design and development of the service system to ensure that it can meet service requirements over a wide range of operating conditions, service management and delivery should sustain the required levels of service system performance and quality during transition and operation.


SSD Add

Refer to the Service System Development process area for more information about developing service systems.


Service system representations are maintained throughout the service lifecycle.

Service system representations are generally not the same as the process performance baselines and models established in Organizational Process Performance (OPP) at levels 4 and 5. Several things distinguish representations from process performance baselines and models:

• OPP process performance models and baselines involve the use of statistical techniques to assist in developing an understanding of the performance or predicted performance of processes. Service system representations are not typically required to be developed in this way.

• Representations established in CAM are not required to be based on data collected from using the organization’s set of standard processes.

• The focus of OPP is on process performance baselines and models. In addition to process data, the focus of CAM’s service system representations includes non-process data, people, and other parts of the service system such as infrastructure and automated systems.

• Service system representations are established to support capacity and availability analysis specifically. This scope is narrower than the scope of OPP practices.

Refer to the Organizational Process Performance process area for more information about establishing performance baselines and models.

Although not required for capacity and availability management, representations provide opportunities to use statistical techniques such as statistical process control. These techniques can be used to quantitatively manage service system performance and quality and to improve service system capability.

Refer to the Quantitative Work Management process area for more information about quantitatively managing the work to achieve the established quality and process performance objectives for the work.

Example Work Products

1. Representations of resource and service use

2. Representations of service levels

3. Data on the use of resources and services

4. Data on current service levels delivered

5. Thresholds that define exception conditions and breaches

Subpractices

1. Collect measurements on the use of resources and services and the current service levels delivered.

2. Establish and maintain descriptions of the normal use of service resources and service system performance.

For some services, it may be advisable to establish general systems flow charts to identify the service system and its processes before determining the service system’s current capacity, which can require determining the capacity of service system components.

3. Establish and maintain service system representations from collected measurements and analyses.

For some services, it may be advisable to estimate the capacity of the service system at peak work volumes.

4. Review and get agreement with relevant stakeholders about the descriptions of the normal use of service resources, service system performance, and service system representations.

5. Make available the descriptions of the normal use of service resources, service system performance, and service system representations.

6. Establish and maintain thresholds associated with demand, workload, use of service resources, and service system performance to define exception conditions in the service system and breaches or near breaches of service requirements.

Thresholds are typically set below the level at which an exception condition or breach of service requirement occurs to allow corrective action to prevent the breach of service requirement, over-use of resources, or poor service system performance.

SG 2 Monitor and Analyze Capacity and Availability

Capacity and availability are monitored and analyzed to manage resources and demand.

The contribution of each service system component to meeting service requirements is analyzed to successfully manage the capacity and availability of services. The efficient use of resources is managed according to the capacity and availability management strategy, which is developed to meet service requirements. It might not be possible for a service organization to influence demand for services and the requirement to do so is not implied by the phrase “manage resources and demand.” Efficient use of resources can include both reactive and proactive responses. Proactive responses are possible in situations in which the service provider can influence demand.

Actual capacity and availability data are monitored regularly. This actual data are also compared regularly with thresholds, descriptions of normal and expected use, and business objectives. These comparisons identify exception conditions in the service system, breaches or near-breaches of service requirements, and changes in the patterns of use of service system resources that can indicate trends. For example, regular monitoring of actual service resource use against estimated service resource use might reveal a pending breach of service requirements.

SP 2.1 Monitor and Analyze Capacity

Monitor and analyze capacity against thresholds.

The use of each service resource is documented as well as the use of each resource by each service (i.e., the extent or degree of use by each service for a given service resource). The impact of service component failures on resources is analyzed.

It can be appropriate for some services to monitor use of surge capacity or “reach-back” resources and determine whether corrective actions are needed such as adjustments to resources provided, adjustments to thresholds, or adjustments to descriptions of the normal use of service resources and service system performance.

The need for corrective actions can be identified as a result of monitoring and analyzing capacity and availability or in response to service incidents, change requests, changes to service requirements (current and future) or to improve service system performance or prevent breaches of the service agreement.

Refer to the Measurement and Analysis process area for more information about specifying data collection and storage procedures.

Example Work Products

1. Service resource use data

2. Growth analysis of service use

3. List of resources not used as estimated

Subpractices

1. Monitor the use of service resources against thresholds, descriptions of normal use, and service system performance.

Refer to the Work Monitoring and Control process area for more information about monitoring work planning parameters.

2. Monitor service response times.

3. Identify breaches of thresholds and exception conditions.

Breaches of thresholds and exception conditions can constitute or indicate an incident.

Refer to the Incident Resolution and Prevention process area for more information about identifying, controlling, and addressing incidents.

Refer to the Service Delivery process area for more information about operating the service system.

4. Determine the corrective action to be taken.

Corrective actions include adjustments to resources and services to prevent performance problems or improve service performance. Adjustments can be automated, performed manually, or both.


SSD Add

Refer to the Service System Development process area for more information about developing service systems.


Refer to the Work Monitoring and Control process area for more information about managing corrective action to closure.

5. Estimate future changes (either growth or reduction) in the use of resources and services.

Methods and tools for estimating service system behavior include trend analysis, analytical modeling, simulation modeling, baseline models, and application sizing.

Estimates of growth in the use of resources can be based on collected capacity and availability data, estimated service requirements, and service system representations.

6. Store capacity and availability data, specifications, analysis results, and monitoring data.

SP 2.2 Monitor and Analyze Availability

Monitor and analyze availability against targets.

To prevent the failure of service system components and support the availability of the system, the service system must be monitored. At a minimum, availability is monitored. Other quality attributes can be appropriate to monitor depending on the type of service provided. Reliability and maintainability are other quality attributes that can be appropriate to monitor for many types of service systems. Resilience of the service system to service component failure can also be monitored and the impacts of specific failures on service system availability can be identified.

Example Work Products

1. Alarm data

2. Availability data

3. Reliability data

4. Maintainability data

Subpractices

1. Monitor availability, reliability, and maintainability against their requirements.

2. Analyze trends in availability, reliability, and maintainability.

For some services, it may be advisable to perform failure trend analysis as well.

3. Identify breaches of availability, reliability, and maintainability requirements.

Refer to the Incident Resolution and Prevention process area for more information about identifying, controlling, and addressing incidents.

4. Determine the corrective actions to be taken.

Refer to the Service Delivery process area for more information about maintaining the service system.

Refer to the Work Monitoring and Control process area for more information about managing corrective action to closure.

SP 2.3 Report Capacity and Availability Management Data

Report capacity and availability management data to relevant stakeholders.

Reports are provided to relevant stakeholders that summarize information about capacity and availability. These reports support monitoring against the service agreement and service reviews. How data are reported strongly influences how much benefit is derived from capacity and availability management.

Refer to the Work Monitoring and Control process area for more information about monitoring the work against the plan.

Service agreements and supplier agreements can define the information to be reported, to whom it should be delivered, and how it is provided (e.g., format, detail, distribution, media). The information should be appropriate to the audience, which means it should be understandable (e.g., not overly technical) and it may need to address multiple perspectives. These perspectives can include business, end user, customer, or service provider perspectives.

Capacity and availability reports can be regular or ad hoc, depending on what is in the service agreement. For some services, reporting can be greatly simplified by the use of databases offering automated reporting features. Organizational reporting standards should be followed and standard tools and techniques should be used when they exist to support the integration and consolidation of information in the reports.

Refer to the Service Delivery process area for more information about establishing service agreements.

Refer to the Organizational Process Definition process area for more information about establishing standard processes.

Refer to the Supplier Agreement Management process area for more information about establishing supplier agreements.

Availability is often reported as a percentage. In addition to reporting availability, some service providers also report on reliability (e.g., reliability of the service, reliability of service system components) because it is required in the service agreement. The service agreement can also require reporting on maintainability and other quality attributes.

Example Work Products

1. Service system performance reports

2. Service resource use reports

3. Service resource use projections

4. Service availability reports

Subpractices

1. Report the performance and use of resources and services.

2. Report exception conditions in the service system and breaches of service requirements.

3. Report data from monitoring against growth estimates in resource and service use.

4. Report the availability, reliability, and maintainability of resources and services.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.163.207