Chapter 15. Capacity Planning

Introduction

The focus of this chapter is how to plan for the adequate capacity of computer resources within an infrastructure. Just as your perception of whether a cup is either half full or half empty may indicate whether you are an optimist or a pessimist, so also may a person’s view of resource capacity indicate his or her business perception of IT. For example, a server operating at 60-percent capacity may be great news to a performance specialist who is trying to optimally tune response times. But, to an IT financial analyst trying to optimize resources from a cost standpoint, this may be disturbing news of unused resources and wasted costs. This chapter explains and bridges these two perspectives.

We start as usual with a formal definition of the process. This is followed by some of the reasons that capacity planning is seldom done very well in most infrastructures. Next is a list of the key steps necessary for developing and implementing an effective capacity planning process. Included within this discussion is the identification of the resources most commonly involved with capacity planning, itemized in approximate priority order. Additional benefits, helpful hints, and hidden costs of upgrades round out the remainder of this chapter.

Definition of Capacity Planning

As its name implies, the systems management discipline of capacity planning involves the planning of various kinds of resource capacities for an infrastructure.

As we will see, ensuring adequate capacity involves four key elements that are underscored in this definition:

• The type of resource capacities required, such as servers, disk space, or bandwidth

• The size or quantities of the resource in question

• The exact timing of when the additional capacity is needed

• Decisions about capacity that are based on sound, thorough forecasts of anticipated workload demands

Later in this chapter we will look at the steps necessary to design an effective capacity planning program. These four elements are an integral part of such a process. But first we will discuss why capacity planning is seldom done well in most infrastructure organizations.

Why Capacity Planning Is Seldom Done Well

There are two activities in the management of infrastructures that historically are not done well, if at all. These are documentation and capacity planning. The reason for poor, little, or no documentation is straightforward. Few individuals have the desire or the ability to produce quality technical writing. Managers do not always help the situation—many of them do not emphasize the importance of documentation, so the writing of procedures drops to a low priority and is often overlooked and forgotten until the time when it is needed in a critical situation.

But what about capacity planning? Almost every infrastructure manager and most analysts will acknowledge the importance of ensuring that adequate capacity is planned for and provided. There is nothing inherently difficult or complex about developing a sound capacity planning program. So why is it so seldom done well?

In my experience, there are seven primary reasons why many infrastructures fail at implementing an effective capacity planning program (as detailed in the following list). We will discuss each of these reasons and suggest corrective actions.

  1. Analysts are too busy with day-to-day activities.
  2. Users are not interested in predicting future workloads.
  3. Users who are interested cannot forecast accurately.
  4. Capacity planners may be reluctant to use effective measuring tools.
  5. Corporate or IT directions may change from year to year.
  6. Planning is typically not part of an infrastructure culture.
  7. Managers sometimes confuse capacity management with capacity planning.

1. Analysts are Too Busy with Day-To-Day Activities

The two groups of people who need to be most involved with an effective capacity planning process are systems analysts from the infrastructure area and programmer analysts from the application development area. But these two groups of analysts are typically the ones most involved with the day-to-day activities of maintenance, troubleshooting, tuning, and new installations. Little time is set aside for planning activities.

The best way to combat this focus on the tactical is to assign a group within the infrastructure to be responsible for capacity planning. It may start out with only one person designated as the process owner. This individual should be empowered to negotiate with developers and users on capacity planning issues, always being assured of executive support from the development side.

2. Users are Not Interested in Predicting Future Workloads

Predicting accurate future workloads is one of the cornerstones of a worthwhile capacity plan. But just as many IT professionals tend to focus on tactical issues, so also do end-users. Their emphasis is usually on the here and now, not on future growth in workloads.

Developers can help capacity planners mitigate this tendency in two ways:

  1. Explaining to end-users how accurate workload forecasts assist in justifying additional computer capacity to ensure acceptable system performance in the future.
  2. Working with capacity planners to simplify the future workload worksheet to make it easier for users to understand it and to fill it out.

3. Users Who are Interested Cannot Forecast Accurately

Some end-users clearly understand the need to forecast workload increases to ensure acceptable future performance, but they do not have the skills, experience, or tools to do so. Joint consultations with both developers and capacity planners who can show users how to do this can help alleviate this drawback.

4. Capacity Planners May be Reluctant to Use Effective Measuring Tools

Newly appointed capacity planners are sometimes reluctant to use new or complex measurement tools that they may have just inherited. Cross-training, documentation, consultation with the vendor, and turnover from prior users of the tool can help overcome this reluctance.

5. Corporate or IT Directions May Change From Year to Year

One of the most frequent reasons I hear for the lack of comprehensive capacity plans is that strategic directions within a corporation and even an IT organization change so rapidly that any attempt at strategic capacity planning becomes futile. While it is true that corporate mergers, acquisitions, and redirections may dramatically alter a capacity plan, the fact is that the actual process of developing the plan has inherent benefits. I will discuss some of these benefits later in this chapter.

6. Planning is Typically Not Part of an Infrastructure Culture

My many years of experience with infrastructures bear this out. Most infrastructures I worked with were created to manage the day-to-day tactical operations of an IT production environment. What little planning was done was usually at a low priority and often focused mainly on budget planning.

Many infrastructures today still have no formal planning activities chartered within their groups, leaving all technical planning to other areas inside IT. This is slowly changing with world-class infrastructures realizing the necessity and benefits of sound capacity planning. A dedicated planning group for infrastructures is suggested.

7. Managers Sometimes Confuse Capacity Management with Capacity Planning

Capacity management involves optimizing the utilization or performance of infrastructure resources. Managing disk space to ensure that maximum use is occurring is a common example, but this is not capacity planning. Capacity management is a tactical activity that focuses on the present. Capacity planning is a strategic activity that focuses on the future. Understanding this difference should help minimize confusion between the two.

How to Develop an Effective Capacity Planning Process

The following list details the nine major steps associated with implementing a sound capacity planning process. A thorough discussion of each of them follows.

  1. Select an appropriate capacity planning process owner.
  2. Identify the key resources to be measured.
  3. Measure the utilizations or performance of the resources.
  4. Compare utilizations to maximum capacities.
  5. Collect workload forecasts from developers and users.
  6. Transform workload forecasts into IT resource requirements.
  7. Map requirements onto existing utilizations.
  8. Predict when the shop will be out of capacity.
  9. Update forecasts and utilizations.

Step 1: Select an Appropriate Capacity Planning Process Owner

The first step in developing a robust capacity planning process is to select an appropriately qualified individual to serve as the process owner. This person is responsible for designing, implementing, and maintaining the process and is empowered to negotiate and delegate with developers and other support groups.

First and foremost, this individual must be able to communicate effectively with developers because much of the success and credibility of a capacity plan depends on accurate input and constructive feedback from developers to infrastructure planners. This person also must be knowledgeable on systems and network software and components, as well as with software and hardware configurations.

Several other medium- and lower-priority characteristics are recommended in selecting the capacity planning process owner (see Table 15-1). These traits and their priorities obviously vary from shop to shop, depending on the types of applications provided and services offered.

Table 15-1. Prioritized Characteristics for a Capacity Planning Process Owner

image

Step 2: Identify the Key Resources to be Measured

Once the process owner is selected, one of his or her first tasks is to identify the infrastructure resources that must have their utilizations or performance measured. This determination is made based on current knowledge about which resources are most critical to meeting future capacity needs. In many shops, these resources revolve around network bandwidth, the number and speed of server processors, or the number, size, or density of disk volumes comprising centralized secondary storage. A more complete list of possible resources follows:

  1. Network bandwidth
  2. Centralized disk space
  3. Centralized processors in servers
  4. Channels
  5. Tape drives
  6. Centralized memory in servers
  7. Centralized printers
  8. Desktop processors
  9. Desktop disk space
  10. Desktop memory

Step 3: Measure the Utilizations or Performance of the Resources

The resources identified in Step 2 should now be measured as to their utilizations or performance. These measurements provide two key pieces of information.

  1. A utilization baseline from which future trends can be predicted and analyzed.
  2. The quantity of excess capacity available for each component.

For example, a critical server may be running at an average of 60-percent utilization during peak periods on a daily basis. These daily figures can be averaged and plotted on a weekly and monthly basis to enable trending analysis.

Resource utilizations are normally measured using several different tools. Each tool contributes a different component to the overall utilization matrix. One tool may provide processor and disk channel utilizations. Another may supply information on disk-space utilization; still another may provide insight into how much of that space is actually being used within databases.

This last tool can be very valuable. Databases are often pre-allocated by database administrators to a size that they feel supports growth over a reasonable period of time. Knowing how full those databases actually are, and how quickly they are filling up, provides a more accurate picture of disk space utilization. In environments where machines are used as database servers, this information is often known only to the database administrators. In these cases, it is important to establish an open dialog between capacity planners and database administrators and to obtain access to a tool that provides this crucial information.

Step 4: Compare Utilizations to Maximum Capacities

The intent here is to determine how much excess capacity is available for selected components. The utilization or performance of each component measured should be compared to the maximum usable capacity. Note that the maximum usable is almost always less than the maximum possible. The maximum usable server capacity, for example, is usually only 80 to 90 percent. Similar limitations apply for network bandwidth and cache storage hit ratios. By extrapolating the utilization trending reports and comparing them to the maximum usable capacity, the process owner should now be able to estimate at what point a given resource is likely to exhaust its excess capacity.

Step 5: Collect Workload Forecasts from Developers and Users

This is one of the most critical steps in the entire capacity planning process, and it is the one over which you have the least control. Developers are usually asked to help users complete IT workload forecasts. As in many instances of this type, the output is only as good as the input. Working with developers and some selected pilot users in designing a simple yet effective worksheet can go a long way to easing this step. Figure 15-1 shows a sample user workload forecast worksheet. This should be customized as much as possible to meet the unique requirements of your particular environment.

Figure 15-1. Sample User Workload Forecast Worksheet

image

Step 6: Transform Workload Forecasts into IT Resource Requirements

After the workload forecasts are collected, the projected changes must be transformed into IT resource requirements. Sophisticated measurement tools or a senior analyst’s expertise can help in changing projected transaction loads, for example, into increased capacity of server processors. The worksheets also allow you to project the estimated time frames during which workload increases will occur. For major application workloads, it is wise to utilize the performance centers that key suppliers of the servers, database software, and enterprise applications now offer.

Step 7: Map Requirements onto Existing Utilizations

The projected resource requirements derived from the workload projections of the users in Step 6 are now mapped onto the charts of excess utilization from Step 4. This mapping shows the quantity of new capacity that will be needed by each component to meet expected demand.

Step 8: Predict When the Shop Will Be Out of Capacity

The mapping of the quantity of additional capacity needed to meet projected workload demands also pinpoints the time frame during which these upgraded resources will be required.

Step 9: Update Forecasts and Utilizations

The process of capacity planning is not a one-shot event but rather an ongoing activity. Its maximum benefit is derived from continually updating the plan and keeping it current. The plan should be updated at least once per year. Shops that use this methodology best are the shops that update their plans every quarter. Note that the production acceptance process also uses a form of capacity planning when determining resource requirements for new applications.

Additional Benefits of Capacity Planning

Along with enabling analysts to assess when, how much, and what type of additional hardware resources will be needed, a comprehensive capacity planning program offers other benefits as well. Four of these advantages are as follows:

  1. Strengthens relationships with developers and end-users
  2. Improves communications with suppliers
  3. Encourages collaboration with other infrastructure groups
  4. Promotes a culture of strategic planning as opposed to tactical firefighting

1. Strengthens Relationships with Developers and End-Users

The process of identifying and meeting with key users to discuss anticipated workloads usually strengthens the relationships between IT infrastructure staff and end-using customers. Communication, negotiation, and a sense of joint ownership can all combine to nurture a healthy, professional relationship between IT and its customers.

2. Improves Communications with Suppliers

Suppliers are generally not unlike any other support group in that they do not enjoy last-minute surprises. Involving key suppliers and support staffs with your capacity plans can promote effective communications among these groups. It can also make their jobs easier in meeting deadlines, reducing costs, and offering additional alternatives for capacity upgrades.

3. Encourages Collaboration with Other Infrastructure Groups

A comprehensive capacity plan by necessity involves multiple support groups. Network services, technical support, database administration, operations, desktop support, and even facilities may all play a role in capacity planning. In order for the plan to be thorough and effective, all these various groups must support and collaborate with each other.

Real Life Experience—Two Sides to Every Story

An executive at a marketing company knew each of his 12 departments would be generating enough reports and memos to justify at least two printers for each group.

To reduce costs, he encouraged his staff to print everything on both sides of the paper and ordered only half as many printers as originally planned.

The idea had merit in theory, but it failed miserably in execution. Few users were willing or able to use two-sided reports, and eventually more printers had to be purchased at costs greater than if the original larger order had been placed.

4. Promotes a Culture of Strategic Planning as Opposed to Tactical Firefighting

By definition, capacity planning is a strategic activity. To do it properly, one must look forward and focus on the plans of the future instead of the problems of the present. One of the most significant benefits of developing an overall and ongoing capacity planning program is the institutionalizing of a strategic planning culture.

Helpful Hints for Effective Capacity Planning

Developing a comprehensive capacity plan can be a daunting challenge at the outset; it requires dedication and commitment to maintain it on an ongoing basis. The following hints can help minimize this challenge:

  1. Start small.
  2. Speak the language of your customers.
  3. Consider future platforms.
  4. Share plans with your suppliers.
  5. Anticipate nonlinear cost ratios.
  6. Plan for occasional workload reductions.
  7. Prepare for the turnover of personnel.
  8. Strive to continually improve the process.
  9. Evaluate the hidden costs of upgrades.

1. Start Small

Many a capacity planning effort fails after a few months because it encompassed too broad a scope too early on. This is especially true for shops that have had no previous experience in this area. In these instances, it is wise to start with just a few of the most critical resources—say, processors or bandwidth—and gradually expand the program as more experience is gained.

2. Speak the Language of your Customers

When requesting workload forecasts from your developers, and especially your end-using customers, discuss these in terms that the developers and customers understand. For example, rather than asking for estimated increases in processor utilization, inquire as to how many additional concurrent users are expected to use the application or how many of a specific type of transaction is likely to be executed during peak periods.

3. Consider Future Platforms

When evaluating tools to be used for capacity planning, keep in mind new architectures that your shop may be considering and select packages that can be used on both current and future platforms. Some tools that appear well-suited for your existing platforms may have little or no applicability to planned architectures.

4. Share Plans with Suppliers

If you plan to use your capacity planning products across multiple platforms, it is important to inform your software suppliers of your plans. During these discussions, make sure that add-on expenses—the costs for drivers, agents, installation time and labor, copies of licenses, updated maintenance agreements, and the like—are identified and agreed upon up-front. Reductions in the costs for license renewals and maintenance agreements can often be negotiated based on all of the other additional expenses.

5. Anticipate Nonlinear Cost Ratios

One of my esteemed college professors was fond of saying that indeed we live in a nonlinear world. This is certainly the case when it comes to capacity upgrades. Some upgrades will be linear in the sense that doubling the amount of a planned increase in processors, memory, channels, or disk volumes will double the cost of the upgrade. But if the upgrade approaches the maximum number of cards, chips, or slots that a device can hold, a relatively modest increase in capacity may end up costing an immodest amount for additional hardware.

6. Plan for Occasional Workload Reductions

A forecasted change in workload may not always cause an increase in the capacity required. Departmental mergers, staff reductions, and productivity gains may result in the reduction of some production workloads. Similarly, development workloads may decrease as major projects become deployed. While increases in needed capacity are clearly more likely, reductions are possible. A good guideline to use when questioning users about future workloads is to emphasize changes, not just increases.

7. Prepare for the Turnover of Personnel

Over time, all organizations experience some degree of personnel turnover. To minimize the effects of this on capacity planning efforts, ensure that at least two people are familiar with the methodology and that the process is fully documented.

8. Strive to Continually Improve the Process

One of the best ways to continually improve the capacity planning process is to set a goal to expand and improve at least one part of it with each new version of the plan. Possible enhancements could include the addition of new platforms, centralized printers, or remote locations. A new version of the plan should be created at least once a year and preferably every six months.

9. Evaluate the Hidden Costs of Upgrades

Most upgrades to infrastructure hardware resources have many hidden costs associated with them. We’ll look at these additional expenses more thoroughly in the next section.

Uncovering the Hidden Costs of Upgrades

Even the most thorough technical and business analysts occasionally overlook an expense associated with a capacity upgrade. Identifying, understanding, and quantifying these hidden costs is critical to the success and credibility of a capacity planning program. The following list details many of these unseen expenses:

  1. Hardware maintenance
  2. Technical support
  3. Software maintenance
  4. Memory upgrades
  5. Channel upgrades
  6. Cache upgrades
  7. Data backup time
  8. Operations support
  9. Offsite storage
  10. Network hardware
  11. Network support
  12. Floor space
  13. Power and air conditioning

1. Hardware Maintenance

Some hardware maintenance agreements allow for minimal upgrades at the same annual rate as the original contract. These tend to be the exception. Most agreements have escalation clauses that drive up annual hardware maintenance costs.

2. Technical Support

Multiprocessors, larger cache memories, and additional units of disk volumes usually require operating-system modifications from technical support.

3. Software Maintenance

Modified or upgraded operating systems can result in increased license fees and maintenance costs.

4. Memory Upgrades

Additional processors can eventually saturate main memory and actually slow down online response rather than improve it, especially if memory utilization was high to start with and high-powered processors are added. Memory upgrades may be needed to balance this out.

5. Channel Upgrades

Additional disks can sometimes saturate channel utilization, causing memory or processor requests to wait on busy channels. Upgrades to channels may be needed to correct this.

6. Cache Upgrades

Additional disks can also undermine the benefits of cache storage by decreasing hit ratios due to increased requests for disks. Expanded cache may be required to address this.

7. Data Backup Time

Increasing the amount of data in use by adding more disk space may substantially increase data backup time, putting backup windows at risk. Faster channels or more tape drives may be needed to resolve this.

8. Operations Support

Increasing backup windows and generally adding more data center equipment may require more operations support.

9. Offsite Storage

Additional tapes that have to be stored offsite due to increased amounts of data to back up may result in additional expense for offsite storage.

10. Network Hardware

Increasing the amount of tape equipment for more efficient data backup processing may require additional network hardware.

11. Network Support

Additional network hardware to support more efficient tape backup processing may require additional network support.

12. Floor Space

Additional boxes (such as servers, tape drives, or disk controllers) require data center floor space, which eventually translates into added costs.

13. Power and Air Conditioning

Additional data-center equipment requires air conditioning and electrical power; this eventually can translate into increased facilities costs.

Assessing an Infrastructure’s Capacity Planning Process

The worksheets shown in Figures 15-2 and 15-3 present a quick-and-simple method for assessing the overall quality, efficiency, and effectiveness of a capacity planning process. The first worksheet is used without weighting factors, meaning that all 10 categories are weighted evenly for the assessment of a capacity planning process. Sample ratings are inserted to illustrate the use of the worksheet. In this case, the capacity planning process scored a total of 20 points for an overall nonweighted assessment score of 50 percent. The second sample worksheet compiled a weighted assessment score of 47 percent.

Figure 15-2. Sample Assessment Worksheet for Capacity Management Process

image

image

Figure 15-3. Assessment Worksheet for Capacity Management Process with Weighting Factors

image

image

One of the most valuable characteristics of these worksheets is that they are customized to evaluate each of the 12 processes individually. The worksheets in this chapter apply only to the capacity planning process. However, the fundamental concepts applied in using these evaluation worksheets are the same for all 12 disciplines. As a result, the detailed explanation on the general use of these worksheets presented near the end of Chapter 7, “Availability,” also applies to the other worksheets in the book. Please refer to that discussion if you need more information.

Measuring and Streamlining the Capacity Planning Process

We can measure and streamline the capacity planning process with the help of the assessment worksheet shown in Figure 15-2. We can measure the effectiveness of a capacity planning process with service metrics such as the number of instances of poor response due to inadequate capacity on servers, disk devices, or the network. Process metrics—such as the number of instances of poor response due to inadequate capacity on servers, disk devices, or the network—help us gauge the efficiency of this process. We can be streamline the capacity planning process by automating certain actions—the notification to analysts when utilization thresholds are exceeded, the submittal of user forecasts, and the conversion of user-workload forecasts into capacity requirements, for example.

Summary

Capacity planning is a strategic activity that focuses on planning for the future. It is sometimes confused with capacity management, which is a tactical activity that focuses on issues in the present. This chapter discussed these differences, starting with a formal definition of capacity planning followed by some of more common reasons this process is typically done poorly in many infrastructures.

The major part of this chapter presented the nine key steps required to develop a robust capacity planning process. Included within this discussion were recommended characteristics for a process owner, resources to consider for inclusion, a sample user worksheet for workload forecasts, and an example of utilization mapping.

A short section describing several additional benefits of a sound capacity planning process followed next, along with numerous helpful hints I have accumulated over the years implementing these processes. Next came a description of various hidden costs of upgrades that are frequently overlooked. Similar to previous chapters, this one concluded with explanations and worksheets on how to assess, measure, and streamline the capacity planning process.

Test Your Understanding

1. The ratio of upgrade costs to upgrade capacities is almost always a linear, or constant, relationship. (True or False)

2. In order to quickly establish a new capacity planning effort, one should start with as broad a scope as possible. (True or False)

3. Which of the following is not a primary reason for infrastructures failing at implementing an effective capacity planning program?

a. analysts are too busy with day-to-day activities

b. users are not interested in predicting future workloads

c. capacity planners may be reluctant to use effective measuring tools

d. corporate or IT directions may remain static from year to year

4. Two activities in the management of infrastructures that historically are not done well are ____________ and ____________.

5. Why do workload forecasts need to be translated from a business perspective into an IT perspective?

Suggested Further Readings

4. Capacity Planning for Web Services: Metrics, Models and Methods; 2001; Menasce, D.A., Almeida, V.A.F.; Prentice Hall PTR

5. Guerrilla Capacity Planning: A Tactical Approach to Planning for Highly Scalable Applications and Services; 2006; Gunther, Neil J.; Springer

6. Association for Computer Operations Managers (AFCOM); www.afcom.com

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.146.152.99