CHAPTER 19
Risk Management

Traditional risk management is an effective tool for project management. The PMBOK® Guide defines project risk as “an uncertain event or condition that, if it occurs, has a positive or a negative effect on at least one project objective, such as time, cost, scope, or quality.” The PMBOK® Guide also defines the objectives of Project Risk Management as “to increase the probability and impact of positive events, and decrease the probability and impact of events adverse to the project.”

To encompass maintenance, we define risk simply as events or conditions in the future that could affect your desired outcome. The effect can be positive, although most people focus on the negative. Risk Management is the discipline of thinking how future variables could influence your desired outcome and taking necessary steps to ensure your desired outcome.

Risks Versus Issues

Sometimes there is confusion between risks and issues. As a general rule:

•   Risks: Something that COULD have an effect

•   Issues: Something that IS having an effect

There is also certainty about issues. For example, is a storm in the state of Florida during a year’s time a risk or an issue? It is not a risk; it is a certainty that there will be multiple storms during any year.

Issues must be managed as part of a project or maintenance team’s normal operations. Certain events will happen, and issues will come up. Issues can be tracked separately or in the Work Tracking Tool. In the example of the Florida storms throughout the year, the impact must be managed if the issue is a factor of your desired outcome.

On the other hand, a storm hitting Orlando, Florida, on March 15 between 11:00 a.m. and noon is a risk, not an issue. If this event was a factor of your desired outcome, you would handle if differently, since there is a high probability that it wouldn’t occur. This is where Risk Management kicks in.

The PMBOK® Guide provides a detailed explanation of Risk Management. For our purposes, we will condense the subset down to Anticipate, Prevent, and Overcome. To Anticipate, list the events or conditions that could impact your desired outcome, the amount of impact they could have, and the likelihood (probability) they will happen. To Prevent, plan actions that you can perform now to prevent the risk event from happening. To Overcome, plan actions that you can perform if the risk event happens to overcome any negative impact.

Figure 19-1 provides a typical Risk Matrix Template that can be used for maintenance. For the best results, involve others when you develop your risk matrix. Your team and peers will provide additional viewpoints and insights.

Use the following steps to manage risk using the Risk Matrix Template in Figure 19-1:

1.   Anticipate potential risks. To help identify all risks, consider risks from the project management Triple Constraint: Scope, Schedule, and Cost. Fill in the Risk Name and Risk Description columns for each significant maintenance risk.

Figure 19-1: Risk Matrix Template

Images

2.   Anticipate the impact and probability of each risk. Determine the impact each risk would have on the desired outcome (0 if no impact; 3 if a high impact). Also determine the probability that the risk will happen (0 if the team has total control; 3 if the team has no control). Update the matrix columns.

3.   Add the two numbers together and update the Current Significance and Original Significance columns. Then you’ll have the relative ranking of the risks; the lower the number the better. Over time the Current Significance column will change, and you can compare it to the Original Significance column.

4.   Develop any prevention actions that you can take to prevent the event from happening.

5.   Develop actions to overcome the event if it happens.

Risk Management for Maintenance

This section will help you get started with your Risk Matrix development. Every maintenance team should anticipate risks like the hot topics discussed in this section. This section also presents potential prevention actions and actions to overcome events if they happen.

The risks can be grouped into two categories: (1) Team Operations and (2) External Factors. We will cover the following risks related to maintenance.

Team Operations:

•   Loss of Maintenance Knowledge

•   Loss of Business Continuity

•   System Disaster

External Factors:

•   Security Attack / Loss of Sensitive Data

•   New Governmental Regulations

The prevention and overcoming actions for these risks and any risk should employ a measured response to meeting minimum requirements. The preventive actions should only be enough to “prevent” the risk, no more. The “overcome” actions should only be enough to overcome the negative impact, no more.

This may not sit well with everyone. Meeting minimum requirements may sound like mediocre work. But as Neal Whitten wrote in his book Neal Whitten’s No-Nonsense Advice for Successful Projects, “Products or services that meet minimum requirements provide the essentials—the mission-critical requirements—for your client’s success. One of the most common problems with projects is taking on too much work; that is, attempting to exceed requirements rather than meet minimum requirements.”

Meeting minimum requirements is good business. Going beyond minimum requirements will cost your company resources and has a questionable return on investment. Of course, not meeting minimum requirements will also cause problems. Your judgment and the judgment of your department’s leadership are key.

The following sections provide more detail on each maintenance risk.

Risk: Loss of Maintenance Knowledge
Risk Description

One day you may no longer have all the knowledgeable team members you need on your maintenance team. Current team members may leave the company or transfer to another department.

Preventive Actions

Cross-train team members so there is at least one backup person for each role. Develop a Coverage Matrix; refer to Chapter 10, “Team Management.”

Capture knowledge in appropriate documentation so new team members can readily find system support information. Refer to Chapter 9, “Documentation.”

Overcoming Actions

If possible, retain communications lines to the lost person so that you can consult him or her if you need information before new team members develop their own knowledge.

Obtain contractors that have the knowledge you need for a short period of time.

Risk: Loss of Business Continuity
Risk Description

The IT business may not be able to function normally due to an office building disaster, geographical disaster, or pandemic. Such events can include the loss of key team members.

Preventive Actions

Consider having your team spread out at multiple locations. Doing this may be impractical, but some companies have their teams geographically dispersed for other business reasons.

Overcoming Actions

Develop a Business Continuity Plan that addresses obtaining replacement personal computers (PCs) and obtaining new team members, remote network access, and alternate office locations, including people working from home.

Risk: System Disaster
Risk Description

The computer system hardware may be destroyed or unusable.

Preventive Actions

Produce proper backup of the system and data. Maintain hardware and facilities appropriately.

Overcoming Actions

Have a manual process for performing tasks that are normally automated.

Have backup system and data media available off-site. Know where to purchase replacement hardware. Have an alternate computer center available with identical system hardware running the current system. There are vendors that provide these types of facilities and services. Provide the correct action level so that you can meet minimum business needs.

Risk: Security Attack
Risk Description

An unauthorized person (internal or external) gains access to your network and computer systems, resulting in system damage or loss of data. Stolen data can include information that would place your company at a competitive disadvantage if obtained by a competitor. Data about people might be included, too, such as information about customers and employees, including Social Security numbers and financial account numbers. Identity theft is becoming more widespread, and these events are becoming more commonplace.

There is growing concern about the security of computer systems and their data, due in part to the exposure of having internal networks connected to the Internet. Protect data from unauthorized use and ensure use of systems is restricted appropriately.

Preventive Actions

The essence of security is control. All types of access should be limited to employees, contractors, and vendors that have a specific business need for access. Administration of these controls must be clearly documented.

Monitor security events on key computer operating systems and databases. Vendors provide this service, which includes monitoring, filtering event logs, and sending alerts on critical events.

Limit employee access to production systems. A developer can have access to the production system for troubleshooting purposes, but sensitive data should be masked. This feature would have to be built into the system design.

In cases where projects are outsourced, provide test data that does not contain any real sensitive data. For example, you can scrub the data by changing Social Security numbers to random numbers. Have an IT team dedicated to evaluating the industry’s best practices for security and implementing those best practices.

Overcoming Actions

Develop actions to evaluate and isolate any attack. Include which groups should be involved in responding to the attack.

Review the legal and contractual reporting requirements your company is obligated to perform if sensitive data is lost. Develop an action plan that incorporates fulfilling these obligations. Include in the plan communications to the public, customers, and employees if data about individuals is lost.

Hiring a public relations company that specializes in crisis-management communications could be useful, so identify, contract with, and provide contact information for a public relations company in your action plan.

Risk: New Governmental Regulations
Risk Description

New governmental regulations similar to the Sarbanes-Oxley Act (SOX) of 2002 could be enacted that would have a direct impact on system design and maintenance.

Preventive Actions

Be on the lookout for new regulations that could affect your industry or financial controls. Attend trade conferences and read IT periodicals.

Overcoming Actions

Address any audit findings if a regulation requirement was missed.

Managing Risks Going Forward

After you have completed your first Risk Matrix, capture all your Preventive Actions into one plan with an owner and due date identified for each action. These actions must be completed to prevent the negative impacts you defined in your Risk Matrix.

If you are not going to perform these actions, there is no point in developing a Risk Matrix in the first place. If you don’t perform the action, you should admit to yourself that for you Risk Management will consist of wishful thinking. This chapter presented a useful tool for evaluating negative impacts on your desired outcomes. The tool must be actually used for it to be useful.

Lastly, you need to keep your Risk Matrix updated by conducting a formal review of it. Determine how frequently you will conduct your review and faithfully update the Risk Matrix at that frequency. Because conditions in maintenance do not change as rapidly as they do in projects (due to the shorter life cycle of projects), a less frequent review can be justified for a Risk Matrix for maintenance. We recommend reviewing your Risk Matrix at least quarterly. One way to help ensure that the update will not be forgotten is to tie its date to some other regularly scheduled event such as a quarterly customer survey.

The review should include:

•   What conditions changed

•   List new preventive and overcoming actions

•   Add any new risks

•   Update significance columns

•   Most importantly, implement any changes

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.146.37.250