CHAPTER 13
Incidents, Defects, and Enhancements

This chapter addresses emergency incidents, production defects, and enhancement requests. Defect fixes and enhancements are addressed together in this chapter because they both cause a change in production. Their priorities and the level of scrutiny involved in obtaining approvals will differ, but the overall process of migrating the change into production will be the same.

We will start by defining incidents, defects, and enhancements, and we will distinguish defects from enhancements. Then we will delve into the two processes for handling defects and enhancements. The enhancement process is more complicated, since there is more discretion on approvals and prioritizations.

The IT maintenance manager doesn’t have full authority to make all the decisions identified in this chapter. The system owner should be consulted about how priorities are set and how the process for seeking approvals is carried out. The business customer may include a large number of people, but the system owner is the specific person or the small group of people that have the final say on system changes.

Incident Definition

An incident is a production issue that could include, but is not limited to, a defect in the software. A non-defect incident can include (1) a data change request such as a user wanting to override an entry, which the system will not allow, or (2) a system administration request such as adding a user, or (3) a user question.

Defect Definition

Production software defects, referred to as defects, are also commonly referred to as bugs. We call them defects in this book and define them as instances in which the software system does not function as designed. Defects can include clearly defined functionality or implied functionality. Some IT professionals use the term “undocumented feature” to refer to functionality that is not defined in the software design, despite the fact that the customer considers it a defect. We will view the subject of defects from the customer’s perspective. Fixing a defect does not have to mean changing the executable code; it can be a data change, a reset of a parameter, or rerunning a batch job.

Fixing defects is the most recognized work that a maintenance team performs. Defects in the system can cause customers to experience anxiety—sometimes to a pronounced degree. Fixing these defects is the major purpose of the maintenance team. However, business priority and severity, not the customer’s anxiety level, are the most important determinants of the appropriate response from your team. Even though a low-level user may consider a defect a high priority, it may actually have no significant business impact, and your team should therefore work on other priorities.

Enhancement Definition

Enhancements are defined as modifications to the system to change the current functionality. Enhancements are sometimes referred to as perfective maintenance. In this book, we refer to these modifications of functionality as enhancements. Enhancements involve a business decision about whether an enhancement should be made or whether a software defect exists that must be fixed quickly.

There are other production changes that don’t easily fit into the defect or enhancement category. These are referred to as adaptive maintenance and preventive maintenance.

Adaptive maintenance refers to changing a system so that it can be run on a different platform (e.g., changing a system that runs on Windows so that it runs on UNIX). The business would view this as a maintenance issue that your team needs to categorize and prioritize.

Preventive maintenance refers to improving the production environment so that problems are prevented. For example, a batch job can occasionally fail. To restart the failed batch job is incident management, but to keep the problem from ever occurring again is preventive maintenance. Some may view this example as a defect fix. Though that view is understandable, the change is actually preventive maintenance because its result is preventing any future batch failures. Such a change is prioritized just like an enhancement. So in this book we treat adaptive maintenance and preventive maintenance as enhancements.

Customer Categorizing Enhancements as Defects

As you will see, the process flows for enhancements and defect fixes differ. The customer will notice that the maintenance team responds differently to defects than it does to enhancements. For example, there may be more approvals needed for an enhancement, and defects will receive a higher priority than an enhancement. A customer may thus want to categorize a change as a defect—when it’s actually an enhancement—in order to get faster service.

Effective project managers are used to dealing with scope creep. This is no time to let down your guard. You want to categorize defects and enhancements correctly for many reasons, including the following:

•   So that charges for work fall into the correct financial category

•   So that lower-priority work does not preempt work with higher business value

•   So that metrics are accurate and not manipulated

•   So that customers recognize the integrity of the maintenance process and its resistance to manipulation

How do you tell if something is a defect or an enhancement? You can use the following questions to test whether the request is for a defect or for an enhancement:

•   Is the system currently working the way it was tested/accepted by the customer?

•   Is the system currently working the way it was working a year ago?

•   Is the system currently working as defined in the design documents?

•   Is the request a change to the system’s functionality?

If the answer to any of these questions is yes, the request is most likely for an enhancement. Of course, your final judgment is what counts. It may be more appropriate to just capitulate and categorize a request as a defect if the circumstances dictate it. If the request is categorized as an enhancement, be sure to let the customer know the process for prioritizing the work. Reassure the customer that the approved requests will be completed in a timely manner.

Emergency/Defect Fix Process

The Emergency and Defect Fix Process usually begins when the customer contacts the maintenance team but sometimes begins when the maintenance team itself detects a defect. The emergency can also be a non-defect incident that has a high severity to the business. Figure 13-1 provides a modified version of the Workflow Tracking Process shown as Figure 11-1 in Chapter 11, “Workflow Tracking.” Figure 13-1 assumes an emergency case where the problem needs to be fixed immediately. If the reported problem is just an annoyance, you can use a process similar to the Enhancement Process shown in Figure 13-3.

The Ticket Statuses are not shown in Figure 13-1 since it is not practical to follow the normal process of updating the Work Tracking Tool in an emergency situation. The emergency should first be worked to resolution or until a workaround is introduced. After the emergency is over, the team member should create a ticket for the incident.

After responding to the incident, the team member should investigate the issue and determine what fix needs to be made. If the fix does not require code changes, the fix is simple to make (e.g., a data change or resetting the status of a record). The fix will be more involved if it requires a code change. If at all possible, the code change should go through the normal development life cycle of design, code, test, and migrate. Code changes are sometimes needed quickly, but making such changes too rapidly introduces the risk that they will have an adverse effect on production.

There are still two more tasks to complete after the incident is resolved. The first is to create a Work Tracking Tool Ticket set to the COMPLETE status to document the emergency. The second is to communicate by phone or e-mail to the appropriate parties that the problem is resolved. Individuals to notify can include the person who identified the problem, a main customer contact such as the system owner, and the IT maintenance manager.

The process flow in Figure 13-1 is generic, but your process flow should be specific for your situation and well documented. Consider writing a to-the-point incident response procedure that sets expectations for your team and training them on the procedure. The procedure needs to address the following items:

•   How fast a call should be answered

•   Process flow to follow to resolve emergency

•   Any pre-approvals that are allowed for an emergency only

•   Whom to contact after the incident is resolved

•   Whom to contact if the problem is not resolved in a specified time, such as 30 minutes

•   Method of contacting people (e-mail, phone, etc.)

•   Expectation for updating the Work Tracking Tool

Figure 13-1: Emergency Defect Process

Images

Throughout the process of fixing defects, your team must use any established Change Control Process to ensure that appropriate approvals are obtained. An effective Change Control Process should specify any acceptable deviations from the normal process for handling an emergency.

Severity Levels

We already mentioned setting expectations for the amount of time a team member has to respond to an emergency call. Now let’s consider how much time the person has to resolve the problem. This is where the business severities come into play. Response and resolution times should be defined in the Service Level Agreement (SLA) for each severity level.

Figure 13-2 provides an excerpt from the sample SLA presented in Chapter 5, “Service Level Agreement.” Severity levels apply to defects and production incidents but do not have meaning for enhancements.

The Initial Response Time and Recovery Time shown in Figure 13-2 set the expectations for team members for these functions. Of course, team members would rather have a Moderate incident instead of a Critical one, because there would be more time to create an appropriate fix or finish up whatever else they were working on.

When the severity to the business is Critical or Serious, an appropriate approach may be to provide a temporary fix. Even though doing this will not close the incident, it can downgrade the severity, which will allow the business to function and will provide more time to provide a quality, permanent fix. After the temporary fix is in place, the ticket can be downgraded in severity.

Figure 13-2: Response Times and Recovery Times for Severity Levels

Images

As an example, the system may have failed over the weekend to the point that its functionality is not usable by the business. This situation meets the criteria for Severity Level 1, which requires an immediate response from the maintenance team. The team works on the problem, but it is a complicated problem. However, the team is able to provide a temporary fix so that the business can function again. The problem is still there, so the incident cannot be closed, but it now meets the criteria for Severity 3, which allows the maintenance team to work on it during normal business hours and engage the vendor if necessary.

Enhancement Process

The Enhancement Process usually begins when the customer sends a request to the maintenance team, but the maintenance team could create the request itself, because adaptive and preventive maintenance are included in the grouping of enhancements, as discussed earlier in this chapter. Figure 13-3 provides a modified version of the Workflow Tracking Process shown as Figure 11-1 in Chapter 11, “Workflow Tracking.”

Four Control Points are depicted in the Enhancement Process shown in Figure 13-3. These Control Points are “gates” at which the maintenance manager and the business customer determine whether (and when) the request should progress to the next phase. These gates control the amount of work that the maintenance team works on at any one time. We will describe the Control Points as we walk through the process steps shown in Figure 13-3.

Let’s start walking through the enhancement process. When the enhancement request is entered into the Work Tracking Tool, it has a status of NEW. The next step is to assign the enhancement to a person to investigate and estimate, but we now encounter Control Point #1, where the manager determines when the assignment will be made. The manager thus can make sure that the team members are not overloaded with work. The request can remain at NEW status while the manager decides when to make the assignment.

Figure 13-3: Enhancement Process

Images

When the enhancement is assigned, the team member will investigate the request, obtain further clarification from the requestor, and estimate the effort needed to complete the enhancement. These results will be delivered to the maintenance manager, the requestor, and the system business owner. An accurate estimate is needed to help the business determine which enhancements to proceed with, and the decision to proceed or not is based in part on return on investment (ROI).

The system business owner will then approve the request, cancel the request, or ask for further investigation if the results are insufficient. Control Point #2 includes these decision options along with when the enhancement coding, if approved, will start.

After Control Point #2, the enhancement will be coded (programmed) and tested. The testing is deemed complete when the business owner approves the test (represented by Control Point #3).

Control Point #4 governs whether the enhancement can now be moved into production. Even though the enhancement is wanted and has passed testing, there may be business reasons to delay the migration, such as a need to wait until after a peak business period is over. The Migration process step represents typical change control methods; however, you should be cognizant of, and follow any other requirements for, change control that your company requires.

The last step will be to close the Work Tracking Ticket with a status of COMPLETE. As the work progresses through the process, the team member assigned to the enhancement should update the status field and the notes section in the Work Tracking Tool. Doing this provides useful information to you and to the customer.

The Enhancement Process resembles a standard IT project management schedule. For small and medium-sized enhancements, you can assign the enhancement to a team member, and effectively manage the task, by tracking progress and status in the Work Tracking Tool. Doing this can be helpful, because you may need to track many small and medium-sized enhancements.

Large enhancements require more active attention and management. They may require multiple team members and resemble projects. Some enhancements could become full-fledged projects.

How do you know when to consider an enhancement a project? The following questions should be answered:

•   Are dedicated resources requested to perform the work?

•   Does implementation involve a high risk?

•   Is the schedule aggressive?

•   Are new technologies involved?

•   Are multiple vendors or customers involved?

Basic principles of project management and software engineering should be applied when an enhancement becomes a project.

Finally, unauthorized enhancement work must be prevented and eliminated. Even though the business owner and maintenance manager are in complete agreement on the need to control and limit enhancements as part of cost control, there can still be problems with the lower levels of the organization. Rogue requests can come in through business users contacting their maintenance team friends directly to have work performed unofficially. This type of hidden demand fulfillment needs to be eliminated by the manager.

Vendors

In some cases, the software vendor may have to be engaged to fix a defect or provide an enhancement. The vendor maintenance agreements and software licenses should provide clarity about when and how to engage the vendor, and these agreements should be followed.

Fixes and enhancements can take longer when a vendor is involved. Some vendor agreements allow for customers to have a copy of the source code to implement temporarily for time-sensitive incidents until the vendor provides a permanent fix.

Customers reporting a defect or requesting an enhancement may not fully understand the relationship between the IT maintenance team and the vendor, including how long the maintenance process could take. Extra care should thus be taken to communicate with customers to ensure that they have a correct understanding.

Grouping Fixes and Enhancements

Frequent defect fixes or enhancements can be disruptive to the business, even if the changes are necessary. It makes sense to group changes into a normal release timeframe, such as monthly or quarterly. The timeframe can be a specific day of the month when the change would be of the least inconvenience to the business. This approach would delay a given fix and enhancement, but it provides more stability to business operations.

Metrics

Monthly metrics on incidents, defects, and enhancements provide a useful management tool for checking the efficiency of the team, evaluating the throughput capacity of the team, and identifying troublesome system components. Effective use of the Work Tracking Tool will make tracking and reporting metrics quick, accurate, and painless.

Metrics should match what is addressed in the SLA. Some metrics to track per review period (typically monthly) are:

•   Number of New Defects Identified by System

•   Number of Defects Closed by System

•   Average Time to Close a Defect for Each Severity Level

•   Number of Non-Defect Incidents Identified by System

•   Number of Non-Defect Incidents Closed by System

•   Number of New Enhancements Requested by System

•   Number of Enhancements Completed by System

Armed with this metric data, managers can answer fundamental questions accurately that they could only assume they knew without the metric data. Through mining the metric data, you can determine:

•   Trending of the number of new defects per month, which can indicate the quality of the production system and the level of aggravation that users may feel

•   Trending of the Average Time to Close Defects, which can indicate responsiveness of the maintenance team

•   Whether there are a large number of defects for any one system or system component, which could indicate where you need to focus improvement to a system in order to reduce negative business impact

•   Whether enhancement requests are trending up, which can indicate if the business may push up the overall cost of maintenance

•   Whether enhancement requests are trending down, which may be an opportunity to decrease staffing on the maintenance team

•   The number of enhancements closed per month, which can indicate the productivity of the maintenance team

The diligence of your team in entering defect and enhancement tickets in a Work Tracking Tool can pay handsome dividends in the type of knowledge you can gather about your maintenance team operation. Sharing this information with business customers can enlighten them about what your team is doing on their behalf. It can also show them how they may be contributing to maintenance costs (if they are requesting many enhancements).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.253.55