Chapter 13. Keeping Defects in Check

Potentially shippable. You’ve heard this phrase before and you’ll hear it again. So what about defects then? Is a story potentially shippable if at the end of the sprint it has some defects that were left unaddressed? This then leads teams to wonder whether each project should have a sprint (or two) dedicated to fixing defects. How about a stabilization phase? After all, you need to reach the mythical code complete phase, right?

Defect management is one of the easiest things to forego on any agile project. After all, teams have been conditioned to address defects at the end of projects. Failing to deal with these defects on an ongoing basis, though, results only in low-quality software and the need to “build quality into the system” during a defect-fixing phase or sprint.

So how should a team manage its defects? Dozens of books and techniques explain how to manage defects, but I have found a strategy that I particularly like and have successfully applied to a large majority of my projects, both big and small, new code and legacy systems. I’ll start by telling you the story that led to this pattern.

The Story

Miguel and his team were new to Scrum. Miguel was excited about his new role as product owner and had been reading books and practicing. Unfortunately, because the project started about one week after the team was formed, he hadn’t learned as much as he wanted yet.

About halfway through its first sprint, the team asked Miguel how he wanted to handle defects.

“Miguel, right now we’re two weeks into this one-month sprint and we’ve got about twenty defects,” said a team member. “Should we hold off and have a dedicated defect-fixing sprint or plan some time between sprints to fix defects?”

Miguel was a bit perplexed. Everything he had read led him to believe that a dedicated defect-fixing sprint was the wrong way to go, but he really wasn’t sure what to suggest as an alternative. As far as he knew, the only way defects were ever fixed at the company was in a dedicated defect-fixing phase.

“Just hold off and let’s see how it goes,” said Miguel. By the end of the sprint, the team had found even more defects—so many, in fact, that the sprint review meeting was cancelled.

The team went into the retrospective and asked Miguel to join. As they all started writing things on the wall, the one that emerged was the most obvious: we have too many defects.

Raul was the ScrumMaster, and he saw this as a big impediment, so he asked, “How should we handle the defects?”

The team sat there for a while, and then the ideas started coming.

“Have a defect-fixing sprint,” said one team member. Raul wrote it on the whiteboard.

“How about we just have a fixed amount of time each sprint to fix defects at the end, say a day or two?” said another. Raul again added it on the whiteboard.

“I read somewhere that teams put defects on the product backlog. I guess we can just do that,” added another team member.

Then the room was silent, until finally Miguel spoke up.

“Why don’t we just fix them in real time?” he asked.

The team looked at him like he was crazy.

“Real time? As in, we just fix them as we find them?”

“Yes,” said Miguel. “That way we keep our costs low. The other ideas we’ve identified so far all have the same pattern—pushing off the defects until some later date. The problem is, if we do that, then we’ll be writing code on buggy software. How will we know if the code we write days later, on top of a buggy API, let’s say, won’t have defects in it, too?”

The team sat there a bit stunned.

“What if we just do the second option, setting some time aside at the end?” someone asked.

“I was thinking about that,” said Miguel. “Since we never know how many defects we’ll have during a sprint, it’ll be nearly impossible to set time aside. What if we have too many defects and not enough time? Then we’re back to where we are today. I’m willing to try it as a last resort, but I’d rather try the real-time thing first.”

The team sat there, thinking.

“Yes, I think you’re right,” said a team member. “I’m willing to try real-time bug fixing, but are we going to fix every bug in real time? Even the low-priority stuff?”

“I think the most critical defects need to be fixed in real time. Any items that are lower priority can go back on the product backlog, and I’ll prioritize them,” said Miguel.

“That’s a fair compromise. I’m willing to try this, too,” said another team member.

Raul took some notes, “Great! Then it’s settled. We’ll do real-time bug fixing. Now, let’s figure out what that really means.”

The Model

I am a strong advocate for prioritizing quality above all else, when it makes sense. I also understand that even code that lives up to the highest quality standards will never be perfect—will never be forever free from defects and maintenance. As such, I accept that defects are a part of life. That doesn’t mean, however, that teams don’t need a technique for managing them, preferably in real time.

The first step in defect management is to realize what is important to the customers, the product owner, and the team. Customers want the team to deliver features that, at least in theory, can be released to production; they don’t want to pay for defects. Product owners want to meet the requests of their customers; they know that customers want defect-free code. Teams want the freedom to mitigate against defects when they realize a piece of their code is in bad shape.

The second step in defect management is to understand that frequent testing reduces overall project costs and defects [BECK] and that it’s just plain cheaper to fix defects sooner rather than later. How much cheaper? Industry standard numbers to fix a defect range that follows familiar 1:10:100 rule, where things identified on the team member’s desktop have a cost of one, and as they move through the software life cycle, they get exponentially more expensive. In Barry Boehm’s data from Software Engineering Economics, he observed a 4:1 cost/fix ratio [BOEHM]. My favorite study, however, was done in 2002. Johanna Rothman published an article on the StickyMinds website [ROTHMAN]. In her article, based on customers she works with, she concluded that a team that waits until the very end to fix defects will have a defect cost price 400 percent larger than a team that addresses defects in real time or near real time.

All this data was enough to convince me that it is just better to fix defects in real time. From this, I created a simple model that gives teams the freedom to fix critical defects in real time, while allowing the product owner to prioritize less critical defects using the product backlog. In essence, all defects are rated on a priority scale of 0 to 3, where priority 0 and priority 1 are critical defects and priority 2 and priority 3 are less critical. The team has full authority to fix p0 and p1 defects as they see fit. The p2 and p3 defects are put onto the product backlog to be reviewed and prioritized.

At this point you may be wondering how to differentiate between p0, p1, p2, and p3. Determining these values is essential in guiding the team members as to when they should use their full authority to fix a bug. Table 13-1 illustrates how I rate these values. Use my definitions as considerations and not the default standard for your projects. In other words, talk about this with your team and decide on a model that works for you and your company.

Image

TABLE 13-1 Defect Rating System

Once you and the team have determined a standardized way to prioritize defects, the team can begin ranking them during the sprints. When a new defect surfaces, the person or pair who discovered the defect must quickly triage the issue in real time and determine its priority using the scale in Table 13-1 (or the one you create). If the defect is a p2 or p3, the defect is immediately logged with as much detail and as many supporting files or logs as possible. The defect number is then put on the product backlog for review and prioritization by the product owner.

If, on the other hand, the defect is a p0 or p1, the person or pair has one hour to accomplish the following tasks:

Image Stop the work they are doing.

Image Identify the root cause of the defect.

Image Fix the root cause.

Image Update all tests (unit, integration, acceptance).

Image Build or update any build verification tests (BVTs).

Image Ensure that all tests (acceptance, unit, etc.) are passing.

Image Check the code in.

Image Release the software to, at least, an integration environment.

If the person or pair can do this within one hour, they are not required to open the defect tracking software to log the defect. However, if this cannot be done within one hour, they must take the following steps:

1. Stop at the end of one hour.

2. Log the defect in the tracking system.

3. Continue driving the defect to completion with the criteria listed above.

4. Once this is complete, open the defect in the tracking system, writing all steps taken to get it to completion, and then close the defect.

5. Create a new item in the sprint backlog for the defect that was fixed, along with the hours spent on driving the defect to completion.

This is a simple yet effective approach. With the team focused on quality throughout each sprint (and automated tests to fix issues when defects surface), you can release on a regular basis with extreme confidence. For more on automated tests, see Chapter 9, “Why Engineering Practices Are Important in Scrum.”

Keys to Success

Managing defects in any software project is a challenge. Teams have years of learned brain muscle memory that tells them to fix defects at the end, after development. As a result, as teams transition to agile, they naturally feel they should fix defects at the end of the sprint or have a bug-fixing sprint. Teams need to retrain their brains.

The key to a successful agile project is to deliver a potentially shippable product increment, where the code is tested and the technical debt is low, at the end of each sprint. Doing this requires a shift in mindset that is difficult for people, teams, and companies alike. Changing such ingrained habits requires discipline and effort.

Those of you working on a legacy system might believe that this approach could never work for you. Reconsider. You’re probably concerned about the sheer volume of issues you will find. I hear you. A friend of mine at a large software company spent two years fixing defects before he was able to write mainline code. This approach, however, allows you to move forward with new code while logging noncritical defects and fixing critical ones in real time. I realize that it could be days, weeks, or even months before you find a good rhythm and get out of defect-fixing mode. That’s okay. The important thing is that you start paying down all the technical debt that the legacy system has accrued while also demonstrating new functionality.

Whether you are doing this on a new project or with legacy code, this approach exposes technical debt and defects quickly. Before you begin, you need to explain to the customer, management, and stakeholders what you are doing and why it is important. Start by building a common understanding on what a defect means for the team and ideally the group or company. Next, educate your customers and stakeholders. Give them examples of what makes a good defect report and teach them how to write defect reports that give the team the information it needs to duplicate and fix defects. Once you have a system in place, communicate the priority of the defects in your backlog and make it visible and available.

By establishing a well-understood process and making defects visible, you and your company can break free of the waterfall attitude toward defects once and for all.

Additional Information

As the second edition was being written, I was looking at different bug management techniques. One of them by my friend Bill Hanlon of Microsoft is simply brilliant. Please read more at http://www.mitchlacey.com/blog/managing-bugs-in-scrum-and-agile-projects.

References

[BECK] Beck, Kent. 2005. Extreme Programming Explained, Second Edition. Upper Saddle River, NJ: Addison-Wesley.

[BOEHM] Boehm, Barry. 1982. Software Engineering Economics. Englewood Cliffs, NJ: Prentice-Hall.

[ROTHMAN] Rothman, Johanna. StickyMinds Website. http://www.stickyminds.com/article/what-does-it-cost-fix-defect (accessed 30 June 2015).

Work Consulted

Ward, William T. 1991. The CBS Interactive Business Website. “Calculating the Real Cost of Software Defects.” http://findarticles.com/p/articles/mi_m0HPJ/is_n4_v42/ai_11400873/ (accessed 30 June 2011).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.188.64.66