Chapter 5. Speed

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 5. Speed

Deliver Fast

Speed is the absence of waste.

If you diligently work to eliminate waste, you will increase the percentage of time you spend adding value during each process cycle. And you will deliver faster—probably much faster.

This fundamental lean equation works in manufacturing. It works in logistics. It works in office operations. PatientKeeper provides a good example of how this works in software development.

PatientKeeper

Five years ago a killer application emerged in the health care industry: Give doctors access to patient information on a PDA. Today (2006) PatientKeeper appears to be winning the race to dominate this exploding market. It has overwhelmed its competition with the capability to bring new products and features to market just about every week. The company’s 60 or so technical people produce more software than many organizations several times larger, and their software is certified to manage life-critical data. They don’t show any sign that complexity is slowing them down even though they sell to a variety of very large health care organizations and support several platforms that integrate with multiple backend systems.

A key strategy that has kept PatientKeeper at the front of the pack is an emphasis on unprecedented speed in delivering features. For the past three years, PatientKeeper has delivered about 45 software releases a year to large health care organizations using simultaneous overlapping iterations (or Sprints) (see Figure 5.1). Every iteration ends in a live release at a customer site. Every one is delivered on time.

Figure 5.1 Simultaneous overlapping iterations running through one set of development teams¹

PatientKeeper CTO Jeff Sutherland explains how this works:²

• All Sprints result in a production release of software.

• QA starts testing as soon as development updates the first code. They can independently kick off the build process and direct it to any QA server.

• By mid-Sprint, the install team deploys Release Candidate1into the customer’s test environment. Now the customer is testing along side internal QA.

• As soon as customer starts feeding back issues, they are addressed by development along with any other issues QA finds.

• Functionality which the customers view as essential to go live continues to surprise us right until the end of the Sprint. We embrace those surprises as it makes the product better faster for all customers.

• It typically takes two or three Release Candidates to go live. All development tasks are complete, all QA issues addressed, and all customer issues completed. QA has run a regression test on the entire system.

• Everyone goes live together at the end of the Sprint. Could be a multi-hospital system with hundreds of physician PDA users and thousands of Web users. We typically take 3–5 customers live at the end of a Sprint.

• Done means all customers are live with no outstanding critical issues. Note that in this scenario, the customer and installation teams are as tightly in the loop as QA and development.

At PatientKeeper, product managers are responsible for deciding exactly what customers want and creating fine-grain definitions of features that are ready for coding. That means, for example, that the product manager has verified a user interface capability through prototypes and possibly focus groups, and made final, detailed decisions on how the interface will work. No attempt is made to distinguish feature requests from defects; they all go into an automated backlog. The product manager is responsible for assigning backlog items to releases.

Development teams are assigned to releases, which consist of assigned backlog items. A developer takes an item from the backlog, breaks it into tasks, and estimates each task. The estimates are entered into the system and rolled up automatically to the backlog item. At the end of each day, it takes no more than a minute for each developer to enter into the tracking system the time spent on each task and its estimated percent complete. With this data in the tracking system, anyone in the company can obtain solid information about how much more time is necessary to complete every release under development.

The rule is that all releases must be on time, so if the system shows that there is too much work, backlog items are removed from the release to match required work to the capacity of the development teams. Because the tracking system gives accurate, up-to-date information about the time to complete any collection of backlog items, tradeoffs can be accurately identified, and valid decisions can be made. Priorities are resolved at weekly meetings which are attended by all the relevant decision makers, including the CEO. Program managers implement the decisions by changing the assignment of backlog items to a release, and development teams self-organize to resolve any problems.

PatientKeeper is fast: It can deliver any application it chooses to develop in 90 days or less. It will not surprise anyone who understands lean that the company has to maintain superb quality in order to support such rapid delivery. Jeff Sutherland explains that rapid cycle time:³

• Increases learning tremendously.

• Eliminates buggy software because you die if you don’t fix this.

• Fixes the install process because you die if you have to install 45 releases this year and install is not easy.

• Improves the upgrade process because there is a constant flow of upgrades that are mandatory. Makes upgrades easy.

• Forces implementation of sustainable pace..... You die a death of attrition without it.

Although this mode of operation seems natural at PatientKeeper, it amazes outsiders. One of the keys is that everyone in the company works together in a spirit of trust, respect, commitment, and continuous improvement. Teams include product managers, developers, QA, and product support. The lead architect is the most experienced and trusted engineer in the company.

Time: The Universal Currency

Everything that goes wrong in a process shows up as a time delay. Defects add delay. Complexity slows things down. Low productivity shows up as taking more time. Change intolerance makes things go very slowly. Building the wrong thing adds a huge delay. Too many things in process create queues and slows down the flow.

Time, specifically cycle time, is the universal lean measurement that alerts us when anything is going wrong. A capable development process is one that transforms identified customer needs into delivered customer value at a reliable, repeatable cadence, which we call cycle time. It is this cycle time that paces the organization, causes value to flow, forces quality to be built into the product, and clarifies the capacity of the organization.

A lean organization makes sure that processes are both available when work arrives at the process, and capable of doing the job expected of the process.⁴ You can find out if your processes are available and capable by looking for the red flag called “expediting.” Expediting happens when work arrives at a process and gets stuck in a queue, but someone thinks the work is so important that they personally “push it through.” If requests are regularly pushed through the system by an “expediter,” something is wrong. Either the process is not available when work comes in, or it is not capable of doing the work.

What does this mean for software development? Consider a software maintenance department that has guaranteed response times of two hours for an emergency, one day for a normal problem, and two weeks for lower priority changes. When an emergency occurs, the department can promise a two hour maximum response time and probably deliver a lot faster. When other requests arrive, it can also promise a reliable response time compatible with the type of request. Since the department has a reliable and repeatable cadence of work, it has established a predictable level of work that can be done. When this threshold is reached, routine requests are turned down or backup capacity is engaged.

Lean organizations evaluate their operational performance by measuring the end-to-end cycle time of core business processes. The value stream maps in Chapter 4 mapped an end-to-end process that started and ended with a customer. They laid out the steps and totaled up the time from concept to launch (for products), or from customer request to deployed software. Excellent performance comes from completing this cycle with as little wasted time and effort as possible.

The best way to measure the quality of a software development process is to measure the average end-to-end cycle time of the development process. Specifically, what is the average time it takes to repeatedly and reliably go from concept to cash or from customer order to deployed software? The idea is not to measure one instance of this cycle time; measure the average time it takes your organization to go from customer need to need filled.

But size varies all over the map. There is no such thing as “average” for us.

If you can treat your work like a product, the best approach is to establish a regular release cycle—every two weeks, every six weeks, every six months—whatever is practical for you. Make it as short a period as you can possibly manage. Then determine how much work you can do in a release, and don’t accept any more than that. Never delay a release because something isn’t ready—take out some features and be careful not to take on so much work next release. Pretty soon you will know how much you can accept in a release.

It helps to divide work within cycles into fine-grain items the same way PatientKeeper does. In Chapter 8 we discuss how to divide work into “stories,” which are one to three days worth of development work. Over time a development team will complete these stories at a reliable, repeatable velocity—which is the demonstrated capacity of the team.

If regular releases don’t make sense in your world, try—very hard—to put an upper limit on project size—six months is a good target—no more than twelve. It has been reported that for many years, Wal-Mart and Dell have limited IT projects size to approximately nine months; surely you can too.

Having set an upper limit on project size, group your projects into two or three groups: emergency, small, and large or some similar categories. Then try to reach a “service level” for each category.

Our projects are far too big and too unique to think about cycles. They go on for years.

No matter how big a development effort is, it gets done in small steps. We used to look at these steps sequentially and create a lot of partially done work at each step. Instead, figure out how to divide up that big effort differently. Divide it into increments of demonstrable value or minimum useful feature sets. In Chapter 8 we will show how a version of the Polaris submarine was launched just three years into a nine year development program. Surely there is a version of your system that can be demonstrated in one third of the overall development time.

Next, try to establish smaller cycles—six weeks to three months—where everything available up to that point is integrated, any existing capability is demonstrated and evaluated, and appropriate decisions are made. Then you have a cycle of no more than three months, and your goal is to deliver a repeatable amount of working software, reliably every cycle.

Queuing Theory

Queuing theory is the study of waiting lines or queues. We certainly have queues in software development—we have lists of requests from customers and lists of defects we intend to fix. Queuing theory has a lot to offer in helping manage those lists.

Little’s Law

Little’s Law states that in a stable system, the average amount of time it takes something to get through a process is equal to the number of things in the process divided by their average completion rate (see Figure 5.2).

Figure 5.2 Little’s Law

In the last section we said that the objective of a lean development organization is to reduce the cycle time. This equation gives us a clear idea of how to do that. One way to decrease cycle time is get things done faster—increase the average completion rate. This usually means spending more money. If we don’t have extra money to spend, the other way to reduce cycle time is to reduce the number of things in process. This takes a lot of intellectual fortitude, but it usually doesn’t require much money.⁵

Variation and Utilization

Little’s Law applies to stable systems, but there are a couple of things that make systems unstable. First there is variation—stuff happens. Variation is often dealt with by reducing the size of batches moving through the system. For example, many stores have check-out lanes for “10 items or less” to reduce the variation in checkout time for that line. Let’s say you have some code to integrate into a system. If it’s six weeks’ worth of work, you can be sure there will be a lot of problems. But if it’s only 60 minutes of work, the amount of stuff that can go wrong is limited. If you have large projects, schedule variation will be enormous. Small projects will exhibit considerably less schedule variation.

High utilization is another thing that makes systems unstable. This is obvious to anyone who has ever been caught in a traffic jam. Once the utilization of the road goes above about 80 percent, the speed of the traffic starts to slow down. Add a few more cars and pretty soon you are moving at a crawl. When operations managers see their servers running at 80 percent capacity at peak times, they know that response time is beginning to suffer, and they quickly get more servers.

Since Google was organized by a bunch of scientists studying data mining, it’s not surprising that their server structure reflects a keen understanding of queuing theory. First of all, they store data in small batches. Instead of big servers with massive amounts of data on each one, Google has thousands upon thousands of small, inexpensive servers scattered around the world, connected through a very sophisticated network. The servers aren’t expected to be 100 percent reliable; instead, failures are expected and detected immediately. It’s not a big deal when servers fail, because data has been split into tiny pieces and stored in lots of places. So when servers fail and are automatically removed from the network, the data they held is found somewhere else and replicated once again on a working server. Users never know anything happened; they still get almost instantaneous responses.

If you have ever wondered why Google chose to dedicate 20 percent of its scientists’ and engineers’ time for work on their own projects, take a look at the graph in Figure 5.3. This figure shows that cycle time starts to increase at just above 80 percent utilization, and this effect is amplified by large batches (high variation). Imagine a group of scientists who study queuing theory for a living. Suppose they find themselves running a company that must place the highest priority on bringing new products to market. For them, creating 20 percent slack in the development organization would be the most logical decision in the world. It’s curious that observers applaud Google for redundant servers but do not understand the concept of slack in a development organization.

Figure 5.3 Queuing theory applies to development as well as traffic

Most operations managers would get fired for trying to get maximum utilization out of each server, because it’s common knowledge that high utilization slows servers to a crawl. Why is it that when development managers see a report saying that 90 percent of their available hours were used last month, their reaction is, “Oh look! We have time for another project!” Clearly these managers are not applying queuing theory to the looming traffic jam in their department.

You can’t escape the laws of mathematics, not even in a development organization. If you focus on driving utilization up, things will slow down. If you think that large batches of work are the path to high utilization, you will slow things down even further, and reduce utilization in the process. If, however, you assign work in small batches and concentrate on flow, you can actually achieve very good utilization—but utilization should never be your primary objective.

Reducing Cycle Time

Let’s agree, at least for the moment, that our objective is to reduce the average cycle time from concept to cash or from customer need to deployed software. How do we go about accomplishing this goal? Queuing theory gives us several textbook ways to reduce cycle time:

1. Even out the arrival of work

2. Minimize the number of things in process

3. Minimize the size of things in process

4. Establish a regular cadence

5. Limit work to capacity

6. Use pull scheduling

Even Out the Arrival of Work

At the heart of every lean process is an even level of work. In a factory, for example, a monthly plan to build 10,000 widgets translates into building one widget every minute. The factory work is then paced to produce at a steady rate of one widget per minute.

The budgeting and approval processes are probably the worst offenders when it comes to creating a steady flow of development work. Requests are queued for months at a time, and large projects may wait for the annual budgeting cycle for approval. Some think that by considering all proposals at the same time, an organization can make a better choice about how to spend its budget. However, this practice creates long queues of work to be done, and if all of the work is released at the same time, it wreaks havoc on the development organization. Moreover, it means that decisions are made well out of sync with need, and by the time the projects are started, the real need in the business will probably have experienced considerable change. Tying project approvals to the budgeting cycle is generally unnecessary and usually unrealistic in all but the slowest moving businesses.

All of our work is in the first half of the year.

I was having dinner with a fairly high-level manager of a very large IT organization. They had a practice of moving developers around to different projects, and I suggested that they might want to consider assigning teams to specific business areas so they could become familiar with their customers.

“We couldn’t do that,” he said. “The business leaders all want their work to be done in the first half of the year, from January to June, so we try to do the background stuff they aren’t interested in the second half of the year.”

“Why do they want all of the work done in the first half of the year?” I asked. “Is your business seasonal?”

“No,” he replied. And then after thinking for a moment, he added, “They get their budgets at the end of December, and so I suppose they want the work done as soon as possible after that.”

“I know you want to be responsive to the business leaders,” I said, “But you can’t let them get away with that. You have to make it clear what that kind of schedule does to your ability to support them. What if you took every business leader’s budget and allocated a team of people to their business based on their annual budget, then let the business leader decide how the team would spend their time throughout the year?”

“Well, that would make too much sense,” he grinned. “It would mean a lot of change....” And he promised to consider it further.

—Mary Poppendieck

Queues at the beginning of the development process may seem like a good place to hold work so that it can be released to the development organization at an even pace. But those queues should be no bigger than necessary to even out the arrival of work. Often we find that work arrives at a steady pace, and if that is the case, then long queues are really unnecessary.

Doctor’s Appointments

Here in the United States, doctors stop accepting new patients when their backlog of appointments starts to grow too long. In many clinics, the waiting list for an appointment is about two months—it is not allowed to get longer, but then again, it never seemed to get any shorter either. This would be regarded as a stable system in queuing theory.

One clinic in Minnesota studied lean ideas and decided to see what would happen if they shortened the waiting time. For a while, most doctors worked an extra half day every week while the schedulers made sure that the arrival of work remained stable. Over the period of a half a year, the clinic reduced waiting times for an appointment to about two days. The clinic found that doctors still saw the same number of patients with the same mix of problems. Some doctors were surprised to find that they did not need a cushion of sixty days of appointments to keep them busy; in fact, they saw very little difference in their workload.

From a patient point of view, there was a dramatic difference—suddenly they could call up and get an appointment within a day or two. This was truly a “lean solution” for patients.

—Mary Poppendieck

Minimize the Number of Things in Process

In manufacturing, people have learned that a lot of in-process-inventory just gums up the works and slows things down. Somehow we don’t seem to have learned this same lesson in development. We have long release cycles and let stuff accumulate before releasing it to production. We have approval processes that dump work into an organization far beyond its capacity to respond. We have sequential processes that build up an amazing amount of unsynchronized work. We have long defect lists. Sometimes we are even proud of how many defects we’ve found. This partially done work is just like the inventory in manufacturing—it slows down the flow, it hides quality problems, it grows obsolete, usually pretty rapidly.

One of the less obvious offenders is the long list of customer requests that we don’t have time for. Every software development organization we know of has more work to do that in can possibly accommodate, but the wise ones do not accept requests for features that they cannot hope to deliver. Why should we keep a request list short? From a customer’s perspective, once something has been submitted for action, the order has been placed and our response time is being measured. Queues of work waiting for approval absorb energy every time they are estimated, reprioritized, and discussed at meetings. To-do queues often serve as buffers that insulate developers from customers; they can be used to obscure reality and they often generate unrealistic expectations.

But we have such a long list of things to do, how can we pare it down?

We generally find that long queues of work to do are unrealistic and unnecessary. Here are some ideas of how to deal with these queues.

1. Start by asking, “How many things in this queue are we realistically never going to get around to?” Cut all of the things you’ll never get to out of the queue immediately. Be honest. Just hit the delete key.

2. So, how many items did that exercise get rid of? Half? Now take the remaining items and do a Pareto analysis on them. Rate each one on a scale of 1 to 5. The critical items will rate a 5. The unimportant items will rate a 1. Now get rid of all except those that got 4s and 5s. Just hit delete. Don’t worry, if they turn out to be important, they’ll come back at you.

3. Now take the items that are left and calculate how many days, months, or years of work they represent. Will you have other things added to the list that will be more important? With that in mind, do you have the capacity to do the remaining items on the list in the near future? If not, should you add additional capacity?

4. If your list is still unrealistically long, there is probably some purpose that it is serving beyond making effective decisions on what to do and what not to do. For example, a long list might deflect undue attention or absorb frivolous requests. Break the list into two lists, one which will serve the exterior purpose, and the other which you will keep short and work off of.

Seven Years?

“We prioritize this list every week,” the manager said.

“Do you know about how many requests there are in the list?” I asked.

“Yes, there are about 750,” he said.

“Do you know how many of those you can do in, say, a month, on the average?”

“Yes, we average about nine every month,” he answered. The company kept good statistics.

“Wow!” someone else said. “That’s seven years of work!”

“Seven years!” the manager was astonished. “I had never looked at it that way.”

“And why do you keep so many on the list if you know you will never get to them?” I asked.

“Well, our process expert said we don’t want to lose track of anything. And we’ve gotten so we don’t spend much time on the list every week.”

“So you never get back to the customers and say, ‘No, sorry.’ And they probably keep expecting you to develop their features,” I said. “Don’t you want to be a bit more honest with your customers?”

“Yes, we used to tell them ‘No’ all the time—I was very aggressive about it when we were small and I was more involved. Customers seemed to appreciate the honesty. Maybe we should start doing that again....”

—Mary Poppendieck

Minimize the Size of Things in Process

The amount of unfinished work in an organization is a function of either the length of its release cycle or the size of its work packages. Keeping the release cycle short and the maximum work package size small is a difficult discipline. The natural tendency is to stretch out product releases or project durations, because the steps involved in releasing work to production seem to involve so much work. However, stretching out the time between releases is moving in exactly the wrong direction from a lean perspective. If a release seems to take a long time, don’t stretch out releases. Find out what is causing all the time and address it. If something is difficult, do it more often, and you’ll get a lot better at it.

Releases Take Too Long

After a talk at a company, the QA manager came to me and said, “I don’t see how we can follow your advice. We have a lot of pressure to put as many features as we can in each release, because they are so far apart.”

“Why not release more often?” I asked.

“We can’t possibly release more often because verification takes so long,” came the immediate reply.

“Why does verification take so long?” I wondered.

“We find lots of problems in verification that have to be fixed,” he said. I began to see a vicious circle.

“Can’t you find most of the problems before verification?” I asked. After all, he was the QA manager.

“Verification is supposed to be independent,” the QA manager replied. “If we verify the code while the developers are writing it, that would destroy our independence.”

I was really surprised. There are a lot of good reasons to delay final verification. In embedded systems the hardware usually isn’t ready until the end. When deploying to a customer site, you don’t have access to their environment until the very last moment. But this was a new one.

“I can understand independent verification,” I said, “but how is that related to long release cycles? Can’t verification be independent but test a smaller amount of code?”

“Well, I’ll have to think about it, but we might get too close to the developers that way.” He sounded reluctant. I guessed that their long release cycles were not going to get shorter any time soon.

—Mary Poppendieck

Oh, NOW I Get It!

We were teaching a class that had done current value stream maps and then future value stream maps. The last of the groups was presenting their future map. Suddenly someone from a different group exclaimed, “Oh, NOW I get it!” The presenter paused to see what this was all about.

“When I developed that future value stream map that I just finished presenting,” he said, “I cut the release cycle to a third of its former length, and I was really frustrated that I still had a rather low process cycle efficiency. What I just realized is that I’ve been trying to optimize utilization with releases. The whole concept of a release is what’s driving down my efficiency. If I could release as soon as a patch is ready instead of waiting for a release, the process cycle efficiency would be much better!”

The speaker was obviously proud of this blinding insight, but as he looked around he noticed that most people weren’t all that impressed. Then he said kind of sheepishly, “I guess this is what you’ve been trying to say all morning. It just took the idea this long to sink into my head.”

—Tom & Mary Poppendieck

Establish a Regular Cadence

Iterations are the cadence of a development organization. Every couple of weeks something gets done. After a short time people begin to count on it. They can make plans based on a track record of delivery. The amount of work that can be accomplished in an iteration quickly becomes apparent; after a short time people stop arguing about it. They can commit to customers with confidence. There is a steady heartbeat that moves everything through the system at a regular pace. A regular cadence produces the same effect as line leveling in manufacturing.

What should the cadence be? One friend favors one week iterations. He finds it’s just long enough to get customers with emergencies to be sure the problem is real before his team dives in and just short enough to deliver very timely work. Another friend swears by 30 days, because it gives the team time to think things through before they start coding, yet is short enough that managers can wait until the next iteration to ask for changes.

The cadence is right when work flows evenly. If there is a big flurry of activity at the end of an iteration then the iteration length is probably too long; shorter iterations will help to even out the workload. Cadence should be short enough that customers can wait until the end of an iteration to ask for changes, yet long enough to allow the system to stabilize. This is best understood by considering a household thermostat. If the thermostat turns on the furnace the instant the temperature falls below the temperature setting, and turns it off the instant the temperature rises above the setting, the furnace will cycle on and off too frequently for its own good. So thermostats have a lag built into them. They wait for the temperature to drop a degree or two below the setting before turning on the furnace, and they wait until the temperature goes a degree or two above the setting before turning the furnace off. This lag in response is small enough so you don’t feel much difference, and big enough to keep the furnace from oscillating. Use the same concept when finding the right cadence for your situation.

Asynchronous Cadence

One embedded software department we know of found their release schedule was getting too complicated as hardware models proliferated. So they decided to create a single version of the software that would run on all hardware models. Once they established the platform, they added technology capabilities to the software at three-week intervals. As new hardware models were developed, their engineers could look at the plan for software “technology drops” and decide which drop to pick up. A new model might wait a couple weeks for a technology drop with a feature it really needed, or the software department might be convinced to change its technology drop schedule, but only if other hardware models agreed.

By uncoupling the software from the hardware, the software department was able to establish its own cadence, and it didn’t take long to discover that the new system was much more productive.

—Mary Poppendieck

Limit Work to Capacity

Far too often we hear that the marketing department or the business unit, “Has to have it all by such-and-such a date,” without regard for the development organization’s capacity to deliver. Not only does this show lack of respect for the people developing the product, it also slows down development considerably. We know what happens to computer systems when we exceed their capacity—it’s called thrashing.

A Long Saturday in the Airport

We got to the Melbourne airport at 7:30 on a fine Saturday morning, plenty of time to catch our 10:00 a.m. flight to Auckland. A check-in desk had just opened up, so we put our luggage on the scale and tried to hand our tickets to the check-in agent. “Not so fast,” she said. “The computers are down.” As we looked around, we finally noticed the long lines everywhere.

“How long have they been down?” we asked.

“Oh, about an hour,” she said.

In most US airports, each airline desk is probably accessing a different computer system, but in airports like Melbourne, there is one computer system that everyone shares. So that meant the whole airport—both domestic and international terminals—had been down for an hour.

Shortly after that the phone rang. Someone was calling to spread the word that the computers were coming back up. Throughout the airport we could see dozens of people poised at their terminals. And then, everyone began typing furiously, all at the same time. It took about 15 seconds for the system to crash again. “It’s been like that for the last half hour,” the agent told us. “Every ten minutes they say the system is coming up, and then it crashes again.”

We could guess why: The system probably was not designed for hundreds of people to type at exactly the same time.

The computer system finally came up three hours later. We were rather late getting into Auckland.

—Mary & Tom Poppendieck

Time sometimes seems to be elastic in a development organization. People can and do work overtime, and when this happens in short bursts they can even accomplish more this way. However, sustained overtime is not sustainable. People get tired and careless at the end of a long day, and more often that not, working long hours will slow things down rather than speed things up. Sometimes an organization tries to work so far beyond its capacity that it begins to thrash. This can happen even if there appear to be enough people, if key roles are not filled and a critical area of development is stretched beyond its capacity to respond.

A Customer Service Problem

I was visiting a company that asked me to look at its customer service process. As we drew a value stream map on the white board, we got to the point where a customer service team was on site installing software. At that point, I was told, they had a problem. “How often does this happen?” I asked.

“Every single time,” they all agreed.

“Okay, so how does the problem get fixed?” I asked.

“It doesn’t.” General agreement again.

“It doesn’t?” I had to be sure I heard that right.

“Yes, a request is made to development, but it goes into a queue. It never comes out unless it’s one of our Top 3 priority customers. They’re just too busy.”

With what? I wondered, but instead I asked, “So what happens to the customer?”

“Well, the customer service people stay on site. They do other things. It can be weeks before they get any help.”

“But you said they only get help if they are one of the Top 3,” I said.

“Well, the customer eventually complains enough that they get moved up into the Top 3, and another one gets bumped out.”

“It would appear to me,” I said, “that you don’t have a customer service problem, you have a problem delivering code that can be counted on to work at a customer site.” In the ensuing discussion I could tell that this was a sore point.

“Why can’t the development department focus on figuring out what is causing this and fix it?” I asked.

“Our investors are very demanding,” they said. “We have a product map. We have to be developing new systems for new customers. We have to keep on adding more customers.”

“But if you can’t bring new customers up, why do you want them?” I asked. “It seems to me that you’re thrashing.”

“Yes, that’s a good word,” some agreed. But others didn’t seem to think that the problem was such a big deal. Which, in the end, was probably the source of the problem in the first place.

—Mary Poppendieck

Use Pull Scheduling

When a development team selects the work it will commit to for an iteration, the rule is that team members select only those (fine-grain) items they are confident that they can complete. During the first couple of iterations, they might guess wrong and select too much work. But soon they establish a team velocity, giving them the information they need to select only what is reasonable. In effect, the development team is “pulling” work from a queue. This pull mechanism limits the work expected of the team to its capacity. In the unlikely event that the team finishes ahead of time, more work can always be pulled out of the queue. Despite the fact that everyone always has work, the pull system has slack, because if emergencies arise or things go wrong, the team can adapt either by terminating the current iteration or by officially moving some items to the next iteration. Finally, since the team is working on the most important features from a customers’ perspective, they are working on the right things.

An Example of Pull Scheduling

I was visiting a medium-sized department in a large financial institution. Swen (not his real name) was responsible for business results and needed software changes to deliver them. He was very frustrated because all of his requests seemed to take forever. His frustration was matched by Karl (also not his real name), the senior manager in the IT department, who felt that Swen didn’t understand how difficult his requests were or the problems he created by constantly changing his mind.

Karl insisted that each request needed to have a rough cost/benefit analysis and, if it passed, a more detailed architectural review before being scheduled for implementation. This sounded fine to Swen. All he wanted was to have some control over what was done and an understanding of when things were going to be ready to use.

I sketched the queuing idea in Figure 5.4 as a management approach:

Figure 5.4 Pull system for managing a workflow

With this system, Swen agreed to be limited to a maximum of six requests at a time. Karl committed to having a rough cost/benefit analysis (that would take about four hours) done within a week of each request. At that point Swen could either put it into the architectural review queue or reject it, but he agreed to have no more than three architectural requests in the queue at a time. If the queue was already full, a new request would either replace one of the existing requests in the queue or be rejected. Swen agreed that when the architectural queue was full, he would not submit any more requests that were less important than the three in the queue.

Karl agreed to complete an architectural review and more detailed cost estimate within two weeks. Then Swen could either accept or reject the result. Accepted requests went into the backlog, and every two weeks the team would pull an iteration’s worth of work from the backlog. Swen agreed to limit the number of items in the backlog to no more than two iterations worth of work.

The important point is that Swen would “own” the queues. By keeping them short, he could tell at a glance approximately when any feature would be complete. Swen could reorganize the queues, add, or take away items at any time until the items were pulled by Karl’s teams. Karl’s organization would always be busy but never swamped, and they would always be working on exactly what Swen wanted.

Karl had a few other customers, but Swen accounted for 65 percent of his workload. Karl felt that he could integrate his other customers into this queuing system or else have separate teams work on their requests.

—Mary Poppendieck

Cascading queues (as shown in Figure 5.4) are possible, and are often used at organizational boundaries. Queues are a useful management tool, because they allow managers to change priorities and manage cycle time while letting the development teams manage their own work. But queues are not an ideal solution. When they are used, here are some general rules to follow:

1. Queues must be kept short—perhaps two cycles of work. It is the length of the queues that governs the average cycle time of a request through the development process.

2. Managers can reorganize or change items at any time that they are in a queue. But once teams start to work on an item, they should not interfere with day-to-day development.

3. Teams pull work from a queue and work at a regular cadence until that work is done. It is this pull system that keeps teams busy at all times while limiting work to capacity.

4. Queues should not be used to mislead people into thinking that their requests are going to be dealt with if the team does not have the capacity to respond.

Summary

The measure of a mature organization is the speed at which it can reliably and repeatedly execute its core processes. The core process in software development is the end-to-end process of translating a customer need into deployed product. Thus, we measure our maturity by the speed with which we can reliably and repeatedly translate customers’ needs into high quality, working software that is embedded in a product which solves the customers’ whole problem.

Try This

1. How many defects are in your defect queue? How fast do they arrive? At what rate do they get resolved? At that rate, how many days, weeks, months, or years of work do you have in your defect queue? How many of the defects in your queue do you have a reasonable expectation to resolve? How much time do you spend on managing and reviewing the queue? Is it worth it?

2. How many requests are on your list of things to do? At what rate do they arrive? How much time, on the average, has already been spent on each item? How much time do you spend on managing and reviewing the queue? How much work (in days, weeks, months, or years) do you have on your list of things to do? Do you keep things in the list that you will never get around to? Why? What percent of the queue does this represent?

3. Does high utilization of people’s available time cause logjams in your environment? Does your organization measure “resource” (i.e., people) utilization? If so, what kind of impact does this measurement have: Is it taken seriously? Does it drive behavior? Is that behavior beneficial?

4. What determines your batch size: Release schedule? Project size? Can you reduce time between releases or the size of projects? What is a reasonable target? What would it take to change to that target?

5. At a team meeting, review the list of ways to reduce cycle time:

• Even Out the Arrival of Work

• Minimize the Number of Things in Process

• Minimize the Size of Things in Process

• Establish a Regular Cadence

• Limit Work to Capacity

• Use Pull Scheduling

Which one of these shows the most promise for your environment? Experiment by implementing the most promising approach and measure what happens to cycle time.

Endnotes

1. From Jeff Sutherland, “Future of Scrum: Parallel Pipelining of Sprints in Complex Projects,” Research Report, Agile 2005. Used with permission.

2. Posted on [email protected] on September 25, 2005. The fourth point is from message 9404, August 5, 2005, message 8849. Used with permission.

3. Posted on [email protected] on November 21, 2004, message 5439. Used with permission.

4. See www.lean.org.

5. See Michael George and Stephen Wilson, Conquering Complexity in Your Business: How Wal-Mart, Toyota, and Other Top Companies Are Breaking Through the Ceiling on Profits and Growth, McGraw-Hill, 2004, p. 37.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 5. Speed

Create new playlist

Sign In

Sign Up

Chapter 5. Speed

Deliver Fast

PatientKeeper

Time: The Universal Currency

Queuing Theory

Little’s Law

Variation and Utilization

Reducing Cycle Time

Even Out the Arrival of Work

Minimize the Number of Things in Process

Minimize the Size of Things in Process

Establish a Regular Cadence

Limit Work to Capacity

Use Pull Scheduling

Summary

Try This

Endnotes

Table of Contents for
Chapter 5. Speed