CHAPTER 5
How we stay trapped

There are several self-reinforcing beliefs that are not true but continue to be believed. Many of these fallacies may have been true at some point, yet collectively they do far more harm than good.

  1. “We need detailed requirements or we won’t get what we want”
  2. “It will cost more to reinvent the wheel”
  3. “Software development is analogous to construction or manufacturing”
  4. “Software projects can only be estimated by analogy”
  5. “Having one neck to choke is an advantage”
  6. “Each application has a positive ROI, therefore my IT portfolio must be returning far more than I am spending”
  7. “We’re not in the information systems business”

In the following sections, we will review where each of these ideas came from, what might have made them true at some point, why they are no longer true, and what we need to do to blunt their impact.

I have seen all of these. They often come up as clichés disguised as wisdom. One of the things that keeps clichés entrenched is the same as the dynamic behind generalizations. A generalization may be based on ignorance or a vast amount of knowledge. It may be based on flimsy association or rigorous scientific experiment. So, for instance if I were to say that “Kombucha causes kidney stones” you can’t tell if this is the product of my association of the one time I drank Kombucha followed by the only kidney stone I’ve ever passed, or a double-blind experiment with testable scientific hypothesis involving a statistically significant population. All we can know for sure is that I will likely get a visit from the Kombucha lobby.

I bring this up because there is a meta-pattern behind these clichés that we need to be inoculated against. The people who perpetrate these clichés are often people with little or no experience in building or integrating the kind of systems we’re talking about here. They are often in positions of influence or control over budgets and approaches. So often, right before you hear these fallacies declared, the speaker will wait for a pause in the conversation, and then with a studied flair, remove his or her glasses and set them down on the conference table and solemnly utter one of the seven fallacy clichés.

We need to keep our guard up. These fallacies are keeping us mired in a status quo that is slowly draining funds and energy. They are far more dangerous than they seem.

Fallacy # 1 “We need detailed requirements or we won’t get what we want”

Once upon a time software was hard to change once built. Barry Boehm pointed this out in his classic Software Engineering Economics.30 Studies of software development at NASA and defense contractors demonstrated that changing a function in a software system at the beginning of a project costs very little. There is no rework, since no work has been done.

Introducing a change at design time incurs additional cost, in particular if there is a great deal of design that must be redone to accommodate the change. Once coding starts, the cost of introducing a change starts to ratchet up rapidly. Again, if the change is introduced early in the code creating process, the impact is lower. Introducing a change late in the coding phase is very expensive, to the extent that it causes changes to other code.

Boehm’s book was written in a time when most software was developed in a strict waterfall methodology, and testing was generally not started until coding was complete. The cost to introduce a change to a function of the system once it has entered testing is often 10 times what the cost would have been had the change been introduced at the time of requirements gathering.

Finally, if the software is in use, in production, the cost of making a change explodes. 40 times is the figure often cited from these studies.31

Faced with this evidence, many prudent people collectively concluded that almost any amount of time spent in requirements would be worthwhile, as every function uncovered early would pay for itself 40 times over. Add to this the prevailing practices in most large firms and all government agencies of putting software out to bid. The scope of the project is the requirements that it must satisfy.

There is deep and long running truth behind the desire to produce detailed requirements before launching a software development project. Nevertheless, there is also a downside to this.

In the pursuit of completion, most projects dramatically over-specify their requirements. This has become more extreme in government agencies, largely because of their funding cycles. There are many state government sponsored projects, where this excess is rampant. We will detail an example from Child Support Enforcement.

Child Support Enforcement is a relatively small agency in most states, and as such has more modestly-sized systems. Everything we will say about Child Support Enforcement is even truer for State Medicare and Medicaid systems, Department of Motor Vehicle Systems, Transportation Systems, State Patrol, Corrections, Employment Security, Pensions, and Education.

Every state has a Child Support Enforcement (CSE) system. Most were built in the 1970’s or 1980’s. Some are running on obsolete databases and programming languages. These systems have been inefficient for decades. The cost to make the simplest change is quite high. Most states soldiered on, though, as the capital costs to replace these systems was prohibitive. Eventually the cost and risk of continuing with the obsolete system motivate change, and a project is launched.

CSEs are mostly funded by the Federal Department of Health and Human Services (HHS). A state wishing to replace its aging system applies to the HHS for approval, and when the approval is granted, the state receives the assurances that it will be reimbursed 60-90% of the cost of the system.

The HHS, with most of the money on the line, wishes to ensure it is getting good value, and has created standard procedures and architectures that the states must follow. One part of this process is to perform a very detailed requirements analysis. These requirements studies typically take a year or more to complete, at costs of $5 million or more. In addition to interviewing all the personnel and reviewing the existing system in depth, the contractors who are preparing the requirements document have previous documents from other states to reference.

Each time a state initiates a project, the requirements get longer and more detailed. Although there is effort made to rank these requirements, each iteration just keeps getting longer, and including more “must have” requirements.

Armed with these requirements documents, the agency goes out to bid. “This is a complex system. Look at all these ‘must have’ requirements,” say the bidders, and they bid accordingly.

We have been tracking these projects for about a decade (ever since we did a high-level design for the State of Colorado). Almost all the replacement projects done in the last decade have cost more than $100 million. Most go way over their original budget. Texas had a $200 million budget that ballooned to $300 million. They threatened to sue the contractor for breach of contract, but in the end asked the federal government for an additional $100 million to finish the project.32 At $420 million it is now “back on track.” California implemented their CSE for $1.7 billion.

The sad thing is that these replacement systems are scarcely better than the systems they replaced, in terms of cost to make subsequent change and cost to integrate with the state’s other systems. These systems are no more prepared to transit off their database and language than their predecessors were.

Government projects of all sorts are famous for these types of excess. Consider the big dig in Boston or the Chunnel, each of which ran over their original estimates fivefold.

But the difference between millions of cubic yards of dirt and building a system to track deadbeat dads, is in the first case there is a real physical lower limit to what it could cost. In the latter case, the lower limit is approaching zero.

We know from our work with Colorado CSE, that a project well executed could be completed by a dozen people in less than two years. Some of this is by judicious use of agile technology, which we will cover subsequently. Some of it is through data centric design, which we will cover in the sequel to this book. But a great deal of it is just by replacing the detailed requirements, and the need to get it complete up front, with a system that prioritizes what it builds.

So while the fallacy is somewhat literally true: If you don’t have detailed requirements, you won’t get exactly what you want. The converse is even truer: your detailed requirements will drive your project costs up 10-100 fold, increase the risk, and greatly prolong the time taken.

It often costs as much to document “all” your requirements as it does to build a working version of the system. The big difference being if you build a system, you have something to work from, and your remaining requirements are clearer. If you write a multi-million dollar requirements document, you’re pretty much committed to building an overly expensive system.

Fallacy # 2 “It will cost more to reinvent the wheel”

There is a great deal of truth to this cliché, which again is what makes it so persistent. As we will see, though, this has led many companies astray.

Who wants to reinvent the wheel? Why invent something that has already been invented?

But that isn’t what this cliché is about in the business world. It is not about invention. It is really: “Why build what you could buy?” Here at the first level of approximation the answer should be “yeah, why would you build what you can buy?”

At the layman’s level, you wouldn’t build your own word processor or spreadsheet when they are readily available at modest prices. At the enterprise level, the decision gets murkier, and relying on metaphors from our personal buying habits may not serve us well.

Enterprise applications are expensive. Even the internal cost to procure them is expensive. We have had front row seats and watched several organizations spend years and at one organization spend over a decade trying to buy an ERP system. (Imagine what they could have accomplished in a decade if they hadn’t been pursuing this goal.) As the price and the long-term commitment to a packaged application increases, so does the proportional internal cost of making the right choice.

However, it’s not just the expense. As expensive as they are, few companies would believe they could recreate the functionality for less by building it or reusing existing well-tested components. Few companies would believe it, but this is what we are already seeing and will soon see more. The cost of creating and maintaining application functionality, if unmoored from proprietary legacy application packages, can be considerably more economical.

Before we elaborate on that claim, let’s spend a bit more time on the hidden costs of application software acquisition.

The first hidden cost of application software is the annual license or maintenance fee. Recently this has been ratcheting up to where it is often 20% of the purchase price. Over a twenty year life, the annual maintenance will be four times the original price, making the cost five times the sticker price. Note that while a product you build yourself will have ongoing maintenance, it will mostly be features you actually want and/or adaptations to your changing application landscape.

The second hidden cost to packaged applications is the implementation cost. You might think that the cost to implement a packaged solution and a custom solution might be comparable, but they are not. I have had the privilege of working on a custom system being implemented in one division of a client while another division was implemented a packaged solution. While the packaged solution saved some money on development, they more than made up for it in implementation. Let’s examine why this is so.

The cost of implementing a packaged application in an enterprise is mostly driven by these activities:

  • Configuration. As packages have become more and more complex, the act of turning on and off various features has become a dark art. In the world of health administration software, the battle between vendors is often pitched in the “how many flags do you have?” front. The “flags” are options you can turn on or off in the system functionality. These flags essentially turn on and off chunks of code within the system. In some ways, configuration is the equivalent of coding. If there were just a few flags and they were isolated and independent, it would just be a matter of choosing between alternate bits of tested functionality. But when there are thousands, and they interfere with each other, configuration can get complex, and should only be performed by experts.
  • Modification or extension. Most packages do not do exactly what you want them to. The alternatives are either to modify the software you’ve purchased or to extend with additional programs. Extension is greatly preferred because modifying third party code makes it very hard to stay current with package updates. However, often modification is the only way to go. Here we get an interesting bit of reuse of the concept introduced in the previous section. Recall that the cost of a change introduced late in a system’s lifecycle can be 40 times as expensive as the same feature being introduced early in the lifecycle. Imagine you had 20 such changes you would want to introduce to a system. The cost of adding them to an already existing system would be 800 (40*20) times as much as adding them to a custom system early in its design stage. Keep in mind you still have to build the base system to apply the 20 changes, but as we’ll see later, this is nowhere near as hard as it used to be.
  • Integration. In most enterprises, the cost of integrating the new system with existing systems exceeds the cost of the system itself. The cost of integration is roughly proportional to the number of touch points in the enterprise, the number of touchpoints in the new package times the complexity of each, times each other. As mentioned in Chapter 1, the US Department of Defense gave up on their packaged HR implementation when the integration cost hit a billion dollars and was only one quarter complete.
  • Data conversion. When you buy package software, you get a data model that is arbitrarily different from the one you are replacing. Also let’s hope you are replacing only one, as many package implementations are additive: they are adding just a few extra features or datatypes that can’t be handled by the existing systems, but aren’t worth dismantling the existing systems. By itself, that isn’t a problem. The problem arises when you do a trial conversion and you realize that in addition to being structured differently, the new system requires new data and has different ideas about data integrity and validation. This often gives rise to a “Data Quality” project to “Clean up” the data. Yes, your existing data had some weaknesses, but it mostly got the job done. The new system often has arbitrarily different ideas about quality (baked in from the vendors experience with other clients) that you must now conform. Again, the data clean-up project is often several times larger than the cost of the software. To be sure, some of the clean-up is useful, but most of the work is conforming to a different model.
  • Change management. Most of your employees have grown up using the existing system. This is how they know your business and your industry. Whatever the labels are on the screens and reports, whatever workflow the system has imposed on them, over time becomes their internal mental model of how a system should work. Many have made careers out of working around the idiosyncrasies in the existing system to get it to work. The new system has all new screens, new reports, new terms, new labels, new implicit and explicit work flow, and a need for a whole slew of new workarounds. The new system proclaims this state of affairs as “best practice.” It is just some averaging of the vendors’ original conception with whatever part of their implementation experience they could fit back into the base product. There is occasional streamlining. However, the jarring nature of the change leads to a need for “change management,” which is a euphemism for retraining people who had learned on the job and aren’t excited about a system that typically isn’t easier, just different. Again, the change management portion of a package implementation project can easily exceed the cost of the system.

Early in my consulting career, we often selected and implemented packaged solutions for clients. Many of those worked out well. However, part of this was due to the relative simplicity of the systems in those days, and part was due to how few other systems there were to integrate. Furthermore, another part of it was that people were coming from no system to a system, rather than from system to system. The manual data entry of the master files was both data conversion and training all in one.

Now the landscape has changed. I remember as packages started getting more complex in the early 1980’s sitting in on a pitch by a consulting firm to do a packaged implementation, and how they explained that the cost of implementing this type of system was typically three times the software acquisition cost. I would sit in on many more of these types of pitches over the years, and the multiplier climbed from three to five to ten. I’m not even sure what it is anymore.

However, even in that first meeting it struck me: what, if any, is the relationship between the professional services needed to implement a package and the cost of the package?

This thinking still prevails.33

I had a flashback to one of my first professional jobs, as a real estate agent for a company that specialized in selling taverns. I had just turned 21, and looked about 16, which made it difficult for me to get into the taverns to ask if the proprietors would like to list their establishment for sale with us. What I struggled with more than my youthful appearance was this firm’s approach to valuation. The owner of the realty firm had decided that a Tavern should sell for about $10,000 times the number of kegs they went through per month.

I was a business major at the time and this made no sense to me. It seemed to make sense to everyone else, because after a decade or so of promoting this idea, whenever I’d show someone a tavern and want to go over its P&L they would ask how many kegs they went through in a month and decide the price was either high or low. A tavern could be losing money, and gaining value based on the number of kegs they went through. It certainly discouraged ancillary activity such as food, wine, or liquor that typically have much higher margins but don’t contribute to the keg metric.

I saw the implementation metric the same way. What possible connection could there be? If the software company suddenly charged half their previous fee, would the consulting magically drop in half? No, I saw only two correlations. First, over time, the four cost drivers of package implementation kept climbing. Second, software that was more expensive tended to be more complex, and complexity is a prime contributor in all four of the above cost drivers.

Finally, each package (really each application, but packaged applications are both more numerous and more complex) adds to the overall entropy of the enterprise. An enterprise with 10 major application systems is a bit complex and it will take effort to introduce an 11th application into that mix. However, introducing the same application into an enterprise with 100 or 1000 applications is much more difficult, and each application added to the mix raises the cost for all subsequent applications.

We’ve laid out the many hidden costs of packaged implementations, but this still doesn’t necessarily make them more expensive than their custom alternatives. We have just seen that the total cost of implementing and owning a package application is often 10-20 times its purchase price, meaning that it is incumbent on us to look very hard at the build versus buy decision before retreating to simple clichés.

The agile software industry has pretty much proven empirically what I’m suggesting here logically. It is often much cheaper and faster to build functionality than to buy it.

When you should buy packaged software

You should buy packaged software when:

  • You can use it as is with no modification and simple customization
  • It contains internal logic that far exceed your own organization’s ability to develop
  • It has simple or uniform interfaces that could be made to conform to your firm’s data model, rather than forcing you to adopt yet another variation
  • The purchase price is at least ten times less than your estimate of what it would cost to build the functionality in house
  • You have real time pressures that would mean the time to implement would be a key factor in an implementation. However, while the cost of development adds time to your timetable, you may gain it back with reduced time to implement.

When you should build software

There are two scenarios for opting for the build route:

  • The small application
  • The many large applications

An example of a small application problem is automating some low volume or low complexity business area for your firm. Maybe you want to automate the supply rooms in your offices or employee progress on a training curriculum. These are projects where your needs and requirements are often modest.

If you have even a very small band of agile developers, you’d be surprised how quickly they can create a first working version of these types of systems, and how rapidly they can iterate toward a completed state.

The “many large applications” scenario is much more nuanced. When you have many large (legacy) systems, it is tempting to want to upgrade them. “Legacy modernization” is the current term of art for replacing your legacy systems with neo-legacy systems. Neo-legacy systems are systems that are built with more modern programming languages, databases and the like, but which none-the-less have most of the undesirable features of the original legacy systems such as high cost of change.

When you have dozens of legacy systems, it is tempting to try to knock them off, one at a time. There is a very important flaw in this thinking. The boundaries of what is in one application or another are not logical. They are based on the sequence of events that led to this particular configuration of applications. Even two firms with the same applications will often use them for different subsets of their business, purely for reasons of historical accident.

If you are committed to getting out of the application quagmire, it will require some strategy and at least some custom development. You will need a long-term plan that allows you to build new functionality that partially relies on your new more permanent and stable data structures and partially reaches into your existing systems for functionality and data you haven’t had time to convert yet. As you can probably appreciate, this kind of functionality is almost the opposite of packaged application functionality. It is about lightweight functionality riding lightly and changing frequently.

If you are in this situation, we recommend two sub strategies with regard to build versus buy:

  • The functionality that you will need to bridge the new with the old will be mostly built. You may buy an architecture or environment that makes it easier to build this, but the combinations you have to construct have never been encountered before, and are de facto custom.
  • When you get ready to tackle a component that has the potential to be a package, estimate what it would cost to custom build the functionality you need before you go out to bid. Take an Accounts Receivable system for instance. There is very little logic in an AR system. Aging of receivables is just categorizing based on due date versus todays date. Calculating late fees and early payment discounts are just a few simple rules, based on balances and cutoff dates. Recurring invoices reuse templating and scheduling features. Most of the rest of the functionality is records management and form population. If you’re still on the fence, have one of your agile teams build a MVP (minimum viable product) prototype of the needed system. You will know in less time than it takes to launch an RFP whether they will rapidly converge on the features you need or not.

In my opinion the build/buy decision, whether for a small standalone application or as an overall modernization strategy, deserves a great deal more thought than just throwing out the cliché “Let’s not reinvent the wheel.”

Fallacy # 3 “Software development is analogous to construction or manufacturing”

Once upon a time writing software was an art form, more akin to poetry than engineering. However, things got big and complex and eventually the engineering analogy took hold. In general, this was a good thing. As systems became larger, more discipline was needed. Along the way, though, another analogy took hold. This one was aided and abetted by the Zachman Framework:34

The analogy is this: construction and manufacturing take designs and turn them into real things.

In the case of construction, there are generally blueprints but often not to an extremely precise level of detail. House blueprints generally specify where walls are but not where the nails go. Manufacturing often specifies the build process in detail but this is in order to make the exact same thing repeatedly. Perhaps the best analogies and the ones that got used the most, were those for products that combine many features of construction and manufacturing: aircraft and submarines.

Here is how the analogy works.

A complex thing, like an airplane or a submarine, starts life as a conceptual design. The design is gradually transformed from one rendition to another, each at a greater level of detail and precision. At each level, different constraints apply. In the airframe business, this high-level design is called the “air transport system,” which determines if the design is feasible (Can it fly?) and if we have the right features (Are customers more interested in range or fuel efficiency?). There is no point investing engineering time on detailing features that won’t work or aren’t wanted.

The second level in the airframe business is called the industrial process. At this point, they work out issues like the propulsion system, level of technology and risk, and key suppliers and partners. There are design decisions made at this level, including preliminary materials selection and more elaborate designs.

The third level in airframe design is the engineering level. At the airframe design level, the design splits into many specialized sub disciplines, each of which creates a more detailed model, subject to their own constraints, and then the various models are reintegrated. This is also the level where the direct operating costs are established.

The fourth level is subsystem design where all the subcomponents are optimized.

In the software analogy, Zachman has proposed six levels:

  • Executive Perspective (scope)
  • Business Management Perspective (business concepts)
  • Architect Perspective (systems logic)
  • Engineer Perspective (technology physics)
  • Technician Perspective (tool components)
  • Enterprise Perspective (operational instances)

Across each of the levels there are six “interrogatives” (what, how, where, who, when, and why) that further subdivide the subdomains of interest.

While this was originally justified as an analogy to other complex construction / manufacturing process, the analogy seems to breakdown in many ways.

Firstly, while it is inescapable that design changes late in the manufacturing or construction process have huge cost impacts (it’s pretty easy to move a bearing wall before a building is built, it very costly to do so after the building is built). And while early generation software had many of these same characteristics (expensive to change in the field), modern distribution methods coupled with agile development methods mean that changes can be propagated late in the delivery process. Amazon can roll out a system change in minutes. Amazon has gotten so good at continuous deployment, they are now rolling out five changes to their production systems per second.

Now that the cost of change has been changed so profoundly, we recognize that the basis for the analogy of software to construction has completely broken down. Airplane designs are on paper or electronic, but the planes are made of aluminum (and other materials). However, with software there is no such materiality shift. Today designs are electronic artifacts as is the final system. As such, there is no essential need to transform from level to level.

In the old days, there was no way to implement a conceptual model, so it seemed natural to think this was an initial phase in the design of the system. In the same fashion, logical models were not directly implemented and table and column names were changed as the design became more physical. Indexes and other implementation considerations were introduced at the bottom of the matrix.

In systems built from architectures like this (including TOGAF, which is an open standard based architecture), there is a tendency for the conceptual models to get left behind. If you create a conceptual model and then derive a logical model from it, in the process you will learn new things and alter your logical model. It is a great deal of extra work to go back and keep the conceptual model up to date. This can be somewhat alleviated if the logical model can be completely created from the conceptual model, but if this is possible, you should ask yourself if there is a need for two models. In any event, in our observations conceptual and even logical models are rarely kept up to date with the conceptual and logical models that spawned them.

In our opinion, the separation by level is an artifact of older design techniques that actually hampers application development.

In software, the artifacts at the conceptual level can be the same as those at the implementation level, such as with more modern databases like graph and document.

The analogy breaks down further in that software and data are reusable in ways that physical things are not. If you decide that an airplane should have two engines, you aren’t done when you’ve built one engine. However, in software if two processes need one shared routine, the second one is complete when the first one builds the routine.

The interrogatives are semantic distinctions, to be sure, but in an information system, they all end up as data. There is essentially no cost to reusing software. There is data about the:

  • what (physical things)
  • how (process templates and process instances)
  • where (at one level geospatial location of the physical things and at another the geospatial location of the data centers and data itself)
  • who (identification and in some case authentication of individuals referred to or participating in the system)
  • when (the framework focuses on cycle time, but other temporal aspects are important such as temporal relationships)
  • why (goals and metrics are the data instantiation of motivation)

So it’s all data–different types of data. There is data about types of items (e.g. Employees and Patients) and there is data about individual things (e.g. Nurse Nichols and Jeff). Most of the type information may be developed “higher” in the framework, and most of the individuals come into existence after the system is put into practice.

We see little value in sticking with this analogy. It encourages a multitude of models, when a few will do. It encourages transformation, when sub-setting will do. It encourages new identifiers when the concepts and identities have not changed.

We have found that abandoning this analogy allows us to build conceptual models and then populate and use them. There is something akin to levels, but there is no need to have transforms between them. When we want to change an implementation, we change the configuration at the implementation level. If it involves something at the conceptual level, we introduce that at the conceptual level and it is immediately available at the implementation level, no need for transformation. We will pursue this in more detail in the companion book.

Overall, we find that this analogy hampers rather than helps the development of systems.

Fallacy # 4 “Software projects can only be estimated by analogy”

After abandoning the construction or manufacturing analogy for software building, you may think it odd that we embrace it for estimating.

Let’s retrace the history of software estimation to see how we got to where we are. What we will see is that for the most part, software project estimates have been getting larger for the same functionality, at the same time that they are getting less accurate.

Early on, people had no idea how long it would take to write and debug a program. Luckily, over time people ended up writing programs that were incredibly similar, and it became like a craft industry. In a craft industry, once you know how long it will take to form, fire, glaze, seal, and finish a bowl, you can pretty well estimate how long it will take to do the next one. You might make adjustments for necks, handles, and lids, but you will have a base estimate with parametric adjustments.

It was in the craft stage of software development. Once you had written the inventory receipt program, you had a good idea how long a cash application program was going to take. In those days, and even today, few people understand that this is the same program, and are happy to be craftspeople and knock out lots of similar pieces of art that are hand crafted, and only slightly different one from another.

At this point some project manager tired of overseeing a group of craftspeople whose output he or she couldn’t predict or manage, decided to start measuring and predicting. The first attempts focused on the output. This is a reasonable approach if you are going to measure ditch diggers or tailors. You might measure cubic yards of trenches, or linear feet of seams sewn. However, this approach falls down in other domains. Measuring light by wattage is such an example, as long as all light bulbs were equally inefficient we could correlate light (luminosity) with power consumption (wattage). With compact florescence and LEDs, that correlation is out the window. Similarly, if we measured our potters by the pound I could guarantee that we would have many larger and heavier bowls.

Lines of code

This was also the case with software. The primary early metric was LOC (lines of code). The belief was that a program twice as long must have solved twice as difficult a problem. Moreover, a project manager could estimate how large their system was (in LOC) and measure progress toward that goal.

Only this backfired massively. Once programmers knew they were being measured by LOC, they started writing more and more LOC. Reusing a subroutine was detrimental to getting credit for writing more lines. I recall watching project managers recoil in horror as their LOC counts exploded and the project hardly moved any closer to completion. The number of lines of code continued to grow, actually harming productivity.

While the number of lines of code is barely correlated with the size of the problem being solved, it is highly correlated with something else: defects.35 Different industries and different development approaches lead to different defect rates, but there are patterns:

  • There are inherent defect introduction rates. These range from about 1-3% of all human decisions
  • Defects can be introduced in any phase of the development process (requirements, design, coding)
  • Most software defects are found before product release
  • Different development styles (for instance coding with complex conditional branching logic) introduce defects at different rates
  • Different defect detection methods (testing, inspection, etc.) have different discovery rates

One thing they all converge on: the more code you write, the more bugs you produce. From the studies that I have reviewed, the number of latent bugs in released software ranges from 1 per KLOC (thousands of lines of code) to 15 per KLOC. Your program with a million lines of code has at least a thousand defects. Unless you have a particularly rigorous defect detection method, you may have 15,000 defects in your released product.

If you allow or encourage your developers to solve the same problem with more lines of code, not only are you ruining their real productivity (while enhancing their perceived productivity), you are increasing your defect rate. Let’s say you have a latent defect rate of 1% (10 defects per KLOC). If you have a problem that could have been solved with 1 million lines of code, you should expect to ship it with 10,000 defects. If that doesn’t bother you, consider if your perverse incentive program of rewarding line of code production actually “worked.” Imagine you solved the same problem with 2 million lines of code. For the sake of argument, let’s say your incentive systems resulted in twice the “productivity.” You would be done at the same time with a system that now has 20,000 defects. Of course, if you actually did achieve twice the average daily output of code per developer, you should expect an even higher defect rate – “haste makes waste” as they say.

This is not a theoretical concern. Anyone who delivers software knows that these defects come back to haunt.

As such, a line of code turns out to be not a very good metric for estimating project size, but a very good metric for measuring defects. Hold that thought for later fallacies.

Function points and user stories

It is much more productive to size the problem, rather than size the solution. The function point methodology does this by assigning points to various aspects of the problem to be solved. There are points for interactive forms, more points for more fields and more complexity. There are points for interface transactions, point for reports, and points for tables and columns.

What is important is there aren’t function points for lines of code. In fact, it flips the productivity question on its head and allows you to ask: How many lines of code did it take to build this? It allows you to compare programming languages for their productivity.

Here is an excerpt from a study by QSM on the average number of lines of code per function point for different programming languages:36

It expresses what we would expect: some programming languages are more expressive, and therefore encourage writing more functionality with fewer lines.

The agile movement has done something even more upstream: estimation by user stories. A user story is a narrative description of a problem to be solved. Team members consider the size, complexity, and risk of the request, relative to where the product is at that time, and create subjective estimates of “points” per user story. There is no uniform definition of the size of a story point or the amount of time a given team will take to develop it, but most teams converge on consistent values for both the points and their velocity. Velocity is the agile term for the rate at which you turn user stories into tested software.

Both function points and story points have been proven accurate ways of estimating software development projects. However, this is not how most large software development projects are estimated.

Detailed work plans

Another approach to estimating is to prepare a detailed work plan of all the tasks needed to complete the project. While it has the look of a metric-based estimate, it mostly is judgement-based, relying on the estimator’s judgment about the size of each of the tasks.

I remember one of my first major projects was to do a high-level design of a custom payroll system for a large forest products company, and to then estimate the cost to build and implement said system.

This was just before spreadsheet software was invented, and so the work plan was handwritten on dozens of large format manual spreadsheets. I remember sitting down with my boss to review my estimates before we presented them to our client. There were hundreds of tasks. He started with some of the first ones and challenged my estimates: “80? No I think we can do this considerably faster than that, let’s say 30.” Then on to the next one. “240? Let’s make it 150.” On it went for another ten minutes or so, until we got to one that I couldn’t see cutting, and I protested. “There is no way we could complete that task in 10 hours.” “10 hours? I thought these estimates were in days.”

That says a lot for subjectivity in estimating. Even detailed task level estimates. And for anyone who has ever had to manage to these “estimates,” you realize that what you actually do in the project bears only a remote relation to the original work plan.

Estimation by (gross) analogy

The subjectivity just ramps up as the scale ramps up. My observation is that large IT projects are estimated by analogy. The large systems integrators that have done “similar” projects, size up the gross characteristics of the system such as size of the client, number of employees affected, and number of systems to be integrated, and then rely on their experience with similar projects to provide estimates.

It is great that they have experience, but this experience is impossible to challenge or question, or even quantify or qualify. Generally, the reasoning is something like “we did a similar project for another company in this industry, and it took three years and cost $100 million, yet we lost money on that project so we should bid this at $120 million.” I’ve done estimating and therefore I know there is more behind their estimates than this, and there is always a work plan behind the estimate but it is more difficult to justify the effort than determine it.

I was recently on a raft trip with the manager of a large (100,000 employees) second-tier consulting firm. He was relating the difficulty there were having in convincing a client that they were the right firm for an $800 million customer support systems project. They had only ever done a $200 million customer support project. He was worried that the client would be swayed to go with a tier 1 firm that actually had experience with such scale projects. The real question is this: How did the client (and the consultants) become convinced that this was an $800 million project? Because once convinced, it is a self-fulfilling prophecy.

Although I haven’t seen the justification for the $800 million customer support project, I suspect based on project work ups I’ve reviewed that substantiation will be in a very voluminous report, with scores of Excel spreadsheets, and will appear to be very detailed and authoritative. In the cases where I have dug in deep, I’ve found that often most of the estimate is based on a few very high level assumptions, which are then extended, multiplied and variously factored up to get the detailed work plans (almost always with two or three alternative approaches that have been similarly grossed up for comparison). The human mind does better with relative amounts than absolute amounts, so when you put an $800 million project plan up against the alternative approach which is $925 million, the $800 million wins every time.

In addition to the contribution to project inflation, the other difficulty with this type of estimate is that it is completely unhelpful in making any kind of decision on course of action as the project is progressing.

I once reviewed the detailed justification for a packaged implementation for a state agency. The key part of the spreadsheet consisted of 150 rows and 17 columns (mostly out year projections). However, if you drill down on many of the key cells in the spreadsheet, they eventually traced back to a couple of numbers far off to the right on the spreadsheet and out of sight to most anyone reviewing the analysis. These two numbers were the cost to upgrade the current system ($84 million) and the cost to implement an ERP system ($115 million). There was no other source for these numbers, and the rest of the spreadsheet just allocated and spread these numbers around by year and by expense category. We were working on another project in parallel, and had an opportunity to review our own estimates of the project with some knowledgeable stakeholders. Our back of the envelop estimate was that the implementation would be closer to $1 billion than $100 million. The stakeholders concurred. In the ensuring 10 years the project has still not been attempted, so we’ll have to wait to see whether the precise but spurious estimate was correct or the back of the envelop one.

Summary of estimation

Unless you have a way to estimate the cost of introducing a new function or feature to your system, you will have no basis for making decisions. You will have no way of evaluating your contractor. You will discover way after the fact that “this was way harder than we thought.”

There are ways to describe the functions you are loading on to your system. There are ways to prioritize them. There are ways to evaluate whether a given team is more or less productive than another.

You need to embrace these methods and turn your back on megaproject estimation, which never reveals any low-level quantification of the basis of the estimates (other than experience). If your contractors are basing their estimates on their experience, and are not managing against anything that will get better over time, their estimates and productivity will continue to get worse over time.

Fallacy # 5 “Having one neck to choke is an advantage”

We hear this often. This comes from the experience of hiring several consulting firms to solve a problem, and then having them blame each other for problems on the project.

One analogy may be your basement remodel. You could hire an architect to design the space, a plumber to redo the bathroom and plumb in the sink, a cabinet maker for the cabinets, tiler for the shower and backsplash, carpet layer for the floor, and select all your fixtures and appliances yourself. You have some choices as to how to coordinate this. You could be the project manager yourself. This is mostly a question of time and inclination. You can hire a general contractor, who will coordinate the subcontractors but who isn’t liable for their work, or you can hire someone to do the whole thing.

A basement remodel is pretty tangible and easy to imagine the tradeoffs on the management models. Having just completed a basement remodel, let me share my assessment. Our project cost about $50,000. About half was materials, about a third was various skilled and unskilled laborers, and about 1/6 was for the general contractor (we went with the middle model).

We opted to have the concrete floor ground and stained. When the floor contractors screwed up, our general helped us hold their feet to the fire, but there was never any question as to whose fault it was and whose responsibility. We were not going to require the general to redo their work.

In the “one neck to choke” model, we would have said: “You’re responsible for everything, hire whichever subs you want, but you have to make good.” I’m sure, had this been an option, the price would have ballooned. First, the general is only going to hire the best contractors (can easily get cost inflation here), is going to pad the estimate to make sure they can handle any claims, and will be very rigorous about the scope.

My guess is this sort of arrangement could have easily doubled the cost of our basement project. The cost gets away from you because there is no inbuilt market for each part of the project. The way we did it, we could cost different concrete treatment subcontractors, and if that looked like it was getting unreasonable, we could shift to wood or carpet. Once you go for an all-in price, you lose a lot of the opportunity for constructive change.

Unlike a basement project, if you’re launching a systems project, you have no idea how much it will cost. My observation is getting a sole source for a large project is almost guaranteeing it will cost five to ten times what it would cost if it were more proactively managed.

J. Paul Getty once said:

If you owe the bank $100 that’s your problem. If you owe the bank $100 million, that’s the bank’s problem.37

This analogy is apt for large projects. If you are a hundred million dollars into a hundred and ten million dollar project, it is your problem and not the problem of your contractor.

There may be some comfort in knowing whose neck it is you want to choke, but the actual likelihood of getting anything from your choking effort is slim.

Before you settle in on a “one neck” contracting strategy, think through what each aspect of your system could cost, if it were done using modern approaches and contractors who were not incented to pad the bill.

We like to start this thought process two ways. One is to decompose the work into chunks that can be considered independently (like the floor and the cabinets in the basement project analogy) and then to consider what best cases and worst cases look like for each.

Most system implementation projects involve:

  • Application software (buying, building, adapting, etc.). Sometimes it helps to decompose the problem a bit, as some of the parts are easier to think about independently. The Child Support Enforcement system I mentioned earlier decomposes nicely into four major sub systems, each of which is quite different, and might be better addressed with different types of technology. There is an intake portion, which is very similar to most case management systems and is primarily a workflow and form filling system. There is a financial system, including transactions, audit trails, and reconciliations, in place to prove that all funds were dispersed according to policy. The remedies sub system is mostly about rules (how many days’ notice is given before you revoke someone’s chauffeurs or hunting license) and external interfaces (how you automate the sending of the license revocation to the appropriate agency). Finally, there are business intelligence / analytics functions that need to be addressed. When you divide it up like this, it’s much easier to estimate and to know what you need to create an adaptable system.
  • Data. There is usually data to convert, data to acquire, data to create, and often data to clean up. It is possible to sample your data well ahead of time to get an idea of the overall quality and appropriateness of your data and the effort it will take to prepare it for the new system.
  • Processes. New systems often involve changing to established processes. The general errors that people make in estimating these are:
    • Underestimating the time and cost for your users to learn the new system. The larger the workforce affected the more time should be spent making the system intuitive, rather than relying on training to make the transition.
    • Being overly optimistic that the users will accept the changed processes. Often the new system proposes changes that the users are either actively or passively opposed.
    • Too much detail in proposed process flows. Designers often believe they can and should flow chart out every possibility and train people for them. While this sounds like a good idea, it can easily get out of hand, and many hypothetical special processes get designed and built that aren’t necessary or are so rare as to not warrant codifying.
  • People. Some systems require people to obtain new skills or learn new approaches. These are hard to estimate and should be decoupled from implementation. For instance, if I were to redo an airlines internal system, I would create a system flexible enough to mimic the existing system (with all the keyboard shortcuts and manic typing), and in parallel introduce more intuitive, graphic, drag and drop interfaces. Although it would cost more, it would take the massive retraining off the critical path for the project, and allow it to come in a much more phased approach.

When you’ve chunked the main portions of your project into the above buckets, start thinking about what each is likely to cost and how long it’s likely to take. I like to think about cost of each chunk in three ways:

  • Would cost
  • Should cost
  • Could cost

Would cost

In this breakdown, “would cost” is what you would expect to pay if you let things go their normal route. “Would cost” is what you’ll get if you go with the “one neck” theory. If you bring in consultants to estimate project costs, they will base their estimates on analogies to other projects they have done. These are almost always analogy estimates and not parametric estimates.

We work a lot with a state agency who had contracted a firm to estimate the cost of an ERP implementation. After surveying a number of states, they came up with the number $125 million +/- $20 million.

Our independent assessment, based on other work we had done for this agency and our knowledge of the scope suggest that a realistic range of costs for the project ranges from $5 Million to $1 Billion +. As strange as it sounds, both of those extremes are as likely as $125 million. Actually, they are more likely. As we’ll discuss, the $125 million number (even with the +/- $20 million) is almost impossible to hit.

The problem with a $125 million project is that you staff up for it. Most companies don’t have the resources to staff a $125 million project, so they go outside. Consultants may add some additional software to the total, but it’s hard to spend much more than a few tens of millions on software. The real purpose of adding software to an implementation project is to give the staff something to do.

External consultants typically charge $150-$200 / hour or about $300K - $400K per year. You’ll need 300-400 person years to hit a $125 million budget. It’s just not possible to get that many people coordinated around a one-year project–it will be 3-5 years, with 100 people.

100 people trying to solve a complex, highly interconnected problem usually end up in the same place: a certain percent of the team are discovering the “real” requirements (which changes work done to date) and another percent are uncovering new constraints (many from the software that was selected to try to solve the problem). This effort, while well intentioned causes churn on the work done to date. No one wants to admit that this is happening until it is too late to deny it.

By this time, it’s time to re-baseline the project.

Why do I say it’s just as likely to hit either extreme? At the high end you have only to look at Healthcare.gov, the California Child Support Enforcement System, or the DOD DIMHRS system to see how medium-sized systems projects can easily become billion-dollar blowouts.

Most of these systems are not complex. A team could tackle the whole thing in an agile fashion and be done in a fraction of the time and cost.

Therefore, “would cost” is if you accept the advice of the “best practice” practitioners, budget $100 million and spend $200 million.

Should cost

Decomposing the project into its chunks allows to you start thinking about “should cost.” “Should cost” is what it would cost you if you knew before you started most of the problems that were going to cause the prices to balloon, and addressed them proactively. This is the cost if you used traditional technology and executed as well as possible. Let’s say we used case management software for the intake portion of the child support system discussed earlier, a simple accounting system for the financial distributions, and some flexible rule and integration system for the enforcement part, it’s not hard to imagine each of these sub-projects being less than $20 million and far less risky.

Could cost

The “could cost” goes one step further. If you knew what was likely to change the most (forms and therefore database structure in the intake, allocation rules in the financial system, and legislated law about notification in the enforcement side), you’d lean toward approaches that made those kinds of changes easy. In the same way that Amazon has streamlined their cost of introducing changes to their online sales system, your cost analysis should focus on architectures that make the parts of the system you will have the most components, variation, and change to the easiest.

If you knew that the cost driver would be the number of interactive User Interfaces (UI) that needed to be built, instead of rolling up your collective sleeves and coding them, you asked “What would need to be true to get the cost of building a UI down to under a day, and the cost of changing a UI to hours?”

When you make those changes, you get little apparent progress for the first half of the project and then it all completes so rapidly that the effect of change doesn’t have the opportunity to take hold.

The “should cost” and “could cost” analysis may take you away from the one neck theory.

However if you go with “one neck,” you should do two things:

  • Make sure the portions of the project are loosely coupled so that you could swap out an approach to part of the project without endangering the whole.
  • Make sure you are tracking progress and productivity in a way that you could align with your “should cost” or “could cost” analysis, and make changes if it gets out of hand.

The “could cost” for projects of these types are often a few million to tens of millions (it’s usually the interfaces and the legacy systems that are the hardest to improve).

If you believe this, as I do, you would realize that you could launch several projects aimed at the “could cost” model, and even if some of them failed, you’d come out far ahead.

However, launching several projects, or even launching one project with several potentially redundant parts, seems wasteful. It is tempting to be “conservative” and take the highest estimate, select the most qualified contractor, put them as the prime contractor, and be content that you have “one neck to choke.” In Chapter 7 we introduce an alternative contracting strategy called “leader/follower,” which effectively gives you two necks to choke.

Fallacy # 6 “Each application has a positive ROI, therefore my IT portfolio must be returning far more than I am spending”

Most projects these days go through a Return on Investment (ROI) review process. Often these analysis spreadsheets have the precision of evaluating long-term bond options (a great deal of the analysis deals with how much should we discount future savings versus current costs).

However, if you look closely (and I’ve looked at hundreds of these) you will see one of two things:

  • A long list of pretty dubious claims that have been dutifully costed out, and/or
  • Some minor benefits that have been costed and a “strategic” reason the project must be done

I’m not opposed to doing systems for strategic reasons. What I’m opposed to is using the strategic reason as an excuse.

Once upon a time back in the 1950’s, which is the only time this exchange could feasibly have happened, a husband came home and greeted his wife with “Hi honey, what’s for dinner?”

“Oh nothing.”

“Nothing, how come?”

“Because the library burned down today.”

“Oh….Wait a minute, what does the library burning down have to do with our dinner?”

“Nothing, really. But when you really don’t want to do something, one excuse is as good as another.”

This interaction reminded me of a project one of our clients was involved. It was the replacement of an enterprise-wide Human Resource system. I read the justification statement. There were a few costed benefits such as reassign several legacy programmers and discontinue some license payments, but nothing that would justify the project’s $18 million price tag. The benefit that put it over the top was the support for “collective bargaining.” Apparently, a big union renegotiation was coming up and the new system was sold on its ability to solve this problem.

Of course, the project ran over budget several-fold (and most of the overrun wasn’t even recorded because it got pushed out to user groups’ budgets). Nevertheless, this system did eventually go live.

I was talking to one of the survivors of the project a year or so after the conversion, lamenting the overrun. I said, “Well it did run over both budget and schedule, but at least you got the collective bargaining support.” The project member just shook his head. “It turned out the project was so late we couldn’t wait for it, and we had to implement the collective bargaining features in the old system.”

This is a classic example of the library burning down. If the real strategic need was collective bargaining, they would have known ahead of time that that requirement could be satisfied some other way. But because collective bargaining sounded like a good excuse, it could get attached to the solution the promoters of the new system wanted all along.

So my first problem with portfolio theory as applied to systems projects is that the costs are usually “hard costs” (they will get spent and money will leave the organization), and they are usually low by 50-100%. In parallel, the benefits are mostly soft if they are quantified at all (an improvement in productivity of 3% will yield a significant saving). These soft savings only become hard if you do something effective with your 3% productivity improvement, such as layoffs, reassignments, and more throughputs.

It is very rare that anyone ever even attempts to measure the professed benefits after the project is over. It is almost a cliché now for ERP implementation projects to have a disclaimer that most of the benefits will not be realized until more than a year or two after the implementation is complete, which is usually enough time for the consultants to get out of town.

Therefore, we have underestimated hard costs and overestimated soft gains or strategic benefits that are rarely measured. This seems like a problem.

But this isn’t the real problem.

The real problem is the externalities. Economists refer to externalities as the cost born by the environment and not born by the perpetrator. When a manufacturer pollutes a river, it has a real cost to the community. Consumers of water may have to spend additionally on purification; sickness incurs additional healthcare costs and the like. Removing the pollutants from the water before returning the water to the river would cost the manufacturer additional money.

In the case of information systems, the externalities often outweigh the benefits the system was meant to deliver. In this case, the externalities include:

  • Increasing the size and complexity of the “data scape.” The data scape is the number of different concepts that much be managed to manage all the data that is being used. Each application that adds another 10,000 attributes to a company’s collective set of attributes has increased the complexity and therefore the cost for all the other systems that now must deal with this.
  • Increasing the legacy technical debt burden. Each additional application system adds more technical debt to the firm. Technical debt is the agile term for unnecessary code complexity that raises the cost of subsequent system changes.
  • Legacy “lock-in.” Every system that gets built increases the opportunity for legacy “lock-in.” Legacy lock-in of system A occurs when system B becomes dependent on some aspect of system A. Maybe it’s a data feed, an API, or a process to retrieve information from system A. Every incidence of this form of lock-in, makes it harder to dislodge system A, when its time finally comes. Making it more difficult to replace a legacy system is one of the largest drivers of cost in corporations today. This is why the mere implementation of a system increases the externality costs.

One of our prescriptions when proposing a new application is to set up a reclamation budget, and to include in the reclamation budget the total cost of decommissioning the system. The idea of a reclamation budget comes from the regulation of the mining industry, where mining companies are now required to establish and set aside a reclamation budget to restore the environment of a mining project to something resembling what it was before the mine was operated.

So if you survey the wreckage of the systems you have in place and ask yourself: “Where did all these horribly inefficient applications come from?,” understand that they are the product of a selection process that rewards “magical thinking” about hard costs and soft benefits, while ignoring the growing costs of piling externality upon externality.

Fallacy # 7 “We’re not in the information systems business”

Many companies use this line to explain why they should not build (or should not even run) their own systems. This line was used to justify a great deal of the outsourcing of information systems.

But is it even remotely true? Once upon a time, when we were mostly an agrarian economy, you could say that most businesses were only marginally in the information business. Even an agrarian society needs some information about relative demand for products and basic information about the inputs needed to optimally grow and harvest their produce, as well as information about weather and markets.

However, ever since we entered the manufacturing age, we became information processing companies. Most processes were initially manual, but eventually most have become automated, to the point that even the smallest café or bed and breakfast relies on information systems to stay in business.

But the question remains. If you are not primarily in the systems business (that is, if you do not sell information systems, but merely use them), are you really in the information systems business? I would suggest that you very much are.

Even if you do not create software, you use it. The act of using software, especially software that mediates your key processes, puts you in the systems business. If you contemplate changing your business processes, this necessitates changing your information system. If you don’t have the ability to change your own software, you are at the mercy of whoever does.

This is one of the lessons that have been learned by companies who have outsourced their information systems. When you outsource your information systems, you have turned over to someone else the exclusive right to change that software and will pay a premium to make changes.

This idea that you pay a premium for changes to outsourced systems arises from the same forces that lead to contractors charging a premium for change orders on projects. First, once a contract or an outsourced arrangement is in place, change orders are the primary source of new revenue for the contractor. Second, changes to complex legacy systems are costly and risky. Most contractors prefer to be conservative when estimating the cost to make a change to ensure that most of the requested changes complete in less than the estimated cost. Third, your outsourcer now has a monopoly on your system change business. Your only real option is to not make a change. However, not changing your system is equivalent to not changing your business, and a business that is not continually evolving is one that is under threat.

We understand why people say they are not in the information systems business. It’s mostly because they don’t want to be in the information systems business. Additionally, it’s often because they were not very good at information system building and information systems implementing. However, wishing to leave the systems business is not the same as leaving the systems business.

Think of the small restaurant owner. He or she may not want to be in the advertising business, or the business of recruiting servers, or the business of menu management and pricing, but failing to master any of these is a recipe for failure in the restaurant business.

In the same way, declaring that you are not in the information systems business is a prescription for being in a business that is unable to change gracefully, and ultimately unable to cope.

IDC has come to the same conclusion:

Enterprises are turning away from traditional vendors and toward cloud providers. They’re increasingly leveraging open source. In short, they’re becoming software companies.38

Summary

Stop and think before blindly adhering to clichés. In the chain of software architecture decision and approval, I have encountered many whose thinking is no deeper than these clichés. Unfortunately, these clichés rule our lives instead of real thinking and evaluating of alternatives.

We need to think deeply and critically about software implementation, as these are deep and systemic problems. Relying on clichés in place of thinking is just the opposite.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.15.183.117