CHAPTER 1
How bad is it?

In this section, we will put some parameters on the size of the problem. These estimates may seem extreme on first reading, but in reality, what you read here is most likely an understatement.

Waste in the information systems industry

Lean manufacturing and the Toyota Production Systems have dramatically improved productivity and quality in the physical goods industries. Enterprise IT has barely scratched the surface. This book is about examining how rampant waste has crept into our industry.

We can learn a lot from the many decades wherein Lean manufacturing has been systematically examining and removing waste from manufacturing processes. What I like about Lean (and the reason it applies well to the information systems industry) is that practitioners become rabid to seek out, identify, and eliminate waste. We have so much to learn.

As Ohno Taiicho put it:

Waste is anything other than the minimum amount of equipment, materials, parts, and working time which is absolutely essential to add value to the product or services.

Lean separates waste into three main sources:

  • Mura (unevenness)
  • Muri (overburden)
  • Muda (activities that do not add value )

Muda is what most people think of when they think of Lean. Muda contains these “seven wastes” per Ohno Taiicho:

  • Transport
  • Inventory
  • Motion
  • Waiting
  • Over-Processing
  • Overproduction
  • Defects

To this list of the original seven wastes, some people also add the following:

  • Talent
  • Resources
  • By-Products

It is tempting to try to make a direct analogy between manufacturing and software production and implementation, but this is a poor analogy. Manufacturing creates physical goods from raw materials and other resources. However, the creation of a new software artifact from an existing one (via copying) is practically free. When you download the latest version of Linux, the only charge is a bit of network connectivity. In the physical world, even making a bad knock-off of a product consumes resources and entails the possibility of quality mishaps.

This isn’t to say that there isn’t waste in software implementation, but we need to explore the nature of software economics (which we will do in Chapter 2) before we can have a more detailed discussion of waste in information systems.

Later in this section, we will make a macro estimate of the degree of waste possible in the enterprise IT industry. First, though, we’ll examine three important sub-sectors of the industry.

Industries that clean up the waste

There are whole industries specializing in cleaning up some of the more egregious bits of waste. Within the umbrella, three sub-industries are generally recognized:

  • Legacy Software Industry
  • Neo-Legacy Software Industry
  • Legacy Modernization Industry

Legacy software industry

I had the misfortune of being on a project that probably bore the last non-pejorative use of the term “legacy” in connection with a software project. I helped build a five-year strategic information plan for Johns Manville Corporation in 1990. We believed this plan would leave a legacy for the sponsors, and accordingly named it the “Legacy Plan.” Shortly after, the monolithic mainframe programs that everyone had and no one could get rid of became known collectively as “legacy systems.” No one advertises themselves as being in the legacy software industry, but many are. Hacker Rank estimates there are 220 billion lines of Cobol code in production, with 1.5 billion more added every year.4

Two things make a legacy system the albatross that everyone disdains. First is that it is built in languages or databases that are no longer supported. The other is that changing a legacy system is cost-prohibitive. A surprisingly large percentage of newly implemented systems are still expensive and difficult to change. In this day and age, there is no excuse for this. What we have instead are reputations, habits, and perverse incentives.

Reputation comes into play when customers want to deal with established companies. The software vendor who entered the market first is often the most established with the largest installed base and most customers. They also have the oldest architecture.

Habit comes in on the buyer and seller side. Software developers and implementers are most comfortable with what they have had the most experience with using. The more senior (and typically decision-making) developers are often most comfortable with the oldest technology, and will revert to their comfort zone.

Vinnie Mirchandani’s excellent book, SAP Nation,5 describes how the SAP ecosystem has become addicted to complexity, and how SAP’s various attempts to rein it in have been mostly rebuffed by the firms that make a living at implementing it.

Neo-legacy software industry

… In so doing, they are creating the legacy systems of the future.

Malcolm Chisholm6

The legacy systems of tomorrow are being written today.

Martin Fowler7

Just as no one claims to be in the legacy software industry, no one claims to be in the neo-legacy industry. The neo-legacy industry is what Martin Fowler was talking about (20 years ago!).

Almost the entire enterprise application software industry is neo-legacy. How to spot a neo-legacy application:

  • High cost of making simple changes
  • Added cost to integrate with other systems
  • Future data migrations are baked in

Most of this book is about understanding how our attitude toward application software is setting the stage for us to create problems faster than we solve them.

Legacy modernization industry

The legacy modernization industry is the only one of the three that will admit to being in a legacy industry. It is relatively small, and is composed of companies that specialize in converting client groups away from their legacy environments. Sometimes, they are converting clients to a neo-legacy environment, but often it is a step in the right direction.

There are legacy-understanding companies, whose roles are to help firms understand either what is hidden in their legacy code, or what is hidden in their legacy data. Then there are legacy conversion companies. Most find ways to emulate legacy code in modern languages, or legacy data structures in modern databases.

Both are often necessary. Often the legacy conversion companies can get a company off a “burning platform” in a short amount of time, and buy time to make a further conversion. A “burning platform” is one where a parent company or agency has announced it will no longer support a language, database, operating system, or hardware line; or that a vendor has gone out of business and therefore the product can no longer be supported; or it has become impossible to find the skills to continue to support the product.

The legacy conversion often isn’t the end stage. However, compared to a rewrite or package implementation that might take five to ten years, a conversion that leaves you with the same functionality (often with lower licensing costs) is attractive.

However, the legacy understanding sub-industry may be more important. We have watched legacy conversions stall out, either due to internal sabotage (the incumbents do not want their monopoly on understanding to be undermined) or sheer conservatism (we just don’t know what else may break). These backfires could be avoided if the sponsors of the conversion had been armed with complete definitive information about what was in the legacy system.

The deep secret is that the legacy system has very few important business rules, and a lot of special logic devoted to conditions that are no longer of interest to the sponsoring organization.

A thought experiment on waste

If we wanted to get a handle on the overall level of waste in the transportation industry (and thereby get a handle on the size of the opportunity that awaits its solution), we might do something like the following (for just the personal terrestrial commute sub-market):

  • Inventory of the number of trips consumers make in their cars, buses, and trains each year
  • Add up the total capital costs invested in supporting this (all the cars, trains, roads, and rails)
  • Add up all the operating costs (fuel, tires, maintenance)
  • Then calculate the average cost per trip
  • Next create a model of what the cost per trip could be (assume as ride sharing has demonstrated that private automobiles could go from 5% utilization to 75%+, and that the whole gas guzzling fleet is swapped for electric vehicles)

If you did this exercise, I predict that you would conclude that at least half of the $2 trillion we currently have is waste – or to put it another way, opportunity. This is exactly the sort of disruption currently being ushered in by the ride-sharing and autonomous vehicle industry.

The economics in the software industry are so much more extreme.

Because the software industry has no physicality, and the cost to replicate code or data is close to zero, the potential savings are staggering. If you observe closely, you will see that most of the software industry consists of writing the same application code over and over again, in different contexts, to different data models and in different computer languages.

Our construction analogy falls apart here. Building the same house over and over again makes sense, because everyone can’t live in the same house. But we can, in principle, run our companies on a limited amount of code. It’s not as simple as it first sounds, but it is possible.

Business application systems only perform a few dozen different functions. These functions may include displaying a field on a form, validating a field, detecting a particular change in a value, and sending a message. Many people think that sending an email is a different function from sending a text message, or that detecting that an account balance is overdue is different from detecting that an inventory item has reached its reorder point. However, these are not different functions; they are merely parametric differences of the same function being applied.

Building a system is a matter of design (what types of data to store, what processes to employ, how to package functionality into useable interactions, etc.) and implementation is a matter of supplying the parametric settings to each of the functions needed to implement the system.

How much code does the world need? The logical answer is “just enough.” We need enough code to implement those few functions, plus some infrastructure to hold it all together, as well as some code that will adapt all the various bits of hardware linked with these systems. This includes the Internet of Things (IoT), which will need adapters for all the sensors and actuators that can be attached to the IoT.

But we only need one copy of each, which could be copied and cached where needed for performance reason.

The entire application software industry could conceivably become a $0 billion industry, with what little remains to be done is professional services designing and configuring more appropriate systems for each company and potentially each employee and citizen.

A large enterprise with over $1 billion in annual IT spend should create a vision for the long-term economics of its information system infrastructure. Once you understand where the dis-economy lays, it is not that hard to conceive of a future free from waste, at a fraction of the current spend.

This will take time.

While it is just a thought experiment, it is already beginning to happen. As we will discuss later in this book, the spread of cloud computing is dropping the cost of hardware and making the cost of licensing software more noticeable. By using open source operating systems such as Linux, and open source databases such as Postgres, firms are escaping the burden of paying licenses per use. Moreover, Linux and Postgres are examples of something that only had to be written once and widely distributed.

The application software industry is currently far from this state. The few open source options for application software packages that exist are pretty bad knockoffs of the current neo legacy offerings. However, all this is poised to change.

Sometimes to understand pathology, it helps to look at some of the most extreme forms. For that, we will look at some examples of projects that have spent 100 to 1000 times more than even current technology would suggest is necessary, and taking my earlier thought experiment to its extreme, on the order of 10,000 times more than it potentially could cost.

How to spend a billion dollars on a million-dollar system

Social scientists and economists are dismayed by the difficulty in performing controlled studies. While there is no shortage of hypothesis about human behavior or complex systems, it is often too expensive or unethical to set up and execute true side-by-side controlled studies. Nevertheless, every now and then the world serves up “natural experiments.” A natural experiment occurs where, by coincidence, most of the features of a controlled experiment happen spontaneously.

One of my favorites is the hypothesis that a market driven, capitalist society would provide higher degrees of prosperity than a socialist or communist economic system. Most studies that attempt to demonstrate this are pure correlations. We can correlate GDP with key features of the economic model, but this strategy risks confounding cause and effect: perhaps the more affluent choose market economics, rather than the market economy driving affluence.

What is needed is a controlled experiment. In a controlled experiment for this hypothesis, you would take a population that is as similar as possible. Many people believe that certain ethnic or cultural groups have advantages in professions that benefit from market economies. So ideally, you would like a population as ethnically and culturally homogenous as possible.

Many people believe that economic prosperity comes largely from access to resources, such as land, minerals, or oil. So if your experiment divided the two study groups, each would be given similar natural resources and similar access to trade routes, ports, or navigable waters.

There are obvious differences between individual people, and many economic truths are complex systems requiring the interplay of thousands of organizations and individuals. Individuals, groups, and organizations need to be able to freely enter and exit various roles in the economy. For an experiment to be truly valid, it would need to involve millions of people. This is one of the reasons that experiments of this type are so rarely performed.

Finally, these differences don’t manifest overnight. It would be preferable to run the experiment for decades, or better yet, for generations. The barriers to conducting such an experiment are immense. Nevertheless, every now and then, circumstances serve up all the prerequisites.

After World War II, the Korean Peninsula (which had been occupied by Japan during the war) was temporarily partitioned. The north was aligned with Communist China and Russia, and the south was with the west–especially the United States. From 1950 to 1953, the forces of the west and east clashed in the Korean peninsula. At various times in the war each side seemed to be close to complete victory, but in the end after massive loss an armistice was signed, and the country was partitioned along the 38th parallel.

North Korea and South Korea at the time were populated with homogenous populations with similar ethnic and cultural backgrounds and similar work ethics. The two landmasses were similar in size and had similar access to natural resources and harbors.

The primarily difference between the “control group” in the north, and the “experimental group” in the south, was the south adopted the Western model of democracy, market economy, and capitalism. The experiment has now run for over 60 years. To underscore the need to run an experiment over an economy you need to be patient. For the first twenty years of the experiment, the differences were relatively small. However, small differences compounded over time result in extreme differences.

We could use statistics to make the point (and the statistics are compelling8) but the difference is so stark that it can be seen from space (courtesy NASA). In the following satellite picture taken at night, Japan is the crescent shape on the right, with Tokyo the bright spot on the right edge. The lights in the upper left corner are part of Liaoning Province in China. The bright light on the left and the lights around it are Seoul and the rest of South Korea. If you didn’t know otherwise, you might think that the dark area between South Korea and Liaoning Province was water, as there are just a few pinpricks of light that might be fishing trawlers.

But not so. That region (20% larger than South Korea) is the nearly lightless North Korea.

I bring this up to suggest that sometimes the world serves up natural experiments that are so much more compelling than reams of correlations or pontifications, that we should recognize them when we see them and take the learnings to heart.

A tale of two projects

If someone has a $100 million project, the last thing that would occur to them would be to launch a second project in parallel using different methods to see which method works better. That would seem to be insane, almost asking for the price to be doubled. Besides, most sponsors of projects believe they know the best way to run such a project.

However, setting up and running such a competition would establish once and for all what processes work best for large scale application implementations. There would be some logistical issues to be sure, but well worth it. To the best of my knowledge, though, this hasn’t happened.

Thankfully, the next best thing has happened. Luckily, we have recently encountered a “natural experiment” in the world of enterprise application development and deployment. We are going to mine this natural experiment for as much as we can.

President Barack Obama signed the Affordable Care Act into law in March 23, 2010. The project was awarded to CGI Federal, a division of the Canadian company, CGI, for $93.7 million. I’m always amused at the spurious precision the extra $0.7 million implies. It sort of signals that somebody knows exactly how much this project is going to cost. It is just the end product of some byzantine negotiating process. It was slated to go live October 2013. (I was blissfully unaware of this for the entire three years the project was in development).

One day in October 2013, one of my developers came into my office and told me he had just heard of an application system comprising over 500,000,000 lines of code. He couldn’t fathom what you would need 500,000,000 lines of code to do. He was a recent college graduate, had been working for us for several years, and had written a few thousand lines of elegant architectural code. We were running major parts of our company on these few thousand lines of code so he was understandably puzzled at what this could be.

We sat down at my monitor and said, “Let’s see if we can work out what they are doing.”

This was the original, much maligned rollout of Healthcare.gov. We were one of the few that first week who managed to log in and try our luck (99% of the people who tried to access healthcare.gov in its first two weeks were unable to complete a session).

As each screen came up, I’d say “what do you think this screen is doing behind the scenes?” and we would postulate, guess a bit as to what else it might be doing, and jot down notes on the effort to recreate this. For instance, on the screen when we entered our fake address (our first run was aborted when we entered a Colorado address as Colorado was doing a state exchange) we said, “What would it take to write address validation software?” This was easy, as he had just built an address validation routine for our software.

After we completed the very torturous process, we compiled our list of how much code would be needed to recreate something similar. We settled on perhaps tens of thousands of lines of code (if we were especially verbose). But no way in the world was there any evidence in the functionality of the system that there was a need for 500,000,000 lines of code.

Meanwhile news was leaking that the original $93 million project had now ballooned to $500 million.

In the following month, I had a chance encounter with the CEO of Top Coder, a firm that organizes the equivalent of X prizes for difficult computer programming challenges. We discussed Healthcare.gov. My contention was that this was not the half-billion dollar project that it had already become, but was likely closer to the coding challenges that Top Coder specialized in. We agreed that this would make for a good Top Coder project and began looking for a sponsor.

Life imitates art, and shortly after this exchange, we came across HealthSherpa.com. The Health Sherpa User Experience was a joy compared to Healthcare.gov. I was more interested in the small team that had rebuilt the equivalent for a fraction (a tiny fraction) of the cost.

From what I could tell from a few published papers, a small team of three to four in two to three months had built equivalent functionality to that which hundreds of professionals had spent years laboring over. This isn’t exactly equivalent. It was much better in some ways, and fell a bit short in a few others.

In the ensuing years, I’d used this as a case study of what is possible in the world of enterprise (or larger) applications. Over the course of the ensuing four years, I’ve been tracking both sides of this natural experiment from afar.

I looked on in horror to watch the train wreck of the early rollout of Healthcare.org balloon from $1/2 billion to $1 billion (many firms have declared victory in “fixing” the failed install for a mere incremental $1/2 billion), and more recently to $2.1 billion. By the 2015 enrolment period, Healthcare.gov had adopted the HealthSherpa user experience, which they now call “Marketplace lite.” Meanwhile HealthSherpa persists, having enrolled over 800,000 members, and at times handles 5% of the traffic for the ACA.

The writing of this book prompted me to research deeper, in order to crisp up this natural experiment playing out in front of us. I interviewed George Kalogeropoulos, CEO of HealthSherpa, several times in 2017, and have reviewed all the available public documentation for Healthcare.gov and HealthSherpa.

The natural experiment that has played out here is around the hypothesis that there are application development and deployment process that can change the resource consumption and costs by a factor of 1,000. As with the Korean Peninsula, you can nominate either side to be the control group. In the Korea example, we could say that communism was the control group and market democracy the experiment. The hypothesis would be that the experiment would lead to increased prosperity. Alternatively, you could pose it the other way around: market democracy is the control and dictatorial communism is the experiment that leads to reduced prosperity.

If we say that spending a billion dollars for a simple system is the norm (which it often is these days) then that becomes the control group, and agile development becomes the experiment. The hypothesis is that adopting agile principles can improve productivity by many orders of magnitude. In many settings, the agile solution is not the total solution, but in this one (as we will see), it was sufficient.

This is not an isolated example – it is just one of the best side-by-side comparisons. What follows is more evidence that application development and implementation are far from waste-free.

Canadian firearms

A program to register firearms in Canada was originally projected to have a net cost of $2 million. The actual write-up was that it would cost $119 million to develop and garner $117 in net new revenue – a couple of curiously precise estimates.

The real cost was $2 billion.9 They have registered 5.6 million firearms. This is a textbook example of “this was way harder than we thought.” Except that there is no reason it should have been. There is nothing inherently complex about registering guns. You have to work at it to make it complex enough to spend $2 billion of someone else’s money.

Sporting goods manufacturer

There is a $2 billion sporting goods manufacturer that has two main product lines. Within each product line, they have thousands of products. In 2007, they embarked on a project to replace their aging systems with a state-of-the-art ERP system. Enterprise Resource Planning systems, as we’ll discuss in Chapters 2 and 3, are integrated packaged software systems meant to cover a large percentage of a firm’s functionality. The ERP system chosen was one of the top three mid-tier systems. They had the best help to implement the system.

They went live in 2015. Total project costs over the 8-year period are estimated to be approximately $50 million. This was considered a success, but the point I’d like to make in this book is that most of this cost and time was driven from the tools and approaches that are accepted to be best practice now. In the companion book we describe a future state where implementing a system such as this would be a $1 or $2 million project, taking about a year, with much less risk.

Insurance conglomerate

We have reviewed the 10Ks of an insurance conglomerate and confirmed the story we were told.

A certain company (which shall remain nameless, as they have mostly managed to avoid publicity) decided to employ a major systems integrator to help them “integrate” the many companies they had acquired. Seemed like a good idea.

Somehow, the project rather than seek early small wins, decided to go for the big integrated solution. Bummer. After $250 million invested, nothing was implemented and the subsidiaries are as unintegrated as they ever were.

Major bank

We have become privy to the story of a major bank that opted for a project that was going to implement all their systems in a “big bang” integrated project implementation.

Incredibly, they spent $1 billion before cancelling the project (and before implementing anything). It is hard to figure out what is more incredible: 1) that a publically held company could spend a billion dollars on a failed experiment and keep it out of the news or 2) that it didn’t occur to anyone to do something incremental along the way.

DIMHRS

The Department of Defense in 2006 launched into a project to consolidate their many human resource systems into one. They chose to base the new system on packaged software (see Chapter 5 for why this is a mistake).

They cancelled the project after spending a billion dollars and realized they had virtually nothing implemented.

Cancer research institute

There is a cancer research institute that got involved in a major initiative that cost them $63 million before they pulled the plug with nothing to show for it.

Summary

One of the lessons here is this: organize your projects such that every few million dollars or so, you could cancel the project and still have something to show for it. A $60 million project should have thrown off many interim implementations each worth more than the investment to date.

You may scoff at the many bad systems implementation projects, but that is to be expected, given the number of projects being executed every day.

Yes, these are the extreme cases. But these wouldn’t have occurred if the norm were to build these systems using agile methods and composition from reusable parts. In order to get these extreme cases, the norm has to be in a very abnormal place.

As of this writing, the norm for “large” system implementation projects is:

  • Initial budget $100 million
  • Initial time frame 3-5 years
  • Cancellation rate 50%
  • The “successful” projects are typically 50%-100% over budget and years over schedule

If that is the norm, then yes, a few will run over by 1000%, just as a few bad apples will spoil the bunch.

The point of this book is that that should not be the norm. All those apples are bad.

By the way, if you make your living selling or implementing these bloated systems, you should probably just throw this book down now. It will only make you angry.

“It is difficult to get a man to understand something when his salary depends on his not understanding it.”10

Upton Sinclair

If you are on the buying side of the equation, this book could change the way you think of systems.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.119.132.223