Chapter 3
Stabilize Your System

New software emerges like a new college graduate: full of optimistic vigor, suddenly facing the harsh realities of the world outside the lab. Things happen in the real world that just do not happen in the lab—usually bad things. In the lab, all the tests are contrived by people who know what answer they expect to get. The challenges your software encounters in the real world don’t have such neat answers.

Enterprise software must be cynical. Cynical software expects bad things to happen and is never surprised when they do. Cynical software doesn’t even trust itself, so it puts up internal barriers to protect itself from failures. It refuses to get too intimate with other systems, because it could get hurt.

The airline’s Core Facilities project discussed in Chapter 2, Case Study: The Exception That Grounded an Airline, was not cynical enough. As so often happens, the team got caught up in the excitement of new technology and advanced architecture. It had lots of great things to say about leverage and synergy. Dazzled by the dollar signs, it didn’t see the stop sign and took a turn for the worse.

Poor stability carries significant real costs. The obvious cost is lost revenue. The retailer from Chapter 1, Living in Production, loses $1,000,000 per hour of downtime, and that’s during the off-season. Trading systems can lose that much in a single missed transaction!

Industry studies show that it costs up to $150 for an online retailer to acquire a customer. With 5,000 unique visitors per hour, assume 10 percent of those would-be visitors walk away for good. That’s $75,000 in wasted marketing.[2]

Less tangible, but just as painful, is lost reputation. Tarnish to the brand might be less immediately obvious than lost customers, but try having your holiday-season operational problems reported in Bloomberg Businessweek. Millions of dollars in image advertising—touting online customer service—can be undone in a few hours by a batch of bad hard drives.

Good stability does not necessarily cost a lot. When building the architecture, design, and even low-level implementation of a system, many decision points have high leverage over the system’s ultimate stability. Confronted with these leverage points, two paths might both satisfy the functional requirements (aiming for QA). One will lead to hours of downtime every year, while the other will not. The amazing thing is that the highly stable design usually costs the same to implement as the unstable one.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.147.27.171