Chapter 4. The Rise of the Security Breach

In 1854, London was in the grip of a cholera epidemic. Disease spreads faster and affects more people in the hothouse of a city, and an understanding of the causes of cholera was hard to come by. Physician John Snow had published a theory that cholera was waterborne, but that was only one theory among many about the causes of the disease. In the epidemic of 1854, Snow plotted cholera deaths on a map and determined that one public water pump was the source of the outbreak. That work is considered a pioneering event in the science of epidemiology, and it led to the creation of public health as a discipline.

We have argued that objective data that would enable good security decisions is in short supply. John Snow had the simple remedy of removing a handle from the pump of an infected well. We may not have such a straightforward solution, but better and more data will help. Ideally, we will get it through less traumatic events than an epidemic, but in information security, there may be an invisible epidemic of mistakes being made. As we said in the preceding chapter, failures of information security are not usually widely discussed.

People dislike talking about security mistakes. They fear the consequences. Mistakes are seen as embarrassing failures and a source of potential liability. Although much can be learned from information sharing, it is hard to break out of the pack and be the first to discuss what is happening to you. Opportunities for errors abound, as ChoicePoint, a company based in the Atlanta suburbs, learned in February 2005.

Nigerian con man Olatunji Oluwatosin opened several accounts with ChoicePoint in October 2004. He claimed to be a variety of financial companies that required access to the data that ChoicePoint has on its customers. Over the following months, Oluwatosin accessed the records of more than 163,000 individuals, choosing 800 to fraudulently impersonate and subject to identity theft.

ChoicePoint was one of the first companies to fall under the aegis of a new California law that requires organizations that lose control of people’s personal data to notify the persons affected. That law, California Senate Bill 1386, is usually referred to simply as SB1386. Just two years later, it is now one of more than thirty-five state laws of its type. As required by SB1386, ChoicePoint notified 30,000 or so Californians of the security breach. MSNBC reporter Bob Sullivan broke the story. Shortly afterward, thirty-eight state attorneys general wrote to ChoicePoint, asking that their citizens be notified if they were also affected by the incident. Information about the breach from ChoicePoint was fragmented, contradicted by statements from law enforcement, and played itself out over a long background of conflict between ChoicePoint and privacy advocates.

It’s ironic that ChoicePoint, a company that provides background checks and so-called “identity verification services,” fell victim to such a fraud. That irony contributed to how the issue unfolded, capturing the attention of reporters, analysts, and bloggers. Data breaches became a major subject of stories and analysis. The Privacy Rights Clearinghouse began to maintain a running list of data breaches. One researcher and blogger realized that reports to New York State were subject to Freedom of Information Act requests, so he requested those reports and placed them in the public domain. Those people and groups acted because the data seemed interesting and worth capturing. For them it was a natural response. Those efforts made possible the creation of academic papers in law, economics, and security.

Back to ChoicePoint. It did not want to talk about what happened. Like everyone who has ever suffered from a security breach, ChoicePoint was concerned about the many costs, from the possibility of lost customers to bad public relations. But California’s new law compelled ChoicePoint to disclose its breach, so it did. The new laws have shown that all sorts of groups have experienced breaches. As we write, over 700 organizations ranging from schools to governments, from commercial pygmies to giant multinationals, have reported errors. Reports have come not only from the U.S., but also from Canada, the U.K., Bermuda, and Japan. These breach notices offer the first public equivalent of the mortality notices studied by John Snow. Over the next few years, we hope to get more data, both by getting more data per breach and by getting data about more breaches. We hope that breach data will be reported to central resources that will share it, and that this will put the focus on data analysis and alleviate the effort involved in collecting and collating it.

People don’t like to talk about any sort of mistake, but in a wide swath of industries they must. From small pills to large airliners, failures are discussed. In aviation, every error is announced and dissected. It is no coincidence that flying is among the safest ways to travel. Errors result in injuries or fatalities that are exceptionally low by mile traveled, by passenger experience, and by total deaths per year. This tight feedback loop within the airline industry is understood to benefit not just passengers, but also the airline industry, which can promote flying as a safe way to travel.

The shame of having made mistakes will recede as more and more organizations admit to them. We predict that the cost of notifying people will drop, and that the effectiveness of remedies offered will improve. New types of solutions more effective than credit monitoring will emerge. The disincentives for companies to disclose will decline, and by the time this book is published, the thunderous headlines will likely have gone away, replaced by a steady rumbling of reports inside the business or local interest sections of the newspaper. The public face of data breaches will diminish (except in extraordinary circumstances), but the data will increasingly become available.

Despite the desire to avoid scrutiny, liability, and other feared effects of breaches, they offer the best opportunity to gather and share objective data about failures in information security today. They also provide the best chance to transform cultural attitudes about discussing security issues. Breach data has less bias than many other possible sources of information about computer security. This is because the mandatory nature of reporting in many circumstances means that we hear about issues that would otherwise be suppressed. Breach data stands in stark contrast with the one-off nature of journalistic reports about security incidents. The public nature of breach data also means that anyone can pick it up and analyze it. More people will take note of the availability of breach data and study it. What follows is a relatively short analysis. As more data becomes available, more detailed analyses will be performed.

How Do Companies Lose Data?

Companies have lost data to fraudsters and crackers, they have lost it by throwing it away accidentally, and they have lost (or had stolen) backup tapes and laptops. We can broadly categorize such failures into two buckets: deliberate and accidental. Examples of deliberate failures are a con man signing up for an account—as with the ChoicePoint incident—and an intruder breaking into a company’s wireless network. Accidental failures are events such as lost laptops or backup tapes.

When the breach data that is available today is categorized according to the number of records exposed, the majority of incidents can be seen to relate to lost or stolen equipment or media. This is perhaps not so surprising. An adage in the networking realm says to never underestimate the bandwidth of a truck full of tapes. A list of the names and social security numbers of every American takes up less than 10 gigabytes. This is larger than most of today’s thumb drives, but easily within the capacity of a laptop, an iPod, or a couple of DVDs. Most breaches don’t involve data that is quite so optimized, so the data may occupy more space, but the largest breaches have been because of lost backup tapes, DVDs, or laptops. There is some irony that auditors carried many of the DVDs and laptops lost in these incidents.

Only between 1% and 2% of total records were lost or stolen as a result of being exposed online. Only 0.5% of records were compromised due to so-called “insider” incidents. This would seem to contradict the conventional wisdom about the perceived enormity of “the insider threat.” Either insiders are responsible for relatively few incidents, or they’re detected at a far lower rate than other events.

It is possible to also examine trends that relate to organizations that have suffered a security breach. (Here we can only talk about the experiences of U.S.-based organizations, because the U.S. is the only country in which this sort of research has been done.) According to a Harris Research poll released in November 2006, just over one American in five has received at least one notification that his or her private information has been compromised. According to another survey, 62% of Americans had been notified as of mid-2007. Of these, just under half reported that the notification was from a government agency, and just under a third reported that the notification came from a financial company. Less than one in eight reported some other commercial concern, and roughly one in twenty received a notification from an educational institution or health-care facility. These numbers make a fair amount of sense, even though we are highly skeptical of poll data. By its own mandate, the government keeps vast collections of personal details about every citizen. Citizens can scarcely choose not to interact with the government, so the motivation to minimize the collection of data might be smaller. The motivation to protect government data may be slightly higher in the United States, because an inspector general rates each American government agency on its security activities in accordance with the Federal Information Security Management Act (FISMA). Other countries may have similar laws or different civil service traditions. Financial companies collect a great deal of data, with much of that data collection mandated by government. The finance sector has also become highly interconnected, so a breach at a single financial organization might affect more people than a breach at an average hospital, school, or college.

The breaches with the highest potential for headlines and perhaps harm will continue to come from organizations with the most data. It seems likely, however, that the amount of media attention will be proportional to the size of the company, not the size of the breach. Joe’s Bait and Tackle Shop losing twenty credit card numbers makes for a poor story, but ChoicePoint losing twenty credit card numbers would be newsworthy because of the slew of earlier stories to refer to. Coca-Cola losing twenty credit card numbers would be noteworthy, because no words need to be wasted explaining who Coke is. Besides, the headlines (“I’d like to buy the world a Coke—with your credit card number”) would write themselves. A final factor in what will be reported on heavily are instances in which an organization broke the rules in its collection of data. This was a factor in the stories about the board of Hewlett-Packard spying on reporters, as well as the publicity around the CardSystems Solutions breach (discussed in Chapter 2). At the time of the story, CardSystems was among the largest breaches reported, with up to 40 million credit cards exposed. The press also fed on the fact that CardSystems broke the rules set by Visa and MasterCard in storing data it should never have stored.

Companies have begun to understand that storing personal data is risky. A rational response is to try to reduce the amount of data they collect and to understand where data is stored. A new category of products branded as “data loss prevention” and “extrusion prevention” monitor corporate communications such as email and file transfers. These products block transmissions when it appears that restricted company data is involved. Encryption software is selling briskly to protect data on laptops from being revealed should a laptop be lost or stolen. There is a blending of motivation in some cases, between wanting to protect data and wanting to avoid mandatory breach disclosure. Some ambiguities in existing breach law make it unclear whether the loss of encrypted records must be disclosed.

As science-fiction writer William Gibson said, “The future is already here; it’s just unevenly distributed.” Breach reporting is also here, just unevenly distributed. Organizations around the world are reporting on failures, but they are doing so unevenly because of legal loopholes and scattered breach laws of varying strengths. But breaches will, for the near term, continue to impact organizations of all sizes and kinds, offering us windows into various worlds. When compared to the potential sources of evidence described in the preceding chapter, breach data is broader and less biased. Breach data that is freely available can be “sliced” by industry, by loss type, by data type, or whatever other measure seems interesting in support of making better security decisions. Naming names enables follow-up research, unlike research based on anonymous data. Examples include studies by Carnegie Mellon researchers on the impact of breach notices on stock prices, and the research we’ve done on mentions of breaches in Securities and Exchange Commission filings.

Disclose Breaches

Having said all this, an organization that experiences a breach will naturally have questions: Should we disclose? Can’t we just cover this up and hope it goes away? The short answer is no. Covering up is a bad short-term strategy and a risky long-term one. As we’ve discussed, there are also broader implications for society.

The simplest argument for disclosing a breach is that in many cases, it is legally required. We’re not lawyers, but if you have customers in any of thirty-eight states or in Japan, disclosure is probably required. It may be possible to spend a good deal of money on legal advice and notify customers in only certain states. Given the increased interest in disclosed breaches, it’s possible that the media will start to notice partial notifications, much like what happened to ChoicePoint. Not an experience anyone wants to emulate.

Even in some places where disclosure is not explicitly required by law, authorities have publicly stated that they interpret their data-protection law to include breach notification. (Canada, Australia, and New Zealand fall into this category.) In some places, there might be no duty to notify, but there is an emerging duty of care. Laws, rulings from privacy commissioners, and the argument around duty of care are rapidly moving most of the English-speaking world toward greater disclosure.

There is a recurring theme in news stories about data breaches in which the organizations involved did not quickly disclose the incident. Customers and the general public affected by breaches are shocked that they were not informed sooner. A strong tone of outrage surrounds the idea that consumers and citizens who could be hurt should be notified in a timely fashion. Not only should organizations disclose, but they also should disclose sooner rather than later as a deliberate strategy. When faced with a breach, an organization should aim to get ahead of the news by notifying customers at once with the fullest details available. While lawyers may be able to argue for partial notice or no notice, such analysis would be more expensive and riskier than disclosure. The risk comes from notification coming out later, causing outrage.

There are other possible reasons to embrace breach disclosure. Organizations might promise to disclose as a signal that they are confident in the quality of their security. Companies that commit to disclose all breaches to their customers are making a statement about the investments they have made in security. An organization that lacks confidence in its security would not make such a commitment. This creates a new opportunity for companies to compete on security or privacy.

Many businesses will need to change their external reporting activity as a response to the legal requirements for breach reporting. These changes might initially be uncomfortable, and it might be tempting to respond by trying to roll back breach notification mandates. Not notifying customers or trying to lobby against mandatory breach disclosure would show those companies to be scornful of their customers. That sort of hostility seems a far better way to lose customers than telling them there was a mistake. (This is a hypothesis, and one that could be tested.)

Most companies have made a commitment to treat the data they collect with care. Organizations in sectors such as finance, government, and health may even be under strict legal obligation to do so. A company that promises to protect the information provided to it, fails to do so, and then fails to report that fact, may open itself to a charge of deceptive business practices. What constitutes a deceptive business practice is for lawyers to argue. Working to cover up a mistake runs counter to transparent and honest business practice. It was not the crime, but rather the cover-up, that forced Nixon to resign. Attempts to cover up mistakes are likely to result in harsh penalties in the future. In the event of a lawsuit, data is likely to come out as a result of the legal discovery process in any case. So, yes, mistakes happen, and mature organizations own up to them.

It may be possible to temporarily roll back mandatory disclosure, but this is an inefficient, short-term risk trade-off. If the way we approach security today is inefficient because we lack the data to help us improve, obtaining data is worth a great deal. It is true that the costs of breach disclosure are currently distributed inefficiently. Disclosure offers no immediate payback. In an ideal world, the benefits would be better aligned with the costs, but after years of security failures we’ve learned nothing. Breach data offers our best chance to overcome this. The future consists not only of mandatory disclosure, but of companies, governments, universities, and trade organizations all studying the reports, learning from them, and refining their processes.

Possible Criticisms of Breach Data

As a result of SB1386 and the ChoicePoint incident, we arrived at breach data as a happy accident. However, the breach data that is available today is not ideal for our full set of needs. We will discuss this topic; then we’ll debunk the common criticisms.

The type of breach data that is available today represents only a subset of what we’d like to know. Some security experts note that the number of incidents reported seems low. U.S. government reports, driven by Congressional investigation, have revealed incidents that were not publicly known. Freedom of Information Act requests routinely reveal new breaches that have been missed by researchers scouring the news wires. These newly revealed incidents are both mistakes by government agencies and mistakes reported to them. Other researchers have compared laptop loss figures to reports of lost data and commented that the amount of sensitive personal data on laptops should have led to a lot more reports. Ultimately, we’d like to know about all computer crime, much like the (American) National Crime Victimization survey or the FBI crime statistics provide insight into all conventional crimes that are reported. Breach notices only provide us with data about privacy breaches, and we don’t yet know what proportion of security breaches involve personal data. What we do know is that in each of these cases someone tried to protect the data and failed. We also know that other incidents are interesting but are not breaches. For example, denial of service attacks that lead to failures of availability seem to fall outside the meaning of “breach.”

Mandatory breach notification laws are an expensive route to the data we’d like. Sending notices to customers affected by a breach costs money, as do other aspects of response, such as staffing a phone bank and perhaps providing credit monitoring or locking services. Much of the published analysis focuses on financial costs such as expected revenue loss from lost customers. It appears, however, that these costs are one-time events and that the stock market has already learned to discount them. Data breaches do not seem to have much of an effect on stock market valuations.

Some people believe that admitting to a security breach will drive away customers. There is research that shows that in most breaches, no more than a small percentage of customers will leave. This research has been published by companies that sell services for responding to breach events, so it might be exaggerating the effect. As people come to expect that mistakes happen, they will be less likely to withdraw their business because of them. This leads us to believe that the churn rate for customers who leave in the event of a breach is probably far smaller today than early surveys of consumers suggested. In the aftermath of one of the largest breaches to date (at the TJX companies, discussed in Chapter 1), sales actually went up in the two quarters following the announcements. Over time, we expect that only repeated mistakes and those that are attributed to carelessness might have an impact on share price and customer churn.

The claim is sometimes made that data that is lost by companies won’t hurt anyone, so breach notification serves no purpose. However, over half of identity fraud victims have no idea how a criminal got hold of their personal data. (For the half who believe they know, we have no data about how accurate their beliefs are.) That gap in our knowledge can be closed only by analyzing breach data and its correlation to cases of identity fraud. The argument that data that is lost won’t hurt consumers is not a valid reason for not disclosing breaches. Logically, if there is no chance of harm, we should be able to discuss each incident openly.

A similar (but contradictory) argument is that because customers have no effective means to protect themselves from identity theft and the other ill effects of the disclosure of personal information, mandatory breach disclosure does not serve the best interests of the general public. This may be somewhat true today, but new businesses are emerging that may help address the problem. If today’s technologies cannot or do not help, new businesses with new approaches will emerge. Businesses that help customers “lock” their credit files against account fraud have already appeared. We expect to see a raft of new businesses arise to help companies report breaches, analyze breach data, and protect consumers against the broad set of threats associated with the disclosure of personal information. To the extent that they’re effective, we welcome them.

Because of differences in breach disclosure law, the structure and content of breach reports are not standardized. Different data points, levels of detail, and terminology are used to describe the same things. This is somewhat of a challenge, but by no means an insurmountable one. John Snow faced similar challenges. Cholera is a viscious disease, and absent intravenous hydration, it kills young and old alike. The fear of cholera drove people away, and anyone who had a place to flee to during the epidemic did. This meant that sometimes deaths caused by cholera happened somewhere else, so a complete data set was not available. Deaths were a tragic and expensive route to the data. Just about any other data point would have been preferable, but those might not have led to enough information to test Snow’s theory. The breach data that is available today is sufficiently detailed to allow us to draw many conclusions. As it becomes more standardized, detailed, and comprehensive, we will be able to draw even more conclusions and do so more efficiently.

Some might say that we don’t need detailed or comprehensive breach data because many stories in the popular press discuss high-profile breaches, and we can draw conclusions from those accounts. Again, answering this question has parallels to Snow’s work. He wallowed in his subject at great personal risk. He did not simply read newspaper reports and graph data. He went into the infected areas and talked to people about what was happening. He got up close and personal with a deadly disease that no one understood. What caused him to be able to effectively select the information that mattered? Some of it was his orientation as a physician, which led to his theory about the disease. Some of it was the depth of his immersion in the question. The key lesson is that analyzing breach data is far different from reviewing copy written by reporters. Breach data is both richer and broader. Today we get our data from web sites such as attrition.org/dataloss and pogowasright.org.

An analysis of the stories in the press shows that they don’t line up with those in the state repositories of breach notices. So, what is in the press is not an accurate summary of the underlying facts. The lessons present within breach data exceed what we can learn from headlines alone. The public availability of breach data also means that our analysis is not limited by the people before us who have sifted, selected, and interpreted the information. There is a strong similarity to historians’ preference for original source material. While an early historian may have done a great job of sifting through the documents to put together a narrative, another will do better by returning to the source, seeing the documents themselves, and deciding if other interpretations exist.

The final potential criticism we will note is that breach notices are believed to be career-limiting for the executives at fault, and that, therefore, there is a strong incentive not to report. Again, this is an established myth, but by and large it is false. (If management is at fault, however, it may be both true and appropriate.) More often, what happens is a mistake, and it is seen as a mistake rather than a failure of judgment. As we learn more, what can be described as a mistake versus what can be considered a failure of judgment will become clearer. In the absence of objective data such as breach data, a scattershot security strategy has been the default. So breach data does pose a risk to decision-makers. As breach data begins to illuminate the nature of mistakes that are being made, executives will open themselves up to criticism if they do not use breach data to guide their spending on security in the most appropriate manner. There will be security leaders who oppose these new rules—not just because they fear the effects of mistakes, but because they haven’t yet joined the New School. They have not yet learned to focus on observation and objective measurement. Those who have know that nothing built by human hands is perfect. They have the maturity that comes from having faced failure and learned from it.

Moving from Art to Science

Breach reports tell us who lost control of data, how it was lost, and how much data was lost per incident. These data points allow us to consider with broad strokes the issues that lead to public reporting. Ideally, we want to be able to evaluate security strategies in detail and understand how various approaches might help prevent incidents. This would also help us identify what methods unequivocally do not help so that we can abandon them. Do people who attend one security training class do better than those who attend another? Is one particular type of security product superior to another? There are many other questions that we cannot answer without evidence.

We don’t want to minimize the difficulties involved in answering such questions. We can’t arrange a set of companies in test tubes, add heat, and see what comes out. In that respect, our data sources are more like those of astrophysicists or sociologists than those that a chemist or physicist might create by careful design. But this doesn’t mean we can’t learn from observation. Careful observation and analysis have led to our understanding quite a bit about how the universe is shaped, where it comes from, and what is happening far away. Some of the most interesting observations are even accidental. (When Penzias and Wilson found the cosmic background radiation left by the Big Bang, they weren’t looking for it; they had built a radio that picked up a persistent noise that they couldn’t initially explain.) It is true that computer security consists of a fog of moving parts, and that lurking near any simplistic analysis is the truism that correlation does not prove causation. At the same time, complex problems do get solved. Investigators bring a broad set of analytic techniques, ranging from explanatory psychological stories that can be seen to jibe with the data, to complex economic models. Some of these may enable hypotheses to be put forth and tested in a scientific manner. Underlying them all is the possibility of collecting objective data against which predictions can be made.

This core aspect of scientific research—the ability to gather objective data against which to test hypotheses—has been largely missing from information security. It is likely that many security strategies that are prevalent today and that were developed as a “best guess” in the absence of objective data will not stand up to scrutiny. Absent measurement, we are reminiscent of medieval scholars arguing about how many teeth a horse has, by reference to Aristotle, rather than by actually looking at a horse. So we need to study horses. “Fortunately,” they can now be seen all around us. For a wide variety of interesting questions, it is possible to gather data about the causes of security breaches. It becomes possible to compare insider attacks to outsider attacks, to compare password failures to patch failures. We can apply that data to the questions that matter to us, and chart a new path with a greater degree of confidence than ever before.

Get Involved

We will speak here about how organizations and individuals can embrace the use of breach data. We will build on these thoughts in Chapter 8, when we discuss the application of the New School overall.

Jane Goodall made a career of studying chimpanzees. She spent years in the wilderness observing chimp behavior and group dynamics. Those deep observations made her an expert worth hearing. In a similar fashion, information security professionals should take time to gather and analyze security breach data, and be familiar with the analysis that others are doing. (Fortunately, you can do this from the comfort of your desk.) A security professional making planning decisions on topics related to data loss can better make those decisions by referring to the lessons of breach data. An organization that wants the focus of its security program to be on preventing data breaches can align its spending to protect against the types of incidents that are reported elsewhere. We will discuss in Chapter 6 how security budgets today tend to be set using rules of thumb. But because breach data is now available, spending strategies can be made much more efficient. Managers who approach a problem by investigating and gathering data and then propose solutions based on that data will succeed more often than those who look to generic, entrenched thinking. The most value that any individual or company can receive from breach data is for them to bring their questions and orientations to the data and study it themselves. Studying data doesn’t require any theories. It simply requires data. Much like John Snow, getting involved is a great place to start.

As lessons are learned from breach data, those lessons must be communicated to the larger world. Every organization and individual who spends time analyzing breach data should communicate what they are studying and why. People with a variety of backgrounds will be able to bring fresh perspectives to breach data for quite some time, leading to interesting talks, journal papers, and other career-enhancing activity. This variety of backgrounds has been referred to as hybrid vigor, a term Dan Geer imported to the security field from biology.

To continue the momentum behind mandatory disclosure, the conversation about data-gathering must also continue. The requirement to report breaches is new, and some people and organizations are deeply uncomfortable with the new. They want things to be as they were, and they are working to roll back the new rules around breach disclosure. The more vibrant the conversation around breach data, the more individuals and organizations will become aware of the positive effects of the availability of that data. The more people who are aware of those benefits, the less likely that forward progress will be slowed. We’re confident that the more we learn about breaches, the more value we will see, and the more questions we’ll want to ask. It’s the natural progression of data collection and analysis.

In Conclusion

When there are things that are not well understood, great scientists know that there is no alternative to getting into the details and being there. That is why scientists are willing to travel to a long list of uncomfortable and dangerous places to get near the things they want to observe. It’s why natural historians and anthropologists observe not for a day or a week but for years. Seeing what’s normal and what’s abnormal often requires an immersion in the subject and a willingness to follow trails, not knowing where they might lead or if they’ll be a waste. This is what allowed John Snow to succeed: immersing himself in the environment that gave rise to cholera. Fundamentally, medicine and public health involve observation and measurement before theories and diagnoses can be made.

We have identified a source of objective data for information security that is both new and important. We’ve discussed how that data can transform the way in which we discuss and approach security issues, and we have taken an initial stab at analyzing that data.

Breach data is bringing us more and better objective data than any past information-sharing initiative in the field of information security. Breach data allows us to see more about the state of computer security than we’ve been able to see with the traditional sources of information. It has also allowed us to demonstrate that some of the reasons why we don’t share data are more fears than realities. Crucially, breach data allows us to understand what sorts of issues lead to real problems, and this can help us all make better security decisions.

The concept of gathering data, analyzing data, and being willing to adjust behavior based on the message of the data is at the heart of the New School.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.142.212.160