Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 3: Model Now!: <i xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:svg="http://www.w3.org/2000/svg">An Introduction to Practical Quantitative Methods for Cybersecurity</i>

Chapter 2
A Measurement Primer for Cybersecurity

Success is a function of persistence and doggedness and the willingness to work hard for twenty-two minutes to make sense of something that most people would give up on after thirty seconds.

—Malcom Gladwell, Outliers¹

Before we can discuss how literally anything can be measured in cybersecurity, we need to discuss measurement itself, and we need to address early the objection that some things in cybersecurity are simply not measurable. The fact is that a series of misunderstandings about the methods of measurement, the thing being measured, or even the definition of measurement itself will hold back many attempts to measure.

This chapter will be mostly redundant for readers of the original How to Measure Anything: Finding the Value of “Intangibles” in Business. This chapter has been edited from the original and the examples geared slightly more in the direction of cybersecurity. However, if you have already read the original book, then you might prefer to skip this chapter. Otherwise, you will need to read on to understand these critical basics.

We propose that there are just three reasons why anyone ever thought something was immeasurable—cybersecurity included—and all three are rooted in misconceptions of one sort or another. We categorize these three reasons as concept, object, and method. Various forms of these objections to measurement will be addressed in more detail later in this book (especially in Chapter 5). But for now, let’s review the basics:

Concept of measurement. The definition of measurement itself is widely misunderstood. If one understands what “measurement” actually means, a lot more things become measurable.
Object of measurement. The thing being measured is not well defined. Sloppy and ambiguous language gets in the way of measurement.
Methods of measurement. Many procedures of empirical observation are not well known. If people were familiar with some of these basic methods, it would become apparent that many things thought to be immeasurable are not only measurable but may have already been measured.

A good way to remember these three common misconceptions is by using a mnemonic like “howtomeasureanything.com,” where the c, o, and m in “.com” stand for concept, object, and method. Once we learn that these three objections are misunderstandings of one sort or another, it becomes apparent that everything really is measurable.

The Concept of Measurement

As far as the propositions of mathematics refer to reality, they are not certain; and as far as they are certain, they do not refer to reality.

—Albert Einstein

Although this may seem a paradox, all exact science is based on the idea of approximation. If a man tells you he knows a thing exactly, then you can be safe in inferring that you are speaking to an inexact man.

—Bertrand Russell (1872–1970), British mathematician and philosopher

For those who believe something to be immeasurable, the concept of measurement—or rather the misconception of it—is probably the most important obstacle to overcome. If we incorrectly think that measurement means meeting some nearly unachievable standard of certainty, then few things will be measurable even in the physical sciences.

If you ask a manager or cybersecurity expert what measurement means, you would usually get answers like “to quantify something,” “to compute an exact value,” “to reduce to a single number,” or “to choose a representative amount,” and so on. Implicit or explicit in all of these answers is that measurement is a single, exact number with no room for error. If that was really what the term means, then, indeed, very few things would be measurable.

Perhaps the reader has heard—or said—something like, “We can’t measure the true impact of a data breach because some of the consequences can’t be known exactly.” Or perhaps, “There is no way we can put a probability on being the target of a massive denial-of-service attack because there is too much uncertainty.” These statements indicate a presumed definition of measurement that is both unrelated to real decision making and also unscientific. When scientists, actuaries, or statisticians perform a measurement, they are using a different de facto definition.

A Definition of Measurement

For all practical decision-making purposes, we need to treat measurement as observations that quantitatively reduce uncertainty. A mere reduction, not necessarily an elimination, of uncertainty will suffice for a measurement. Even if some scientists don’t articulate this definition exactly, the methods they use make it clear that, to them, measurement is only a probabilistic exercise. Certainty about real-world quantities is usually beyond their reach. The fact that some amount of error is unavoidable but can still be an improvement on prior knowledge is central to how experiments, surveys, and other scientific measurements are performed.

The practical differences between this definition and the most popular definitions of measurement are enormous. Not only does a true measurement not need to be infinitely precise to be considered a measurement, but the lack of reported error—implying the number is exact—can be an indication that empirical methods, such as sampling and experiments, were not used (i.e., it’s not really a measurement at all). Measurements that would pass basic standards of scientific validity would report results with some specified degree of uncertainty, such as, “There is a 90% chance that an attack on this system would cause it to be down somewhere between 1 and 8 hours.”

This conception of measurement might be new to many readers, but there are strong mathematical foundations—as well as practical reasons—for looking at measurement this way. A measurement is, ultimately, just information, and there is a rigorous theoretical construct for information. A field called “information theory” was developed in the 1940s by Claude Shannon, an American electrical engineer and mathematician. In 1948, he published a paper titled “A Mathematical Theory of Communication,”² which laid the foundation for information theory and, ultimately, much of the world of information technology that cybersecurity professionals work in.

Shannon proposed a mathematical definition of “information” as the amount of uncertainty reduction in a signal, which he discussed in terms of the “entropy” removed by a signal. To Shannon, the receiver of information could be described as having some prior state of uncertainty. That is, the receiver already knew something, and the new information merely removed some, not necessarily all, of the receiver’s uncertainty. The receiver’s prior state of knowledge or uncertainty can be used to compute such things as the limits to how much information can be transmitted in a signal, the minimal amount of signal to correct for noise, and the maximum data compression possible.

This “uncertainty reduction” point of view is what is critical to business. Major decisions made under a state of uncertainty—such as whether to approve large information technology (IT) projects or new security controls—can be made better, even if just slightly, by reducing uncertainty. Sometimes even small uncertainty reductions can be worth millions of dollars.

A Taxonomy of Measurement Scales

Okay, so measuring cybersecurity is like any other measurement in the sense that it does not require certainty. Various types of measurement scales can push our understanding of measurement even further. Usually, we think of measurements as involving a specific, well-defined unit of measure such as dollars per year in the cybersecurity budget or minutes of duration of system downtime.

But could a scale like “high,” “medium,” or “low” constitute a proper measurement? Cybersecurity professionals will recognize scales like this as common in many standards and practices in all areas of risk assessment. It is common to see quantities like “impact” or “likelihood” assessed subjectively on a scale of 1 to 5 and then for those scales to be combined further to assess risk as high, medium, or low. These are deceptively simple methods that introduce a series of issues that will be discussed in further detail later in this book. For now, let’s talk about where it might make sense to use scales other than conventional units of measure.

Note that the definition I offer for measurement says a measurement is “quantitatively expressed.” The uncertainty, at least, has to be quantified, but the subject of observation might not be a quantity itself—it could be entirely qualitative, such as a membership in a set. For example, we could “measure” something where the answer is yes or no—like whether a data breach will occur this year or whether a cyberinsurance claim will be made—while still satisfying our precise definition of measurement. But our uncertainty about those observations must still be expressed quantitatively (e.g., there is a 15% chance of a data breach this year, there is a 20% chance of making a cyberinsurance claim, etc.).

The view that measurement applies to questions with a yes/no answer or other qualitative distinctions is consistent with another accepted school of thought on measurement. In 1946, the psychologist Stanley Smith Stevens wrote an article called “On the Theory of Scales and Measurement.”³ In it he describes four different scales of measurement: nominal, ordinal, interval, and ratio scales. If the reader is thinking of Celsius or dollars as a measurement, they are thinking of an interval and ratio scale, respectively. These scales both have a well-defined “unit” of a regular size. In both cases we can say a 6 is 2 more than a 4 (6 degrees Celsius or $6). An interval scale, however, doesn’t really allow us to say that a 6 is “50% more” than a 4 or “twice as much” as a 3. For example, 6 degrees Celsius is not “twice as hot” as 3 degrees Celsius (since the “zero” position on the Celsius scale is set arbitrarily at the freezing point of water). But $6 million is twice as much as $3 million. So, there are some mathematical operations we cannot do with interval scales, like multiplication or division.

Nominal and ordinal scales are even more limited. A nominal scale has no implied order or magnitude—like gender or location or whether a system has a given feature. A nominal scale expresses a state without saying that one state is twice as much as the other or even, for that matter, more or less than the other—each state scale is just a different state, not a higher or lower state. Ordinal scales, on the other hand, denote an order but not by how much. We can say, for example, that someone with admin rights has more privilege than a regular user. But we don’t say it is five times the privilege of a normal user and twice as much as another user. So most mathematical operations—other than basic logic or set operations—are not applicable to nominal or ordinal scales.

Still, it is possible for nominal and ordinal scales to be informative even though they vary from more conventional measurement scales like kilograms and seconds. To a geologist, it is useful to know that one rock is harder than another, without necessarily having to know by how much. The method they use for comparing hardness of minerals—called the Mohs hardness scale—is an ordinal scale.

So the use of ordinal scales like those often found in cybersecurity are not strictly a violation of measurement concepts, but how it is done, what it is applied to, and what is done with these values afterward actually does violate basic principles and can cause a lot of problems. Geologists don’t multiply Mohs hardness scale values times the rock’s color. And while the Mohs scale is a well-defined measurement, the uses of ordinal scales in cybersecurity often are not.

We will show later that measures based on well-defined quantities—like the annual probability of an event and a probability distribution of potential losses—are preferable to the types of ordinal scales typically used in cybersecurity. In fact, nothing in science and engineering really relies on an ordinal scale. Even the Mohs hardness scale has been replaced in many uses. (Outside of geology, the Vickers scale, a proper ratio scale, is considered more suitable for materials in science and engineering problems.)

These are all important distinctions about the concept of measurement that contain many lessons for managers in general as well as cybersecurity specialists. The commonplace notion that presumes measurements are exact quantities ignores the usefulness of simply reducing uncertainty, if eliminating uncertainty is not possible or economical. And not all measurements even need to be about a conventional quantity. Measurement applies to discrete, nominal points of interest like “Will we experience a major data breach?” as well as continuous quantities like “How much will it cost if we do have a data breach?” In business, decision makers make decisions under uncertainty. When that uncertainty is about big, risky decisions, then uncertainty reduction has a lot of value—and that is why we will use this definition of measurement.

Bayesian Measurement: A Pragmatic Concept for Decisions

Therefore the true logic for this world is the calculus of Probabilities, which takes account of the magnitude of the probability which is, or ought to be, in a reasonable man’s mind.

—James Clerk Maxwell, 1850

When we talk about measurement as “uncertainty reduction,” we imply that there is some prior state of uncertainty to be reduced. And since this uncertainty can change as a result of observations, we treat uncertainty as a feature of the observer, not necessarily the thing being observed.⁴ When we conduct a penetration test on a system, we are not changing the state of the application with this inspection; rather, we are changing our uncertainty about the state of the application.

We quantify this initial uncertainty and the change in uncertainty from observations by using probabilities. This means that we are using the term “probability” to refer to the state of uncertainty of an observer or what some have called a “degree of belief.” If you are almost certain that a given system will be breached, you can say there is a 99% probability. If you are unsure, you may say there is a 50% probability (as we will see in Chapter 7, assigning these probabilities subjectively is actually a skill you can learn). Likewise, if you are very uncertain about the duration of an outage from a denial of service attack, you may say there is a 90% probability that the true value falls between 10 minutes and 2 hours. If you had more information, you might give a much narrower range and still assign a 90% probability that the true value falls within that range.

This view of probabilities is called the “subjectivist” or sometimes the “Bayesian” interpretation. The original inspiration for the Bayesian interpretation, Thomas Bayes, was an eighteenth-century British mathematician and Presbyterian minister whose most famous contribution to statistics would not be published until after he died. His simple formula, known as Bayes’s theorem, describes how new information can update prior probabilities. “Prior” could refer to a state of uncertainty informed mostly by previously recorded data, but it can also refer to a point before any objective and recorded observations. At least for the latter case, the prior probability often needs to be subjective.

For decision making, this is the most relevant use of the word “probability.” It is not just something that must be computed based on other data. A person represents uncertainty by stating a probability. Being able to express a prior state of uncertainty is an important starting point in all practical decisions. In fact, you usually already have a prior uncertainty—even though you might not explicitly state probabilities. Stating priors even allows us to compute the value of additional information since, of course, the value of additional information is at least partly dependent on your current state of uncertainty before you gather the information. The Bayesian approach does this while also greatly simplifying some problems and allowing us to get more use out of limited information.

This is a distinction that cybersecurity professionals need to understand. Those who think of probabilities as only being the result of calculations on data—and not also a reflection of personal uncertainty—are, whether they know it or not, effectively presuming a particular interpretation of probability. They are choosing the “frequentist” interpretation, and while they might think of this as “objective” and scientific, many great statisticians, mathematicians, and scientists would beg to differ. (The original How to Measure Anything book has an in-depth exposition of the differences.)

So, there is a fundamental irony when someone in cybersecurity says they lack the data to assign probabilities. We use probability because we lack perfect information, not in spite of it. This position was stated best by the widely recognized father of the field of decision analysis, Professor Ron Howard of Stanford University. During a podcast for an interview with Harvard Business Review, the interviewer asked Howard how to deal with the challenge of analysis “when you don’t know the probabilities.” Howard responded:

Well, see, but the whole idea of probability is to be able to describe by numbers your ignorance or equivalently your knowledge. So no matter how knowledgeable or ignorant you are, that’s going to determine what probabilities are consistent with that.

—Ron Howard, Harvard Business Review podcast, interviewed by Justin Fox, November 20, 2014

There are cases where “probability” is a computed value but, as great minds like Howard and James Clerk Maxwell (from the earlier quote) state, probability is also used to represent our current state of uncertainty about something, no matter how much that uncertainty is. But keep in mind that, while subjective, the probability we refer to is not just irrational and capricious. We need subjective uncertainties to at least be mathematically coherent as well as consistent with repeated, subsequent observations. A rational person can’t simply say, for instance, that there is a 25% chance of their organization being hit by a particular type of cyberattack and a 90% chance that it won’t be (of course, these two possibilities should have a total probability of 100%). Also, if someone keeps saying they are 100% certain of their predictions and they are consistently wrong, then we can reject their subjective uncertainties on objective grounds just as we would with the readings of a broken digital scale or ammeter. In Chapter 7, you will see how probabilities can be subjective and yet rational.

Finally, we need to remember that there is another edge to the “uncertainty reduction” sword. Total elimination of uncertainty is not necessary for a measurement, but there must be some uncertainty reduction. If a decision maker or analyst engages in what they believe to be measurement activities, but their estimates and decisions actually get worse or don’t at least improve, then they are not actually reducing their error and are not conducting a measurement according to the stated definition.

And so, to determine whether these ordinal scales so commonly used in cybersecurity are proper measurements, we at least need to ask whether such scales really constitute a reduction in uncertainty. (These finer points will be developed further in Chapter 5.)

The Object of Measurement

A problem well stated is a problem half solved.

—Charles Kettering (1876–1958), American inventor, holder of over 100 patents, including electrical ignition for automobiles

There is no greater impediment to the advancement of knowledge than the ambiguity of words.

—Thomas Reid (1710–1796), Scottish philosopher

Even when the more useful concept of measurement (as uncertainty-reducing observations) is adopted, some things seem immeasurable because we simply don’t know what we mean when we first pose the question. In this case, we haven’t unambiguously defined the object of measurement. If someone asks how to measure “damage to reputation” or “threat” or “business disruption,” we simply ask, “What do you mean, exactly?” It is interesting how often people further refine their use of the term in a way that almost answers the measurement question by itself.

Once managers figure out what they mean and why it matters, the issue in question starts to look a lot more measurable. This is usually the first level of analysis when one of the authors, Hubbard, conducts what he calls “clarification workshops.” It’s simply a matter of clients stating a particular, but initially ambiguous, item they want to measure. Just ask questions like “What do you mean by [fill in the blank]?” and “Why do you care?”

This applies to a wide variety of measurement problems, and cybersecurity is no exception. In 2000, when the Department of Veterans Affairs asked Hubbard to help define performance metrics for what they referred to as “IT security,” Hubbard asked: “What do you mean by ‘IT security’?” and over the course of two or three workshops, the department staff defined it for him. They eventually revealed that what they meant by IT security were things like a reduction in intrusions and virus infections. They proceeded to explain that these things impact the organization through fraud, lost productivity, or even potential legal liabilities (which they may have narrowly averted when they recovered a stolen notebook computer in 2006 that contained the Social Security numbers of 26.5 million veterans). All of the identified impacts were, in almost every case, obviously measurable. “Security” was a vague concept until they decomposed it into what they actually expected to observe.

What we call a “clarification chain” is just a short series of connections that should bring us from thinking of something as an intangible to thinking of it as a tangible. First, we recognize that if X is something that we care about, then X, by definition, must be detectable in some way. How could we care about things like “quality,” “risk,” “security,” or “public image” if these things were totally undetectable, in any way, directly or indirectly? If we have reason to care about some unknown quantity, it is because we think it corresponds to desirable or undesirable results in some way. Second, if this thing is detectable, then it must be detectable in some amount. If you can observe a thing at all, you can observe more of it or less of it. Once we accept that much, the final step is perhaps the easiest. If we can observe it in some amount, then it must be measurable.

If the clarification chain doesn’t work, I might try what scientists would call a “thought experiment.” Imagine you are an alien scientist who can clone not just sheep or even people but entire organizations. You create a pair of the same organization, calling one the “test” group and one the “control” group. Now imagine that you give the test group a little bit more “damage to reputation” while holding the amount in the control group constant. What do you imagine you would actually observe—in any way, directly or indirectly—that would change for the first organization? Does it mean sales go down in the near term or long term? Does it mean it becomes harder to recruit applicants who want to work at prestigious firms? Does it mean that you have to engage in expensive PR campaigns to offset these consequences? If you can identify even a single observation that would be different between the two cloned organizations, then you are well on the way to identifying how you would measure it.

It also helps to state why we want to measure something in order to understand what is really being measured. The purpose of the measurement is often the key to defining what the measurement is really supposed to be. Measurements should always support some kind of decision, whether that decision is a one-off or a frequent, recurring decision. In the case of measuring cybersecurity risks, we are presumably conducting measurements to better allocate resources to reduce risks. The purpose of the measurement gives us clues about what the measure really means and how to measure it. In addition, we find several other potential items that may need to be measured to support the relevant decision.

Identifying the object of measurement really is the beginning of almost any scientific inquiry, including the truly revolutionary ones. Cybersecurity experts and executives need to realize that some things seemed intangible only because they have been poorly defined. Avoidably vague terms like “threat capability” or “damage to reputation” or “customer confidence” seem immeasurable at first, perhaps, only because what they mean is not well understood. These terms may actually represent a list of distinct and observable phenomena that need to be identified in order to be understood. Later in this book (especially Chapter 6) we will offer ways of decomposing them into lists of more specific things.

We should start clarifying the objective of measurement by defining some of the other terms we’ve used many times up to now. To measure cybersecurity, we would need to ask such questions as “What do we mean by ‘cybersecurity’?” and “What decisions depend on my measurement of cybersecurity?”

To most people, an increase in security should ultimately mean more than just, for example, who has attended security training or how many desktop computers have new security software installed. If security is better, then some risks should decrease. If that is the case, then we also need to know what we mean by risk. Clarifying this problem requires that we jointly clarify uncertainty and risk. Not only are they measurable; they are key to understanding measurement in general. So let’s define these terms and what it means to measure them.

We will explain how we assign these probabilities (initially by using skills you will learn in Chapter 7), but at least we have defined what we mean—which is always a prerequisite to measurement. We chose these definitions because they are the most relevant to how we measure the example we are using here: security and the value of security. But, as we will see, these definitions also are the most useful when discussing any other type of measurement problem we have.

Now that we have defined “uncertainty” and “risk,” we have a better tool box for defining terms like “security” (or “safety,” “reliability,” and “quality,” but more on that later). When we say that security has improved, we generally mean that particular risks have decreased. If I apply the definition of risk given earlier, a reduction in risk must mean that the probability and/or severity (loss) decreases for a particular list of events. That is the approach mentioned earlier to help measure some very large IT security investments—including the $100 million overhaul of IT security for the Department of Veterans Affairs.

In short, figure out what you mean and you are halfway to measuring it. Chapter 6 will dive deeper into approaches for defining the observable consequences of cybersecurity, how to break down the effects of a cybersecurity event, and how to clarify the necessary decision. (There you will find that we will again refer to Ron Howard’s work in decision analysis.)

The Methods of Measurement

It’s not what you don’t know that will hurt you, it’s what you know that ain’t so.

—Mark Twain⁵

When thinking about measurement methods, someone may imagine a fairly direct case of measurement. If you measure the downtime of a system or the number of people who attended security training, there is no larger “unseen” population you are trying to assess. You have direct access to the entire object of measurement. If this is the limit of what one understands about measurement methods, then, no doubt, many things will seem immeasurable. Statistics and science in general would be much easier if we could directly see everything we ever measured. Most “hard” measurements, however, involve indirect deductions and inferences. This definitely applies to cybersecurity, where we often need to infer something unseen from something seen. Studying populations too large or dynamic to see all at once is what statistics is really all about.

Cybersecurity is not some exceptional area outside the domain of statistics but rather exactly the kind of problem statistics was made for. (Cybersecurity experts who are convinced otherwise should consider Mark Twain’s quote above.) They may believe they correctly recall and understand enough about statistics and probability so that they can make confident declarations about what inferences can be made from some data without attempting any math. Unfortunately, their mental math is often not at all close to correct. There are misconceptions about the methods of measurement that get in the way of assessing risk in many fields, including cybersecurity.

Statistical Significance: What’s the Significance?

You may often hear someone claim that a set of sample data is not large enough to be “statistically significant.” If you hear someone say that, you know one thing for sure: They misunderstand the concept of statistical significance. A recent survey of 171 cybersecurity professions conducted by the authors demonstrates that these misconceptions are just as prevalent in this industry as in any other (more about the findings from this survey will be covered in Chapter 5). You may notice that the beliefs some hold about statistics will contradict the following facts:

There is no single, universal sample size required to be “statistically significant.”
To compute it correctly, statistical significance is a function of not only sample size, but also the variance within a sample and the hypothesis being tested. These would be used to compute something called a “P-value.” This result is then compared to a stated “significance level.” Lacking those steps, the declaration of what is statistically significant cannot be trusted.
Once you know not only how to compute statistical significance but also how to understand what it means, then you will find out that it isn’t even what you wanted to know in the first place. Statistical significance does not mean you learned something and the lack of statistical significance does not mean you learned nothing.

This issue is explored in further detail at a mathematical level in the original How to Measure Anything: Finding the Value of “Intangibles” in Business. For now, it is probably better if you drop the phrase “statistically significant” from your vocabulary. What you want to know is whether you have less uncertainty after considering some source of data and whether that reduction in uncertainty warrants some change in actions. Statisticians know that is not the question statistical significance answers and they find themselves constantly correcting those who believe otherwise. There is math for questions like how much uncertainty was reduced, but they can be answered without reference to statistical significance or what the cybersecurity analyst believes they recall about it.

Cybersecurity experts, like many in virtually all fields of management, need to unlearn some misconceptions about statistics as much as they need to learn new concepts about statistics. Later, we will discuss how several proven measurement methods can be used for a variety of issues to help measure something you may have at first considered immeasurable. Here are a few examples involving inferences about something unseen from something seen:

Measuring with very small random samples of a very large population: You can learn something from a small sample of data breaches and other events—especially when there is currently a great deal of uncertainty.
Measuring when many other, even unknown, variables are involved: We can estimate how much a new security control reduced risk even when there are many other factors affecting whether or not losses due to cyberattacks occurred.
Measuring the risk of rare events: The chance of a launch failure of a rocket that has never flown before, or the chance of another major financial crisis, can be informed in valuable ways through observation and reason. These problems are at least as difficult as the risk of the rare major breach in cybersecurity, yet measurements can and have been applied.
Measuring subjective preferences and values: We can measure the value of art, free time, or reducing risk to your life by assessing how much people actually pay for these things. Again, the lessons from other fields apply equally well to cybersecurity.

Most of these approaches to measurements are just variations on basic methods involving different types of sampling and experimental controls and, sometimes, choosing to focus on different types of questions that are indirect indicators of what we are trying to measure. Basic methods of observation like these are often absent from certain decision-making processes in business, perhaps because such quantitative procedures are considered to be some elaborate, overly formalized process. Such methods are not usually considered to be something you might do, if necessary, on a moment’s notice with little cost or preparation. But we will show some methods that—to use a popular concept in systems engineering—can even be considered “agile.”

Small Samples Tell You More Than You Think

When someone in cybersecurity or any other field says something like, “We don’t have enough data to measure this,” they probably do not understand that they are making a very specific mathematical claim—for which they provided no actual math to support. Did they actually compute the uncertainty reduction from a given amount of data? Did they actually compute the economic value of that uncertainty reduction? Probably not.

Our intuition is one problem when it comes to making probabilistic inferences about data. But perhaps a bigger problem is what we think we learned (but learned incorrectly) about statistics. Statistics actually helps us make some informative inferences from surprisingly small samples.

Consider a random sample of just five of anything. It could be time spent by employees on websites, a survey of firms in some industry reporting cybersecurity budgets, and so on. What is the chance that the median of the entire population (the point at which half the population is below and half above) is between the largest and smallest of that sample of five? The answer is 93.75%. In How to Measure Anything, Hubbard refers to this as the “Rule of Five.” With a sample this small, the range might be very wide, but if it is any narrower than your previous range, then it counts as a measurement according to our previous definition. The Rule of Five is simple, it works, and it can be proven to be statistically valid for a surprisingly wide range of problems. If your intuition—or your recollection of statistics—disagrees with this, it’s not the math that is wrong.

It might seem impossible to be 93.75% certain about anything based on a random sample of just five, but it’s not. If we randomly picked five values that were all above the median or all below it, then the median would be outside our range. But what is the chance of that, really? Remember, the chance of randomly picking a value above the median is, by definition, 50%—the same as a coin flip resulting in “heads.” The chance of randomly selecting five values that happen to be all above the median is like flipping a coin and getting heads five times in a row. The chance of getting heads five times in a row in a random coin flip is 1 in 32, or 3.125%; the same is true with getting five tails in a row. The chance of not getting all heads or all tails is then 100% – (3.125% × 2), or 93.75%. Therefore, the chance of at least one out of a sample of five being above the median and at least one being below is 93.75% (round it down to 93% or even 90% if you want to be conservative). Some readers might remember a statistics class that discussed statistics for very small samples. Those methods were more complicated than the Rule of Five, but the answer is really not much better. (Both methods make some simplifying assumptions that work very well in practice.)

We can improve on a rule of thumb like this by getting more samples and by using simple methods to account for certain types of bias we will discuss later. Still, even with acknowledged shortcomings, the Rule of Five is something that the person who wants to develop an intuition for measurement keeps handy.

Let’s make some deliberate and productive assumptions instead of ill-considered presumptions. We propose a contrarian set of assumptions that—because they are assumptions—may not always be true in every single case but in practice turn out to be much more effective. We will cover these points in more detail later but for now we will just point them out:

No matter how complex or “unique” your measurement problem seems, assume it has been measured before.
If you are resourceful, you can probably find more sources of data than you first thought.
You probably need less data than your intuition tells you—this is actually even more the case when you have a lot of uncertainty now.

There might be the rare case where, only for lack of the most sophisticated measurement methods, something seems immeasurable. But for those things labeled “intangible,” more advanced, sophisticated methods are almost never what are lacking. Things that are thought to be intangible tend to be so uncertain that even the most basic measurement methods are likely to reduce some uncertainty. Cybersecurity is now such a critical endeavor that even small reductions in uncertainty can be extremely valuable.

In the next chapter, we will show how these concepts can be just partially applied through a very simple yet quantitative method for evaluating cybersecurity risks, which will take barely any more time than the common risk matrix.

Notes

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 2: A Measurement Primer for Cybersecurity

Create new playlist

Sign In

Sign Up