Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

P. RissenExperiment-Driven Product Developmenthttps://doi.org/10.1007/978-1-4842-5528-5_6

6. Hypotheses, Measures, and Conditions

Paul Rissen¹

(1)

Middlesex, UK

In the previous chapter, we looked at what is known as the “Planning Phase” of experiment-driven product development. In that phase, we collected together all kinds of raw material that form the basis of product development work, turned them into experiment-ready questions, and prioritized among them.

Once complete, you’ll have a backlog of questions that if answered would give you useful knowledge that can help shape your path forward. Now, we’re ready to pick a question from the top of the backlog and move into the “Design Phase.”

The Design Phase of experiment-driven product development is crucial to the success or failure of any particular experiment. In the Design Phase, we take the question we’re going to try and answer and design an experiment which will yield a useful, reliable answer—one that we can trust to make a product decision upon. This will be the focus of Chapters 6 and 7.

Allowing the time for experiment design is a crucial part of the XDPD approach. It’s very easy to dive straight into running experiments, particularly if you’re used to A/B feature testing, but until you’ve become comfortable with nailing down exactly what you’re trying to find out, the precise way in which you can discover that, and interpreting the results of your experiment, you’ll find the whole experience somewhat unsatisfying—and it won’t really help improve your product or service.

In this chapter, we’ll concentrate on the first part of designing an experiment—establishing a Hypothesis for your belief-led questions, choosing appropriate Measures, and detailing the specific Conditions needed in order to answer the question. In the following chapter, we can then discuss the Scale of the experiment, and the Method you’re going to use.

In Chapter 5, we discussed two common forms of question that emerge from the Planning Phase of XDPD—“belief-led” questions and “exploratory” questions. As you can see from Figure 6-1, no matter what form of question you’re dealing with, the Design Phase has plenty to offer.

../images/473799_1_En_6_Chapter/473799_1_En_6_Fig1_HTML.jpg — Figure 6-1.
The Design Phase of experiment-driven product development

When considering belief-led questions, however, there is an important step to go through before you can get down to deciding measures, conditions, scale, and method—formulating a hypothesis.

Formulating hypotheses

A hypothesis is a statement of belief, hence why it is a mandatory requirement when trying to answer belief-led questions. After all, if you’re trying to determine whether your belief is correct or not, you need to start off by being specific about what exactly it is you believe, so that the experiment can actually provide a useful answer.

In digital product and user experience circles, hypotheses are often framed around solutions. That is to say, the team or stakeholders are asked to prepare a hypothesis statement which frames a solution idea in terms of how it meets the particular needs of a group of users and how a product team will know whether or not the idea was a success. Often, the talk that follows is of how the hypothesis can be “validated” or, even more confusingly, “proved.”

In XDPD, we follow the established practice of scientific and statistical experimentation—we talk of testing a hypothesis rather than validating or proving one.

The reason for this slight change is subtle, but it can help you greatly when designing experiments. The danger in framing hypotheses as things to be “validated” is that it can lead people down the path of designing experiments that are biased in favor the collection of positive, affirmatory evidence.

Similarly, claiming that an experiment which affirmed your belief in something, “proved” it, suggests that the matter is closed forever and can never be disputed. As emphasized throughout the first few chapters of this book, we need to recognize the limits of our conclusions, and stay humble.

Remember!

Instead of validating, or proving a hypothesis to be “true,” we seek to test a hypothesis and only accept it if it meets acceptable levels of certainty.

So, what makes a good hypothesis? Let’s take a look at a few key principles of good hypothesis design and some examples of how you can use them to write good and poor hypotheses.

A hypothesis should be the presumed outcome of your research

A good place to start when designing a hypothesis is to start with what you believe the answer to the question to be. Having started with a premise, claim, or assumption and turned that into a question, you probably already have some inkling, a hunch, or opinion perhaps, of what the answer might be. Whether that answer is right or wrong doesn’t matter at this stage. What is important now is to capture what you think the answer might be.

This is important because it forces you to be honest with yourself, acknowledging that you have opinions and biases and putting them out there in the open, before the experiment begins. In some ways, it’s the equivalent of declaring any potential conflicts of interest before you begin in earnest.¹

If it turns out you were wrong, OK, you might look silly for a moment, but this is where doing the smallest, useful thing comes in. Spending a little effort to prevent you from wasting further time and resources pursuing a lost cause in the future can be a price worth paying.

A hypothesis must be falsifiable, testable, and measurable

Making claims is easy. When we are choosing what to believe, in order to make good product decisions, however, we normally need a higher standard of proof than gut feeling. The three criteria—falsifiable, testable, and measurable—are necessary to achieve that standard.

Falsifiable

What do we mean by falsifiable? A hypothesis that is not falsifiable is essentially impossible to refute. There’s no feasible way of really knowing whether it can be trusted or not, so there’s no purpose in trying to answer the question it poses.

In contrast, a falsifiable hypothesis is one in which it is conceivably possible that we might find evidence that disproves it. We might never be able to emphatically prove a hypothesis—because there’s always a chance that in the future it may no longer be true—but we can disprove one, by collecting evidence which shows it not to be true.

Testable

Along similar lines, a hypothesis needs to be testable. If there’s no practical way in which we can reasonably assess, given the time and resources available to us, whether a hypothesis stands up to scrutiny or not, then it’s useless to us.

For example, in the world of scientific publishing, we might hypothesize that building a particular feature might lead to lesser-known research papers being cited more often than otherwise. But the pace at which research papers are written, reviewed, and published means that we’d only be able to collect evidence to test that hypothesis after months, if not years. More often than not, working on digital products, we just don’t have that luxury.

Measurable

Every experiment relies on evidence; thus every hypothesis needs to explicitly state what measures we will use when assessing whether to accept it or not. Most often, this takes the form of some kind of number, but that doesn’t mean that every experiment must use quantitative methods.

The criteria of measurability simply means that there must be some way of comparing the evidence gathered during the experiment back to the original hypothesis. A hypothesis should always state the specific measure that it can be assessed using.

Furthermore, building on the principle that your hypothesis should be the presumed outcome, or answer, to your research, you should state in the hypothesis what you believe will be found when examining the evidence.

The raw material from which we created questions can be quite general—we think adding a photo feature will increase engagement—but it’s difficult to accurately test whether this is true or not without defining the conditions in which someone might see the photo feature, how exactly you would measure “engagement,” or to what extent it might be increased.

A hypothesis often describes a difference or change

When considering your raw material—the premises, claims, assumptions, and knowledge gaps—it’s often the case that at their heart, each of them conceives of a potential difference or change that could exist or will result once your work is complete.

This could be a positive or negative change to a product metric, a change in behavior or opinion, or simply a difference in behavior between two groups of users. A good rule of thumb, therefore, when formulating a hypothesis, is to think about the kind of difference or change it represents.

This gets quite philosophical and deep, but think about it this way. If there were no distinctions between any knowledge, or indeed between the users of your product—if all there was, was an amorphous mass, where nothing changed and there were no meaningful differences in the contexts, beliefs, actions, state, or thoughts of our users, then how would you develop anything? How would you be able to pinpoint a particular thing that made the product better? How would you be able to distinguish between different user groups? It’d be impossible.

Innovation, as Jared Spool has written,² isn’t necessarily coming up with a shiny new thing—it’s making a positive difference—affecting a change, delivering something that satisfies and delights someone, because it makes their lives unexpectedly easier.

Describing precisely what that change, or difference, might be is important. It helps make the hypothesis testable, because it tells you exactly what to look out for once you’ve gathered the data from running an experiment.

Gathering evidence via running experiments, no matter what form that might be—user activity data, interview records, and so on—and then analyzing the evidence to determine whether the change or difference you thought might be detectable, is there or not, will ultimately provide the knowledge you sought. It might not be the answer you expected, but it will be useful knowledge, to inform what comes next.

Regardless of the method you choose, defining your hypothesis requires that you think carefully about what difference or change you’re trying to detect. It could be quantifiable or purely qualitative.

The hypothesis statement

Now we’ve looked at the principles which make up a well-designed hypothesis, let’s look at the general format of the hypothesis statement—the sentence we will write on our experiment card and communicate to others.

At an absolute minimum, the hypothesis statement should contain three things:

We believe the answer is <our belief>,
The evidence for which would be reflected in <our measures>,
Under the circumstances of <our conditions>.

Notice that the hypothesis statement:

Is framed as an answer, X, to the question being posed
Clearly states the way in which we expect to be able to detect evidence (i.e., our measures)
Also states the conditions under which we expect the answer to hold true

Our work here isn’t complete, however. Although a hypothesis constructed according to the preceding template would be measurable, it could still lead us to design an experiment weighted in favor of our belief. To avoid this bias, and to ensure we stay honest in our experiment design, we should pair our hypothesis with a null hypothesis.

Null hypotheses

Hypotheses always come in pairs. When we talk of “testing” a hypothesis, we’re not just directly trying to gather evidence which suggests this hypothesis may be correct or not. Instead, we’re trying to decide whether to reject what’s known as the null hypothesis in favor of our alternative hypothesis.

The null hypothesis is essentially the mirror to your hypothesis. Null hypotheses generally tend to state that

Nothing will happen as a result of a change you make
Anything that does happen, or any change/difference you notice, is purely down to chance
The reason you believed was responsible for a certain phenomenon is not correct
There is no real answer to the question

This may sound cynical, but in order to make sure we don’t mistakenly end up accepting our alternative hypothesis, and thus making poor decisions, we must presume, up until we have assessed the evidence gathered from an experiment, that the null hypothesis holds true.

What we actually do in an experiment to answer a belief-led question, therefore, is work out whether the evidence we gather is strong enough to persuade us to reject the null hypothesis in favor of our proposed alternative answer.

Examples of hypotheses

Let’s run through three examples so that we can understand the concept of our paired hypotheses—the “null” and “alternative” hypotheses in more detail.

Example 1

Question: Is it true that changing the color of a “buy” button would increase sales in China?

Belief: Red is regarded as a “lucky” color in China; therefore we should make more use of it for that audience.

Hypothesis: Changing the “buy” button on product pages in our site from blue to red will increase sales in China by 3%.

Null hypothesis: Changing the color of the button will not be responsible for sales in China increasing by 3%.

Note that the null hypothesis doesn’t say that sales won’t increase at all. What it says is that changing the color of the button from blue to red will not directly cause sales in China to increase by 3%. Sales might increase. They might decrease. They may stay exactly the same. All of this might happen by chance.

The key point here is that the null hypothesis asserts that the relationship between the change in color of the button and the precise jump in sales doesn’t stand up to scrutiny.

Example 2

Question: Would anyone care if we removed Feature X from our product?

Belief: Given that we can’t see any evidence of click data in the past 12 months on Feature X in our product, we should remove it.

Hypothesis: Nobody has clicked on Feature X in the last 12 months; therefore it is unloved and can be removed.

Null hypothesis: Users have noticed and appreciated Feature X over the last 12 months, but this is reflected in ways other than click data.

In this example, you’ll note that we’re very specific about the measure—click data—and the conditions—the last 12 months. Our null hypothesis helps frame our experiment so that we can get a better answer, by encouraging us to look at other possible evidence before we take a potentially rash decision.

Example 3

Question: Who are becoming our most engaged user group?

Belief: Students in Argentina are becoming more engaged in our product than any other user group.

Hypothesis: We believe students in Argentina are becoming our most engaged user group, as evidenced by the fact that they visit more pages per session than any other group.

Null hypothesis: Students in Argentina are no more engaged than any other user group—the increased number of pages per session from that group is a random event.

Again, even though this example doesn’t propose an intervention or change to the product, we are careful to state both the measure—pages per session—and the conditions—students in Argentina. The null hypothesis cautions us against reading too much into the data we’ve collected, by reminding us that chance is always a factor.

Remember!

Every belief-led question needs both a hypothesis and a null hypothesis. Your experiment is then designed to determine whether there is enough evidence to reject the null hypothesis and accept your “alternative” hypothesis.

Exploratory questions and hypotheses

At first, it might be presumed that hypotheses are mandatory for all experiments. Indeed, if you think of experiments purely as A/B tests (and/or as trials of new features or changes to an existing product), then this makes sense.

However, in XDPD, experiments are a structured way of asking and answering questions. The method, as I’ve said previously, should only be decided once you know exactly what question you’re trying to answer, and the simplest, useful way of answering it.

When considering exploratory questions, you might at first find that coming up with a hypothesis does seem possible. In this case, you’re likely to have encountered a lurking belief—or worse, the method-question-answering loop.

The lurking belief

Unknowingly, you’ve actually surfaced a premise, assumption, or claim—some form of prior belief as to the answer to the question. That is fine, but this turns your question in to a belief-led one, and thus a hypothesis is needed.

The method-question-answering loop

Your proposed hypothesis is either very general—“if I’ll do research, I’ll find an answer”—or too specific to a particular method—“I believe that if we do a survey, we’ll find an answer.” This is what I call the “method-questioning-answering” loop.

The method-questioning-answering loop is one where because you’ve baked a method into your hypothesis, you end up designing an experiment to test whether that particular method would answer the question or not.

That means ultimately the question you’re asking isn’t the original intended question. For example, let’s take an exploratory question such as “Why do people think our product is difficult to use?”, and let’s assume that you have no prior knowledge, assumptions, or beliefs, so you propose it as an exploratory question.

When trying to come up with a hypothesis, you end up stating a belief (already a warning sign when dealing with exploratory questions) along the lines of “We believe if we run a survey, we’ll be able to answer the question.”

Remember that your hypothesis should be framed as an answer to the question. If we were to continue along the lines of the belief we just outlined, then we’d not be answering “Why do people think our product is difficult to use?” Instead, we’d be answering the question of “Would a survey tell us why people think our product is difficult to use?”

Perhaps it would—but if it didn’t, you’d end up looping back around and proposing a different method, perhaps some user interviews. Either way, you’ve not actually discovered the answer to the question you posed in the first place.

Again, this is where our null hypothesis can save us. If you find yourself falling into the trap of the method-question-answering loop, try to come up with a null hypothesis.

While yes, your method-based hypothesis is technically a hypothesis, your null hypothesis would be something like “if I don’t do research, I’ll not find anything” or “A survey won’t answer the question,” which isn’t very helpful.

To be safe, don’t force a hypothesis into your experiment which is answering an exploratory question. And if you spot yourself doing so, take a step back, think about whether there’s really a lurking belief here—and avoid falling into the method-question-answering loop.

Remember!

Although hypotheses may not be mandatory when it comes to experiment-driven product development, this doesn’t mean you can skip the Design Phase altogether! Considering conditions, measures, scale, and method is still crucial to the success or otherwise of your experiments.

Having formulated our pair of hypotheses , our belief-led questions can now return to the main track of the Design Phase. I’ve touched upon the importance of making experiments measurable, and specifying the conditions under which we expect to find evidence, but let’s close out this chapter by digging deeper into these topics.

Measures and conditions

Every experiment starts with a question. Whether or not you have a prior belief as to the answer, by designing an experiment, you’ll want to find ways in which you can gather evidence which will help you answer the question.

One way in which we can do this is to think about how we expect evidence to manifest in the world and what the motivating factor might be in causing that evidence to appear. These correspond to our measures and conditions.

Measures

Running an experiment without deciding which measures to pay attention to can leave you stumbling around in the dark when it comes to analyzing the results and determining an answer.

We select measures so that we can ensure we capture evidence which will help us answer the question but also to help us catch any other interesting secondary evidence—things that might inform future experiments, or things that make sure we’re not damaging the long-term health of our product or service.

You need to be able to measure at least one thing, in order to be able to answer the question—to see, for instance, whether you were able to detect a change or difference, and how small or vast that change was. This will be, ultimately, what determines whether you can reject your null hypothesis in favor of an alternative.

Even in the case of exploratory questions, where you have no prior belief or hypothesis to work with, specifying the measures you’re especially looking out for, ahead of time, will ensure that your research is targeted.

This should never be to the exclusion of other signals or material that may surface. When exploring, you should always be open to new, interesting things. But by writing down the measures ahead of time, you ensure everyone involved in the experiment, and your stakeholders, know what kinds of evidence will help answer the question.

Choosing appropriate measures

When deciding which measures to choose, think about how you would expect your answer to manifest in a way that is possible to detect.

Perhaps people will click on something in particular. Perhaps they’ll spend less time on a certain step in a process. Perhaps they’ll smile more or drop a particular word in conversation. Select the thing that, if measured in some way, will be representative of the change or difference or would help you answer the question when you have no prior belief.

Measures can appear simple at first—number of clicks, number of page views. But bear in mind that you want to also make sure that they are representative. For instance, if measuring clicks, if you only measure the basic number of page views, that could include a single user refreshing the page many, many times. Perhaps in that case it might be better to measure unique page views—the number of different pages that were viewed.

Measures can be calculated. For instance, if you want to measure the interest in something, you might want to measure the number of clicks. But sheer number of clicks could fall prey to a similar problem as earlier. In this case, the number of clicks, divided by the number of unique page views, might be more informative—the unique click-through rate.

Measures might also be proportional. Say you wanted to encourage users to read more articles. Measuring click-through rate would tell you that they expressed interest in articles. Measuring the number of unique page views would tell you how many different articles were read in total.

What you’re really looking for is a change (or otherwise) in the number of articles read by a single user. And, more than that, what you probably really want to know is—did they spend more time reading more articles in one visit? So perhaps what you’re really looking for here is the number of unique pages read in a single session by each user.

Finally, especially for belief-led questions, you should think about how you might detect whether you should be sticking with your null hypothesis—measures that might suggest your alternative hypothesis doesn’t hold up to scrutiny.

These could be the same measures as those which would lead you to reject the null hypothesis in favor of your alternative. But they could also be measures which would show that, for instance, that it’s not A that causes B, as per your belief, nor is it just chance—it’s previously unnoticed factor C.

Track all the things, right? Wrong.

With digital products, it’s very tempting to start by measuring everything, regardless or not of whether it’s tied to an experiment or not. After all, if you’re measuring everything, then you’ll be able to learn so much more, right? Knowledge is power, and all that?

Well, not exactly. There are several downsides to simply setting up everything to be measured, whether it’s in a digital or physical world.

Firstly, everything you decide to measure has a cost. In isolation, measuring a single thing, for instance, the event of someone clicking a link, may not result in running out of memory on a database somewhere, but if you measure everything, these all add up.

Similarly, every event tracked on a web site needs some kind of script to detect and fire off a record of the event, to some data collection service. The more events you track, the more those scripts can get bloated, the slower your page. Again, in small numbers, this is nothing to worry about, but it’s something to bear in mind if you decide to track all the things, all the time.

Secondly, there’s a lack of focus if you track everything, all the time. You’re gathering all this data, and yes, it may prove useful at some point, but 90% of the time it’s just sitting there, unused, useless, not really providing you with any actual knowledge. Potential knowledge, perhaps, but it’s not exactly the simplest, useful thing you could do. It’s the hoarder’s instinct. In the spirit of Marie Kondo, ask yourself, does every single data point spark joy, or rather, does everything you’re tracking really answer a specific question?

Finally, there are the legal, ethical, and moral implications of collecting all the data, all the time. Not only is all this data collection potentially straining your resources, and most of it sitting untapped, but it is stockpiling potential legal trouble. Increasingly, citizens and governments are, quite rightly, becoming more cautious about the amount of personal and private data being harvested and potentially sold on for profit, by both private companies and governments.

Indeed, at the time of writing, there are now laws designed to compel organizations to protect the rights of individuals, by ensuring that data is only used to provide core functionality for a product or service—and that the individual retains the right to ask an organization to provide, and possibly remove, all the information it has about their activity.³

All of which adds up to the indiscriminate harvesting of data being something which, while undoubtedly gives great power, also has great risks. So—if measuring everything isn’t the way to go, but experiment-driven product development relies on gathering some data, where do we go from here?

Be frugal about what you measure

A rule of thumb—only measure what you need to, for the amount of time that you need to measure it. If you end up making a change to your product, or even just putting some tracking in place to measure existing behavior, only track the data for the length of the experiment. Put the measures in place, and take them out once the experiment is complete, even if the feature you developed was successful.

Remember!

Measures, and tracking in general, should always be linked back to a question you’re seeking to answer. Determine the scale needed in order to answer it, and once your experiment is complete, remove the tracking.

Health metrics, and “do no harm”

In addition to tracking measures that can help you answer the question, it’s important to make sure that, when making a change to your product or service, you’re not making it worse, in unintended ways.

Earlier, we looked at an example of an experiment designed to answer whether a feature should be removed from a product. There, you’ll notice that the hypotheses suggest we shouldn’t just measure how we might expect to see use of the feature but other ways in which we might be able to understand whether there’s enough interest in the feature to keep it around.

Health metrics work on a similar principle but are almost always independent of any particular experiment. They are the key things that you use to determine the overall health of your product or service—whether it’s working or not and whether users are finding things slower, harder, or not at all.

These are measures that you’d want to know were being negatively affected, even if a particular experiment suggested that a particular course of action might improve a specific part of the product or service. Ultimately, it’s up to you to make the decision, but bear in mind that in the short term, you might make things better, but keeping an eye on your health metrics will help you spot whether there could be problems in the long term.

Baseline experiments

Often, but not always, your hypotheses will suppose that if you changed something in your service or product, you’d be able to detect a difference from what happened beforehand. Sometimes, though, you’ll be introducing something entirely new and won’t really be able to compare it to what’s gone before.

You’d still be interested to know whether there’s a change or difference as a result of introducing something new—sometimes here you’re especially interested in determining whether there is no difference—but directly comparing to a previous situation, or the nearest equivalent you can find in your product or service, may not be helpful.

These experiments are what we call “baseline” experiments. Rather than trying to optimize something, you’re interested in initial reactions. As a result, when choosing your measures for these experiments, you’re most likely to be tracking something new and thus have nothing to compare it to. In some ways, it’s less about a change and simply more about an effect. The most important thing to do in these cases is to measure your health metrics—to ensure that the completely new thing you’ve introduced does no significant harm to the overall health of your product or service.

Another thing to bear in mind when introducing something new is the novelty factor. We notice something new and are intrigued by it. This causes observers to fall into the trap of significantly overrating the success or otherwise of something completely new, as it’s really indicating the “newness” of something rather than its actual success or otherwise.

Indeed, you almost certainly won’t be able to come to significant, meaningful conclusions about the true popularity or otherwise of something completely new, in a realistic time frame of an experiment. Instead, measure your health metrics, and by all means keep an eye on measures that suggest interest, success, or failure, but do not be tempted to draw absolute conclusions the first time around. New features will often need time to bed in, a few tweaks here and there.

This shouldn’t be an excuse to never kill your new features, though—never get too attached. Always frame things in terms of testing your premises, and if after a few different experiments, something isn’t working as well as you were hoping, or beginning to cause sustained harm to your health metrics—kill it.

Conditions

Aside from trying to capture evidence which helps you answer the question, you’ll want to know why this particular answer has been found. Designing an experiment around certain conditions helps you establish the motivating factor behind an answer.

In Chapter 5, we talked about the importance of describing why we wanted to ask and answer a question—here, we’re interested in why the evidence that we uncover is what it is.

Conditions for belief-led questions

With belief-led questions, you have the pair of hypotheses, so you can determine the conditions by modeling each condition after your null and alternative hypotheses.

The condition that is designed to gather evidence to suggest you should abandon your null hypothesis in favor of the proposed alternative is called the “experimental” condition.

The “control” condition, on the other hand, is designed to disprove your alternative hypothesis—by acting as if the null hypothesis is correct.

Conditions will be familiar to anyone who’s run A/B tests—the A and B being the conditions themselves. The “A” in this case is usually the control, where participants are not exposed to any change in the product, with the “B” being the group that is exposed to the new version.

Conditions are still relevant, though, in experiments which don’t include direct interventions or changes to the product in question. Your conditions represent the hypothesized reasons why you believe a particular answer will be found.

For instance, you might be posing a question which seeks to discover whether or not users in certain demographics behave in a different way or have certain attitudes.

In this case, you’d want to establish a condition which groups users around the demographic that you hypothesize is a motivating factor in that difference. Equally, you’d want a condition that allows you to determine whether that difference happens in other demographics too—thus refuting your presumed reasoning.

Conditions for exploratory questions

For exploratory questions, conditions become less about “control” and “experimental” states, but focus more on the particular demographics, groupings, or categories of things you might want to study in order to answer the question.

It could be as simple as “people who use our product” or “people who don’t”—but in most cases, you’ll be interested in finding out more about specific targeted audiences. Thus, your conditions become the way in which you select what qualifies something to be an object of study for your research.

Again, look to the question and define the boundaries of what you will, and won’t study, in order to determine an answer. Stating this up front helps establish the caveats and usefulness of any particular answer—always good when managing the expectations of the team and your stakeholders—but equally it helps inform any future questions you may want to ask.

Summary

We now find ourselves deep into the Design Phase of experiment-driven product development. Having chosen a question to answer, this chapter has walked you through three important aspects that will help you to design a successful experiment—the hypothesis, measures, and conditions.

We use hypotheses to take our belief-led questions and come up with a presumed answer, based on our prior belief. This presumed answer is often one that describes a difference or change we expect to see.

Importantly, we seek not to prove or validate that answer but to test it. We can use the concept of the null hypothesis to capture a “mirror” to our hypothesis which reminds us to look for ways in which we might be wrong about the answer.

An experiment’s measures are the ways in which we will capture evidence that helps us answer the question.

For questions with hypotheses, they help us be more specific in the hypothesis, by stating exactly how we expect our presumed answer to manifest in a way we might detect.

For exploratory questions, measures help us focus our research on capturing any evidence that matters.

Conditions then give us an insight into why we find what we find among the evidence. They help us identify what might be the motivating factor behind an answer, be it a change, difference, or otherwise.

In the case of belief-led questions, ensuring a control and experimental condition helps test whether our presumed answer was correct or not. And in the case of exploratory questions, conditions help target the research by establishing the boundaries of what you will and won’t study. This then sets expectations around the applicability of an answer.

With our hypotheses, measures, and conditions established, we now know what we’re looking out for when running an experiment. Two other questions still remain:

How will we know if the answer we find is meaningful enough to trust, or base a decision upon?
After all this, how, then, do we propose to go about finding out the answer to our question?

The answer to each of these—the scale and method for our experiment—will be found in Chapter 7.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 6. Hypotheses, Measures, and Conditions

Create new playlist

Sign In

Sign Up

6. Hypotheses, Measures, and Conditions

Formulating hypotheses

Remember!

A hypothesis should be the presumed outcome of your research

A hypothesis must be falsifiable, testable, and measurable

Falsifiable

Testable

Measurable

A hypothesis often describes a difference or change

The hypothesis statement

Null hypotheses

Examples of hypotheses

Example 1

Example 2

Example 3

Remember!

Exploratory questions and hypotheses

The lurking belief

The method-question-answering loop

Remember!

Measures and conditions

Measures

Choosing appropriate measures

Track all the things, right? Wrong.

Be frugal about what you measure

Remember!

Health metrics, and “do no harm”

Baseline experiments

Conditions

Conditions for belief-led questions

Conditions for exploratory questions

Summary

Table of Contents for
6. Hypotheses, Measures, and Conditions