Chapter 3. Realizing ROI in Analytics

“Fast Learners Win”

Eric Schmidt

In the past decade, the amount of time and money spent on incorporating analytics into business processes has risen to unprecedented levels. Data-driven decisions that used to be a C-suite luxury (costing a small fortune in consulting) are now expected from every operating executive. Gut feelings and anecdotal cases are no longer acceptable. “In God we trust, everyone else bring data” has become the slogan of a generation.

Large organizations have hundreds of processes, strategic initiatives, and channels—all waiting to be optimized—and increasing business complexity has only amplified the demand for analytic solutions. But despite the tremendous demand, supply, and investment, there is one major missing link: value. Do you really know whether your analytics are living up to their lofty expectations?

To unpack this, we first have to dive a little deeper into the different tiers of analytics: descriptive, predictive, and prescriptive. Descriptive analytics summarize historical data—these are reports and dashboards. Predictive analytics projects the most likely outcome given future or unlabeled data points—these are recommendations or classifications. Finally, the highest tier is prescriptive analytics, which combine predictions and feedback—measuring accuracy of predictions over time.

One great example of prescriptive analytics is the Amazon product page and shopping cart. At the bottom of each product page there are recommendations of commonly bundled items. These recommendations are predictions of cross-selling opportunities. Most importantly: with the shopping cart, Amazon has a feedback mechanism to track the success (or failure) of its recommendations. Amazon is embedding prescriptive recommendations into the purchasing process.

That feedback mechanism is the cornerstone of what allows Amazon to measure the “realized ROI” of the predictions: a transaction-level data collection of the outcomes. This feedback mechanism is the most critical, yet least understood, component of successful prescriptive analytics. Without it your analytical projects are likely to be one-offs and ineffective.

The Lifecycle for a Feedback System

Analytics, just like software projects, have a lifecycle of development. You can split these into three categories: (1) data and requirement discovery, (2) feature engineering and modeling, and (3) online learning.

  1. Data and Requirement Discovery. When first embarking on an analytics project, there is a period when the business value is known, but finding the associated data to drive insight is unknown.

  2. Feature Engineering and Modeling. There is an art to applying the right model and transformations to the dataset and domain. Generally, the more data, the less you have to rely on obscure concepts like bias/variance trade-off.

  3. Feedback. Just completing feature engineering and modeling gives a predictive model, which can be applied to a set of problems. Understanding whether that model is working, and incrementally improving it, is called “online learning.”

The Measurements for a Feedback System

Before designing an analytic system, the most important task is to distill success and failure into a measure of accuracy. How will the value of the application be affected by your predictive application? In statistical lingo, the mathematical formula for accuracy is called your “loss function.”

For example, if you recommend a product on Amazon: what is the implication if you recommend a bad product? How about not recommending the precisely correct product? How about when you recommend the correct product, whether that translates into a larger shopping cart? In this case, like most cases, it depends. Is Amazon trying to help customers save money? Increase their operating margin? Reduce inventory at a fulfillment center?

Here success or failure is the business objective with a measurable outcome. The accuracy of the recommendation system should be based on this objective. There are three common accuracy metrics that combine to make your “loss function”:

  1. How many recommendations do you miss?

  2. How many recommendations are incorrect?

  3. How many recommendations are not acted upon?

    How many recommendations do you miss?

    All predictive algorithms are just that: predictive. They have uncertainty and risk. Do you always want your algorithm to predict an event, even if it doesn’t have a high degree of confidence that your prediction will be correct? That depends on your applications. The loss is: the opportunity cost associated with not making a recommendation.

    How many recommendations are incorrect?

    What if your algorithm makes a prediction and gets it wrong? If someone acts on the incorrect prediction, what is the cost? How many predictions does your algorithm have to get wrong before people to start ignoring it all together? The loss is: the cost associated with making an incorrect recommendation.

    How many recommendations are not acted upon?

    What if you make a prediction and it’s correct. Does this turn into dollars? While at an outcome level, this has the same result as (1), it requires a very different solution. Making good technology is one thing, but changing human behavior can be a different challenge entirely. The loss is: the opportunity cost associated with not making a recommendation.

One important note is that it’s much easier to get feedback on incorrect recommendations than missed recommendations. Users can find mistakes quickly, describe why they’re incorrect, and the algorithms can adjust accordingly. When algorithms aren’t able to determine a good recommendation at all, users have to do a lot more work. They not only have to go out and find the correct answer, but they also have to make a decision in the absence of an alternative. This means that feedback collection is naturally skewed toward (2) and (3) above, and measuring (1) is more challenging.

Unfortunately, this isn’t all just numbers. It has been shown that we have very different emotional reactions between opportunity cost and clear cost. Similarly, the absence of a prediction is often more justifiable than a blatantly incorrect one, but perhaps not as justifiable as having the prediction and not acting on it. That’s why it’s critical to define the trade-off between the three. It will not only provide a clearer path of development for your analytics, but also map the project to business value. However, before you can even begin such analytics, you need to have an infrastructure to support it.

The Database for a Feedback System

While building analytical systems, starting from a solid set of components for data movement, storage and modeling can be the difference between a fast or slow iteration cycle. The last 20 years of data warehousing have been dominated by the idea of a unilateral data flow—out of applications and into a central warehouse. In this model, enterprises have been focused on reducing the latency between when data is captured to movement, and supporting more users with richer queries in the data warehouse. In the new paradigm of a feedback system, the warehouse will have to become more dynamic and bidirectional.

The cost (read: time) of adding data variety (i.e., new attributes for analysis) is critical. Adjusting algorithms based on feedback and constant validation is often constrained by the number of data scientists available. If adding this feedback scales with your technical resources, as many data warehousing solutions do today, then sticking to predictive and historic analytics is the path of least resistance.

Equally critical is the bidirectional nature of information. Not only are you pulling data from systems in order to build predictive models, but you also have to supplement your workflow with predictions to capture the value. Most applications can’t be retrofitted to display predictions and don’t have the flexibility to add this in-app to the workflow, so frequently the warehouse will have to trace predictions to outcomes. Alternatively, application-specific data marts can be used to house and populate these predictions.

Regardless, the days of static and enterprise-wide data warehouses are coming to an end. The acceleration of analytics delivered by highly engineered feedback systems trumps the benefit from more centralized and usable data.

The ROI of a Feedback System

Building analytical-driven systems is about more than just confirming the benefits of data-driven decisions. It’s a fundamental muscle of a well-operated organization. It makes objectives more tangible and measurable, and connects process with observable behavior. The transparency and accountability empowers teams to set aggressive targets, and aligns them around a set of tools to achieve those targets. Finally, building analytical-driven systems is about developing a behavior of rigorous experimentation that speeds up learning.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.15.135.175