Chapter 5. Monitoring the Business

If you recall from Chapter 2, we learned one of the important monitoring design patterns: monitor from the user’s perspective. We learned that starting your monitoring efforts from the outside, rather than deep in the bowels of the infrastructure where most people start, is a far better approach as it provides you with immediate insight into the actual questions people are asking (“Is the site up?” “Are users impacted?”) and sets the stage to iteratively go deeper.

The questions asked by business owners are often vastly different than those asked by software engineers or infrastructure engineers, and I think this is an area where we as engineers can improve our skills and understanding. Once we learn to ask the questions the executives are asking, we can really begin to work on the most important and highest-leverage problems facing the business.

In this chapter, you’ll learn what those questions are and how to apply your engineering expertise to answering them while hitting the basics of business KPIs. By the end of the chapter, you’ll have an appreciation for the concerns of executives and how you can make their lives easier while also demonstrating the value that monitoring provides to the business.

Business KPIs

A key performance indicator (KPI) is a metric that measures how your company is doing along lines the company has deemed important to the health of the business as a whole. A KPI, like a performance metric does for the app and infrastructure, tells you how your business is doing. Also like performance metrics, some metrics can be rather fuzzy about what they tell you and may require some degree of judgment in order to make decisions with them.

From an executive or founder’s perspective, you can sum up their concerns fairly easily:

  • Are customers able to use the app/service?

  • Are we making money?

  • Are we growing, shrinking, or stagnant?

  • How profitable are we? Is profitability increasing, decreasing, or stagnant?

  • Are our customers happy?

There are many metrics you can use to answer these questions, and they all tend to be approximations, requiring some level of judgment. After all, business is often messy—if it were easy, everyone would be doing it, right?

The following are common metrics business owners use to answer these questions:

Monthly recurring revenue

Measures the amount of monthly recurring revenue from customers. Most often used by SaaS or managed services companies.

Revenue per customer

Measures the amount of revenue per customer, generally on an annual basis. A good measurement for most types of companies.

Number of paying customers

Self-explanatory. You probably want this number to be going up.

Net promoter score

A measurement of user/customer satisfaction. Net promoter score (NPS) asks the user on a scale from 1 to 10, with 10 being the best (also known as a Likert scale), how likely they are to recommend the service/app to someone else. With enough responses, you can get a sense of how happy your users are with your service/app. NPS can also be used at a more granular level, such as in follow up emails with recently resolved help desk tickets.

Customer lifetime value (LTV)

The total value of a customer over their lifetime. If you are cross-selling to customers, this number should be going up. This measurement is closely related to revenue per customer but is measured on a lifetime basis.

Cost per customer

Measures how much it costs to service a customer. You ideally want this number to be decreasing over time, as it means you are becoming more efficient at providing a service/app and therefore more profitable. If you are running a SaaS app, determining how much infrastructure costs to run per user is a good starting place for this metric.

Customer acquisition cost (CAC)

Measures how much it costs to acquire a customer/user. This is generally a metric that your marketing team lives and dies by.

Customer churn

Measure of how many users are leaving your app/service. Some amount of churn is inevitable and simply the nature of doing business, but high churn can indicate problems with the app, whether from a product perspective (your app just isn’t very good), performance perspective (your app is too slow), or cost perspective (your app is too expensive). Churn rate is highly dependent on the nature of your business, so it’s best compared to yourself over time and not to other businesses.

Active users

A measurement of active users for your app/service. Active users can be hard to define, and this measurement is greatly dependent on the nature of your business. This metric is often tracked as multiple metrics, such as daily active users (DAU), weekly active users (WAU), and monthly active users (MAU). Ideally, you want this number increasing.

Burn rate

A measurement of how much money the company is spending as a whole. This number includes everything from salaries to office space. If you’re a revenue-generating company (e.g., later stage startup or enterprise), this number isn’t generally used.

Run rate

Often found in conjunction with burn rate, run rate is a measurement of how long a company has before it’s out of cash at current expenditure levels. This is usually expressed in months. If you’re a revenue-generating company (e.g., later stage startup or enterprise), this number isn’t generally used.

Total addressable market (TAM)

A measurement of how large a particular market is. It’s fundamentally an estimate that is arrived at by determining the total dollar value if you were to sell to everyone in that market. This can fluctuate depending on how a company defines its market.

Gross profit margin

A measurement of profitability after cost of goods sold (COGS). If you’re a SaaS company, this number is usually greater than 80% and often in the 90% range. COGS, for SaaS, is essentially what it costs to run the app/service. If you are a physical goods company, COGS is the cost to produce those goods. COGS does not include cost of salaries or office space. You can further divide this number by the number of users to determine how much it’s currently costing you on a per-user basis.

Each of these metrics is used to answer different questions (or the same question from a different perspective!). It’s sometimes hard to get this data, depending on the business. Given the sometimes sensitive nature of these metrics, you may not have access to the data yourself, but it’s still important for you to understand what’s being measured at the executive level and why. If you’re interested in learning more about these topics, Andreessen Horowitz has two fantastic blog posts covering them.1 Though they are aimed primarily at the startup world, they’re a great starting point for digging in deeper on business-level metrics.

Two Real-World Examples

Since you’re reading this book, you’re probably in IT or engineering and now wondering what you can do about helping to monitor the business, and the answer is: a lot, actually!

If you’re running a SaaS app, for example, there are a lot of questions that can be answered by instrumenting it. The examples that follow will show you exactly what I mean by monitoring from the outside first and why it’s so helpful. Let’s look at how a couple well-known companies might take this approach.2

Yelp

Yelp is an online platform that connects people with local businesses. There are two types of users for their platform: people searching for a local business (and possibly giving reviews) and business owners managing their business’s page (and possibly giving responses to reviews). A business owner can “claim” their business page for free, but Yelp monetizes its platform by charging business owners for advertising.

Even from this small description, compiling a list of business KPIs is easily done:

  • Searches performed

  • Reviews placed

  • User signups

  • Business pages claimed

  • Active users

  • Active businesses

  • Ads purchased

  • Review responses placed

All of these measure core functionality of the Yelp app and, depending on the architecture, may have a strong or loose mapping to backend services. These metrics are great leading indicators that something might be going wrong somewhere for one simple reason: if the search functionality is broken or slower than normal, the number of searches performed is likely to drop. These metrics, over time, should be relatively stable. You’ll get to know them at a glance and intuitively understand what looks right and what doesn’t. Imagine if you had all these metrics on a TV in your office for all to see. Anyone walking by would be able to immediately get a sense of whether things were fine or going south. None of these metrics tell anyone what might be wrong, but they’re great at signaling the overall health of the business.

Another side effect of tracking these metrics is that you’ll be able to quickly see the impact of backend problems on users. How many times have you seen a slowdown in some backend service and wondered about the impact to users? If you’ve got these metrics, answering that question becomes as easy as popping open the dashboard.

Reddit

Reddit is a social networking site. Users can read Reddit without an account, but posting a thread, commenting, voting, or private messages requires an account. Reddit is monetized via ads and Reddit Gold, a premium-level account for users. Subreddits are unlimited and free to create, but also require an account.

Measuring the core functionality would probably look something like this:

  • Users currently on the site

  • User logins

  • Comments posted

  • Threads submitted

  • Votes cast

  • Private messages sent

  • Gold purchased

  • Ads purchased

Reddit’s metrics aren’t terribly different from Yelp’s, are they?

By now, you’re starting to get an idea of what you might be able to do for your company. On their own, these metrics aren’t perfect, but they do measure interaction and engagement of users/customers. What would it look like if we kept going deeper with this line of thinking?

Tying Business KPIs to Technical Metrics

Let’s go back to one of the examples above: Reddit. On its own, tracking user logins is good, but is there a way we could get more specific metrics about user login performance? I think there is: user login failures.

Tracking user logins would give you both successful and failed logins. That’s not bad, but if the backend service responsible for handling user logins were having issues, this metric wouldn’t show us. Tracking success and failure separately is even better, as it helps us in our ever-present quest to know whether our app is working.

Let’s look again at Reddit’s metrics, with a bit more granularity (Table 5-1):

Table 5-1. Business KPIs tied to technical metrics
Business KPI Technical metrics

Users currently on the site

Users currently on the site

User logins

User login failures, login latency

Comments submitted

Comment submission failures, submission latency

Threads submitted

Thread submission failures, submission latency

Votes cast

Vote failures, vote latency

Private messages sent

Private message failures, submission latency

Gold purchased

Purchase failures, purchase latency

Ads purchased

Purchase failures, purchase latency

A couple things to note here:

  • We’ve left current users alone. This metric is still very useful to us, as it provides clues about traffic levels to the site.

  • The new metrics are all about failure rates and latency. You can track success rates if you’d like, but failure rates are more directly applicable to our goals. Latency is great to track, as it can be a good indicator of coming problems.

These new metrics answer the question of whether the app is working at a more granular level than the previous set did. They don’t assume they know what the problem is, but only hint that there could be one—exactly where we want to be for these sorts of metrics.

My App Doesn’t Have Those Metrics!

You might thinking about now, “My app doesn’t give me that data. How do I monitor something I don’t have?” I am so glad you asked.

Monitoring, as I mentioned in chapters 1 and 2, isn’t something that can be bolted on after the fact. To get visibility into the performance of your app and infrastructure, you have to have a design for it.

Can you imagine if Ford made a car with no way to measure how much fuel was in the tank? Or how fast you were going? These aren’t simply bolted onto the car after it’s been finished—they were designed into the vehicle from the very start. In fact, modern vehicles are very much like modern software: the ECU (engine control unit, aka, “the computer”) is just a whole lot of software responsible for analyzing inputs from lots of sensors and adjusting outputs given to the other components of the car. The core functionality of an ECU depends entirely on sensors feeding back measurements to it, allowing the ECU to adjust its controls of various critical components. Right from the beginning of the computerized era, monitoring was built into the vehicle at design.

Thankfully for us, we aren’t building cars: we can change things whenever we like with a much faster feedback loop than, say, adding a gas gauge to every car we’ve sold after they’ve been shipped out. We have the ability to modify our apps and infrastructure to add better monitoring as we please and improve on it over time. If your app doesn’t expose a measurement you need, get your hands dirty and modify the app to do so.

Finding Your Company’s Business KPIs

Now that you’ve got an idea of how you can tie business KPIs to technical metrics, let’s talk about how to find them for your business and app.

With the examples given, you’ve probably already got a decent idea of what your metrics are. I’d love to give you a list of metrics you should be tracking, but alas, every business is different, and sweeping generalizations can’t be made.

However, don’t fret, because I have a foolproof way of ensuring you understand how the app works and what’s important to measure: talk to people.3

Yes, I know, it’s crazy, but I swear it works!

So whom should you talk to?

The first person is a product manager. If you haven’t worked with a product manager before, their job is to essentially understand what the customers want and work with engineering to get it built. As a result, product managers tend to have the best idea of what matters at a high level. After talking with product managers, talk to the manager(s) of the software engineering team(s), followed by a few senior software engineers. By the end, you should have a great idea of what matters and how to find it.

What should you ask them? Here are my favorite questions:

  • Let’s assume I’m new to the company—how do I know the app is working? What do you check? How should it behave?

  • What are the KPIs for our app? Why are those the KPIs? What do they tell us?

Another way to figure out what you should monitor at this stage is to sketch out the app’s functionality at a very high level. Pay no attention to whether you’re using MySQL or PostgreSQL or what-have-you for your database—simply knowing that some component talks to a database is sufficient to know that you probably want to measure database latency from that component. Mapping out functionality, such as login, search, loading a map, etc. is a great way to determine where to start.

Since every business is different, there are no common metrics everyone should track. Your goal is to find the high-level metrics that indicate whether the app is actually working as it should.

Wrap-Up

We’ve learned the importance of facets of the company that many of us are rarely, if ever, exposed to. Yet they are absolutely crucial to the operation and growth of the business—and us keeping our jobs. To recap:

  • Business KPIs are among the most important metrics out there and make great leading indicators for the health and performance of your app and infrastructure.

  • We learned how to identify these important metrics in our companies and track them.

  • We learned how to tie business metrics to technical ones.

In the next chapter, we’ll learn about the ever-evolving world of frontend performance monitoring.

1 See http://bit.ly/2yJOWRe and http://bit.ly/2zBRMo9.

2 I’m making an educated guess here—I could be totally off base.

3 This is actually a mind hack of sorts: talking to people outside of Engineering gets you out of the engineering bubble, even if for a brief time. It’s always valuable to understand perspectives outside of Engineering.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.19.211.134