Debugging Issues Discovered Through Metrics and Alerts

When you eliminate the impossible, whatever remains, however improbable, must be the truth.

- Spock

So far, we explored how to gather metrics and how to create alerts that will notify us when there is an issue. We also learned how to query metrics and dig for information we might need when trying to find the cause of a problem. We'll expand on that and try to debug a simulated issue.

Saying that an application does not work correctly should not be enough by itself. We should be much more precise. Our goal is to be able to pinpoint not only which application is malfunctioning, but also which part of it is the culprit. We should be able to blame a specific function, a method, a request path, and so on. The more precise we are in detecting which part of an application is causing a problem, the faster we will find the cause of an issue. As a result, it should be easier and faster to fix the issue through a new release (a hotfix), scaling, or any other means at our disposal.

Let's get going. We'll need a cluster (unless you already have one) before we simulate a problem that needs to be solved.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.90.182