Chapter 4. Troubleshooting

In this chapter, we will learn several techniques to track sources of poor performance or stop potential outages before they occur. We will cover the following recipes in this chapter:

  • Performing triage
  • Installing common statistics packages
  • Evaluating the current disk performance with iostat
  • Tracking I/O-heavy processes with iotop
  • Viewing past performance with sar
  • Correlating performance with dstat
  • Interpreting /proc/meminfo
  • Examining /proc/net/bonding/bond0
  • Checking the pg_stat_activity view
  • Checking the pg_stat_statements view
  • Debugging with strace
  • Logging checkpoints properly

Introduction

A DBA managing a highly available database server is charged with a huge responsibility. The amount of integration, speed of operations, and urgency behind resolving performance degradation can be extremely stressful. Some personalities thrive under this kind of pressure, while others will find it impossible to concentrate and will become paralyzed in fear.

We're not going to claim that every DBA in this position is a battle-weary veteran, typing furiously to save the day while disaster looms. This kind of scenario only exists in movies and often leads to compounding the original problem. In reality, a DBA's job includes many more calculated reactions even when managing a transaction-heavy database with frightfully low tolerance for downtime. The best tip we can give and the whole reason behind this book is to have an expansive bag of tricks.

For the purposes of this chapter, our bag is full of common Linux utilities useful for troubleshooting. With them, we approach system malfunctions like scientists. Given the behavior of the database or the underlying operating system, it is our job to produce a hypothesis for the cause. The tools serve as our instruments, ready to measure and sample, to either prove or disprove until we successfully isolate and address the problem.

With enough practice, we can begin to expect certain output, given PostgreSQL's behavior. Like a good mechanic who can diagnose an engine by its sound, we will hear the subtle tone of distress deep in the database cluster and have an answer. The first step towards this goal is to learn the tools.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.227.9