Tracking I/O-heavy processes with iotop

Many DBAs and system administrators are familiar with the top command, which displays the processes that use the most CPU or RAM. However, this does not help us find the processes that cause high amounts of system I/O.

Fortunately, there is a command, much like top, that is designed specifically for displaying the processes that make storage requests. The iotop utility displays a continuously updated list of the processes and any I/O they are handling. Provided that the server is dedicated to PostgreSQL, we can use this information to almost instantly identify one or more database backends that make disk requests.

Just like top, processes are only sorted to the head of the list as long as their I/O continues to limit its long-term usefulness. Let's learn more about iotop and see if we can benefit from its functionality.

Getting ready

The iotop command can only be executed by root-level users, as it uses some kernel resources available only to superusers. Be ready with the sudo command!

How to do it...

Follow these steps to obtain a sample output from the iotop command:

  1. Enter interactive mode with this command (exit by pressing q):
    sudo iotop
    
  2. Obtain batch output for 10 seconds with this command:
    sudo iotop -b -n 10
    
  3. Restrict batch output to only active processes, include a timestamp, and suppress the headers with this command:
    sudo iotop -bot -qqq
    

How it works...

While it may be somewhat inconvenient to need superuser access to invoke iotop, we're willing to make that sacrifice in this case. Our first command simply starts iotop like we would use top interactively. We can sort the output into different columns with the arrow keys, reverse the sort order by pressing the r key, and quit by pressing q. Of the columns presented here, we may be interested in the following:

  • TID: This column provides the PID of the process that makes I/O requests. This can be used to investigate or terminate the program.
  • DISK READ: This column illustrates the number of bytes read per second by the listed process.
  • DISK WRITE: This column details the number of bytes written per second by the listed process.
  • IO: This column shows the percentage of time that the listed process spent issuing I/O requests.
  • COMMAND: This column depicts the name of the process that handles I/O. If this is a master process, it might include command-line switches as well.

While this kind of use is informative for live troubleshooting, it's less applicable for historical applications. Thus, for the second command, we add the -b argument to put iotop in batch mode. This means that all the output is simply printed to the screen, which we can redirect to a file if desired. In addition, we used the -n parameter to only obtain 10 readings—one for each second—for later analysis.

Readers working along by trying these examples might notice that the amount of output in batch mode is overwhelming. By default, iotop lists every process it can see, whether or not it is actually utilizing disk resources. We can stop this behavior with the -o parameter, so only active processes are included in any output. By adding the -t argument, we also gain a timestamp that we can use to correlate disk activity across data-gathering techniques.

The -q argument acts to suppress excessive iotop output. By specifying it once, iotop only includes the column labels at the top of the output. If you specify it twice, it will never include the column labels. If you specify it a third time, it will also remove the summary data that iotop normally prints after every iteration. This type of output is ideal for importing into reporting tools or even analyzing by hand by searching for interesting time periods.

There's more...

While the iotop data is not actually part of the statistics gathered automatically by the sysstat package, we can log the data for posterity anyway. Follow these steps as a root-level user to log the iostat data:

  1. Create a file named iotop at /etc/cron.d/ and fill it with this line:
    * * * * * root iotop -boat -qqq -d 5 -n 2 >> /var/log/iotop
  2. Reload the configuration files of the cron service with this command:
    sudo service cron reload
    

By adding the -a parameter, iotop will log the cumulative total of the I/O used between the readings, instead of the I/O per second. We use the -d argument to add a 5 second delay between two readings, as specified by the -n parameter. Together, this means that we get a 5 second sample logged to /var/log/iotop every minute.

See also

  • Always examine the manual for the tools that we use in these recipes. In this case, the manual for iotop is available by executing this command:
    man iotop
    
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.222.185