Attacking low performance

After this basic inspection, it is time to get rid of a very common problem: wrong and missing indexes.

Reviewing indexes

At this point, most of you may say, "indexing? Come on! What is the point? We know that; let's work on real stuff!" From experience, I can tell you with absolute certainty that broken indexing is the single most common cause of bad performance in the world. So, before focusing on anything else in the system, it always makes sense to carry out a safety check to ensure that indexing is definitely okay.

What is the best way to check for missing indexes? In my daily life as a consultant, I use this query:

test=# SELECT relname, seq_scan, seq_tup_read, 
      idx_scan AS idx, 
      seq_tup_read / seq_scan AS ratio 
   FROM  pg_stat_user_tables 
   WHERE seq_scan > 0 
   ORDER BY seq_tup_read DESC
   LIMIT 10;
 relname | seq_scan |   seq_tup_read | idx | ratio 
---------+----------+----------------+-----+----------
 t_user  |  4564324 | 18563909843245 |   2 |  4067176
 ...
(10 rows)

The output of the query shows the following fields: the name of the table, the number of sequential scans that have occurred on the table, the number of rows found in those sequential scans, the number of index scans used, and the average number of rows read during a sequential scan.

Given the listing here, the next question is, why would anybody want to read 4 million rows 4.5 million times? There is no sane case that justifies that such a large table should be read millions of times. Just imagine what it means for the overall load of your system if a table containing 4 million rows is read over and over again. It is like reading the entire phone book to find a single number.

In most cases, the missing index is already obvious when looking at the table structure. In this example, the user table might be used for authentication, but the e-mail address used by people to sign in might not be indexed.

Note

Keep in mind that sequential scans are not always bad and they can never be avoided completely. It is just important to avoid a large number of expensive and pointless sequential scans.

Experience has shown that in most systems showing bad performance, indexes are missing. It is very likely that a simple review of your indexing strategy will solve most of your problems.

On the other hand, it is also possible that too many indexes cause performance problems. An index is not always a solution to a problem—it can also do harm. Such performance problems are not as obvious as missing indexes. In addition to that, the use of too many indexes can usually be tolerated much longer than missing indexes (depending on the type of application, of course).

Fixing UPDATE commands

Once indexes have been fixed, a second and very common problem can be attacked. Many people forget about a thing called FILLFACTOR. Missing FILLFACTOR settings can cause performance problems in the case of update-intensive workloads.

How can those problems be detected? Here is a query showing how it works:

SELECT schemaname, relname, n_tup_ins, 
   n_tup_upd, n_tup_hot_upd, n_tup_del 
FROM   pg_stat_user_tables 
ORDER BY n_tup_upd DESC;

This query returns those tables that are facing most of the UPDATE commands in the system. Now the important ratio is the ratio of n_tup_upd to n_tup_hot_upd. Let's assume a row is updated. If the page containing the row is full, the row has to be copied to some other block. Additional, expensive I/O is the logical consequence. If a proper FILLFACTOR setting (for example, FILLFACTOR = 70 for 70 percent) has been chosen, a so-called HOT-UPDATE command will be performed. In short, the copy of the row can be put in the same block as the original row. A HOT-UPDATE command is way cheaper than a traditional, normal UPDATE command. Therefore, tables facing many UPDATE commands need a reasonably low FILLFACTOR option to speed up things and to reduce I/O.

The query shown in this section helps identify those tables that need special attention. It makes sense to work through the list top down and set FILLFACTOR on indexes and tables.

Note

Keep in mind that the perfect FILLFACTOR option can never be calculated. However, you can make an educated guess and come up with a reasonably good setting. If the FILLFACTOR has to be lowered, 75 percent can be a good start.

To change the FILLFACTOR to 70 percent, the following command can be used:

ALTER TABLE foo SET (FILLFACTOR=70);

If you want to change the FILLFACTOR of an index, use this:

ALTER INDEX foo_index SET (FILLFACTOR=70);

Note

Keep in mind that this setting does not have an immediate effect on the physical layout of the table; it is merely a hint. However, the system will try to reach the desired state over time.

Once indexes and FILLFACTORs have been defined, the majority of problems should have been fixed.

Detecting slow queries

Next, an important step can be performed—finding slow queries. Some people prefer to check for slow queries first. However, I personally prefer to do some basic checks such as the ones shown up to here because those checks don't require restarts. Activating pg_stat_statements needs a restart, and therefore, it makes sense to activate it after picking the low-hanging fruit.

When activating pg_stat_statements in postgresql.conf, as outlined earlier in the book, it makes a lot of sense to set track_io_timing to true. Enabling track_io_timing will provide users with detailed information about which query has performed which kind of I/O.

To figure out what is going on in the system, the following queries may be useful.

Find the most time-consuming queries, like this:

SELECT (SELECT datname 
    FROM   pg_database 
    WHERE   dbid = oid), 
   query, calls, total_time 
FROM   pg_stat_statements AS x 
ORDER BY total_time DESC;

The most time-consuming queries will show up at the top of the listing and can be fixed one at a time. However, in some cases, total execution time is not your problem. What if I/O is the main limiting factor? In this case, the following fields need special attention:

temp_blks_read | bigint |

temp_blks_written | bigint |

blk_read_time | double precision |

blk_write_time | double precision |

If the work_mem settings are too low, it can happen that many temporary files are written to disk, resulting in expensive disk writes, which can easily be avoided. Those temp_* are a good indicator of the query that causes disk wait.

The blk_* settings will give insight into how much I/O is needed by a query and how much time is spent in I/O.

Working through the most expensive queries is essential to track down those last bottlenecks remaining on your systems. If postgresql.conf is configured in the same way, going through the basic steps outlined in this chapter will remove a great portion of the most common performance problems.

The problems remaining usually need a more detailed look and maybe changes inside the application. Everything covered so far can be fixed using PostgreSQL's onboard means.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.138.69.163