After this basic inspection, it is time to get rid of a very common problem: wrong and missing indexes.
At this point, most of you may say, "indexing? Come on! What is the point? We know that; let's work on real stuff!" From experience, I can tell you with absolute certainty that broken indexing is the single most common cause of bad performance in the world. So, before focusing on anything else in the system, it always makes sense to carry out a safety check to ensure that indexing is definitely okay.
What is the best way to check for missing indexes? In my daily life as a consultant, I use this query:
test=# SELECT relname, seq_scan, seq_tup_read, idx_scan AS idx, seq_tup_read / seq_scan AS ratio FROM pg_stat_user_tables WHERE seq_scan > 0 ORDER BY seq_tup_read DESC LIMIT 10; relname | seq_scan | seq_tup_read | idx | ratio ---------+----------+----------------+-----+---------- t_user | 4564324 | 18563909843245 | 2 | 4067176 ... (10 rows)
The output of the query shows the following fields: the name of the table, the number of sequential scans that have occurred on the table, the number of rows found in those sequential scans, the number of index scans used, and the average number of rows read during a sequential scan.
Given the listing here, the next question is, why would anybody want to read 4 million rows 4.5 million times? There is no sane case that justifies that such a large table should be read millions of times. Just imagine what it means for the overall load of your system if a table containing 4 million rows is read over and over again. It is like reading the entire phone book to find a single number.
In most cases, the missing index is already obvious when looking at the table structure. In this example, the user table might be used for authentication, but the e-mail address used by people to sign in might not be indexed.
Experience has shown that in most systems showing bad performance, indexes are missing. It is very likely that a simple review of your indexing strategy will solve most of your problems.
On the other hand, it is also possible that too many indexes cause performance problems. An index is not always a solution to a problem—it can also do harm. Such performance problems are not as obvious as missing indexes. In addition to that, the use of too many indexes can usually be tolerated much longer than missing indexes (depending on the type of application, of course).
Once indexes have been fixed, a second and very common problem can be attacked. Many people forget about a thing called FILLFACTOR
. Missing FILLFACTOR
settings can cause performance problems in the case of update-intensive workloads.
How can those problems be detected? Here is a query showing how it works:
SELECT schemaname, relname, n_tup_ins, n_tup_upd, n_tup_hot_upd, n_tup_del FROM pg_stat_user_tables ORDER BY n_tup_upd DESC;
This query returns those tables that are facing most of the UPDATE
commands in the system. Now the important ratio is the ratio of n_tup_upd
to n_tup_hot_upd
. Let's assume a row is updated. If the page containing the row is full, the row has to be copied to some other block. Additional, expensive I/O is the logical consequence. If a proper FILLFACTOR
setting (for example, FILLFACTOR = 70
for 70 percent) has been chosen, a so-called HOT-UPDATE
command will be performed. In short, the copy of the row can be put in the same block as the original row. A HOT-UPDATE
command is way cheaper than a traditional, normal UPDATE
command. Therefore, tables facing many UPDATE
commands need a reasonably low FILLFACTOR
option to speed up things and to reduce I/O.
The query shown in this section helps identify those tables that need special attention. It makes sense to work through the list top down and set FILLFACTOR
on indexes and tables.
To change the FILLFACTOR
to 70 percent, the following command can be used:
ALTER TABLE foo SET (FILLFACTOR=70);
If you want to change the FILLFACTOR
of an index, use this:
ALTER INDEX foo_index SET (FILLFACTOR=70);
Once indexes and FILLFACTOR
s have been defined, the majority of problems should have been fixed.
Next, an important step can be performed—finding slow queries. Some people prefer to check for slow queries first. However, I personally prefer to do some basic checks such as the ones shown up to here because those checks don't require restarts. Activating pg_stat_statements
needs a restart, and therefore, it makes sense to activate it after picking the low-hanging fruit.
When activating pg_stat_statements
in postgresql.conf
, as outlined earlier in the book, it makes a lot of sense to set track_io_timing
to true
. Enabling track_io_timing
will provide users with detailed information about which query has performed which kind of I/O.
To figure out what is going on in the system, the following queries may be useful.
Find the most time-consuming queries, like this:
SELECT (SELECT datname FROM pg_database WHERE dbid = oid), query, calls, total_time FROM pg_stat_statements AS x ORDER BY total_time DESC;
The most time-consuming queries will show up at the top of the listing and can be fixed one at a time. However, in some cases, total execution time is not your problem. What if I/O is the main limiting factor? In this case, the following fields need special attention:
temp_blks_read | bigint |
temp_blks_written | bigint |
blk_read_time | double precision |
blk_write_time | double precision |
If the work_mem
settings are too low, it can happen that many temporary files are written to disk, resulting in expensive disk writes, which can easily be avoided. Those temp_*
are a good indicator of the query that causes disk wait.
The blk_*
settings will give insight into how much I/O is needed by a query and how much time is spent in I/O.
Working through the most expensive queries is essential to track down those last bottlenecks remaining on your systems. If postgresql.conf
is configured in the same way, going through the basic steps outlined in this chapter will remove a great portion of the most common performance problems.
The problems remaining usually need a more detailed look and maybe changes inside the application. Everything covered so far can be fixed using PostgreSQL's onboard means.
3.138.69.163