Configuration – getting it right the first time

An important aspect of setting up a highly available database is starting with a stable configuration that will not require a lot of future modifications. Even settings that can be changed during database operation can drastically alter its performance profile and behavior. Other settings may require a full database restart, which can lead to a short outage, depending on how resilient the frontend application is.

We want to avoid introducing instability into our PostgreSQL database from the very beginning. To that end, we are going to explore common (and perhaps, uncommon) configuration options to use in a highly available installation.

Getting ready

The PostgreSQL documentation describes all of the settings we will be discussing. We recommend that you visit the PostgreSQL.org website and read the documentation regarding server configuration. There's probably too much to absorb before continuing with this section, but we recommend that you familiarize yourself with the settings presented here.

We will approach each setting in the order commonly encountered in a recent postgresql.conf file generated in a new database.

How to do it...

Find these settings in the postgresql.conf file for the desired PostgreSQL instance and perform the following steps:

  1. Set max_connections to three times the number of processor cores on the server. Include virtual (hyperthreading) cores. Set shared_buffers to 4GB for servers with up to 64 GB of RAM. Use 8GB for systems with more than 64 GB of RAM.
  2. Set work_mem to 8MB for servers with up to 32 GB of RAM, 16MB for servers with up 64 GB of RAM, and 32MB for systems with more than 64 GB of RAM. If max_connections is greater than 400, divide this by two.
  3. Set maintenance_work_mem to 1GB.
  4. Set wal_level to hot_standby.
  5. Set checkpoint_segments to (system memory in MB / 20 / 16).
  6. Set checkpoint_completion_target to 0.8.
  7. Set archive_mode to on.
  8. Set archive_command to /bin/true.
  9. Set max_wal_senders to 5.
  10. Set wal_keep_segments to (3 * checkpoint_segments).
  11. Set random_page_cost to 2.0 if you are using RAID or SAN; 1.0 for SSD-based storage.
  12. Set effective_cache_size to half of the available system RAM.
  13. Set log_min_duration_statement to 1000.
  14. Set log_checkpoints to on.

How it works...

The commonly accepted formula for estimating max_connections is to take the number of processor cores, multiply them by two, and add disk spindles. With the relatively recent improvement of virtual cores, contributing factors such as SSD or other high-performance storage, and so on, we have a bit more freedom than we had earlier. In addition, even if we were to follow this estimation method, allowing a few extra connections can prevent highly visible connection rejections. A slightly lower performance is a small price to pay for availability.

The advice for shared_buffers is very different from the accepted practice of simply setting it to a quarter of the available RAM. We must consider buffer flushing and the synchronization time. In the case of a forced checkpoint, an amount of RAM equal to shared_buffers could be flushed to disk. This kind of write storm can easily cripple even high-end hardware. Highly available hardware often has far more RAM that could easily be flushed to a disk in an emergency. As such, we don't recommend that you use more than 8 GB until this situation improves substantially.

The work_mem setting is the amount of memory used by several temporary operations, including data sorts. Thus, a single query can consume multiple instances of this amount simultaneously. A good estimate is to assume that each connection will use up to four instances at a time. Setting this too high can lead to over-committed memory and cause the kernel to start killing processes until RAM is available. This can lead to PostgreSQL shutdown or a server crash, depending on what processes are stopped. Systems with very high connection counts (over 400) have increased risk for such a cascade, so we reduce work_mem in these cases.

The maintenance_work_mem setting is similar to the work_mem setting in that there can be multiple instances. However, this is reserved for background workers and maintenance such as vacuum, analyze, or create index activities. Starving these kinds of Memory operations can drastically increase the disk I/O, which can detrimentally affect query performance. For the cost of a few GBs of RAM, we get a more stable server.

The only reason we set wal_level to hot_standby is because in a highly available environment, we should have at least one online streaming standby. Other recipes will detail how we set these up, but this is the starting point.

The number of checkpoint_segments is not a simple thing to set. The calculation we used assumes up to 5 percent of system memory, which could be in transit as checkpoint data, and each segment is 16 MB in size. This time, we are trying to avoid forced checkpoints, because we ran out of segments during data acquisition.

We also want to reduce disk contention when possible, so we increase checkpoint_completion_target to 0.8. We don't want to overwhelm the disk subsystem, and this setting will cause PostgreSQL to spread writes over 80 percent of the time specified by checkpoint_timeout. By default, checkpoint_timeout is set to 5 minutes, which should suffice until we start working with larger batches of data or a busy OLTP system.

Next, we enable archive_mode by setting it to on. This setting can only be changed by restarting PostgreSQL, which we want to avoid. It's very likely that we will be using WAL archival in some respect, even if we don't yet know which method to use at this point. This means we also need to set archive_command to a command that always succeeds, or PostgreSQL will fill our logs with complaints that it couldn't archive old WAL files. Using /bin/true as a placeholder, we can change it when we choose an archival method.

We increase max_wal_senders because it's needed for certain synchronization and backup methods. Five is a good starting point, and we can always decrease it later; we definitely need more than zero. Additionally, wal_keep_segments is set to a relatively high number. In this case, we keep it up to three multiples of checkpoint_segments worth, in case a streaming standby falls behind.

If this count of segments is exhausted while the standby is behind, it can never catch up until the remaining WAL segments are provided some other way or the standby is re-imaged. We'll discuss this more when it's time to talk about WAL archival. This uses more disk space, so multiply the total number of these segments by 16 MB to estimate total disk usage.

The cost of reading a random disk block, as opposed to reading it sequentially, directly affects how the query planner decides to execute a query. By decreasing random_page_cost, we tell PostgreSQL that our storage's random read performance is very fast. A highly available server should have equally capable storage, so we lower this to something more reasonable. In the case of SSD or PCIe-based storage, there is effectively no difference between a random or sequential read, so the setting should reflect this.

The last setting that modifies server behavior is effective_cache_size, which tells the query planner how much RAM is probably being used by the OS to cache data. Generally, this makes PostgreSQL prefer indexes, because it's likely that the indexed data is in memory. As most Unix systems are fairly aggressive when caching, at least half of the available RAM on a dedicated database server will be full of cached data.

Finally, we want better logging. We increase the logging of slow queries by setting log_min_duration_statement to 1000. This is in milliseconds, so any query that runs for over one second will be logged. This helps us find slow queries without flooding the logs with thousands or even millions of entries by logging everything. Similarly, we want log_checkpoints enabled, because it provides extremely beneficial information on checkpoints. We can see how long they took, how frequently they ran, and also how much disk-sync time they required. We need to know if checkpoints start taking too long or occur too frequently so that some values can be adjusted. This setting really should be enabled in all PostgreSQL servers.

There's more...

Many, if not most of these settings, show up frequently in the PostgreSQL mailing lists. As a result, we used many of the prescribed values or formulas. However, several of these settings show up very often; a tool is available to estimate them by analyzing the server hardware and by taking parameter hints. The pgtune program is a contributed utility for automatically estimating many system-dependent server settings.

We urge caution if you are relying primarily on this utility. It is extremely liberal when estimating work_mem and shared_buffers and doesn't seem to modify checkpoint_segments at all. Still, we feel that the values it produces are much better than the defaults for larger servers, so feel free to experiment.

See also

There are many more configuration settings we haven't included. We recommend that you browse the PostgreSQL documentation to learn more. In addition, we've included a link to the pgtune utility, which may be useful in optimizing your postgresql.conf file:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.224.54.136