Making use of asynchronous replication

While point-in-time recovery is nice, it might not be enough. In many cases, people want an up-to-date standby server that can also serve as a read-only slave. A feature called streaming replication is exactly what is needed in this case.

Setting up streaming replication is easy and can be performed in a handful of steps:

  1. Adapt postgresql.conf on the master server.
  2. Change pg_hba.conf to allow remote base backups.
  3. Run pg_basebackup to get an initial copy.
  4. Fire up the slave.

To set up postgresql.conf, the following settings are needed:

wal_level = hot_standby
max_wal_senders = 5
hot_standby = on

The first two settings are mandatory on the master. The wal_level field tells the server to create enough xlog to allow streaming replication, and max_wal_senders will indicate the number of streams the master is allowed to keep open.

Tip

If you are planning to run one slave, consider setting max_wal_senders to 3 or higher. During streaming, only one wal-sender process will be used, but during the initial copy, up to two connections are required. Setting this too tight can cause unnecessary database restarts.

The third setting changed here is hot_standby. Basically, this is a slave setting, and the master will ignore it. The reason it has been set is that in the next step (base backup), the entire instance, including the configuration, will be sent to the standby servers. If this variable has already been set on the master, there is no need to change it later on the slave (which is somewhat convenient).

Once postgresql.conf has been changed, it is time to attack pg_hba.conf. The important point here is that you have to precisely tell PostgreSQL who is allowed to stream the transaction log and who is not. IPs allowed must be listed in pg_hba.conf, like this:

host   replication   all   192.168.0.34/32        trust

In this case, a slave running on 192.168.0.34 is allowed to fetch the transaction log from the master without having to send a password. For the sake of our example, this is fine.

If you want to test replication on the same box as the master, the following configuration may be useful to you:

local   replication   all                         trust
host    replication   all    127.0.0.1/32         trust
host    replication   all    ::1/128              trust

Once postgresql.conf and pg_hba.conf have been changed, the server can be restarted.

Working with pg_basebackup

In the next step, it is possible to fetch an initial copy of the database instance. Remember that in the previous section, this very step was performed using pg_start_backup and pg_stop_backup. In this section, the second method, pg_basebackup, will be used.

To prepare the slave for replication, these steps will be necessary:

mkdir /slave
chown postgres.postgres slave
chmod 700 slave

This example assumes that the server will be running as postgres and that the target directory is /slave.

To create the initial backup, the following command can be executed on the slave:

pg_basebackup -D /slave 
  -h master.server.com --checkpoint=fast 
  --xlog-method=stream -R

Ensure that those host names are replaced by your server names or IP.

Here are the meanings of those command-line flags:

  • -D: This tells pg_basebackup where to put the initial backup.
  • -h: This tells the system where to look for the master.
  • --checkpoint=fast: This indicates that the base backup will initiate a checkpoint on the master. This makes sense; otherwise, small backups will take longer than necessary.
  • --xlog-method=stream: This will force pg_basebackup to open a second connection and stream enough xlog while the backup is running. The advantage here is that the newly created base backup is already self-sufficient. It can be started straight away. It also ensures that a fresh backup can be started without falling back to using an archive.
  • -R: This is a fairly new feature and will create a reasonable recovery.conf file for streaming replication.

The default configuration file created by -R will contain two lines:

standby_mode = 'on'
primary_conninfo = '…'

The standby_mode = on setting means that the server should keep streaming even if no xlog is around anymore (it should just wait for more).

The primary_conninfo setting points to the master and contains the connection string to the desired master server. These two settings are enough to make streaming work.

Firing up replication

Once the base backup has been created, the server can be started. If, for testing purposes, your master and your slave are in the same box, you can start PostgreSQL on a different port without having to modify the configuration. It works as shown in the next listing:

$ pg_ctl -D /slave/ -o "--port=5433" start

If your slave has been configured properly, some information will show up in the log:

server starting
LOG:  database system was interrupted; last known up at
   2014-12-08 15:07:39 CET
LOG:  creating missing WAL directory
   "pg_xlog/archive_status"
LOG:  entering standby mode

To the server, the entire endeavor looks like a crash. It tells us when it feels that it stopped working (that is, at the time of the base backup).

Once PostgreSQL enters standby mode, you are on the right path. PostgreSQL will start replaying stuff, and eventually reach a consistent state:

LOG:  redo starts at 0/D000028
LOG:  consistent recovery state reached at 0/D0000F0

Always check whether a consistent state has been reached or not. If you have no consistent state, it means there is trouble ahead.

LOG:  database system is ready to accept read only 
  connections
LOG:  started streaming WAL from primary at 0/E000000 
  on timeline 1

Also, make sure that you have entered streaming mode.

Voilà! The server is up and running now. To check whether replication is working safely, you can rely on a system view called pg_stat_replication. It will contain one line per slave:

test=# d pg_stat_replication
          View "pg_catalog.pg_stat_replication"
      Column      |           Type           | Modifiers 
------------------+--------------------------+----------
 pid              | integer                  | 
 usesysid         | oid                      | 
 usename          | name                     | 
 application_name | text                     | 
 client_addr      | inet                     | 
 client_hostname  | text                     | 
 client_port      | integer                  | 
 backend_start    | timestamp with time zone | 
 backend_xmin     | xid                      | 
 state            | text                     | 
 sent_location    | pg_lsn                   | 
 write_location   | pg_lsn                   | 
 flush_location   | pg_lsn                   | 
 replay_location  | pg_lsn                   | 
 sync_priority    | integer                  | 
 sync_state       | text                     | 

These fields are highly important, and it is worth explaining them in more detail.

The first field contains the process ID of the process serving the slave on the master. Then come the usesysid field and the name of the user connected to the system. This is the user the client is connected as. Then there is an application_name field. The purpose of the application_name field is to give a connection a real name. This is important for debugging (for normal applications) and, as you will see in the next section, for synchronous replication.

Then there are a couple of fields showing where a connection comes from (client_*). Those fields will tell you which slave the entire line is about. The backend_start field tells us when the slave started to stream.

After the backend_xmin field comes the state the connection is in. If everything is normal, it should be set to streaming. Then there are four location fields. They will tell the administrator how far the slave has already consumed respectively replayed xlog.

Finally there is the sync_state field. It tells us whether the server is in synchronous or asynchronous replication mode.

For a short, step-by-step guide about streaming replication, refer to http://www.cybertec.at/wp-content/uploads/PostgreSQL_5_min_streaming_replication.pdf.

Promoting slaves to masters

Promoting a slave to a master is an easy task. The following command will do it:

pg_ctl -D /slave promote

The slave will receive the signal and turn itself into a master. The cool thing here is that even existing read-only connections will automatically be turned into read/write connections. There is not even any need to reconnect; PostgreSQL will do all of this for you.

Now the new master is totally independent of the old master. Those two boxes won't synchronize anymore.

Note

Note that a master cannot be transformed back into a slave easily. A complete resync should be performed to turn the broken, old master into a new slave replica.

Making replication safer

So far a simple replication setup has been shown. But what happens when a user takes a base backup and does not connect the slave to the master instantly? Nothing prevents the master from recycling its replication log. If the slave comes late, the transaction log needed is not there anymore. The same can happen in the case of slow connections. A user might load data into the master at 200 MB per second but the network can only handle 100 MB per second. The slave falls behind, and again, nothing prevents the master from recycling its xlog. To solve this kind of problem, a simple mechanism is there—wal_keep_segments.

If wal_keep_segments is set to, say, 100 in postgresql.conf, it means that the master will keep 100 * 16 MB more of transaction log around than it would need to repair itself in case of a crash. This little buffer gives the slave some room to breathe, and it defines the maximum amount of data by which the slave is allowed to fall behind the master without losing sight of the master. My personal recommendation is to always set wal_keep_segments to a reasonable value to ensure that the slave can always keep up with the master in the event of excessively slow connections, reboots on the slave side, and so on.

One more issue that can be observed in many cases is the appearance of conflicts. In PostgreSQL, the slave can be used to serve read-only transactions. It can happen from time to time that the slave does something that conflicts with the intentions of the master. As an example, let's assume that the slave is running a long SELECT statement. In the meantime, somebody on the master performs a DELETE statement and instantly fires a VACUUM command to clean up those rows. The SELECT statement on the slave might still need this data for its purposes. However, the xlog tells the system to clean up those rows. Thus, a conflict has happened. The slave will now wait for max_standby_streaming_delay = 30s before it simply kills the connection on the slave, if the transaction in doubt does not finish earlier.

To prevent most conflicts, it is possible to set hot_standby_feedback = on. This will make the slave send its oldest transaction ID (xmin) to the master. The master will then know that there is still a query running on the slave observing certain data. Therefore, VACUUM will be delayed until the conflict cannot happen anymore.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.137.67