Life is not always just black or white; sometimes there are also some shades of gray. For some cases, streaming replication might be just perfect. In some other cases, file-based replication and PITR are all you need. But, there are also many cases in which you need a little bit of both. One example would be that when you interrupt replication for a longer period of time, you might want to resync the slave using the archive again instead of performing a full base backup again. It might also be useful to keep an archive around for some later investigation or replay operation.
The good news is that PostgreSQL allows you to actually mix file-based and streaming-based replication. You don't have to decide whether streaming- or file-based is better; you can have the best of both worlds at the very same time.
How can you do that? In fact, you have seen all the ingredients already; we just have to put them together in the right way.
To make this easier for you, we have compiled a complete example for you.
On the master, we can use the following configuration in postgresql.conf
:
wal_level = hot_standby # minimal, archive, or hot_standby # (change requires restart) archive_mode = on # allows archiving to be done # (change requires restart) archive_command = 'cp %p /archive/%f' # command to use to archive a logfile segment # placeholders: %p = path of file to archive # %f = file name only max_wal_senders = 5 # we used five here to have some spare capacity
In addition to that, we have to add some config lines to pg_hba.conf
to allow streaming. Here is an example:
# Allow replication connections from localhost, by a user with the # replication privilege. local replication hs trust host replication hs 127.0.0.1/32 trust host replication hs ::1/128 trust host replication all 192.168.0.0/16 md5
In our case, we have simply opened an entire network to allow replication (to keep the example simple).
Once we have made those changes, we can restart the master and take a base backup as shown earlier in this chapter.
Once we have configured our master and taken a base backup, we can start to configure our slave system. Let us assume for the sake of simplicity that we are only using a single slave; we will not cascade replication to other systems.
We only have to change a single line in postgresql.conf
on the slave:
hot_standby = on # to make the slave readable
In the next step, we can write a simple recovery.conf
file and put it into the main data directory:
restore_command = 'cp /archive/%f %p' standby_mode = on primary_conninfo = ' host=sample.postgresql-support.de port=5432 ' trigger_file = '/tmp/start_me_up.txt'
When we fire up the slave, the following things will happen:
restore_command
to fetch the transaction log from the archive.restore_command
to fetch the transaction log from the archive.You can keep streaming as long as necessary. If you want to turn the slave into a master, you can again use pg_ctl promote
or the trigger_file
defined in recovery.conf
.
The most important advantage of a dual-strategy is that you can create a cluster, which offers a higher level of security than just plain streaming-based or plain file-based replay. If streaming does not work for some reason, you can always fall back to files.
In this section we can discuss some typical error scenarios in a dual-strategy cluster:
If the network is dead, the master might not be able to perform the archive_command
operation successfully anymore. The history of the XLOG files must remain continuous, so the master has to queue up those XLOG files for later archiving. This can be a dangerous (yet necessary) scenario because you might run out of space for XLOG on the master if the stream of files is interrupted permanently.
If the streaming connection fails, PostgreSQL will try to keep syncing itself through the file-based channel. Should the file-based channel also fail, the slave will sit there and wait for the network connection to come back. It will then try to fetch the XLOG and simply continue once this is possible again.
Rebooting the slave will not do any harm as long as the archive has the XLOG to bring the slave back up. The slave will simply start up again and try to get the XLOG from any source available. There won't be corruption or any other problem of this sort.
If the master reboots, the situation is pretty uncritical as well. The slave will notice though the streaming connection that the master is gone. It will try to fetch the XLOG through both channels, but it won't be successful until the master is back. Again, nothing bad such as corruption can happen. Operations can simply resume after the reboot on both boxes.
If the XLOG in the archive corrupts, we have to distinguish between two scenarios:
Surely, there is a lot more that can go wrong, but given those likely cases, you can see clearly that the design has been made as reliable as possible.
18.223.172.132