Sometimes, in order to provide acceptable data durability, a high availability configuration must utilize synchronous commits. Beginning with PostgreSQL 9.1, database servers can now refuse to commit a transaction until the data is located on at least one alternate server. Unlike asynchronous replication where this is optional, synchronous replicas enforce this requirement to a fault.
Discussions in the PostgreSQL mailing list suggest that there is a long-standing misconception that synchronous replication is similar to RAID-1 operation. In RAID-1, the same exact data exists on two disks (or two disk sets), and if one of the pair fails, it continues to operate in degraded mode until the problem is addressed. This is absolutely not the case with PostgreSQL synchronous replication.
Unlike a RAID-1, PostgreSQL replicas can exist on different servers, on different networks, or even in different countries. PostgreSQL synchronous replication is a guarantee that data is written to at least two servers. Despite the necessary increase in latency to confirm this, the guarantee is upheld at all times.
This recipe is for databases that need this kind of extreme durability.
We will be continuing the work we performed in the Upgrading to asynchronous replication recipe, so please refer to that section to build a working asynchronous replica. We will alter the standby setup to include synchronous streaming replication.
For this scenario, the server at 192.168.56.10
is still the primary PostgreSQL server. Follow these steps to change an asynchronous PostgreSQL server into a synchronous replica:
recovery.conf
file on the recovery server to match these lines:standby_mode = on primary_conninfo = 'host=192.168.56.10 user=rep_user application_name=node2' restore_command = 'cp /db/pg_archived/%f %p 2>/dev/null'
postgres
user:pg_ctl -D /db/pgdata restart
synchronous_standby_names
setting in the postgresql.conf
file on the primary server to read the following:synchronous_standby_names = 'node2'
postgres
user:pg_ctl -D /db/pgdata reload
SELECT client_addr, state, sync_state, application_name FROM pg_stat_replication;
Promoting an asynchronous standby server to synchronous mode is actually a fairly simple procedure. We begin by modifying the primary_conninfo
setting in the standby's recovery.conf
file to include the application_name
value. PostgreSQL differentiates replicas by their stated application name, so if we change this, we can specifically target that particular replica. Any other synchronous standby nodes should be assigned different names.
Once we restart the PostgreSQL server on the streaming standby, it will reconnect to the primary server with the new application_name
that we assigned. From this point onward, we can refer to the standby server as node2
. Thus, when we alter the synchronous_standby_names
variable in the primary server's postgresql.conf
file, we use the same name there.
Any time we want to change the synchronous_standby_names
variable, we merely need to tell PostgreSQL to reload its configuration files. Thus, after we do this, node2
should now act as a synchronous standby server. Any transaction will only commit if it can write to this server as well as the primary one.
This last point is extremely important. If, for any reason, the synchronous standby becomes unavailable, the primary server will stop writing to the database as well! If you are performing maintenance on the secondary server, we suggest that you set synchronous_standby_names
to a blank value and reload the PostgreSQL server. This will break the synchronous guarantee until the standby can be reconnected.
Once we have reloaded the primary server's configuration files, we can check the pg_stat_replication
view again to observe how streaming is currently functioning. After executing the query, we should see something like this:
As we can see in this example, the primary server sees node2
as a synchronous streaming replica.
We really want to confirm if the streaming replication works as advertised. To do this, let's shut down the standby server with this command:
pg_ctl -D /db/pgdata stop -m fast
Then, try to write to the primary server. This simple SQL statement should wait indefinitely:
CREATE TABLE foo ( bar INT );
If we then restart the streaming replica using the following command, we should see the transaction complete:
pg_ctl -D /db/pgdata start
As you might imagine, this can be problematic in true high availability architectures that handle thousands of transactions per second. As such, we don't actually recommend that you use synchronous replication on OLTP servers. As these comprise the bulk of highly available PostgreSQL clusters, opportunities to take advantage of this level of data durability are somewhat slim.
However, synchronous commit is actually somewhat optional. If we want to try the experiment again, we can first issue this SQL statement before trying a basic write query:
SET synchronous_commit TO false;
This disables synchronous replication temporarily for the current session. Subsequent write queries in this connection should succeed normally as if the remote server was a standard asynchronous copy.
There are good resources within the PostgreSQL documentation and Wiki regarding streaming replication. For more information, please visit these URLs:
52.15.80.101