Upgrading to asynchronous replication

Since the release of PostgreSQL 9.0, DBAs have had access to asynchronous streaming replication. Unlike the older hot standby methods used in earlier versions, replica servers can connect to an upstream PostgreSQL server and consume data modifications directly. With low network latency and fast transactions, this means that it is fairly common for streaming replicas to lag behind the master by only a few milliseconds.

In the context of high availability, this means we can scale horizontally by copying the database to multiple servers. Of course, this means that we need to copy the entire database to each server. For small-to medium-sized database instances, this is a relatively minor requirement. This also means that we can produce up-to-date backups, perform ad hoc queries on practically live data, and aggregate information into reports without disrupting our primary database.

This recipe will explain how to set up a streaming asynchronous database replica and explore some of the hidden caveats of doing so.

Getting ready

We will be continuing the work we performed in the Setting up a hot standby recipe, so please refer to that recipe to build a working hot standby. We will alter the standby setup to include streaming replication, and better security.

How to do it...

For this scenario, the server at 192.168.56.10 is the primary PostgreSQL server, and 192.168.56.20 will be the asynchronous replica. Follow these steps to build a PostgreSQL asynchronous replica:

  1. Give the rep_user user a password with this SQL statement:
    ALTER USER rep_user WITH PASSWORD 'newpass';
  2. On the primary server, modify the pg_hba.conf line and remove any references to the rep_user user. Then, add this line:
    host   replication   rep_user   192.168.56.20/32   md5
  3. Reload the configuration files on the primary server with the following command as the postgres user:
    pg_ctl -D /db/pgdata reload
    
  4. On the replica server, create a file named .pgpass in the postgres user's home directory with the following contents:
    192.168.56.10:*:replication:rep_user:newpass
  5. Alter the .pgpass file to have the correct permissions with this command:
    Chmod 600 ~/.pgpass
    
  6. Modify the recovery.conf file on the recovery server to match these lines:
    standby_mode = on
    primary_conninfo = 'host=192.168.56.10 user=rep_user'
    restore_command = 'cp /db/pg_archived/%f %p 2>/dev/null'
  7. Reload the configuration files on the streaming replica server with the following command as the postgres user:
    pg_ctl -D /db/pgdata reload
    
  8. Confirm that the standby is connected by executing this SQL on the primary PostgreSQL server:
    SELECT client_addr, usename, state
      FROM pg_stat_replication;

How it works...

Using trust authentication is not generally a recommended practice. It is one thing to copy the database without a password once, but quite another to leave a long-term security hole for all database replicas. This means it is time to ensure that the rep_user user has a password. We also need to change pg_hba.conf to reflect the fact that we want to use regular md5 authentication instead of trust. Once we reload the configuration files on the primary server, we move on to the streaming replica.

To get into the practice of using .pgpass files, we create one on the replica server for the rep_user user. The line we created in this file will send our desired password when the sections match; in this case, if we connect to 192.168.56.10 on any port to the replication database as the rep_user user, authentication will succeed automatically If any of these are different, the PostgreSQL client libraries will not send a password, and the client will receive an error. This is a fairly easy way to automate password submissions securely. PostgreSQL will also ignore this file if the permissions are wrong, so we set the control flags with chmod so that only the postgres user can access it.

Next, we rewrite the contents of the recovery.conf file to include primary_conninfo. This line is used to specify the connection information for establishing streaming replication. Since our password is in the .pgpass file, we don't need to enter it here. We also removed pg_standby in favor of a regular cp command with the errors suppressed. Now that our primary method of WAL consumption is directly from another server, we only need WAL files from /db/pg_archived as a fail back in case the stream is disrupted.

Tip

Why do we use .pgpass instead of entering the password in the recovery.conf file? It is very common for system automation tools to distribute configuration files to dozens or even hundreds of servers. Using .pgpass, we can settle on and redistribute passwords easily. In addition, tools that build recovery.conf will not need to know the password for the replication user.

Once we reload the standby server, it should become a streaming replica instead of a regular hot standby. We can confirm this with the SQL statement that checks the pg_stat_replication view on the primary server. We should get output similar to this:

How it works...

There's more...

Though streaming replication has existed since PostgreSQL 9.0, recent changes to 9.3 include two very helpful additional features:

  • The pg_basebackup tool puts PostgreSQL in backup mode by invoking the pg_start_backup() function. As this function writes to the database, it normally can't be used on a streaming server. However, the developers made changes in 9.3 that make it possible to use pg_basebackup on standby servers. This can greatly simplify the backup process and reduce overhead on the primary server.
  • In the event that we have several streaming replicas, older versions of PostgreSQL required replica servers to connect directly to the primary server. In 9.3 and above versions, PostgreSQL allows streaming replicas to subscribe to other replicas. With this, we can further reduce strain on the primary database server by offloading replication duties to a topology of alternate servers.

See also

There are good resources within the PostgreSQL documentation and Wiki regarding streaming replication. For more information, please visit these URLs:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.244.250