Swapping active nodes with pgpool

With pgpool installed, we have an abstraction layer above PostgreSQL, which hides the active node from the client. This allows us to change the primary node so that we can perform maintenance, and yet we never have to actually stop the database.

This kind of design will work best when pgpool is not installed on one of the PostgreSQL servers, but it has its own dedicated hardware or virtual machine. This allows us full control over each PostgreSQL server, including the ability to reboot for kernel upgrades, without potentially disrupting pgpool.

Let's discuss the elements involved in switching the primary server with a replica so that we can have high availability in addition to regular maintenance.

Getting ready

Make sure pgpool is installed and configured according to the Installing pgpool and Configuring pgpool for master/slave mode recipes. We will need two nodes so that we can promote one and demote the other.

Next, we will ready the operating system so that pgpool can invoke remote commands. If we have two PostgreSQL servers at 192.168.56.10 and 192.168.56.20, we should execute these commands as the postgres system user on each, as shown:

ssh-keygen
ssh-copy-id 192.168.56.10
ssh-copy-id 192.168.56.20

Tip

The ssh-keygen command will prompt for a key password. This can make SSH keys more secure, but it also makes them extremely difficult to use within an automated context. For this and future SSH keys, use a blank password.

We will also use scripts located in the pgpool_scripts directory of the code for this chapter. Have these scripts available before continuing.

How to do it...

Assuming our database is located at /db/pgdata, follow all of these steps to enable and configure automatic and forced pgpool primary server migration:

  1. Copy the scripts from the pgpool_scripts directory of this book to the PostgreSQL cluster data directory.
  2. Execute this command as a root-level user to make them executable:
    chmod a+x /db/pgdata/pgpool_*
    
  3. Execute the following at the command line as a root-capable user:
    sudo sed -i "s/'DISALLOW/'ALLOW/" /etc/pgpool/pgpool.conf
    
  4. Execute these commands as a root-capable user to enable pgpool control operations, where pass is a password defined for pgpool administration:
    mv /etc/pgpool/pcp.conf.sample /etc/pgpool/pcp.confecho postgres:$(pg_md5 pass) >> /etc/pgpool/pcp.conf
    
  5. Edit the /etc/pgpool/pgpool.conf file and make the following changes:
    failover_command = '%D/pgpool_failover %d %P %h %H %D %R'
    recovery_1st_stage_command = 'pgpool_recovery'
  6. Execute this command as a root-capable user to restart pgpool:
    sudo service pgpool restart
    
  7. Detach the primary node from pgpool with this command, where pass is the password we created in step four:
    pcp_detach_node 10 192.168.56.10 9898 postgres pass 0
    
  8. Perform some fake maintenance as the postgres user on the primary node with this command:
    pg_ctl -D /db/pgdata status
    
  9. Reattach the primary node as a replica with these commands, again using pass as the pgpool control password:
    pcp_recovery_node 10 192.168.56.10 9898 postgres pass 0
    pcp_attach_node 10 192.168.56.10 9898 postgres pass 0
    

How it works...

pgpool depends on external helper scripts to perform remote operations on the servers it proxies. The pgpool source includes a few examples, but they use antiquated commands and may not work on our system. The scripts included in this book should work on most major Linux distributions. Thus, we move them into the PostgreSQL data directory and mark them as executable. They must reside here for pgpool to invoke them.

Next, we enable failover on all nodes by changing nodes marked DISALLOW_TO_FAILOVER to ALLOW_TO_FAILOVER with a quick command-line operation. Without this change, pgpool will not perform any migrations, regardless of how many nodes have crashed or how many times we request one.

Next, pgpool won't let us use the control commands until we create a user and password. This is not the same as any PostgreSQL user or operating system users. We use postgres to simplify, but any username will work. We encrypt the password with pg_md5, so pgpool will check against the encrypted value it expects.

Then, we need to tell pgpool that we defined scripts for failover and recovery operations. We do that by setting failover_command and recovery_1st_stage_command properly in pgpool.conf. Perceptive readers may note that we didn't change any settings to include the pgpool_remote_start script. This is because pgpool specifically seeks it by name. Don't forget to install it with the others. After we restart pgpool, all of our changes are incorporated, and failover should work as expected.

By calling the pcp_detach_node command on the primary server at port 9898, pgpool removes the indicated node from the active list of available servers. If the server is the primary node, it automatically promotes the replica to act as the new primary. Our version of the failover script also shuts down the primary PostgreSQL server to prevent unpooled connections from making changes that won't be caught by the newly promoted server.

At this point, we can do anything to the PostgreSQL server, including upgrade of the PostgreSQL software to the latest bugfix for our current version. Later, we use pcp_recovery_node to tell pgpool that it should refresh node zero with a copy of the node currently serving as the primary server. If the command succeeds, we can reattach it to the pool by invoking pcp_attach_node.

There's more...

If pgpool doesn't seem to call our scripts, we may need to install the pgpool_recovery extension. Assuming that we still have the pgpool source available, follow these steps as a root-capable user to install the pgpool PostgreSQL extension library:

cd pgpool-II-3.3.2/sql/
make
sudo make install

Then, connect to the template1 PostgreSQL database and install the pgpool_recovery extension with the following SQL query:

CREATE EXTENSION pgpool_recovery;

See also

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.14.205.205