Managing software upgrades

Software in the server space is normally fairly stable. However, elements such as security updates and bug fixes must be applied. Highly available servers can't be stopped often, but without important upgrades, they could crash or experience a breach, which would be far more serious.

Then how do we ensure that updates can be applied safely while maintaining consistent availability? Once again, this often comes down to preparation. We prepare by having duplicate online data copies and by abstracting access paths. With architecture like this in place, we can switch to a backup server while upgrading the primary; thus, the database never actually goes offline.

We'll explore this scenario here, especially as it will be a very common one.

Getting ready

For this section, we need at least one extra server with PostgreSQL installed. This server should be running a copy of our database. We can follow the Managing system migrations recipe to build a copy if we don't already have one available. We will also use ideas introduced in the Exploring the magic of virtual IPs recipe. Reviewing these recipes now might be a good idea.

How to do it...

For this scenario, assume that we have two servers with the addresses 192.168.1.10 and 192.168.1.20, where 192.168.1.10 is currently the primary server. In addition, we have a virtual IP address of 192.168.1.30 on the eth0:pgvip Ethernet device. To upgrade the PostgreSQL software on both nodes, follow these steps:

  1. Stop the database copy on 192.168.1.20 as the postgres user using this command:
    pg_ctl -D /path/to/database stop -m fast
    
  2. Perform any necessary software upgrades. For example, to upgrade a Debian or Ubuntu server to the latest PostgreSQL 9.3, use the following command as a root-capable user on 192.168.1.20:
    sudo apt-get install postgresql-9.3
    
  3. Start the database copy on 192.168.1.20 as the postgres user:
    pg_ctl -D /path/to/database start
    
  4. As a root-capable user on 192.168.1.10, stop the virtual IP address with the following command:
    sudo ifconfig eth0:pgvip down
    
  5. As a database superuser, issue a checkpoint to the database on 192.168.1.10:
    CHECKPOINT;
  6. Connect to PostgreSQL on 192.168.1.10 and issue the following query to check replication status:
    SELECT sent_location, replay_locationFROM pg_stat_replicationWHERE usename = 'rep_user';
  7. Periodically, repeat the preceding query until sent_location and replay_location match.
  8. As postgres, stop the PostgreSQL service on 192.168.1.10 with this command:
    pg_ctl -D /path/to/database stop -m fast
    
  9. As postgres, promote the PostgreSQL replica on 192.168.1.20 with this command:
    pg_ctl -D /path/to/database promote
    
  10. As a root-capable user on 192.168.1.20, start the virtual IP address with the following command:
    sudo ifconfig eth0:pgvip 192.168.1.30 up
    
  11. If necessary, inform the developers and support staff to restart the application's database connection pools.
  12. Repeat any necessary software upgrades on 192.168.1.10 as already performed on 192.168.1.20.
  13. Erase the existing database on 192.168.1.10 as the postgres user this way:
    rm -Rf /path/to/database
    
  14. Use pg_basebackup on 192.168.1.10 to make a copy of the upgraded database on 192.168.1.20:
    pg_basebackup -U rep_user -h 192.168.1.20 -D /path/to/database
    
  15. Create a file named recovery.conf in /path/to/database with the following contents:
    standby_mode = 'on'
    primary_conninfo = 'host=192.168.1.20 port=5432 user=rep_user'
  16. Start the newly created copy as the postgres user on 192.168.1.10 using the following command:
    pg_ctl -D /path/to/database start
    

How it works...

This entire process is very long, but we hope to illustrate that it is actually very straightforward. The first step is to upgrade the mirror copy of the database under the assumption that it is not actively utilized by applications or users. The role of the secondary node in this case is to act as an emergency backup for the primary database node. As it's not being used, we are able to stop the database, perform any updates necessary, and start it and allow it to synchronize again.

Afterwards, we isolate the primary database node by disabling the virtual IP address. This allows the streaming replica to replay the last few active transactions so that it's fully synchronized before we make it the new primary database. We accomplish this by issuing CHECKPOINT and watching the replication status until it matches on both systems. When the replication status matches, we can stop the primary PostgreSQL server; its role in the process is complete.

As software upgrades may take some time to complete or require a server restart, we need to immediately make the secondary node available as the primary database. We start by promoting the replica to become the new primary by sending the promote command to pg_ctl. Once the database is writable, we reinstate the 192.168.1.30 virtual IP address so that applications and users can reconnect safely.

This process of node switching is fairly quick, provided we already have a replica ready to take over. With the replica acting as a primary, the next step is to perform any upgrades necessary, just as we did on the secondary node. After the upgrades are finished, we cannot simply restart the primary database again, as the replica has been acting as a primary database for a period of time.

This means that we need to rebuild the primary database as a new replica. This makes both nodes ready for the next upgrade and maintains the two-node relationship. We start this process by erasing the old contents of the database and then use pg_basebackup to copy the current primary database. Then, we create a new recovery.conf file and direct it to act as a new replica. Once the replica is started, we have the same configuration as we had earlier, but now, the roles are reversed; 192.168.1.20 is the primary, and 192.168.1.10 is the replica.

There's more...

Astute readers may have noticed that using pg_basebackup to copy the entire database following a minor upgrade is somewhat wasteful. We agree! In the later recipes, we will make use of rsync or PostgreSQL-specific software to perform these tasks instead. This recipe was already pretty long, and setting up rsync properly for this operation would have added quite a bit more time. The point is to show you the switching process; feel free to substitute better methods you know for synchronizing data.

See also

  • In addition to rsync, a newer utility named pg_rewind can make resetting replicas much easier. It is beyond the scope of this chapter, so we recommend that you read more about it at https://github.com/vmware/pg_rewind.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.147.62.94