Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Securing the WAL stream

The primary mechanism that PostgreSQL uses to provide a data durability guarantee is through its Write Ahead Log (WAL). All transactional data is written to this location before ever being committed to database files. Once WAL files are no longer necessary for crash recovery, PostgreSQL will either delete or archive them. For the purposes of a highly available server, we recommend that you keep these important files as long as possible. There are several reasons for this; they are as follows:

Archived WAL files can be used for Point In Time Recovery (PITR)
If you are using streaming replication, interrupted streams can be re-established by applying WAL files until the replica has caught up
WAL files can be reused to service multiple server copies

In order to gain these benefits, we need to enable PostgreSQL WAL archiving and save these files until we no longer need them. This section will address our recommendations for long-term storage of WAL files.

Getting ready

In order to properly archive WAL files, we recommend that you provision a server dedicated to backups or file storage. Depending on the transaction volume, an active PostgreSQL database might produce thousands of these on a daily basis. At 16 MB apiece, this is not an idle concern. For instance, for a 1 TB database, we recommend at least 3 TB of storage space.

In addition, we will be using rsync as a daemon on this archive server. To install this on a Debian-based server, execute the following command as a root-level user:

sudo apt-get install rsync

Red-Hat-based systems will need this command instead:

sudo yum install rsync xinetd

How to do it...

Our archive server has a 3 TB mount at the /db directory and is named arc_server on our network. The PostgreSQL source server resides at 192.168.56.10. Follow these steps for long-term storage of important WAL files on an archive server:

Enable rsync to run as a daemon on the archive server.
On Debian-based systems, edit the /etc/default/rsync file and change the RSYNC_ENABLE variable to true.
On Red-Hat-based systems, edit the /etc/xinet.d/rsync file and change the disable parameter to no.
Create a directory to store archived WAL files as the postgres user with these commands:
```
sudo mkdir /db/pg_archived
sudo chown postgres:postgres /db/pg_archived
```

Create a file named /etc/rsyncd.conf and fill it with the following contents:

[wal_store]
    path = /db/pg_archived
    comment = DB WAL Archives
    uid = postgres
    gid = postgres
    read only = false
    hosts allow = 192.168.56.10
    hosts deny = *

Start the rsync daemon.
Debian-based systems should execute the following command: sudo service rsync start.
Red-Hat-based systems can start rsync with this command instead: sudo service xinetd start.
Change the archive_mode and archive_command parameters in postgresql.conf to read the following:
```
archive_mode = on
archive_command = 'rsync -aq %p arc_server::wal_store/%f'
```
Restart the PostgreSQL server with the following command:
```
pg_ctl -D $PGDATA restart
```

How it works...

The rsync utility is normally used to transfer files between two servers. However, we can take advantage of using it as a daemon to avoid connection overhead imposed by using SSH as an rsync protocol. Our first step is to ensure that the service is not disabled in some manner, which would make the rest of this guide moot.

Next, we need a place to store archived WAL files on the archive server. Assuming that we have 3 TB of space in the /db directory, we simply claim /db/pg_archived as the desired storage location. There should be enough space to use /db for backups as well, but we won't discuss that in this recipe.

Next, we create a file named /etc/rsyncd.conf, which will configure how rsync operates as a daemon. Here, we name the /db/pg_archived directory wal_store so that we can address the path by its name when sending files. We give it a human-readable name and ensure that files are owned by the postgres user, as this user also controls most of the PostgreSQL-related services.

The next, and possibly the most important step, is to block all hosts but the primary PostgreSQL server from writing to this location. We set hosts deny to *, which blocks every server. Then, we set hosts allow to the primary database server's IP address so that only it has access. If everything goes well, we can start the rsync (or xinetd on Red Hat systems) service and we can see that in the following screenshot:

Next, we enable archive_mode by setting it to on. With archive mode enabled, we can specify a command that will execute when PostgreSQL no longer needs a WAL file for crash recovery. In this case, we invoke the rsync command with the -a parameter to preserve elements such as file ownership, timestamps, and so on.

In addition, we specify the -q setting to suppress output, as PostgreSQL only checks the command exit status to determine its success. In the archive_command setting, the %p value represents the full path to the WAL file, and %f resolves to the filename. In this context, we're sending the WAL file to the archive server at the wal_store module we defined in rsyncd.conf.

Once we restart PostgreSQL, it will start storing all the old WAL files by sending them to the archive server.

Tip

In case any rsync command fails because the archive server is unreachable, PostgreSQL will keep trying to send it until it is successful. If the archive server is unreachable for too long, we suggest that you change the archive_command setting to store files elsewhere. This prevents accidentally overfilling the PostgreSQL server storage.

There's more...

As we will likely want to use the WAL files on other servers, we suggest that you make a list of all the servers that could need WAL files. Then, modify the rsyncd.conf file on the archive server and add this section:

[wal_fetch]
    path = /db/pg_archived
    comment = DB WAL Archive Retrieval
    uid = postgres
    gid = postgres
    read only = true
    hosts allow = host1, host2, host3
    hosts deny = *

Now, we can fetch WAL files from any of the hosts in hosts allow. As these are dedicated PostgreSQL replicas, recovery servers, or other defined roles, this makes the archive server a central location for all our WAL needs.

Table of Contents for
Securing the WAL stream

Securing the WAL stream

Getting ready

How to do it...

How it works...

Tip

There's more...

See also

Table of Contents for Securing the WAL stream

Create new playlist

Sign In

Sign Up

Securing the WAL stream

Getting ready

How to do it...

How it works...

Tip

There's more...

See also

Table of Contents for
Securing the WAL stream