Installing the components

The two main components to the software we use in this chapter are Corosync and Pacemaker. Each of these is comprised of or depends on several other elements and prerequisites. For now, we'll simply refer to the entire suite as Pacemaker, as it comprises the bulk of how we will control the failover system.

This recipe should be relatively short, as we will only discuss installation of Corosync and Pacemaker, not their configuration.

Getting ready

Red-Hat-based systems such as Fedora, CentOS, and Scientific Linux will already have Pacemaker in their repositories. Debian and its derivatives such as Ubuntu also include Pacemaker as an optional install from standard repositories. Red Hat Enterprise Linux (RHEL) itself, however, only offers the software as a paid add-on, available at http://www.redhat.com/products/enterprise-linux-add-ons/high-availability/.

Whatever choice you make, it shouldn't be necessary to compile Pacemaker from source on most Linux distributions.

How to do it...

Follow these quick steps to install Pacemaker and Corosync on all PostgreSQL server pairs running a Debian-based distribution:

  1. Install the main packages and all dependencies with this command as a root-capable user:
    sudo apt-get install corosync pacemaker
    
  2. Disable the cluster software from starting on system boot:
    sudo update-rc.d corosync disable
    

For those running a Red-Hat-based operating system, follow these steps to install and prepare Pacemaker:

  1. Install the main packages and all dependencies with this command as a root-capable user:
    sudo yum install corosync pacemaker
    
  2. Disable the cluster software from starting on system boot:
    sudo chkconfig corosync off
    

How it works...

Each of these short recipes consists of two steps:

  1. Install Corosync and Pacemaker.
  2. Disable Corosync on server boot.

While the first step makes sense, why do we need the second? When running a highly available cluster, caution is a beneficial attribute. A server may reboot for any number of reasons, and many of those include crashes that require further investigation.

Were Pacemaker to start immediately following a server reboot, we could potentially lose valuable diagnostic information. More importantly, a rebooted server should be considered in an unknown or potentially damaged state until it is examined by an experienced system administrator. We don't want a misbehaving server as part of our critical infrastructure.

Corosync is the communication layer between each Pacemaker node. It also launches the Pacemaker management system. This means that we can prevent all node management simply by disabling it.

There's more...

If you believe we are being too wary, simply skip the second step in our recipe. However, it's important to remember that services are easy to start on Linux servers. This command, for instance, will start Corosync normally:

sudo service corosync start

If the server was rebooted as the result of maintenance, the preceding command will return the system to normal operation. Otherwise, a few cursory checks through server logs may determine that the cause of the system crash does not adversely affect PostgreSQL data. If so, once again, it is easy to start Corosync and re-establish the dual-node cluster.

What we have done here is a very rudimentary form of STONITH, which means to Shoot The Other Node In The Head. Dedicated STONITH hardware may power a server off completely or remove it from the network, making it inaccessible through anything other than console emulation or direct access. Truly high-availability systems cannot afford to introduce unknown entities into a carefully crafted and manicured architecture. To do so invites undefined behavior across the spectrum of database services that could lead to outages or data loss.

If we want to claim that our data is important and our uptime is essential, we need to adopt a similar stance toward crashed or damaged servers. We haven't gone so far as to completely disable the server in this recipe; we only prevent it from rejoining a functioning Pacemaker pair. In a true STONITH-enabled organization, our measures would be much more drastic.

See also

  • The clusterlabs.org website is a repository of all things related to pacemaker. It has several relevant tutorials, examples, and copious documentation. If you had trouble installing with our recipe, try an alternative listed at http://clusterlabs.org/wiki/Install.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.147.27.131