High availability and uninterrupted service have been both marketing buzzwords and coveted goals for real-world IT professionals and network administrators as long as most of us can remember. To meet this perceived need and solve a few related problems, CARP and pfsync were added as two highly anticipated features in OpenBSD 3.5. With these tools, OpenBSD and the other operating systems that adopted them came a long way toward offering what other operating systems refer to as general purpose clustering functionality. The terminology used by OpenBSD and its sister BSDs differs from what other products use, but as you will see in this chapter, CARP, pfsync, and related tools offer high availability functionality equivalent to what a variety of proprietary systems tend to offer only as costly optional extras.
This chapter covers how to use these tools as found in your base system to manage resource availability—or, in other words, how to use them to make sure resources and services in your care stay available even in adverse conditions.
The Common Address Redundancy Protocol (CARP) was developed as a non-patent-encumbered alternative to the Virtual Router Redundancy Protocol (VRRP), which was far along the track to becoming an IETF-sanctioned standard, even though possible patent issues haven’t been resolved.[42] One of the main purposes of CARP is to ensure that the network will keep functioning as usual, even when a firewall or other service goes down due to errors or planned maintenance activities, such as upgrades. Not content to just make a clone of the patent-encumbered protocol, the OpenBSD developers decided to go one better on several fronts. CARP features authenticated redundancy—it’s address-family independent and comes with state synchronization features. Complementing CARP, the pfsync protocol is designed to handle synchronization of PF states between redundant packet-filtering nodes or gateways. Both protocols are intended to ensure redundancy for essential network features with automatic failover.
CARP is based on setting up a group of machines as one master and one or more redundant backups, all equipped to handle a common IP address. If the master goes down, one of the backups will inherit the IP address. The handover from one CARP host to another may be authenticated, essentially by setting a shared secret (in practice, much like a password).
In the case of PF firewalls, pfsync can be set up to handle the synchronization, and if the synchronization via pfsync has been properly set up, active connections will be handed over without noticeable interruption. In essence, pfsync is a type of virtual network interface specially designed to synchronize state information between PF firewalls. Its interfaces are assigned to physical interfaces with ifconfig
.
Even if it’s technically possible to lump pfsync traffic together with other traffic on a regular interface, it’s strongly recommended that you set up pfsync on a separate network, or even VLAN. pfsync does no authentication on its synchronization partners, so the only way to guarantee correct synchronization is to use dedicated interfaces for your pfsync traffic.
To illustrate a useful failover setup with CARP and pfsync, we’ll examine a network with one gateway to the world. Our goals for the reconfigured network are as follows:
The network should keep functioning much the same way it did before we introduced redundancy.
We should have better availability without noticeable downtime.
The network should experience graceful failover with no interruption of active connections.
We’ll begin with the relatively simple network from Chapter 3, as shown in Figure 8-1.
We replace the single gateway with a redundant pair of gateways that share a private network for state-information updates over pfsync. The result is shown in Figure 8-2.
CARP addresses are virtual addresses, and unless you have console access to all machines in your CARP group, you should almost always assign an IP address to the physical interfaces. With a unique IP address for each physical interface, you’ll be able to communicate with the host and be sure of which machine you’re interacting with. Without IP addresses assigned to physical interfaces, you could find yourself with a setup where the backup gateways are unable to communicate (except with hosts in networks where the physical interfaces have addresses assigned) until they become the master in the redundancy group and take over the virtual IP addresses.
It’s reasonable to assume that the IP address assigned to the physical interface will belong in the same subnet as the virtual, shared IP address. It’s also important to be aware that this is, in fact, not a requirement—it’s even possible to configure CARP where the physical interface hasn’t been assigned an address. If you don’t specify a specific physical interface for the CARP interface, the kernel will try to assign the CARP address to a physical interface that’s already configured with an address in the same subnet as the CARP address. Even if it may not be required in simpler configurations, it’s generally useful to make the interface selection explicit via the carpdev
option in the ifconfig
command string that you use to set up the CARP interface.
Most CARP setup lies in cabling (according to the schematic for your network), setting sysctl
values, and issuing ifconfig
commands. Also, on some systems, you’ll need to make sure that your kernel is set up with the required devices compiled in.
On OpenBSD, both the CARP and pfsync devices are in the default GENERIC and GENERIC.MP kernel configurations. Unless you’re running a custom kernel without these options, no kernel reconfiguration is necessary. If you’re running FreeBSD, make sure that the kernel has the CARP and pfsync devices compiled in because the default GENERIC kernel lacks these options. (See the FreeBSD Handbook to learn how to compile and install a custom kernel with these options.)
NetBSD should check that the kernel has pseudo-device CARP compiled in because NetBSD’s default GENERIC kernel configuration doesn’t have it. (You’ll find the relevant line commented out in the GENERIC configuration file.) As of this writing, NetBSD doesn’t support pfsync due to claimed protocol-numbering issues.
On all CARP-capable systems, the basic functions are governed by a handful of sysctl
variables. The main one is net.inet.carp.allow
, and it’s enabled by default. On a typical OpenBSD system, you’ll see:
$ sysctl net.inet.carp.allow
net.inet.carp.allow=1
This means that your system comes equipped for CARP.
If your kernel isn’t configured with a CARP device, this command should produce something like the following on FreeBSD:
sysctl: unknown oid 'net.inet.carp.allow'
Or it could produce something like this on NetBSD:
sysctl: third level name 'carp' in 'net.inet.carp.allow' is invalid
Use this sysctl
command to view all CARP-related variables:
$ sysctl net.inet.carp
net.inet.carp.allow=1
net.inet.carp.preempt=0
net.inet.carp.log=2
On FreeBSD, you’ll also encounter the read-only status variable net.inet.carp.suppress_preempt
, which indicates whether preemption is possible. On systems with CARP code based on OpenBSD 4.2 or earlier, you’ll also see net.inet.carp.arpbalance
, which is used to enable CARP ARP balancing to offer some limited load balancing for hosts on a local network.
To enable the graceful failover between the gateways in our setup, we need to set the net.inet.carp.preempt
variable so that on hosts with more than one network interface (like our gateways), all CARP interfaces will move between master and backup status concurrently. This setting must be identical on all hosts in the CARP group, and it should be repeated on all hosts during setup.
$ sudo sysctl net.inet.carp.preempt=1
The net.inet.carp.log
variable sets the debug level for CARP logging between 0 and 7. The default of 2 means only CARP state changes are logged.
Notice in the network diagram shown in Figure 8-2 that the local network uses addresses in the 192.168.12.0 network, while the Internet-facing interface is in the 192.0.2.0 network. With these address ranges and the CARP interface’s default behavior in mind, the commands for setting up the virtual interfaces are actually quite straightforward.
In addition to the usual network parameters, CARP interfaces require one additional parameter: the virtual host ID (vhid
), which uniquely identifies the interfaces that will share the virtual IP address.
The vhid
is an 8-bit value that must be set uniquely within the network’s broadcast domain. Setting the vhid
to the wrong value can lead to network problems that can be hard to debug, and there’s even anecdotal evidence that ID collisions with otherwise unrelated systems can occur and cause disruption on redundancy and load-balancing systems based on VRRP, which uses a virtual node identification scheme similar to CARP’s.
Run these commands on the machine that will be the initial master for the group:
$ sudo ifconfig carp0 192.0.2.19 vhid 1 $ sudo ifconfig carp1 192.168.1.1 vhid 2
We don’t need to explicitly set the physical interface because the carp0
and carp1
virtual interfaces will bind themselves to the physical interfaces that are already configured with addresses in the same subnets as the assigned CARP address.
On systems that offer the carpdev
option to ifconfig
, it’s recommended to use the carpdev
option for all CARP interface setups, even if it isn’t strictly required. The carpdev
option becomes indispensable in cases where the choice of physical network device for the CARP interface isn’t obvious from the existing network configuration, and adding a carpdev interface
string to the ifconfig
commands can make the difference between a nonfunctional setup and a working one. This can be particularly useful in some nonintuitive configurations and where the number of free IP addresses in the relevant network is severely limited. The FreeBSD port of CARP offers the carpdev
option starting with FreeBSD 10.0.
Use ifconfig
to make sure that each CARP interface is properly configured and pay particular attention to the carp:
line, which indicates MASTER
status, as shown here:
$ ifconfig carp0
carp0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
lladdr 00:00:5e:00:01:01
carp: MASTER carpdev ep0 vhid 1 advbase 1 advskew 0
groups: carp
inet 192.0.2.19 netmask 0xffffff00 broadcast 192.0.2.255
inet6 fe80::200:5eff:fe00:101%carp0 prefixlen 64 scopeid 0x5
The setup is almost identical on the backup except that you add the advskew
parameter, which indicates how much less preferred it is for the specified machine to take over than the current master.
$ sudo ifconfig carp0 192.0.2.19 vhid 1 advskew 100 $ sudo ifconfig carp1 192.168.1.1 vhid 2 advskew 100
The advskew
parameter and its companion value, advbase
, are used to calculate the interval between the current host’s announcements of its master status once it’s taken over. The default value for advbase
is 1, and the default for advskew
is 0. In the preceding example, the master would announce every second (1 + 0/256), while the backup would wait for 1 + 100/256 seconds.
With net.inet.carp.preempt=1
on all hosts in the failover group, when the master stops announcing or announces that it isn’t available, the backups take over, and the new master starts announcing at its configured rate. Smaller advskew
values mean shorter announcement intervals and a higher likelihood that the host becomes the new master. If more hosts have the same advskew
, the one that’s already master will keep its master status.
On OpenBSD 4.1 and higher, one more factor in the equation determines which host takes over CARP master duty. The demotion counter is a value each CARP host announces for its interface group as a measure of readiness for its CARP interfaces. When the demotion counter value is 0, the host is in complete readiness; higher values indicate measures of degradation. You can set the demotion counter from the command line using ifconfig -g
, but the value is usually set by the system itself, with higher values typically during the boot process. All other things being equal, the host with the lowest demotion counter will win the contest to take over as the CARP master.
As of this writing, FreeBSD CARP versions earlier than FreeBSD 10 don’t support setting the demotion counter.
On the backup, use ifconfig
once again to check that each CARP interface is properly configured:
$ ifconfig carp0
carp0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
lladdr 00:00:5e:00:01:01
carp: BACKUP carpdev ep0 vhid 1 advbase 1 advskew 100
groups: carp
inet 192.0.2.19 netmask 0xffffff00 broadcast 192.0.2.255
inet6 fe80::200:5eff:fe00:101%carp0 prefixlen 64 scopeid 0x5
The output here is only slightly different from what you’ve just seen on the master. Notice that the carp:
line indicates BACKUP
status along with the advbase
and advskew
parameters.
For actual production use, you should add a measure of security against unauthorized CARP activity by configuring the members of the CARP group with a shared, secret passphrase, such as the following:[43]
$ sudo ifconfig carp0 pass mekmitasdigoat 192.0.2.19 vhid 1 $ sudo ifconfig carp1 pass mekmitasdigoat 192.168.1.1 vhid 2
As with any other password, the passphrase will become a required ingredient in all CARP traffic in your setup. Be sure to configure all CARP interfaces in a failover group with the same passphrase (or none).
Once you’ve figured out the appropriate settings, preserve them through future system reboots by putting them in the proper files in /etc:
On OpenBSD, put the proper ifconfig
parameters into hostname.carp0 and hostname.carp1.
On FreeBSD and NetBSD, put the relevant lines in your rc.conf file as contents of the ifconfig_carp0=
and ifconfig_carp1=
variables.
As the final piece of configuration, set up state-table synchronization between the hosts in your redundancy group to prevent traffic disruption during failover. This feat is accomplished through a set of pfsync interfaces. (As noted earlier, as of this writing, NetBSD doesn’t support pfsync.)
Configuring pfsync interfaces requires planning and a few ifconfig
commands. You can set up pfsync on any configured network interface, but it’s best to set up a separate network for the synchronization. The sample configuration in Figure 8-2 shows a tiny network set aside for the purpose. A crossover cable connects the two Ethernet interfaces, but in configurations with more than two hosts in the failover group, you may want a setup with a separate switch, hub, or VLAN. The interfaces to be used for the synchronization have been assigned the IP addresses 10.0.12.16 and 10.0.12.17, respectively.
With the basic TCP/IP configuration in place, the complete pfsync setup for each synchronization partner interface is
$ sudo ifconfig pfsync0 syncdev ep2
The pfsync protocol itself offers little in the way of security features: It has no authentication mechanism and, by default, communicates via IP multicast traffic. However, in cases where a physically separate network isn’t feasible, you can tighten up your pfsync security by setting up pfsync to synchronize only with a specified syncpeer
:
$ sudo ifconfig pfsync0 syncpeer 10.0.12.16 syncdev ep2
This produces a configured interface that shows up in ifconfig
output like this:
pfsync0: flags=41<UP,RUNNING> mtu 1500 priority: 0 pfsync: syncdev: ep2 syncpeer: 10.0.12.16 maxupd: 128 defer: off groups: carp pfsync
Another option is to set up an IPsec tunnel and use that to protect the sync traffic. In this case, the ifconfig
command is
$ sudo ifconfig pfsync0 syncpeer 10.0.12.16 syncdev enc0
This means that the syncdev
device becomes the enc0
encapsulating interface instead of the physical interface.
If possible, set up synchronization across a physically separate, dedicated network or a separate VLAN because any lost pfsync updates could lead to less than clean failover.
One very useful way to check that your PF state synchronization is running properly is to watch the state table on your synchronized hosts using systat states
on each machine. The command gives you a live display of states, showing updates happening in bulk on the sync targets. Between the synchronizations, states should display identically on all hosts. (Traffic counters—such as the number of packets and bytes passed—are the exception; they display updates only on the host that handles the actual connection.)
This takes us to the end of the basic network configuration for CARP-based failover. In the next section, we’ll discuss what to keep in mind when writing rule sets for redundant configurations.
After all the contortions we’ve been through in order to configure basic networking, you may be wondering what it will take to migrate the rules you use in your current pf.conf to the new setup. Fortunately, not much. The main change we’ve introduced is essentially invisible to the rest of the world, and a well-designed rule set for a single gateway configuration will generally work well for a redundant setup, too.
That said, we’ve introduced two additional protocols (CARP and pfsync), and you’ll probably need to make some relatively minor changes to your rule set in order for the failover to work properly. Basically, you need to pass the CARP and pfsync traffic to the appropriate interfaces. The simplest way to handle the CARP traffic is to introduce a macro definition for your carpdevs
that includes all physical interfaces that will handle CARP traffic. You’ll also introduce an accompanying pass
rule, like the following one, in order to pass CARP traffic on the appropriate interfaces:
pass on $carpdevs proto carp
Similarly, for pfsync traffic, you can introduce a macro definition for your syncdev
and an accompanying pass
rule:
pass on $syncdev proto pfsync
Skipping the pfsync interfaces entirely for filtering is cheaper performance-wise than filtering and passing. To take the pfsync device out of the filtering equation altogether, use this rule:
set skip on $syncdev
You should also consider the roles of the virtual CARP interface and its address versus the physical interface. As far as PF is concerned, all traffic will pass through the physical interfaces, but the traffic may have the CARP interface’s IP addresses as source or destination addresses.
You may find that you have rules in your configuration that you don’t want to bother to synchronize in case of a failover, such as connections to services that run on the gateway itself. One prime example is the typical rule to allow SSH in for the administrator:
pass in on $int_if from $ssh_allowed to self
For rules like these, you could use the state option no-sync
to prevent synchronizing state changes for connections that really aren’t relevant once failover has occurred:
pass in on $int_if from $ssh_allowed to self keep state (no-sync)
With this configuration, you’ll be able to schedule operating system upgrades and formerly downtime-producing activities on members of your CARPed group of systems at times when they’re most convenient, with no noticeable downtime for the users of your services.
Redundancy by failover is nice, but sometimes it’s less attractive to have hardware sitting around in case of failure and better to create a configuration that spreads the network load over several hosts.
In addition to ARP balancing (which works by calculating hashes based on the source MAC address on incoming connections), CARP in OpenBSD 4.3 and higher supports several varieties of IP-based load balancing, with traffic allocated based on hashes calculated from the connections’ source and destination IP addresses. Because ARP balancing is based on the source MAC address, it’ll work only for hosts in the directly connected network segment. On the other hand, the IP-based methods are appropriate for load-balancing connections to and from the Internet at large.
The choice of method for your application will depend on the specifications of the rest of the network equipment you need to work with. The basic ip
balancing mode uses a multicast MAC address to have the directly connected switch forward traffic to all hosts in the load-balancing cluster.
Unfortunately, the combination of a unicast IP address and a multicast MAC address isn’t supported by some systems. In those cases, you may need to configure your load balancing in ip-unicast
mode, which uses a unicast MAC address, and configure your switch to forward to the appropriate hosts. Or you may need to configure your load balancing in ip-stealth
mode, which doesn’t use the multicast MAC address at all. As usual, the devil is in the details, and the answers are found in man pages and other documentation, most likely with a bit of experimentation thrown in.
Traditionally, relayd
has been used to do intelligent load balancing as the frontend for servers that offer services to the rest of the world. In OpenBSD 4.7, relayd
acquired the ability to track available uplinks and alter the system’s routing tables based on link health, with the functionality wrapped in a bundle with the router
keyword. For setups with several possible uplinks or various routing tables, you can set up relayd
to choose your uplink or, with a little help from the sysctl
variables net.inet.ip.multipath
and net.inet6.ip6.multipath
, perform load balancing across available routes and uplinks. The specifics will vary with your networking environment. The relayd.conf
man page contains a complete example to get you started.
In load-balancing mode, the CARP concept is extended by letting each CARP interface be a member of multiple failover groups and as many load-balancing groups as there are physical hosts that will share the virtual address. In contrast with the failover case, where there can be only one master, each node in a load-balancing cluster must be the master of its own group so that it can receive traffic. The choice of group—and by extension, physical host—that ends up handling a given connection is determined by CARP via a hash value calculation. This calculation is based on the connection’s source MAC address in the ARP-balancing case and on the source and destination IP address in the IP-balancing case as well as actual availability. The downside to this scheme is that each group consumes one virtual host ID, so you’ll run out of these IDs quite a bit more quickly in a load-balancing configuration than when using failover only. In fact, there’s a hard upper limit to the number of CARP-based load-balancing clusters of 32 virtual host IDs.
The advskew
parameter plays a similar role in load-balancing configurations as in the failover ones, but the ifconfig
(and hostname.carpN) syntax for CARP load balancing is slightly different from that of the failover case.
Changing the CARP failover group built over the previous sections to a load-balancing cluster is as simple as editing the configuration files and reloading. In the following example, we choose an IP load-balancing scheme. If you choose a different scheme, the configuration itself differs only in the keyword for mode selection.
On the first host, we change /etc/hostname.carp0 to
pass mekmitasdigoat 192.0.2.19 balancing ip carpnodes 5:100,6:0
This says that on this host, the carp0
interface is a member of the group with vhid 5
(with an advskew
of 100
) as well as the interface with vhid 6
, where it’s the prime candidate for becoming initial master (with an advskew
set to 0
).
Next, we change /etc/hostname.carp1 to this:
pass mekmitasdigoat 192.168.12.1 balancing ip carpnodes 3:100,4:0
For carp1
, the memberships are vhid
s 3
and 4
, with advskew
values of 100
and 0
, respectively.
For the other host, the advskew
values are reversed, but the configuration is otherwise predictably similar. Here, /etc/hostname.carp0 reads:
pass mekmitasdigoat 192.0.2.19 balancing ip carpnodes 5:0,6:100
This means that the carp0
interface is a member of vhid 5
with advskew 0
and a member of vhid 6
with advskew 100
. Complementing this is the /etc/ hostname.carp1 file that reads:
pass mekmitasdigoat 192.168.12.1 balancing ip carpnodes 3:0,4:100
Again, carp1
is a member of vhid 3
and 4
, with advskew 0
in the first and 100
in the other.
The ifconfig
output for the carp
interface group on the first host looks like this:
$ ifconfig carp
carp0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
lladdr 01:00:5e:00:01:05
priority: 0
carp: carpdev vr0 advbase 1 balancing ip
state MASTER vhid 5 advskew 0
state BACKUP vhid 6 advskew 100
groups: carp
inet 192.0.2.19 netmask 0xffffff00 broadcast 192.0.2.255
inet6 fe80::200:24ff:fecb:1c10%carp0 prefixlen 64 scopeid 0x7
carp1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
lladdr 01:00:5e:00:01:03
priority: 0
carp: carpdev vr1 advbase 1 balancing ip
state MASTER vhid 3 advskew 0
state BACKUP vhid 4 advskew 100
groups: carp
inet 192.168.12.1 netmask 0xffffff00 broadcast 192.168.12.255
inet6 fe80::200:24ff:fecb:1c10%carp1 prefixlen 64 scopeid 0x8
pfsync0: flags=41<UP,RUNNING> mtu 1500
priority: 0
pfsync: syncdev: vr2 syncpeer: 10.0.12.17 maxupd: 128 defer: off
groups: carp pfsync
The other host has this ifconfig
output:
$ ifconfig carp
carp0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
lladdr 01:00:5e:00:01:05
priority: 0
carp: carpdev vr0 advbase 1 balancing ip
state BACKUP vhid 5 advskew 100
state MASTER vhid 6 advskew 0
groups: carp
inet 192.0.2.19 netmask 0xffffff00 broadcast 192.0.2.255
inet6 fe80::200:24ff:fecb:1c18%carp0 prefixlen 64 scopeid 0x7
carp1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
lladdr 01:00:5e:00:01:03
priority: 0
carp: carpdev vr1 advbase 1 balancing ip
state BACKUP vhid 3 advskew 100
state MASTER vhid 4 advskew 0
groups: carp
inet 192.168.12.1 netmask 0xffffff00 broadcast 192.168.12.255
inet6 fe80::200:24ff:fecb:1c18%carp1 prefixlen 64 scopeid 0x8
pfsync0: flags=41<UP,RUNNING> mtu 1500
priority: 0
pfsync: syncdev: vr2 syncpeer: 10.0.12.16 maxupd: 128 defer: off
groups: carp pfsync
If we had three nodes in our load-balancing scheme, each carp
interface would need to be a member of an additional group, for a total of three groups. In short, for each physical host you introduce in the load-balancing group, each carp
interface becomes the member of an additional group.
Once you’ve set up the load-balancing cluster, check the flow of connections by running systat states
on each of the hosts in your load-balancing cluster for a few minutes to make sure that the system works as expected and to see that all the effort you put in has been worth it.
[42] VRRP is described in RFC 2281 and RFC 3768. The patents involved are held by Cisco, IBM, and Nokia. See the RFCs for details.
[43] This particular passphrase has a very specific meaning. A Web search will reveal its significance and why it’s de rigeur for modern networking documentation. The definitive answer can be found via the openbsd-misc mailing list archives.
18.226.187.233