Setting Up Queues

Now we will look at examples of setting up the three different types of ALTQ queues: priority-based, class-based bandwidth, and HFSC.

The general syntax for ALTQ queues in PF looks like this:

altq on interface type [options ... ] main_queue { sub_q1, sub_q2 ..}
  queue sub_q1 [ options ... ]
  queue sub_q2 [ options ... ] { subA, subB, ... }
[...]
pass [ ... ] queue sub_q1
pass [ ... ] queue sub_q2

Note that cbq and hfsc queues can have several levels of subqueues. The priq queues are essentially flat, with only one queue level. We will address syntax specifics for each type in the following sections.

Priority-Based Queues

The basic concept for priority-based queues (priq) is fairly straightforward and perhaps the easiest to understand. Within the total bandwidth allocated to the main queue, all that matters is traffic priority. You assign queues a priority value in the range 0 through 15, where a higher value means that the queue’s requests for traffic are serviced sooner.

A Real-World Example

For a real-world example, we can look to Daniel Hartmeier. He discovered a simple yet effective way to improve the throughput for his home network by using ALTQ. Like many people, Daniel had his home network on an asymmetric connection, with total usable bandwidth low enough that he felt a strong desire to get better bandwidth utilization. In addition, when the line was running at or near capacity, some oddities started appearing. One symptom in particular seemed to indicate there was room for improvement: Incoming traffic (downloads, incoming mail, and such) slowed down disproportionately whenever outgoing traffic started—more than could be explained by measuring the raw amount of data transferred. It all came back to a basic feature of TCP.

When a TCP packet is sent, the sender expects acknowledgment (in the form of an ACK packet) from the receiving end and will wait for a specified time for it to arrive. If the ACK does not arrive within the specified time, the sender assumes that the packet has not been received and resends the packet it originally sent. The problem is that in a default setup, packets are serviced sequentially by the interface as they arrive. This inevitably means that ACK packets, with essentially no data payload, wait in line while the larger data packets are transferred.

A testable hypothesis formed: If the tiny, practically data-free ACK packets were able to slip in between the larger data packets, this would lead to a more efficient use of available bandwidth. The simplest practical way to implement and test the theory was to set up two queues with different priorities and integrate them into the rule set. The following lines show the relevant parts of the rule set.

ext_if="kue0"

altq on $ext_if priq bandwidth 100Kb queue { q_pri, q_def }
     queue q_pri priority 7
     queue q_def priority 1 priq(default)

pass out on $ext_if proto tcp from $ext_if queue (q_def, q_pri)

pass in on $ext_if proto tcp to $ext_if queue (q_def, q_pri)

Here, the priority-based queue is set up on the external interface, with two subordinate queues. The first subqueue, q_pri, has a high priority value of 7; the other subqueue (q_def) has a significantly lower priority value of 1.

This seemingly simple rule set works by exploiting how ALTQ treats queues with different priorities. Once a connection is assigned to the main queue, ALTQ inspects each packet’s type of service (ToS) field. ACK packets have the ToS delay bit set to low, which indicates that the sender wanted the speediest delivery possible. When ALTQ sees a low-delay packet, and queues of differing priorities are available, it will assign the packet to the higher priority queue. This means that the ACK packets skip ahead of the lower priority queue and are delivered more quickly, which in turn means that data packets are serviced more quickly.

The net result is that a configuration like this provides better performance than a pure FIFO configuration with the same hardware and available bandwidth.[38]

Using a match Rule for Queue Assignment

In the preceding example, the rule set was constructed the traditional way, with the queue assignment as part of the pass rules. However, this is not the only way to do queue assignment. Using match rules (available in OpenBSD 4.6 and later), it is incredibly easy to retrofit this simple priority-queueing regime onto an existing rule set.

If you have been working through the examples in Chapter 3 and Chapter 4, it is likely that your rule set already has a match rule that applies nat-to on your outgoing traffic. If you want to introduce priority-based queueing to your rule set, you can get started by adding the queue definitions and making some minor adjustments to your outgoing match rule.

Start with the queue definition from the preceding example, and adjust the total bandwidth to local conditions, as in this example:

altq on $ext_if priq bandwidth $ext_bw queue { q_pri, q_def }
      queue q_pri priority 7
      queue q_def priority 1 priq(default)

This gives the queues whatever bandwidth allocation you define with the ext_bw macro.

The simplest and quickest way to integrate the queues into your rule set is to edit your outgoing match rule to read something like this:

match out on $ext_if from $int_if:network nat-to ($ext_if) queue (q_def, q_pri)

Reload your rule set, and the priority-queueing regime is applied to all traffic that is initiated from your local network.

You can use the systat command to get a live view of how traffic is assigned to your queues.

$ sudo systat queues

This will give you a live display something like this:

2 users    Load 0.39 0.27 0.30                      Fri Apr  1 16:33:44 2011
QUEUE                BW SCH  PR   PKTS  BYTES  DROP_P  DROP_B QLEN
 BORRO SUSPE  P/S   B/S
q_pri                   priq  7  21705  1392K       0       0    0
               12   803
q_def                   priq     12138  6759K       0       0    0
                9  4620

Looking at the numbers in the PKTS (packets) and BYTES columns, you see a clear indication that the queuing is working as intended.

The q_pri queue has processed a rather large number of packets in relation to the amount of data, just as we expected. The ACK packets do not take up a lot of space. On the other hand, the traffic assigned to the q_def queue has more data in each packet, and the numbers show essentially the reverse packet numbers-to-data size ratio compared to the q_pri queue.

Note

systat is a rather capable program on all BSDs, and the OpenBSD version offers several views that are relevant to PF, so far not found in the systat variants on the other systems. We will be looking at systat again in the next chapter. In the meantime, read the man pages and play with the program. It’s a very useful tool for getting to know your system.

Class-Based Bandwidth Allocation for Small Networks

Maximizing network performance generally feels nice. However, you may find that your network has other needs. For example, it might be important for some traffic—such as mail and other vital services—to have a baseline amount of bandwidth available at all times, while other services—peer-to-peer file sharing comes to mind—should not be allowed to consume more than a certain amount. For addressing these kinds of requirements or concerns, the class-based queue (cbq) discipline offers a slightly larger set of options.

To illustrate how to use cbq, we move on to another example, which builds on the rule sets from previous chapters. This is a small local network, where we want to let the users on the local network connect to a predefined set of services outside their own network and also access a web server somewhere on the local network.

Queue Definition

Here, all queues are set up on the external, Internet-facing interface. This approach makes sense mainly because bandwidth is more likely to be limited on the external link than on the local network. In principle, however, allocating queues and running traffic shaping can be done on any network interface. The following example setup includes a cbq queue for a total bandwidth of 2Mb with six subqueues.

altq on $ext_if cbq bandwidth 2Mb queue { main, ftp, udp, web, ssh, icmp }
      queue main bandwidth 18% cbq(default borrow red)
      queue ftp bandwidth 10% cbq(borrow red)
      queue udp bandwidth 30% cbq(borrow red)
      queue web bandwidth 20% cbq(borrow red)
      queue ssh bandwidth 20% cbq(borrow red) { ssh_interactive, ssh_bulk }
            queue ssh_interactive priority 7 bandwidth 20%
            queue ssh_bulk priority 0 bandwidth 80%
      queue icmp bandwidth 2% cbq

The subqueue main has 18 percent of the bandwidth and is designated as the default queue. This means any traffic that matches a pass rule but is not explicitly assigned to some other queue ends up here. The borrow and red keywords mean that the queue may “borrow” bandwidth from its parent queue, while the system attempts to avoid congestion by applying the RED algorithm.

The other queues follow more or less the same pattern, up to the subqueue ssh, which itself has two subqueues with separate priorities. Here we see a variation on the ACK priority example. Bulk SSH transfers, typically SCP file transfers, are transmitted with a ToS indicating throughput, while interactive SSH traffic has the ToS flag set to low delay and skips ahead of the bulk transfers. The interactive traffic is likely to be less bandwidth-consuming and gets a smaller share of the bandwidth, but receives preferential treatment because of the higher priority value assigned to it.

This scheme also helps the speed of SCP file transfers, since the ACK packets for the SCP transfers will be assigned to the higher priority subqueue.

Finally, we have the icmp queue, which is reserved for the remaining 2 percent of the bandwidth, from the top level. This guarantees a minimum amount of bandwidth for ICMP traffic that we want to pass but that does not match the criteria for being assigned to the other queues.

Rule Set

To make it all happen, we use these pass rules, which indicate which traffic is assigned to the queues and their criteria:

set skip on { lo, $int_if }
pass log quick on $ext_if proto tcp to port ssh queue (ssh_bulk, ssh_interactive)
pass in quick on $ext_if proto tcp to port ftp queue ftp
pass in quick on $ext_if proto tcp to port www queue http
pass out on $ext_if proto udp queue udp
pass out on $ext_if proto icmp queue icmp
pass out on $ext_if proto tcp from $localnet to port $client_out

The rules for ssh, ftp, www, udp, and icmp assign traffic to their respective queues. The last catchall rule passes all other traffic from the local network, lumping it into the default main queue.

A Basic HFSC Traffic Shaper

The simple schedulers we have looked at so far can make for rather efficient setups, but network admins with traffic-shaping ambitions tend to look for a little more flexibility than can be found in the pure priority-based queues or the simple class-based variety. After the gentle queuing introduction we’ve just been through, how does it sound to you if I say there is a regime with flexible bandwidth allocation, guaranteed lower and upper bounds for bandwidth available to each queue, and variable allocations over time—and one that only starts shaping when there is an actual need?

The HFSC queueing algorithm (hfsc in pf.conf terminology) offers all of these, but the added flexibility comes at a price: The setup is a tad more complex than the other types, and tuning your setup for an optimal result can be quite an interesting process.

Queue Definition

Working from the same configuration we altered slightly earlier, we insert this queue definition early in the pf.conf file:

altq on $ext_if bandwidth $ext_bw hfsc queue { main, spamd }
       queue main bandwidth 99% priority 7 qlimit 100 hfsc (realtime 20%,
 linkshare 99%) 
            { q_pri, q_def, q_web, q_dns }
          queue q_pri bandwidth 3% priority 7 hfsc (realtime 0, linkshare 3% red )
          queue q_def bandwidth 47% priority 1 hfsc (default realtime
 30% linkshare 47% red)
          queue q_web bandwidth 47% priority 1 hfsc (realtime 30% linkshare 47% red)
          queue q_dns bandwidth 3% priority 7 qlimit 100 hfsc (realtime
 (30Kb 3000 12Kb), 
                  linkshare 3%)
       queue spamd bandwidth 0% priority 0 qlimit 300 hfsc (realtime 0,
 upperlimit 1%, 
            linkshare 1%)

As you can see, the hfsc queue definitions take slightly different parameters than the simpler disciplines. We start off with this rather small hierarchy by splitting the top-level queue into two. At the next level, we subdivide the main queue into several subqueues, each with a defined priority. All the subqueues have a realtime value set; this is the guaranteed minimum bandwidth allocated to the queue. The optional upperlimit sets a hard upper limit on the queue’s allocation. The linkshare parameter sets the allocation the queue will have available when it is backlogged; that is, it has started to eat into its qlimit allocation.

In case of congestion, each queue by default has a pool of 50 slots, the queue limit (qlimit), to keep packets around when they cannot be transmitted immediately. In this example, the top-level queues main and spamd both have larger than default pools set by their qlimit setting: 100 for main and 300 for spamd. Cranking up queue sizes here means we are a little less likely to drop packets when the traffic approaches the set limits, but it also means that when the traffic shaping kicks in, we will see increased latency for connections that end up in these larger than default pools.

The queue hierarchy here uses two familiar tricks to make efficient use of available bandwidth:

  • It uses a variation of the high- and low-priority mix demonstrated in the earlier pure priority example.

  • We speed up almost all other traffic (and most certainly the web traffic that appears to be the main priority here) by allocating a small but guaranteed portion of bandwidth for name service lookups. For the q_dns queue, we set up the realtime value with a time limit—after 3000 milliseconds, the realtime allocation goes down to 12Kb. This can be useful to speed connections that transfer most of their payload during the early phases.

Rule Set

The next step is to tie the newly created queues into the rule set. Assuming you have a filtering regime in place already, the tie-in becomes amazingly simple by adding a few match rules:

match out on $ext_if from $air_if:network nat-to ($ext_if) queue (q_def, q_pri)
match out on $ext_if from $int_if:network nat-to ($ext_if) queue (q_def, q_pri)
match out on $ext_if proto tcp to port { www https } queue (q_web, q_pri)
match out on $ext_if proto { tcp udp } to port domain queue (q_dns, q_pri)
match out on $ext_if proto icmp queue (q_dns, q_pri)

Here, the match rules once again do the ACK packet speedup trick with the high- and low-priority queue assignment, just as you saw earlier in the pure priority-based system. The only exception is when we assign traffic to our lowest priority queue, where we really do not care to have any speedup at all:

pass in log on egress proto tcp to port smtp rdr-to 127.0.0.1 port spamd queue spamd

This has the intention of slowing down the spammers a little more on their way to our spamd. With a hierarchical queue system in place, systat queues shows the queues and their traffic as a hierarchy, too.

2 users    Load 0.22 0.25 0.25                      Fri Apr  1 16:43:37 2011
QUEUE                 BW SCH  PRIO     PKTS    BYTES   DROP_P   DROP_B QLEN
 BORROW SUSPEN   P/S    B/S
root_nfe0            20M hfsc    0        0        0        0
        0    0                   0      0
 main                19M hfsc    7        0        0        0
       0    0                   0      0
  q_pri             594K hfsc    7     1360    82284        0
        0    0                  11    770
  q_def            9306K hfsc           158    15816        0        0
    0                 0.2     11
  q_web            9306K hfsc           914   709845        0        0
   0                  50  61010
  q_dns             594K hfsc    7      196    17494        0        0
    0                   3    277
 spamd                 0 hfsc    0      431    24159        0        0
    0                   2    174

The root queue is shown as attached to the physical interface—as nfe0 and root_nfe0 in this case. main and its subqueues q_pri, q_def, q_web, and q_dns are shown with their bandwidth allocations and number of bytes and packets passed. The DROP_P and DROP_B columns are where number of packets and bytes dropped, respectively, would appear if we had been forced to drop packets at this stage. The final two columns show live updates of packets per second and bytes per second.

Queueing for Servers in a DMZ

Back in Chapter 5, we set up a network with a single gateway but all externally visible services configured on a separate DMZ network. That way, all traffic to the servers from both the Internet and the internal network has to pass through the gateway.

The network schematic is shown in Figure 7-1, which is identical to Figure 5-2.

With the rule set from Chapter 5 as the starting point, we will be adding some queuing in order to optimize our network resources. The physical and logical layout of the network will not change.

The most likely bottleneck for this network is the bandwidth for the connection between the gateway’s external interface and the Internet at large. The bandwidth elsewhere in our setup is not infinite, of course, but the available bandwidth on any interface in the local network is likely to be less of a limiting factor than the bandwidth actually available for communication with the outside world. For services to be available with the best possible performance, we need to set up the queues so the bandwidth available at the site is made available to the traffic we want to allow.

In our example, it is likely that the interface bandwidth on the DMZ interface is either 100Mb or 1Gb, while the actual available bandwidth for connections from outside the local network is considerably smaller. This consideration shows up in our queue definitions, where you clearly see that the actual bandwidth available for external traffic is the main limitation in the queue setup.

Network with DMZ

Figure 7-1. Network with DMZ

total_ext = 2Mb
total_dmz = 100Mb
altq on $ext_if cbq bandwidth $total_ext queue { ext_main, ext_web, ext_udp, 
    ext_mail, ext_ssh }
queue ext_main bandwidth 25% cbq(default borrow red) { ext_hi, ext_lo }
queue ext_hi priority 7 bandwidth 20%
queue ext_lo priority 0 bandwidth 80%
queue ext_web bandwidth 25% cbq(borrow red)
queue ext_udp bandwidth 20% cbq(borrow red)
queue ext_mail bandwidth 30% cbq(borrow red)
altq on $dmz_if cbq bandwidth $total_dmz queue { ext_dmz, dmz_main, dmz_web, 
     dmz_udp, dmz_mail }
queue ext_dmz bandwidth $total_ext cbq(borrow red) queue { ext_dmz_web, 
     ext_dmz_udp, ext_dmz_mail }
      queue ext_dmz_web bandwidth 40% priority 5
      queue ext_dmz_udp bandwidth 10% priority 7
      queue ext_dmz_mail bandwidth 50% priority 3
queue dmz_main bandwidth 25Mb cbq(default borrow red) queue { dmz_main_hi, 
    dmz_main_lo }
queue dmz_main_hi priority 7 bandwidth 20%
queue dmz_main_lo priority 0 bandwidth 80%
queue dmz_web bandwidth 25Mb cbq(borrow red)
queue dmz_udp bandwidth 20Mb cbq(borrow red)
queue dmz_mail bandwidth 20Mb cbq(borrow red)

Notice that the total_ext bandwidth limitation determines the allocation for all queues where the bandwidth for external connections is available. To make use of the new queuing infrastructure, we need to make some changes to the filtering rules, too. Keep in mind that any traffic you do not explicitly assign to a specific queue is assigned to the default queue for the interface. Thus, it is important to tune your filtering rules as well as your queue definitions to the actual traffic in your network.

The main part of the filtering rules could end up looking like this after adding the queues:

pass in on $ext_if proto { tcp, udp } to $nameservers port domain 
    queue ext_udp
pass in on $int_if proto { tcp, udp } from $localnet to $nameservers 
    port domain
pass out on $dmz_if proto { tcp, udp } to $nameservers port domain 
    queue ext_dmz_udp
pass out on $dmz_if proto { tcp, udp } from $localnet to $nameservers 
    port domain queue dmz_udp
pass in on $ext_if proto tcp to $webserver port $webports queue ext_web
pass in on $int_if proto tcp from $localnet to $webserver port $webports
pass out on $dmz_if proto tcp to $webserver port $webports queue ext_dmz_web
pass out on $dmz_if proto tcp from $localnet to $webserver port $webports 
    queue dmz_web
pass in log on $ext_if proto tcp to $mailserver port smtp
pass in log on $ext_if proto tcp from $localnet to $mailserver port smtp
pass in log on $int_if proto tcp from $localnet to $mailserver port $email
pass out log on $dmz_if proto tcp to $mailserver port smtp queue ext_mail
pass in on $dmz_if from $mailserver to port smtp queue dmz_mail
pass out log on $ext_if proto tcp from $mailserver to port smtp 
    queue ext_dmz_mail

Notice that only traffic that will pass either the DMZ interface or the external interface is assigned to queues. In this configuration, with no externally accessible services on the internal network, queuing on the internal interface would not make much sense, since it is likely the part of our network with the least restrictions on available bandwidth.

Using ALTQ to Handle Unwanted Traffic

So far, we have focused on queuing as a method to make sure specific kinds of traffic are let through as efficiently as possible given the conditions that exist in and around your network. Now, we will look at two examples that present a slightly different approach to identify and handle unwanted traffic. These examples demonstrate some queuing-related tricks you can use to keep miscreants in line.

Overloading to a Tiny Queue

Think back to the Turning Away the Brutes section (page 86), where we used a combination of state-tracking options and overload rules to fill up a table of addresses for special treatment. The special treatment we demonstrated in Chapter 6 was to cut all connections, but it is equally possible to assign overload traffic to a specific queue instead.

Consider this rule from our earlier class-based bandwidth example earlier in the chapter:

pass log quick on $ext_if proto tcp to port ssh flags S/SA 
    keep state queue (ssh_bulk, ssh_interactive)

We could add state-tracking options, like this:

pass log quick on $ext_if proto tcp to port ssh flags S/SA 
    keep state (max-src-conn 15, max-src-conn-rate 5/3, 
    overload <bruteforce> flush global) queue (ssh_bulk, ssh_interactive)

Then we could make one of the queues slightly smaller:

queue smallpipe bandwidth 1kb cbq

And then assign traffic from miscreants to the small-bandwidth queue with the following rule:

pass inet proto tcp from <bruteforce> to port $tcp_services queue smallpipe

It might also be useful to supplement rules like these with table-entry expiry as described, as described in Tidying Your Tables with pfctl in Tidying Your Tables with pfctl.

Queue Assignments Based on Operating System Fingerprint

Chapter 6 covered several ways to use spamd to cut down on spam. If running spamd is not an option in your environment, you can use a queue and rule set based on the common knowledge that machines that send spam are likely to run a particular operating system.

PF has a fairly reliable operating system fingerprinting mechanism, which detects the operating system at the other end of a network connection based on characteristics of the initial SYN packets at connection setup. The following may be a simple substitute for spamd if you have determined that legitimate mail is highly unlikely to be delivered from systems that run that particular operating system:

pass quick proto tcp from any os "Windows" to $ext_if port smtp queue smallpipe

Here, email traffic originating from hosts that run a particular operating system get no more than 1 kilobit of your bandwidth, with no borrowing.



[37] This really dates the book, I know. In a few years, these numbers will seem quaint.

[38] Daniel’s article about this version of his setup, at http://www.benzedrine.cx/ackpri.html, contains a more detailed analysis.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.218.45.80