Tactical TE Design

In the tactical model, you build TE LSPs to work around congestion. The key things to deal with in this model are as follows:

  • When you decide to build TE LSPs

  • Where you put them

  • When you take them down

  • What nifty TE features you can use

The following sections sift through all four considerations.

One thing that this section assumes is that you have already configured MPLS TE on every link in your network, as well as in your IGP. In other words, you have everything you need to use MPLS TE tunnels already in place, except for the TE tunnels themselves. When you first decide to roll out MPLS, even if it's not for TE but for VPNs or some other reason, it can't hurt to enable MPLS TE everywhere (IGP and on interfaces) so that when you do need to use it in a tactical manner, it's right there waiting for you.

When You Decide to Build TE LSPs

Consider the sample network shown in Figure 9-1—specifically, the OC-12 link between St. Louis and Denver. Two cases in which that link can be overloaded for a significant amount of time are

  • A link failure somewhere else in the network (probably either Boston→Seattle or Dallas→San Diego) pushes more traffic than you'd planned for onto the St. Louis→Denver link. The link failure can be short (a line card crashed), or it can be long (a fiber cut that will take days to fix).

  • Something on the other side of that link becomes a major traffic draw. Perhaps you have a customer who has just turned up a major streaming media service, and they're sending a lot of traffic from the East Coast to the West Coast. Maybe there's major breaking news. Maybe a big Denial of Service attack is headed toward the West Coast. All these things are traffic draws.

Suppose a large amount of traffic is going from St. Louis to Denver, as shown in Figure 9-5. It is routed across the St. Louis→Denver link even if the traffic being sent down the link exceeds the link capacity.

Figure 9-5. Excess Traffic from St. Louis to Denver


If you discover a link that's full, how long do you wait before working around it with TE LSPs? You don't want TE LSPs to be built the instant a link starts queuing and dropping packets. And if you have an outage that you know will last only a little while (a router reboot or crash, for example), it's not worth working around.

But if you have a significant traffic disruption for a long time (perhaps an hour or more), you should consider using TE-LSPs to see if you temporarily clear up the problem.

There is no hard and fast minimum time you need to wait before using TE-LSPs. It's up to you. You should start applying them to failures of an hour or more, get a feel for how long it takes you to apply them, and then use that to factor into a policy specific to your network. Although it takes far less than an hour to build a TE-LSP (it can be done in seconds!), being too responsive to short-term problems can make you trigger-happy and twitchy, and that's no way to run a network.

Where You Put TE LSPs

By the time you get around to determining where to put TE LSPs, you've already decided there's a problem you're going to work around. So now the question becomes “Where do I place TE-LSPs so that I don't create another problem?”

Think about the St. Louis→Denver link shown in Figure 9-1 again. If the load on this link increases to 900 Mbps for a significant amount of time, it's clear that you want to push that traffic somewhere else, because it's an OC-12 link, which has an IP capacity of about only 600 Mbps. So you're dropping 33 percent of your traffic, which is a bad thing.

The first thing you might think of doing is splitting the traffic along two paths. You obviously want to keep using the St. Louis→Denver link directly for as much traffic as you can, because when it's not congested, that's the best path from St. Louis to Denver.

So what you can do is build two TE LSPs. One goes directly across the St. Louis→Denver path, and the other goes St. Louis→Dallas→San Diego→Denver. The load is split roughly equally between the two LSPs, leading to 450 Mbps across the St. Louis→Denver path and 450 Mbps across the St. Louis→Dallas→San Diego→Denver path.

Your network then ends up looking like Figure 9-6.

Figure 9-6. Two LSPs Between St. Louis and Denver, Each Carrying 450 Mbps


Why Two LSPs?

Recall from Chapter 5, “Forwarding Traffic Down Tunnels,” that you never load-share between a TE tunnel path and an IGP path for a tunnel's tail. So if you want to use a TE LSP to get to a particular destination, you also need to build a tunnel across the IGP shortest path so that you'll use both paths.


For the sake of simplicity, assume that you don't have a significant amount of traffic on the Dallas→San Diego link, so it's safe to put 450 Mbps of traffic there for a a short time. But suppose that the load increases on the Dallas→San Diego link so that you don't have 450 Mbps of spare capacity there. What do you do then? One thing to do is build a total of three LSPs from St. Louis to Denver—the first one across the St. Louis→Denver path, the second through St. Louis→Dallas→San Diego→Denver, and the third through St. Louis→ Chicago→New York→Boston→Seattle→Denver. This splits the traffic across the three paths, sending something like 300 Mbps across each LSP, as shown in Figure 9-7.

Figure 9-7. Three LSPs Between St. Louis and Denver, Each Carrying 300 Mbps


Considering the two-LSP solution, using a 600-Mbps LSP and a 300-Mbps LSP isn't feasible, because putting 600 Mbps of traffic across a link with 600 Mbps of capacity is not a good idea. If even a small burst greater than 600 Mbps occurs, you'll have a lot of delay to deal with. A 450/450 split is better, because no link is full. In the three-LSP case, you can split the traffic into three 300-Mpbs LSPs; either solution might be appropriate given other bandwidth demands on the network.

There are four ways you can control the ratio of traffic distributed between the two LSPs:

  1. For equal-cost forwarding, don't reserve any bandwidth or change any administrative weight. Traffic is shared evenly between both LSPs.

  2. Have each tunnel reserve bandwidth in accordance with how much traffic is actually supposed to flow across the link. So on each tunnel interface, configure tunnel mpls traffic-eng bandwidth 450000 to have the tunnel reserve 450 Mbps of bandwidth.

  3. Have each tunnel reserve bandwidth in accordance with the ratio of traffic share you want them to have. So have each tunnel reserve 1 Kbps, or 2 Kbps, or 47 Mbps, or whatever; it doesn't matter. Because all you're interested in is the ratio, reserve whatever bandwidth you want.

  4. Have each tunnel reserve no bandwidth, but set the load-share ratio on the tunnel with tunnel mpls traffic-eng load-share. In this example, you'd configure tunnel mpls traffic-eng load-share 1 on each tunnel.

With only two equal-cost tunnels, it doesn't matter what method you pick. But if you're going to be doing unequal-cost load balancing between two or more tunnels, you should use solutions 2, 3, or 4. Use solution 2 (reserving actual bandwidth) if you want a quick way to look at a midpoint and see how much bandwidth each LSP should be carrying. Use solutions 3 or 4 if you don't want to have to worry about accidentally overreserving bandwidth but just want to control the load-share ratios. (Sometimes this is done by controlling bandwidth ratios, and sometimes this is done by directly controlling load-share ratios.)

Laying out LSPs of 300 Mbps and 450 Mbps in a network whose bottleneck is 600 Mbps links can be dangerous; you might still have congestion problems even after you've shifted data around. One way to deal with this is to add more bandwidth, but that takes time. Another way to deal with this problem is to try to optimize your tunnel placement to avoid congestion as much as possible.

Optimizing Tunnel Placement: Step 1

Whereas distributing the traffic over multiple LSPs overcomes the problem of an overloaded link, it might introduce a new, although less-severe, problem—suboptimal forwarding.

In the scenario with three LSPs, consider the East Coast traffic destined for Seattle and San Diego that enters St. Louis. Because two of the three LSPs have Seattle or San Diego as LSP midpoints, any traffic from St. Louis that goes over these LSPs and that is actually destined for Seattle or San Diego first must get to Denver (the tail of all three TE tunnels) before heading back west to Seattle and San Diego.

For example, if traffic were destined for a Web-hosting center in Seattle, and it followed the IGP path from St. Louis, it would go St. Louis→Denver→Seattle. But if you install the three LSPs mentioned in the preceding section, as a result of autoroute, anything destined for Seattle from St. Louis will go over these LSPs to Denver, after which they follow the IGP path to Seattle. Because of load sharing, approximately one-third of your Seattle-bound traffic will go St. Louis→Chicago→New York→Boston→Seattle→Denver over the tunnel and then Denver→Seattle via IGP-based forwarding. This won't cause any loops, because you're following an LSP to Denver and IP routing from Denver back to Seattle.

If you've got lots of extra capacity on the Seattle→Denver link, maybe this isn't something you have to worry about. You still might be concerned with delay. Traffic going Seattle→Denver→Seattle encounters about 50 ms additional delay, which can be substantial if you're talking about real-time traffic.

However, this path is certainly suboptimal.

What Is Optimality?

In order to discuss optimal LSP placement, it's important to first define optimality. Optimality means different things to different people. In this section, optimization is defined as “ensuring that your network traffic takes the path with the lowest possible delay while maximizing the available bandwidth along a path.”

Sometimes your delay and bandwidth constraints are mutually exclusive. If you have a choice between a low-delay, low-bandwidth path and a high-delay, high-bandwidth path, what do you do? The choice is up to you. It depends mostly on the type of traffic you're taking on. Generally, though, because most network traffic is TCP and TCP is more sensitive to packet loss than to delay, you'd choose the high-bandwidth, high-delay path. But if you're building a VoIP network, your choice might be different. Of course, your goal should be to never have to make that choice. When you do have to make it, MPLS TE can help you optimize network performance so that your resource shortage affects your users as little as possible.


Of course, you have the same problem if you have traffic destined for San Diego that will go St. Louis→Dallas→San Diego→Denver→San Diego.

How do you improve this?

Take your three TE LSPs and terminating them not on the node at the other end of the St. Louis→Denver link, but instead at the entry point to the Western region. So take the three LSPs already built, and change them. Instead of St. Louis→Chicago→New York→ Boston→Seattle→Denver, make that LSP St. Louis→Chicago→New York→ Boston→Seattle. Do the same with the St. Louis→Dallas→San Diego→Denver LSP: Have it run St. Louis→Dallas→San Diego instead. When traffic comes out of the TE LSP as it enters the West region, the router that is the tail for the TE LSP receives an IP packet and routes it appropriately.

The St. Louis→Denver LSP, of course, needs to stay the same.

Optimizing Tunnel Placement: Step 2

There's another problem that you can solve, too. Even after you make the adjustments described in Step 1, you still have some suboptimal packet paths.

Take a closer look at the 900-Mbps traffic going St. Louis→Denver. There are only four places that 900 Mbps could have come from:

  • Originated from St. Louis

  • Coming from Chicago

  • Coming from Raleigh

  • Coming from Dallas

Traffic from Dallas to Denver isn't likely to cross St. Louis, though, because Dallas has a more attractive path through San Diego to anything in the West region. So that leaves three possible ways this Denver-bound traffic (or, indeed, any traffic) could have come into St. Louis.

For the sake of simplicity, assume that 300 Mbps of the 900 Mbps Denver-bound traffic originates from St. Louis, 300 Mbps comes from Raleigh, and 300 Mbps comes from Chicago. Because traffic arriving in St. Louis will be pretty evenly distributed across the three LSPs, this means that 100 Mbps of the Chicago-sourced traffic will go over the St. Louis→Chicago→New York→Boston→Seattle path. Put another way, the path for that traffic is Chicago→St. Louis→Chicago→New York→Boston→Seattle. Again, there's no danger of routing loops, because you're forwarding in an LSP, but things are certainly suboptimal.

How do you get around this suboptimality? By building TE LSPs farther away from the source of the congestion.

Assume the following:

  • Chicago sends 300 Mb of traffic to St. Louis:

    - 100 Mb is destined for Denver.

    - 100 Mb is destined for Seattle.

    - 100 Mb is destined for San Diego.

  • Raleigh sends 300 Mb of traffic to St. Louis:

    - 100 Mb is destined for Denver.

    - 100 Mb is destined for Seattle.

    - 100 Mb is destined for San Diego.

  • St. Louis originates 300 Mb of traffic:

    - 100 Mb is destined for Denver.

    - 100 Mb is destined for Seattle.

    - 100 Mb is destined for San Diego.

If you build TE LSPs between the following points, you can individually control each of these 100 Mb traffic streams and place them anywhere in the network you want:

  • Chicago to Denver

  • Chicago to Seattle

  • Chicago to San Diego

  • Raleigh to Denver

  • Raleigh to Seattle

  • Raleigh to San Diego

  • St. Louis to Denver

  • St. Louis to Seattle

  • St. Louis to San Diego

NOTE

In general, the farther you move from the point of congestion, the more exact control you have over your network traffic. The ultimate case of this is a full mesh of TE LSPs between all routers, or at least all routers with a given role in the network. For example, if you have a full mesh of TE LSPs between all the WRs in the network, you have full control over all traffic entering your WAN, at the level of granularity equal to the amount of traffic between any two WRs. See the sections “Online Strategic TE Design” and “Offline Strategic TE Design” for more information on a full mesh of TE LSPs and how to manage it.


As soon as you understand this methodology, it's simple to apply. Based on real-life scenarios, as soon as you discover the source of congestion, you can make modifications in a matter of minutes!

By now, hopefully you've seen some of the power of the tactical TE model; however, you still need to address a few more issues.

When to Remove Your Tactical TE Tunnels

Now that you've built all these nifty TE LSPs to work around problems, are you done? Nope. Because your problem was caused by a temporary event, the problem you're trying to solve will eventually go away. Just like your mother periodically harangued you to clean your room, so too should you periodically clean your network. In fact, two cases when you should consider taking TE LSPs down are

  • When they're no longer needed— The problem they're solving doesn't exist anymore.

  • When they're causing problems— A traffic spike somewhere else in the network collides with traffic in a TE LSP.

Determining When TE LSPs Are No Longer Needed

How do you tell if TE LSPs are no longer needed? It's pretty simple. Take the example of the LSPs that were created to work around the congested St. Louis→Denver link. You know that that link has a nominal capacity of 600 Mbps. You know that three TE LSPs are in place to work around the congestion. If the aggregate throughput of these three LSPs falls to significantly less than 600 Mbps (low enough that high traffic rates don't induce delay), you can safely consider bringing down the TE LSPs. However, this requires doing a few things:

  • You need to constantly monitor the network to see what new TE LSPs have been installed and whether existing TE LSPs are still useful.

  • You need to remember why TE LSPs were put up in the first place!

Tracking the existence of TE LSPs is easy enough, because TE LSPs are interfaces, and you will see them with SNMP's IF-MIB or via show interfaces and other such CLI commands.

Remembering why the TE LSPs were put up in the first place is also straightforward, surprisingly enough. Because TE-LSPs are interfaces, you can put a description on them, like any other interface. This description is carried in the RSVP setup information as an LSP is set up.

Determining When TE LSPs Are Causing Problems

TE LSPs can induce artificial congestion—link congestion because of traffic that's there only because some TE LSPs are traversing a non-shortest path across that link.

Remember that the sample network has a TE LSP from St. Louis→Dallas→San Diego that carries 300 Mbps of traffic. But the path that these LSPs follow is made up of OC-12s that are just half-filled. If 400 Mbps of additional traffic starts leaving Atlanta, destined for San Jose, that traffic crosses the Dallas→San Diego link.

If you discover a congested link, the first thing you need to do is check to see if any TE LSPs are crossing that link, and if so, how much bandwidth they're consuming.

If the St. Louis→Denver LSPs haven't been removed, but the Dallas→San Diego link utilization shoots up to 700 Mbps, the first thing to do is check on the Dallas router to see how much the St. Louis→Dallas→San Diego LSP is contributing to the congestion.

You can check this by first determining which LSPs are traversing a given link, as demonstrated in Example 9-1.

Example 9-1. Determining the LSPs Traversing a Link
STLouisWR1#show mpls traffic-eng tunnels interface out pos5/0 brief
Signalling Summary:
    LSP Tunnels Process:            running
    RSVP Process:                   running
    Forwarding:                     enabled
    Periodic reoptimization:        disabled
    Periodic auto-bw collection:    disabled
TUNNEL NAME                      DESTINATION      UP IF     DOWN IF   STATE/PROT
STLouisWR1_t6                    192.168.1.6      -         PO5/0     up/up

Next, you can check one of two things:

  • StLouisWR1 (the headend of this tunnel) to see how much traffic is going down the tunnel

  • The midpoint

Checking the traffic on the headend is simple—as you know by now, TE tunnels are interfaces. show interface tunnel1 on the headend router tells you how much traffic is going down the TE tunnel.

You can check the midpoint by finding the incoming label for a particular tunnel and then keeping an eye on the traffic coming in with that label, as demonstrated in Example 9-2.

Example 9-2. Checking the Midpoint for Traffic Flow
DalWR2#show mpls traffic-eng tunnels name-regexp StLouisWR1 | include InLabel
  InLabel  : POS1/2, 17

DalWR2#show mpls forwarding-table labels 17
Local  Outgoing    Prefix            Bytes tag  Outgoing   Next Hop
tag    tag or VC   or Tunnel Id      switched   interface
17     Pop tag     192.168.1.5 8 [1] 1939835    PO1/1      point2point

Keep an eye on the Bytes tag switched value, and see if it increases over time. On distributed systems (such as the GSR), you might not see the counters change for several seconds and then make quite a jump; this is normal.

Checking at the headend is preferred, because it is easier, but either method works.

Assume that the St. Louis→Dallas→San Diego LSP is still carrying 300 Mbps of traffic. This is 300 Mbps of artificial congestion, and if you remove it, the Dallas→San Diego link will carry only 400 Mbps of traffic and will be OK. 400 Mbps on an OC-12 is still a 67 percent traffic load, but 67 percent load is better than 117 percent load!

You now have a tough choice to make. How do you solve the St. Louis→Denver problem and the brand-new Dallas→San Diego problem? You can probably attack the problem further upstream and break the traffic into small-enough streams that you can balance them properly. At this point, 1.3 Gbps (St. Louis→Denver 900 Mbps + Dallas→San Diego 400 Mbps) needs to be balanced over a total of 1.8 Gbps of bandwidth (OC-12's Boston→ Seattle, St. Louis→Denver, and Dallas→San Diego), so you're running pretty close to the edge already. You might be able to further optimize this problem, though. Use TMS or your favorite traffic-matrix generation tool (see Chapter 10) to figure out what traffic is crossing those links and where it's coming from, and place the LSPs in a way that makes sense. If no single traffic flow can be put onto a single link, split the flow into multiple LSPs.

Hacky? Sure. Difficult to track? Yeah, probably. But this is a workaround to a problem that exists until you get more bandwidth. If you have an OC-192 St. Louis→Denver coming in three days from now, it might not be worth doing all this work in the interim. Or it might be. It depends on how badly your customers are affected and what this means to your business.

On the other hand, if your St. Louis→Denver OC-192 isn't due in for six months, and you can't get any additional capacity until then, clearly the duct-tape approach is far better than nothing.

Remember, one of the things MPLS TE buys you is the capability to prolong the life of your existing capacity, thus putting off the amount of time until you have to buy new circuits. Every month you save on buying capacity can equal tens or hundreds of thousands of dollars. Of course, nobody ever said it was easy.

If placing all these LSPs by hand seems too complex, skip ahead to the discussion of full-mesh TE models; they can manage a lot of that complexity.

Useful TE Features for Tactical TE

MPLS TE has lots of neat features, as covered throughout this book. There's auto bandwidth, Fast Reroute, DiffServ-aware TE, forwarding adjacency, and lots more. But not all of these features are appropriate for the tactical model. Table 9-2 lists the major TE features that might be suitable for a tactical TE deployment.

Table 9-2. TE Feature Recommendations for Tactical TE Deployment
Feature Should You Use It? Reference
Auto bandwidth Yes. Chapter 5
Autoroute Possibly. It depends on what your needs are. If you need to steer traffic around that's only destined for a particular BGP neighbor, you can use static routes. Autoroute is easier to scale, because you don't have to manage a bunch of static routes, but make sure that what autoroute does is what you want. Chapter 5
Forwarding adjacency Yes, but pay attention. Enabling forwarding adjacency changes not only the traffic patterns for traffic from the TE headend router, but it also influences the path decision other routers make, which can change your traffic flow. FA can solve some problems, but it can create others if you're not careful. Chapter 5
Fast Reroute No. It doesn't make sense to FRR-protect every link in your TE cloud if you're not going to be running TE LSPs over most of them. You'd need to configure FRR everywhere, and the nature of tactical TE is that you don't use TE most of the time, so you'd have a large FRR infrastructure deployed that wouldn't get used. Chapter 7
DiffServ-aware TE No. Because you might have both IP and MPLS TE traffic in the same queue, administratively reserving space from a subpool can give you a false sense of security. Chapter 6
Administrative weight Probably not. In the tactical model, you'll most likely build explicit paths, in which case admin-weight does you no good, but if you see a use for it in your application, by all means go ahead. Chapter 3
Link attributes and LSP affinities See administrative weight. The same concerns apply here. Chapter 3

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.14.15.94