Link protection can be divided into four sections:
Prefailure configuration
Failure detection
Connectivity restoration
Post-failure signalling
The following sections examine each of these topics in detail.
You need to note one subtle point: Link protection, like all other TE applications, is unidirectional. In order to protect LSPs flowing from A to B across a link, you need protection in the A→B direction. To protect LSPs flowing from B to A across the same link, you also need to build a protection LSP in the B→A direction. Most of the configurations or concepts presented in this chapter are unidirectional unless specified otherwise.
There are two places where you have to configure things related to link protection:
At the headend on the tunnel interface you want to protect
At the PLR
You might not want to protect all your primary tunnels. Remember that link protection means that, when a link goes down, the LSPs that would have gone over that link are instead sent across some other path in the network. If you end up protecting too much traffic (protecting an OC-48 with an LSP that goes down an OC-3, for example), you might end up making things not much better than if there was no protection.
That's why TE tunnels don't request protection by default. In order to have a TE tunnel request protection, you need to explicitly configure the tunnel to ask for protection. You do this using the tunnel mpls traffic-eng fast-reroute configuration under the primary tunnel interface at the headend, as shown in Example 7-1.
7200a#configure terminal Enter configuration commands, one per line. End with CNTL/Z. 7200a(config)#interface tunnel 1 7200a(config-if)#tunnel mpls traffic-eng fast-reroute |
When you configure tunnel mpls traffic-eng fast-reroute, the headend sets the SESSION_ATTRIBUTE flag 0x01 (“Local protection desired”) in the PATH message for that tunnel.
This can be observed in the output of debug ip rsvp path detail on the 7200a router, as demonstrated in Examples 7-2 and 7-3.
Example 7-2 shows the output before tunnel mpls traffic-eng fast-reroute is configured.
7200a#debug ip rsvp path detail *Oct 16 14:38:56.460: SESSION_ATTRIBUTE type 7 length 24: *Oct 16 14:38:56.460: Setup Prio: 7, Holding Prio: 7 *Oct 16 14:38:56.460: Flags: SE Style *Oct 16 14:38:56.460: Session Name: 7200a_t1 |
The only flag set is SE Style.
Example 7-3 shows the output after tunnel mpls traffic-eng fast-reroute is configured on the primary tunnel interface.
7200a#debug ip rsvp path detail *Oct 16 14:40:57.124: SESSION_ATTRIBUTE type 7 length 24: *Oct 16 14:40:57.124: Setup Prio: 7, Holding Prio: 7 *Oct 16 14:40:57.124: Flags: Local Prot desired, Label Recording, SE Style *Oct 16 14:40:57.124: Session Name: 7200a_t1 |
After tunnel mpls traffic-eng fast-reroute is configured, three flags are now set, as detailed in Table 7-2.
Flag | Description |
---|---|
Local Prot desired | This is how the headend indicates to any downstream nodes that it would like local protection of some sort (either link or node) for this LSP. |
Label Recording | Not used in link protection, but as you will see later, it is used in node protection. |
SE Style | Stays the same as before tunnel mpls traffic-eng fast-reroute was configured. |
Enabling FRR at the PLR involves two things:
Creating a backup tunnel to the NHop
Configuring the protected link to use the backup tunnel upon failure
In local protection, a backup tunnel needs to be built from the PLR to the MP. In link protection, the MP is the node on the other end of the protected link, as depicted in Figure 7-8. The primary LSP in Figure 7-8 goes from 7200a to 7200c. The midpoints are 12008a and 12008c. The link that is being protected in this example is the POS link that goes between 12008a and 12008c in the 12008a→12008c direction.
If this link fails, the LSPs that are traversing this link need to be protected and rerouted locally to 12008c, which is the downstream end of this link.
You need to be sure to configure the backup tunnel so that it doesn't attempt to cross the link it's protecting. This would defeat the entire concept of local protection. You can use one of two methods to build the path for this backup tunnel:
Use an explicitly routed path option to route the backup tunnel away from the link it is protecting.
Use the exclude-address knob to tell a backup LSP to use any path it can find that avoids the link it's protecting.
NOTE
It is important to understand that, for link protection to work, the backup tunnel must originate at the router on the upstream end of the protected link and terminate on the router at the downstream end of the protected link. It is not important what path the protected link takes, as long as the protection LSP doesn't try to go out the link it's trying to protect. You might want to provision the backup LSP along a path that has available bandwidth comparable to the resources it's protecting.
Example 7-4 shows how to build the path for the backup tunnel using the exclude-address syntax.
12008a#show running-config interface tunnel1 interface Tunnel1 description Link Protection Tunnel (Backup) ip unnumbered Loopback0 no ip directed-broadcast tunnel destination 11.11.11.11 tunnel mode mpls traffic-eng tunnel mpls traffic-eng path-option 5 explicit name nhop end 12008a#show ip explicit-paths name nhop PATH link-protection-tunnel (loose source route, path complete, generation 8) 1: exclude-address 10.0.5.11 |
Example 7-4 excludes 12008c's POS1/1 interface address (10.0.5.11) to ensure that the protected link 12008a→12008c is not included in CSPF for the backup tunnel on 12008a. As you can see from Example 7-5, the backup tunnel comes up. If you examine the ERO, you can see that the protected tunnel is not a part of it.
12008a#show mpls traffic-eng tunnel tunnel1 Name: Link Protection Tunnel (Backup) (Tunnel1) Destination: 11.11.11.11 Status: Admin: up Oper: up Path: valid Signalling: connected path option 5, type explicit nhop (Basis for Setup, path weight 3) Config Parameters: Bandwidth: 0 kbps (Global) Priority: 7 7 Affinity: 0x0/0xFFFF Metric Type: TE (default) AutoRoute: disabled LockDown: disabled Loadshare: 0 bw-based auto-bw: disabled(0/60) 0 Bandwidth Requested: 0 InLabel : - OutLabel : POS1/1, 38 RSVP Signalling Info: Src 5.5.5.5, Dst 11.11.11.11, Tun_Id 1, Tun_Instance 93 RSVP Path Info: My Address: 5.5.5.5 Explicit Route: 10.0.11.10 10.0.9.16 10.0.7.11 11.11.11.11 |
Just configuring the backup tunnel and calling the explicit path “backup” does not make traffic go over this tunnel when the protected link goes down. After you build the backup tunnel, you need to tell an interface to use that tunnel for protection.
How do you tie the protection of interface POS 1/0 (the outgoing interface of the protected link on 12008a) to this backup tunnel? You use the configuration highlighted in Example 7-6 on that interface.
12008a#show running-config interface pos1/0 interface POS1/0 ip address 10.0.5.5 255.255.255.0 no ip directed-broadcast mpls traffic-eng tunnels mpls traffic-eng backup-path Tunnel1 ip rsvp bandwidth 155000 155000 End |
The command mpls traffic-eng backup-path Tunnel1 protects 12008a's POS1/0 interface with Tunnel1.
The backup tunnel is now signalled and ready to go. If failure is detected on the protected link, all the traffic from LSPs requesting protection is forwarded over the backup tunnel. Until then, the backup LSP does not carry any traffic. Example 7-7 shows the output of show mpls traffic-eng fast-reroute database before any failure has occurred but after the backup tunnel is ready to protect the primary LSP.
12008a#show mpls traffic-eng fast-reroute database Tunnel head fast reroute information: Prefix Tunnel In-label Out intf/label FRR intf/label Status LSP midpoint frr information: LSP identifier In-label Out intf/label FRR intf/label Status 4.4.4.4 1 [1520] 16 PO1/0:33 Tu1:33 ready |
Here is a detailed explanation of the key fields in this output:
LSP identifier— This column shows the source address of the primary LSP being protected (4.4.4.4), its tunnel ID (1), and its LSP ID (1520).
In-label— This column shows the incoming label that is being protected. This label corresponds to the tunnel listed in the LSP identifier column.
Out intf/label— This column shows the interface of the protected link (PO1/0) and the tunnel's outgoing label on that interface (33).
FRR intf/label— This column shows the backup tunnel that is being used to back up the primary tunnel. When POS1/0 fails, the packet that would have gone out POS1/0 with label 33 is instead sent down the backup tunnel, but with that same label value.
Status— This column shows ready, meaning that FRR protection is ready to back up the primary tunnel, and that a failure has not yet occurred. If the Status field shows active, the protected link is currently down, and protection is active (sometimes referred to as in place). The Status field can also show partial, meaning that all the pieces required to back up a primary LSP are not yet available. A tunnel that has requested protection is not protected, generally because the interface is not configured with mpls traffic-eng backup-path, or the backup tunnel is not up.
If the backup tunnel exists and is up, the PLR (12008a) responds by setting the RRO subobject IPv4 flags to 0x01, indicating “Local protection available.” This is specified in RFC 3209. You can see this from the output of debug ip rsvp resv detail on 7200a, as demonstrated in Example 7-8.
7200a#debug ip rsvp resv detail *Oct 17 08:38:04.216: RSVP: version:1 flags:0000 type:RESV cksum:0000 ttl:255 reserved:0 length:152 *Oct 17 08:38:04.220: SESSION type 7 length 16: *Oct 17 08:38:04.220: Tun Dest 12.12.12.12 Tun ID 1 Ext Tun ID 4.4.4.4 *Oct 17 08:38:04.220: HOP type 1 length 12: *Oct 17 08:38:04.220: Hop Addr: 10.0.3.5 LIH: 0x0 *Oct 17 08:38:04.220: TIME_VALUES type 1 length 8 : *Oct 17 08:38:04.220: Refresh Period (msec): 30000 *Oct 17 08:38:04.220: STYLE type 1 length 8 : *Oct 17 08:38:04.220: RSVP_SE_OPTION *Oct 17 08:38:04.220: FLOWSPEC type 2 length 36: *Oct 17 08:38:04.220: version = 0 length in words = 7 *Oct 17 08:38:04.220: service id = 5, service length = 6 *Oct 17 08:38:04.220: tspec parameter id = 127, tspec flags = 0, tspec length = 5 *Oct 17 08:38:04.220: average rate = 0 bytes/sec, burst depth = 1000 bytes *Oct 17 08:38:04.220: peak rate = 2147483647 bytes/sec *Oct 17 08:38:04.220: min unit = 0 bytes, max unit = 0 bytes *Oct 17 08:38:04.220: FILTER_SPEC type 7 length 12: *Oct 17 08:38:04.220: Tun Sender: 4.4.4.4, LSP ID 1520 *Oct 17 08:38:04.220: LABEL type 1 length 8 : 00000012 *Oct 17 08:38:04.220: RECORD_ROUTE type 1 length 44: *Oct 17 08:38:04.220: 10.0.5.5/32, Flags:0x1 (Local Prot Avail/to NHOP) *Oct 17 08:38:04.220: Label record: Flags 0x1, ctype 1, incoming label 16 *Oct 17 08:38:04.220: 10.0.17.11/32, Flags:0x0 (No Local Protection) *Oct 17 08:38:04.220: Label record: Flags 0x1, ctype 1, incoming label 33 *Oct 17 08:38:04.220: 10.0.17.12/32, Flags:0x0 (No Local Protection) *Oct 17 08:38:04.220: *Oct 17 08:38:04.220: RSVP 4.4.4.4_1520-12.12.12.12_1: RESV message arrived from 10.0.3.5 on POS3/0 |
The highlighted output in Example 7-8 also shows that protection is available to the PLR's NHop.
Entering show ip rsvp sender detail on 7200a and 12008a, as shown in Example 7-9, is useful to get a quick synopsis of the FRR states of the two routers.
7200a#show ip rsvp sender detail PATH: Tun Dest 12.12.12.12 Tun ID 1 Ext Tun ID 4.4.4.4 Tun Sender: 4.4.4.4, LSP ID: 1520 Path refreshes being sent to NHOP 10.0.3.5 on POS3/0 Session Attr:: Setup Prio: 7, Holding Prio: 7 Flags: Local Prot desired, Label Recording, SE Style Session Name: 7200a_t1 ERO: 10.0.3.5 (Strict IPv4 Prefix, 8 bytes, /32) 10.0.5.11 (Strict IPv4 Prefix, 8 bytes, /32) 10.0.17.12 (Strict IPv4 Prefix, 8 bytes, /32) 12.12.12.12 (Strict IPv4 Prefix, 8 bytes, /32) RRO: Empty Traffic params - Rate: 0G bits/sec, Max. burst: 1K bytes Fast-Reroute Backup info: Inbound FRR: Not active Outbound FRR: No backup tunnel selected 12008a#show ip rsvp sender detail PATH: Tun Dest 12.12.12.12 Tun ID 1 Ext Tun ID 4.4.4.4 Tun Sender: 4.4.4.4, LSP ID: 1520 Path refreshes arriving on POS2/1 from PHOP 10.0.3.4 Path refreshes being sent to NHOP 10.0.5.11 on POS1/0 Session Attr:: Setup Prio: 7, Holding Prio: 7 Flags: Local Prot desired, Label Recording, SE Style Session Name: 7200a_t1 ERO: 10.0.5.11 (Strict IPv4 Prefix, 8 bytes, /32) 10.0.17.12 (Strict IPv4 Prefix, 8 bytes, /32) 12.12.12.12 (Strict IPv4 Prefix, 8 bytes, /32) RRO: 10.0.3.4/32, Flags:0x0 (No Local Protection) Traffic params - Rate: 0G bits/sec, Max. burst: 1K bytes Fast-Reroute Backup info: Inbound FRR: Not active Outbound FRR: Ready -- backup tunnel selected Backup Tunnel: Tu1 (label 33) Bkup Sender Template: Tun Sender: 10.0.11.5, LSP ID: 506 Bkup FilerSpec: Tun Sender: 10.0.11.5, LSP ID 506 |
One of the most important pieces of any dynamic protocol is failure detection. This is doubly important in protection mechanisms because the longer it takes to detect a failure, the longer it takes to kick in the protection mechanism designed to circumvent that failure, and the less good your protection mechanism does you.
The methods for detecting these failures and the complexity involved in detecting them vary. It is easy to determine that a link has gone down if the PLR has had its interface administratively turned down, and harder to determine if the interface is down if there is a failure at or between the PLR and the other end of the circuit. Fortunately, detection of a failed link is nothing new. The mechanisms in place to aid the detection of these failures are
Failure detection mechanisms specific to a particular physical layer, such as SONET
For point-to-point links, PPP or HDLC keepalives
RSVP hello extensions
Link protection needs to be able to detect that a directly connected link to the NHop has gone down. The next two sections examine detection using Layer 2 alarms and RSVP hellos in detail.
When link protection was designed, SONET APS 50-millisecond convergence was used as the metric. An important part of achieving this goal is rapid detection of an interface's failure. If it takes 10 seconds to figure out that a link is down, that's 10 seconds of lost traffic that was forwarded out that link, and that's not good. SONET's APS (and SDH's MSP) are triggered by various physical alarms; FRR merely hooks into those alarms and so detects interface failure as quickly as APS and MSP would.
NOTE
Although FRR can be triggered by SONET alarms, it is supported only on POS interfaces, not ATM SONET interfaces.
NOTE
Although FRR works off the same alarms that trigger APS, you do not need to configure APS to get FRR to work. They're two completely independent mechanisms that just happen to use the same failure trigger.
It is a good idea to enable pos ais-shut on both sides of any POS link; this sends a Line Alarm Indication Signal (LAIs) when the interface is administratively shut down. This is independent of whether you use FRR or not; pos ais-shut simply helps a router detect when the interface on the other side of a link has been shut down. Without pos ais-shut, the router needs to rely on PPP/HDLC keepalives timing out to figure out that an interface went down, and keepalive timeout can take a while.
RFC 3209 defines RSVP hellos in Section 5. Here is part of the definition:
The RSVP Hello extension enables RSVP nodes to detect when a neighboring node is not reachable. The mechanism provides node to node failure detection. When such a failure is detected it is handled much the same as a link layer communication failure. This mechanism is intended to be used when notification of link layer failures is not available and unnumbered links are not used, or when the failure detection mechanisms provided by the link layer are not sufficient for timely node failure detection.
Physical layer alarms are not always available. For example, a Gigabit Ethernet interface that's connected to a switch doesn't always lose the link signal when a link in the same VLAN goes down.
If you want to detect failures on non-POS interfaces, you might want to use RSVP hellos. However, you need to know that RSVP hello-based failure detection is somewhat slower than Layer 2 alarm-based detection. It can take several hundred milliseconds to detect a neighbor failure using RSVP hellos, depending on how you tune them.
Even so, RSVP hello-based detection is considered sufficient for failure detection in local protection, and convergence is faster than plain IP or MPLS TE without FRR.
Figure 7-9 shows the configuration used for RSVP hellos.
You need to configure RSVP hellos at both the global and interface levels, as shown in Example 7-10.
12008a(config)#ip rsvp signalling hello 12008a(config)#interface pos 1/0 12008a(config-if)#ip rsvp signalling hello ? missed-acks # missed Hello Acks which triggers neighbor down refresh-interval Time between sending Hello Requests, msec. <cr> 12008a(config-if)#ip rsvp signalling hello missed-acks ? <2-10> Hello missed |
You control the refresh interval by entering the command shown in Example 7-11.
12008a(config-if)#ip rsvp signalling hello refresh-interval ?
<10-10000> Hello interval
<cr>
|
If no refresh interval is specified, it defaults to 100 milliseconds. Keep this default unless you have a good reason to change it. RSVP hellos need to be configured on both sides of a link.
You can examine the configuration of RSVP hellos using the show ip rsvp hello instance detail command, as demonstrated in Example 7-12.
mpls-12008a#show ip rsvp hello instance detail Neighbor 10.0.5.11 Source 10.0.5.5 State: LOST Type: ACTIVE (sending requests) I/F: PO1/0 LSPs protecting: 3 Refresh Interval (msec) Configured: 100 Statistics: no stats collected Src_instance 0xCFF52A98, Dst_instance 0x0 Counters: Communication with neighbor lost: Num times: 0 Reasons: Missed acks: 0 Bad Src_Inst received: 0 Bad Dst_Inst received: 0 I/F went down: 0 Neighbor disabled Hello: 0 Msgs Received: 0 Sent: 24 Suppressed: 0 mpls-12008a# |
Using the missed-acks knob, you can control how many acknowledgments should be missed before the neighbor is considered down. This defaults to 4 if it is not configured explicitly.
As soon as the failure is detected, the PLR is responsible for switching traffic to the backup tunnel. The internal processing performed on the PLR involves the following:
Making sure a presignalled backup LSP is in place. This includes the new label provided by a new downstream neighbor.
New adjacency information (Layer 2 encapsulation) is computed based on the backup tunnel's outgoing physical interface.
This information is precomputed and ready to be installed in the FIB/LFIB as soon as the failure is detected so as to minimize packet loss; this is referred to as the ready state, as shown in Example 7-13. After the failure is detected, FRR kicks and is said to be in active state (see Example 7-17). These states can only be observed on the PLR (12008a, in this example).
12008a#show mpls traffic-eng fast-reroute database Tunnel head fast reroute information: Prefix Tunnel In-label Out intf/label FRR intf/label Status LSP midpoint frr information: LSP identifier In-label Out intf/label FRR intf/label Status 4.4.4.4 1 [1520] 16 PO1/0:33 Tu1:33 ready |
The LSP identifier (4.4.4.4 1 [1520]) in Example 7-13 refers to the tunnel's source, the tunnel identifier, and the tunnel instance. This should agree with what you see in the output of show mpls traffic-eng tunnel tunnel1 on 7200a, shown in Example 7-14.
7200a#show mpls traffic-eng tunnel tunnel1
...
RSVP Signalling Info:
Src 4.4.4.4, Dst 12.12.12.12, Tun_Id 1, Tun_Instance 1520
...
|
The output in Examples 7-13 and 7-14 demonstrates that protection is offered to the LSP originating from 7200a but not to the one originating from 7500a.
Figure 7-10 shows a tunnel from 7500a that is not requesting protection.
You can see this in the configuration of the tunnel on 7500a in Example 7-15.
7500a#show running-config interface tunnel1 interface Tunnel1 description Primary tunnel(7500a->12008a->12008c->7200c) ip unnumbered Loopback0 no ip directed-broadcast ip route-cache distributed tunnel destination 12.12.12.12 tunnel mode mpls traffic-eng tunnel mpls traffic-eng autoroute announce tunnel mpls traffic-eng path-option 5 explicit name primary tunnel mpls traffic-eng path-option 6 dynamic tunnel mpls traffic-eng record-route End 7500a#show ip explicit-paths name primary PATH primary (strict source route, path complete, generation 6) 1: next-address 10.0.2.5 2: next-address 10.0.5.11 3: next-address 10.0.17.12 4: next-address 12.12.12.12 |
Example 7-15 shows that FRR is not configured under tunnel1 on 7500a.
As soon as a failure is detected, FRR quickly reacts to send traffic down the protection tunnel. This can be seen in the output from show mpls traffic-eng fast-reroute database (see Example 7-17) and show mpls traffic-eng fast-reroute log reroutes (see Example 7-20).
Figure 7-11 depicts the flow of traffic over the link protection tunnel. It is important to note that for local protection mechanisms, while the protection is active and the backup tunnel is forwarding traffic, the primary LSP continues to stay up. This is different from the end-to-end path protection scheme, in which the primary LSP goes down after a failure, gracefully switching over to standby LSP to minimize packet loss. This results in the tunnel's staying up.
When local protection is active, if you just examine the state of the primary tunnel on 7200a using the show mpls traffic-eng tunnels tunnel1 command, you might not notice that protection has kicked in because the primary LSP is still up and you see only the original ERO, which does not include the backup path. As you can see from Figure 7-11, the primary tunnel traffic now goes over the backup path. Doing a traceroute to the tunnel destination when FRR is active shows you that the backup path is actually taken. Example 7-16 shows the output of the traceroute from 7200a to 7200c. 10.0.3.5 is the first hop to 12008a, 10.0.11.10 is the second hop to 12008b, 10.0.9.16 is the third hop to 12008d, 10.0.7.11 is the forth hop to 12008c, and 10.0.17.12 is the final hop to 7200c. You can also observe the label stack in this output.
7200a#traceroute 12.12.12.12 Type escape sequence to abort. Tracing the route to 12.12.12.12 1 10.0.3.5 [MPLS: Label 16 Exp 0] 0 msec 0 msec 0 msec 2 10.0.11.10 [MPLS: Labels 38/33 Exp 0] 0 msec 0 msec 4 msec 3 10.0.9.16 [MPLS: Labels 35/33 Exp 0] 0 msec 0 msec 0 msec 4 10.0.7.11 [MPLS: Label 33 Exp 0] 0 msec 0 msec 0 msec 5 10.0.17.12 4 msec * 0 msec |
After failure, you can check if the FRR is active using the show mpls traffic-eng fast-reroute command at the PLR (12008a), as demonstrated in Examples 7-17 and 7-18.
12008a#show mpls traffic-eng fast-reroute database Tunnel head fast reroute information: Prefix Tunnel In-label Out intf/label FRR intf/label Status LSP midpoint frr information: LSP identifier In-label Out intf/label FRR intf/label Status 4.4.4.4 1 [1520] 16 PO1/0:33 Tu1:33 active |
12008a#show mpls traffic-eng fast-reroute database detail LFIB FRR Database Summary: Total Clusters: 1 Total Groups: 1 Total Items: 1 Link 4: PO1/0 (Down, 1 group) Group 28: PO1/0->Tu1 (Up, 1 member) LSP identifier 4.4.4.4 1 [1520], active Input label 16, Output label PO1/0:33, FRR label Tu1:33 |
Another useful command for checking FRR information is show ip rsvp sender detail. It shows what the RSVP neighbors have sent. Example 7-19 shows the output of this command executed on 12008a.
12008a#show ip rsvp sender detail PATH: Tun Dest: 12.12.12.12 Tun ID: 1 Ext Tun ID: 4.4.4.4 Tun Sender: 4.4.4.4 LSP ID: 1520 Path refreshes arriving on POS2/1 from PHOP 10.0.3.4 Path refreshes being sent to NHOP 11.11.11.11 on Tunnel1 Session Attr:: Setup Prio: 7, Holding Prio: 7 Flags: Local Prot desired, Label Recording, SE Style Session Name:Primary tunnel 7200a->12008a->12008c->7200c ERO: 11.11.11.11 (Strict IPv4 Prefix, 8 bytes, /32) 10.0.17.12 (Strict IPv4 Prefix, 8 bytes, /32) 12.12.12.12 (Strict IPv4 Prefix, 8 bytes, /32) RRO: 10.0.3.4/32, Flags:0x0 (No Local Protection) Traffic params - Rate: 0G bits/sec, Max. burst: 1K bytes Fast-Reroute Backup info: Inbound FRR: Not active Outbound FRR: Active -- using backup tunnel Backup Tunnel: Tu1 (label 33) Bkup Sender Template: Tun Sender: 10.0.11.5 LSP ID: 1632 Bkup FilerSpec: Tun Sender: 10.0.11.5, LSP ID 1632 Orig Output I/F: PO1/0 Orig Output ERO: 10.0.5.11 (Strict IPv4 Prefix, 8 bytes, /32) 10.0.17.12 (Strict IPv4 Prefix, 8 bytes, /32) 12.12.12.12 (Strict IPv4 Prefix, 8 bytes, /32) |
The show mpls traffic-eng fast-reroute database and show ip rsvp sender detail commands are good when they are applied when FRR is active. However, if you want to get historical information about when FRR was active and for how long, you can use the show mpls traffic-eng fast-reroute log reroutes command, shown in Example 7-20.
12008a#show mpls traffic-eng fast-reroute log reroutes
When Interface Event Rewrites Duration CPU msecs Suspends Errors
23:58:20 Tu1 Down 0 0 msecs 0 0 0
23:58:04 PO1/0 Down 1 0 msecs 0 0 0
23:57:54 PO1/0 Up 0 0 msecs 0 0 0
01:05:47 PO1/0 Down 1 0 msecs 0 0 0
01:01:12 PO1/0 Down 1 0 msecs 0 0 0
00:34:39 PO1/0 Down 1 0 msecs 0 0 0
00:00:41 PO1/0 Down 1 0 msecs 0 0 0
|
Some useful commands for debugging FRR are as follows:
debug ip rsvp fast-reroute
debug mpls lfib fast-reroute database
debug mpls lfib fast-reroute events
debug mpls lfib fast-reroute reroutes
As you understand by now, much of RSVP-based MPLS TE revolves around RSVP signalling. Fast restoration is not an exception. FRR concepts and implementation depend heavily on making further extensions to what is already defined in RFC 3209. These extensions are specified in draft-ietf-mpls-rsvp-lsp-fastreroute. This section talks about RSVP signalling that happens after FRR protection has kicked in. This can be broken into the following:
Upstream signalling
IGP notification
Downstream signalling
Recall from Chapter 4, “Path Calculation and Setup,” that when a link goes down along an LSP, the node that is upstream of the failed link signals a path error to the headends of the LSPs traversing the failed link. In Figure 7-12, after the link between 12008a and 12008c fails, it is the responsibility of 12008a to send a PathErr message to 7200a, which is the headend of the primary tunnel.
According to RFC 3209, “RSVP-TE: Extensions to RSVP for LSP Tunnels,” with no local protection configured, a node that is upstream of a failed link needs to send a PathErr for each LSP that crossed this link. This PathErr contains an error code of 24 (meaning “Routing Problem”) with a value of 5 (indicating “No route available toward destination”), as shown in Figure 7-12. You can see this in the output of debug ip rsvp path detail on 7200a after shutting down the link between 12008a and 12008c, as demonstrated in Example 7-21.
7200a#debug ip rsvp path detail *Oct 16 12:54:09.469: RSVP: version:1 flags:0000 type:PERR cksum:0000 ttl:25 *Oct 16 12:54:09.469: SESSION type 7 length 16: *Oct 16 12:54:09.469: Tun Dest 12.12.12.12 Tun ID 1 Ext Tun ID 4.4.4.4 *Oct 16 12:54:09.469: ERROR_SPEC type 1 length 12: *Oct 16 12:54:09.469: Error Node: 10.0.3.5 *Oct 16 12:54:09.469: Error Code: 24 (Routing Problem) *Oct 16 12:54:09.469: Error Value: 0x5 (No route available toward destination) |
When an LSP's headend gets the error shown in Example 7-21, it brings the tunnel interface down and then tries to find a new path for the LSP. The headend ignores the fact that local protection might be available around the broken link.
As a result, traffic along that LSP is blackholed until the LSP can be rerouted. This makes the backup LSP completely useless. Hence, you need a mechanism for 12008a to tell 7200a something like this: “My downstream link along the LSP is broken. I am temporarily rerouting the traffic. This path might no longer be the optimal path to the destination. Please compute an alternative path if one is available.” This is also referred to as the LSR 12008a triggering reoptimization.
To signal the nondestructive information, RFC 3209 specifies using a PathErr with an ERROR_SPEC containing error code 25, “Notification,” and a subcode of 3, “Tunnel locally repaired.”
When an LSP headend receives such a message, it knows that it does not need to immediately stop using its primary LSP, just that this LSP might be following a suboptimal path until it can be rerouted. The headend is free to reroute the LSP when it gets a chance to do so. What protection buys you here is that during the time before the headend can find a suitable alternative end-to-end path, traffic is still being delivered down the backup tunnel.
A headend that receives a notification of 25/3 attempts to calculate and signal a new path for that tunnel. After receiving the reservation (RESV) message for this new path, the label for the old path is replaced with the new label. Only then is the old LSP torn down. make-before-break and helps minimize packet loss.
If the headend cannot find a new path for an LSP that is currently being protected, the headend remains on the protected path. You might have an LSP that has an explicit path that doesn't allow it to get away from a failed link, or perhaps the necessary bandwidth for an LSP is available only along the link that has failed. As long as the protection tunnel is in place, the protected LSP remains up and passes traffic, and the headend periodically tries to find a new path for that tunnel to take.
Figure 7-13 shows the PathErr message when local protection has been enabled.
Figure 7-13 shows that 12008a (PLR) sends a PathErr to the primary tunnel headend 7200a. This is also captured in the debug output on 7200a using the debug ip rsvp path detail command, as shown in Example 7-22.
7200a#debug ip rsvp path detail *Oct 17 08:20:45.420: RSVP: version:1 flags:0000 type:PERR cksum:0000 ttl:255 reserved :0 length:132 *Oct 17 08:20:45.420: SESSION type 7 length 16: *Oct 17 08:20:45.420: Tun Dest 12.12.12.12 Tun ID 1 Ext Tun ID 4.4.4.4 *Oct 17 08:20:45.420: ERROR_SPEC type 1 length 12: *Oct 17 08:20:45.420: Error Node: 10.0.3.5 *Oct 17 08:20:45.420: Error Code: 25 (Notify) *Oct 17 08:20:45.420: Error Value: 0x3 (Tunnel locally repaired) *Oct 17 08:20:45.420: SENDER_TEMPLATE type 7 length 12: *Oct 17 08:20:45.420: Tun Sender: 4.4.4.4, LSP ID: 1520 *Oct 17 08:20:45.420: SENDER_TSPEC type 2 length 36: *Oct 17 08:20:45.420: version=0, length in words=7 *Oct 17 08:20:45.420: service id=1, service length=6 *Oct 17 08:20:45.420: parameter id=127, flags=0, parameter length=5 *Oct 17 08:20:45.420: average rate=0 bytes/sec, burst depth=1000 bytes *Oct 17 08:20:45.420: peak rate =0 bytes/sec *Oct 17 08:20:45.420: min unit=0 bytes, max unit=0 bytes *Oct 17 08:20:45.420: ADSPEC type 2 length 48: *Oct 17 08:20:45.420: version=0 length in words=10 *Oct 17 08:20:45.420: General Parameters break bit=0 service length=8 *Oct 17 08:20:45.420: IS Hops:1 *Oct 17 08:20:45.420: Minimum Path Bandwidth (bytes/sec):19375000 *Oct 17 08:20:45.420: Path Latency (microseconds):0 *Oct 17 08:20:45.420: Path MTU:4470 *Oct 17 08:20:45.420: Controlled Load Service break bit=0 service length=0 *Oct 17 08:20:45.420: *Oct 17 08:20:45.420: RSVP 4.4.4.4_1520-12.12.12.12_1: PATH ERROR message for 12.12.12.12 (POS3/0) from 10.0.3.5 |
As shown in the highlighted output in Example 7-22, 10.0.3.5 happens to be the interface address of 12008a facing 7200a. This is highlighted so that you know where the RSVP messages are coming from.
When a protected link fails and is switched down the backup tunnel, the PLR also sends Path messages for the protected LSPs down the backup tunnel. This is so that the MP doesn't time out the protected tunnel, in the unlikely event that the protected tunnel headend can't reroute the LSP. See the later section “Downstream Signalling” for more details on why this is necessary.
Additionally, some changes are made to the body of the Path message itself. As you know, RSVP is a soft-state protocol. In order to keep sessions alive, RSVP refresh messages are sent periodically. These refresh messages are sent between RSVP neighbors by sending Path and Resv messages. LSP tunnels are identified by a combination of the SESSION and SENDER_TEMPLATE objects in these Path and Resv messages (see Chapter 4). The SENDER_TEMPLATE object is modified by the PLR so that the sender IPv4 address now contains the PLR's IP address rather than that of the headend. Doing so allows the tail to see this Path message as coming from a new sender but still belonging to the same session.
From this point on, the refresh messages can flow over the backup tunnel. The original state maintained by the tail for this session is eventually torn down because of timeout (by any LSR downstream of the failed link, including the tail), but the altered Path message from the PLR is enough to effectively maintain the bandwidth reservation for as long as necessary. Example 7-23 shows the original ERO on 7200a before the protected link between 12008a and 12008c went down. Example 7-24 shows the change in ERO in the RSVP refresh message over the backup tunnel.
7200a#show mpls traffic-eng tunnel tunnel1 Name: Primary tunnel 7200a->12008a->12... (Tunnel1) Destination: 12.12.12.12 Status: Admin: up Oper: up Path: valid Signalling: connected path option 5, type explicit primary (Basis for Setup, path weight 3) Config Parameters: Bandwidth: 100 kbps (Global) Priority: 7 7 Affinity: 0x0/0xFFFF Metric Type: TE (default) AutoRoute: enabled LockDown: disabled Loadshare: 100 bw-based auto-bw: disabled(0/258) 0 Bandwidth Requested: 100 InLabel : - OutLabel : POS3/0, 16 RSVP Signalling Info: Src 4.4.4.4, Dst 12.12.12.12, Tun_Id 1, Tun_Instance 1520 RSVP Path Info: My Address: 4.4.4.4 Explicit Route: 10.0.3.5 10.0.5.11 10.0.17.12 12.12.12.12 |
12008a#debug ip rsvp path detail
RSVP 4.4.4.4_1520-12.12.12.12_1: PATH message for 12.12.12.12(POS2/1) from 10.0.3.4
RSVP 4.4.4.4_1520-12.12.12.12_1: PATH: fastreroute in progress
RSVP: version:1 flags:0000 type:PATH cksum:0000 ttl:254 reserved:0 length:216
SESSION type 7 length 16:
Destination 12.12.12.12, TunnelId 3, Source 3.3.3.3
HOP type 1 length 12: 0A000203
: 00000000
TIME_VALUES type 1 length 8 : 00007530
EXPLICIT_ROUTE type 1 length 52:
(#1) Strict IPv4 Prefix, 8 bytes, 10.0.2.5/32
(#2) Strict IPv4 Prefix, 8 bytes, 10.0.11.10/32
(#4) Strict IPv4 Prefix, 8 bytes, 10.0.7.11/32
(#5) Strict IPv4 Prefix, 8 bytes, 10.0.17.12/32
(#6) Strict IPv4 Prefix, 8 bytes, 12.12.12.12/32
|
Related to the refresh messages is the fact that Path messages would have to be forwarded down the backup tunnel by the PLR. But if the PLR did so using the contents of the ERO, as it would normally do, it would fail because the next IP address in the ERO would point to the failed link. This behavior has to change to make FRR work. In addition to the ERO, the RRO and phop objects are modified for refresh messages flowing over the backup tunnel according to the IETF draft.
Although in many cases, RSVP messages reach either the headend or tailend ahead of any IGP notification, this is not guaranteed to be the case. When IGP information (such as OSPF/IS-IS LSA declaring a link down) for some reason makes it before the RSVP message, what consequences does this have on TE signalling?
In the absence of FRR, if the primary tunnel headend receives a link-down LSA for a link that was part of the primary LSP, the headend tears down the primary tunnel. After that, the headend can, if configured correctly, attempt to reroute the LSP.
If the primary tunnel is configured for FRR, the link-down LSA has no effect. The headend tears down a protected LSP based only on RSVP error messages and ignores IGP's reporting a link down along the LSP. This is because a downed link doesn't necessarily mean a failed LSP because the LSP could be protected.
You just saw the repercussions of link failure upstream of the failed link, both with and without local protection. This section examines what goes on downstream of the failed link.
Consider the topology shown in Figure 7-14.
When the link between 12008a and 12008c goes down (when no local protection is in place), 12008c sends a PathTear message to 7200c, as demonstrated in Example 7-25.
12008c#debug ip rsvp path detail *Oct 22 12:59:41.185: RSVP: version:1 flags:0000 type:PTEAR cksum:6CF7 ttl:2 52 reserved:0 length:132 *Oct 22 12:59:41.185: SESSION type 7 length 16: *Oct 22 12:59:41.189: Tun Dest 12.12.12.12 Tun ID 1 Ext Tun ID 4.4.4.4 *Oct 22 12:59:41.189: HOP type 1 length 12: *Oct 22 12:59:41.189: Hop Addr: 10.0.17.11 LIH: 0x0 *Oct 22 12:59:41.189: SENDER_TEMPLATE type 7 length 12: *Oct 22 12:59:41.189: Tun Sender: 4.4.4.4, LSP ID: 1520 *Oct 22 12:59:41.189: SENDER_TSPEC type 2 length 36: *Oct 22 12:59:41.189: version=0, length in words=7 *Oct 22 12:59:41.189: service id=1, service length=6 *Oct 22 12:59:41.189: parameter id=127, flags=0, parameter length=5 *Oct 22 12:59:41.189: average rate=0 bytes/sec, burst depth=1000 bytes *Oct 22 12:59:41.189: peak rate =0 bytes/sec *Oct 22 12:59:41.189: min unit=0 bytes, max unit=0 bytes *Oct 22 12:59:41.189: ADSPEC type 2 length 48: *Oct 22 12:59:41.189: version=0 length in words=10 *Oct 22 12:59:41.189: General Parameters break bit=0 service length=8 *Oct 22 12:59:41.189: IS Hops:2 *Oct 22 12:59:41.189: Minimum Path Bandwidth (bytes/sec):19375000 *Oct 22 12:59:41.189: Path Latency (microseconds):0 *Oct 22 12:59:41.189: Path MTU:4470 *Oct 22 12:59:41.189: Controlled Load Service break bit=0 service length=0 *Oct 22 12:59:41.189: |
But if the LSP were protected because of FRR, this kind of message would have an adverse effect. It would result in the LSP's being torn down even though the LSP were being locally protected by the PLR. To prevent this, the PathTear message needs to be suppressed for primary LSPs that have the “Local Protection Desired” flag on, in spite of the fact that you don't receive Path messages for the protected primary tunnel on the original incoming interface anymore. As long as 12008c receives Path messages belonging to the original RSVP session on any interface, it does not time out and sends a PathTear on its downstream interface. If you recall from the “Upstream Signalling” section, the PLR sends Path messages for all protected tunnels down the protection tunnel.
As the tail of the primary tunnel, 7200c does not know that the protected tunnel has failed unless one of the following things happens:
It receives an IGP update about the link failure.
It receives a PathTear from MD 12008c.
It does not receive an RSVP refresh message (Path) that keeps the session alive within a certain period of time.
This wait on Cisco routers by default is four keepalive periods. (See Chapter 4 for more information on RSVP timers.)
If the tail receives an IGP update about the link failure, it does not take any action from an MPLS TE perspective.
If the RSVP signalling state times out, the LSP is declared dead, and a ResvTear message is sent to the headend. This means that, apart from preventing PathTear from being sent by MP 12008c, you somehow need to make sure that the tail continues to receive the RSVP refresh messages even though one of the links that constituted the primary LSP is now down. This is achieved by making sure that the MP (12008c) continues to receive PATH messages for the primary LSP over the backup tunnel.
The earlier sections delved into the pieces that make up local protection. The configuration and show commands used for each piece were provided and the concepts explained. This section summarizes all the configuration pieces needed to configure link protection.
The sample topology shown in Figure 7-15 explains the link protection configuration. In Figure 7-15, 7200a is the head of the primary tunnel that needs to be protected against failure of the protected link 12008a→12008c. 12008a is the PLR, and 12008c is the MP.
The configuration task for link protection can be broken into the following:
Headend configuration
PLR configuration
Example 7-26 shows the primary tunnel configuration for 7200a for enabling FRR.
7200a#show running-config interface tunnel1 interface Tunnel1 description Primary tunnel 7200a->12008a->12008c->7200c ip unnumbered Loopback0 tunnel destination 12.12.12.12 tunnel mode mpls traffic-eng tunnel mpls traffic-eng autoroute announce tunnel mpls traffic-eng path-option 5 explicit name primary tunnel mpls traffic-eng path-option 6 dynamic tunnel mpls traffic-eng fast-reroute 7200a#show ip explicit-paths name primary PATH primary (strict source route, path complete, generation 6) 1: next-address 10.0.3.5 2: next-address 10.0.5.11 3: next-address 10.0.17.12 4: next-address 12.12.12.12 |
The PLR configuration consists of two pieces:
Building a backup tunnel to NHop
Configuring the protected interface to use the backup tunnel
Example 7-27 shows what the configuration of the backup tunnel on 12008a looks like.
12008a#show running-config interface tunnel1 interface Tunnel1 description Link Protection Tunnel (Backup) ip unnumbered Loopback0 no ip directed-broadcast tunnel destination 11.11.11.11 tunnel mode mpls traffic-eng tunnel mpls traffic-eng path-option 5 explicit name nhop 12008a#show ip explicit-paths name nhop PATH nhop (strict source route, path complete, generation 24) 1: next-address 10.0.11.10 2: next-address 10.0.9.16 3: next-address 10.0.7.11 |
Example 7-28 shows the configuration of protected interface POS 1/0, required for FRR.
12008a#show running-config interface pos 1/0 Building configuration... Current configuration : 365 bytes ! interface POS1/0 ip address 10.0.5.5 255.255.255.0 no ip directed-broadcast mpls label protocol ldp mpls traffic-eng tunnels mpls traffic-eng backup-path Tunnel1 tag-switching ip crc 32 clock source internal ip rsvp bandwidth 155000 155000 end |
As you can see, enabling link protection is simple. It turns out that enabling node protection is just as easy, as you will see in the next section.
18.221.85.33