Understanding Convergence

Recall that convergence involves four primary activities: update, invalid, holddown, and flush. These activities come into play any time a router experiences a change in its routing table, including when the router is powered on. This section includes a scenario that shows the process of convergence by causing RouterC to cease sending routing updates to RouterA. This section also introduces the concept of parallel paths in a routing table and discusses some of the issues they can create.

Parallel Paths

When a router has two or more routes to the same network with the same metric, these routes can be thought of as having an equal cost. The term parallel paths is a just a common way to refer to occurrences of equal-cost routes in a routing table.

The Effect of Parallel Paths on Convergence

If a router had two or more equal-cost paths (routes) to a network, it may use them concurrently. If a router loses one or more of the parallel paths, it will continue to use the paths that are still available. RouterA in Figure 2-3 has two equal-cost paths to 168.71.7.0. If it loses the route via serial 0, it can continue to use the route via serial 1. Convergence in this situation is simply a matter of removing any references in the routing table to the route that has ceased to exist.

The arrows in Figure 2-3 show that RouterA has two routes (parallel paths) to subnet 168.71.7.0.

Figure 2-3. Parallel paths enable a router to continue to use whatever paths are available when some paths are down.


Looking at Parallel Paths in a Routing Table

The following routing table from RouterA has the two parallel paths, which are shown here in bold:

RouterA#show ip route
Codes: C - connected, S - static, I - IGRP, R - RIP, M - mobile, B - BGP
   D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area
   E1 - OSPF external type 1, E2 - OSPF external type 2, E - EGP
   i - IS-IS, L1 - IS-IS level-1, L2 - IS-IS level-2, * - candidate default
Gateway of last resort is 0.0.0.0 to network 0.0.0.0
168.71.0.0 255.255.255.0 is subnetted, 5 subnets
C    168.71.9.0 is directly connected, Serial1
R    168.71.8.0 [120/1] via 168.71.9.2, 00:00:15, Serial1
R    168.71.7.0 [120/1] via 168.71.6.2, 00:00:00, Serial0
							          [120/1] via 168.71.9.2, 00:00:15, Serial1
C    168.71.6.0 is directly connected, Serial0
C    168.71.5.0 is directly connected, Ethernet0
   171.68.0.0 is variably subnetted, 2 subnets, 2 masks
C    171.68.207.128 255.255.255.128 is directly connected, Ethernet0
S    171.68.0.0 255.255.0.0 [1/0] via 171.68.207.129
S* 0.0.0.0/0 is directly connected, Ethernet0
RouterA#

When parallel (equal-cost) paths are available to a network, the routing table displays the first entry with a character prefix indicating how it knows about the route and omits this prefix for all other paths to the same network. In the previous example, RouterA's routing table indicates that it has two paths to 168.71.7.0. Both are learned via RIP; however, only one of them has the R tag that indicates it is a RIP-derived route. The R is assumed for the other route. The first route has a next hop of 168.71.6.2. The second route has a next hop of 168.71.9.2. Refer to Figure 2-3 to compare the network diagram with the previous routing table.

The codes section of the routing table shows what routing protocols the various prefixes are used for.

Convergence in Action

This section presents a scenario that explains what happens when a router running RIP has to converge on a new route because an existing route has ceased to be advertised by the router it was originally learned from.

Here is a review of the default timers RIP uses. These timers are reflected in the behavior of RIP in this scenario.

RouterA#show ip protocol
Routing Protocol is "rip"
  Sending updates every 30 seconds, next due in 27 seconds
  Invalid after 180 seconds, hold down 180, flushed after 240
.
.
Output deleted
.
RouterA#

The access list that was applied in the previous section to stop RouterC from advertising 168.71.8.0 has been modified to also block subnet 168.71.7.0. It has been applied to RouterC again for this section.

This is RouterA's routing table before the invalid timer for the route to 168.71.8.0 expires. Notice that the invalid timer is already at 39 seconds. This means that the access list was applied at least nine seconds ago.

RouterA#show ip route
Codes: C - connected, S - static, I - IGRP, R - RIP, M - mobile, B - BGP
   D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area
   E1 - OSPF external type 1, E2 - OSPF external type 2, E - EGP
   i - IS-IS, L1 - IS-IS level-1, L2 - IS-IS level-2, * - candidate default
   U - per-user static route

The gateway of last resort is 0.0.0.0 to network 0.0.0.0:
   168.71.0.0/16 is subnetted, 5 subnets
C    168.71.9.0 is directly connected, Serial1
R    168.71.8.0 [120/1] via 168.71.9.2, 00:00:39, Serial1
R    168.71.7.0 [120/1] via 168.71.6.2, 00:00:11, Serial0
          [120/1] via 168.71.9.2, 00:00:39, Serial1
C    168.71.6.0 is directly connected, Serial0
C    168.71.5.0 is directly connected, Ethernet0
S* 0.0.0.0/0 is directly connected, Ethernet0
RouterA#

RouterA's invalid timers for the routes it has learned from RouterC will expire in 141 seconds.

Assume that 141 seconds have just passed. The following debug messages from RouterA show what happens when the invalid timers for the routes it has learned from RouterC fire.

RouterA#debug ip routing
RouterA#sh clock
20:33:30.246 UTC Fri Aug 2 1996
RouterA#debug ip routing
Aug  2 20:36:45: RT: flushed route to 168.71.8.0 via 168.71.9.2 (Serial1)
Aug  2 20:36:45: RT: no routes to 168.71.8.0, entering holddown
Aug  2 20:36:45: RT: flushed route to 168.71.7.0 via 168.71.9.2 (Serial1)
Aug  2 20:37:41: RT: garbage collecting entry for 168.71.8.0
Aug  2 20:37:50: RT: add 168.71.8.0/24 via 168.71.6.2, rip metric [120/2]
RouterA#

Notice in the debug that RouterA did not add a new route to 168.71.7.0 after removing the old routes from RouterC. This is because RouterA originally had two parallel paths to 168.71.7.0. The convergence process in this scenario involved removing any references to routes learned from RouterC and installing a route to 168.71.8.0 via RouterB.

Note

The preceding router output introduced the debug ip routing command. The debug ip routing command causes the router to create a message any time the status of the routing table changes. This change can be the addition of a new route or a change in the status of an existing route.


The Routing Table After Convergence

The following routing table from RouterA shows its new converged state. RouterA is now using RouterB to route packets to 168.71.8.0. Remember that RouterA did not lose its original route to 168.71.8.0 because there was a problem with RouterC's Token Ring interface. It lost the route because the access list applied to RouterC's configuration caused RouterC to stop advertising all of its routes to RouterA. RouterC never stopped advertising routes to RouterB.

RouterA#show ip route
Codes: C - connected, S - static, I - IGRP, R - RIP, M - mobile, B - BGP
   D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area
   E1 - OSPF external type 1, E2 - OSPF external type 2, E - EGP
   i - IS-IS, L1 - IS-IS level-1, L2 - IS-IS level-2, * - candidate default
   U - per-user static route
Gateway of last resort is 0.0.0.0 to network 0.0.0.0
   168.71.0.0/16 is subnetted, 5 subnets
C    168.71.9.0 is directly connected, Serial1
R    168.71.8.0 [120/2] via 168.71.6.2, 00:00:05, Serial0
R    168.71.7.0 [120/1] via 168.71.6.2, 00:00:06, Serial0
C    168.71.6.0 is directly connected, Serial0
C    168.71.5.0 is directly connected, Ethernet0
S* 0.0.0.0/0 is directly connected, Ethernet0
RouterA#

Notice the different metric in the partial routing tables shown here. When the route for 168.71.8.0 changed from RouterC to RouterB, the metric went from 1 to 2. The metric is the second number inside the []s. For example, [120/3] means a metric of 3. The next two bullet points illustrate the change from one metric to another.

  • The original route via 168.71.9.2 has a metric of 1, as follows:

    RouterA#show ip route 168.71.8.0 (output edited for clarity)
    R    168.71.8.0 [120/2] via 168.71.9.2, 00:00:39, Serial1
    								
  • The metric changed to 2 after losing the original route via 168.71.9.2 and replacing it with the new route via 168.71.6.2:

    RouterA#show ip route 168.71.8.0 (output edited for clarity)
    R    168.71.8.0 [120/2] via 168.71.6.2, 00:00:05, Serial0
    								

Keeping track of the metrics and next hop addresses in a routing table will help you when you are troubleshooting an IP routing problem. You will be able to determine the different routes your traffic is taking and compare them to your network diagrams.

Step-by-Step Review of Convergence

The following steps occurred at or close to the same time RouterC stopped advertising routes to RouterA. Refer to the time stamps on the previous debug messages.

  1. Aug 2 20:33:45: Invalid timer restarted after the last routing update was received from RouterC. You know that this was the approximate time the last update was received because the invalid timer runs for 180 seconds (three minutes) before firing, and it fires at 20:36:45.

  2. Aug 2 20:33:45: Flush timer restarted after the last routing update was received from RouterC. You know that this is the approximate time because the flush timer is reset every time the invalid timer is reset. The flush timer and the invalid timer run concurrently.

  3. Aug 2 20:36:45: Invalid timer fires for networks that RouterA was using RouterC to reach.

  4. Aug 2 20:36:45: Hold down timer starts for networks that RouterA was using RouterC to reach.

  5. Aug 2 20:37:41: Flush timer fires and terminates holddown timer for networks that RouterA was using RouterC to reach. This is allowed because the flush timer overrides the hold down timer.

    Hold down didn't start until 20:36:45. The hold down timer is 180 seconds (three minutes); therefore, it should have run until 20:39:45.

    Garbage collecting is Cisco-speak for totally removing (flushing) any remaining references to the routes in question.

  6. Aug 2 20:37:50: RouterA receives RouterB's routing update and installs a route to 168.71.8.0. In this case, the update from RouterB arrived nine seconds after RouterA's flush timer fired.

Note

Because RIP only sends updates every 30 seconds, it could have taken anywhere from .01 second to 29.99 seconds (rounding to the nearest hundredth second) for RouterB's next update to arrive so that RouterA could accept the new route.

The total estimated time of convergence (from last update received to accepting the higher cost route) in this scenario was equal to the following:

20:37:50 (finish)

–20:33:45 (start)

00:04:05 Time to converge


You may have noticed that the times shown in the debug do not match the exact default timer intervals for RIP. This is because the router might take some time to get around to running the task. A RIP timer might expire and raise a flag to be serviced, but it might not be serviced for a few seconds after raising the flag. This is normal behavior and should be considered when determining maximum possible convergence times. Note that the update timer will vary by a random amount, helping prevent all routers from sending their updates at exactly the same time. RFC 1058, which is discussed in Appendix A, explains RIP timers in more detail.

It is important to note that RouterB was advertising 168.71.8.0 and 168.71.7.0 to RouterA every 30 seconds. RouterA was initially ignoring RouterB's advertisements for 168.71.8.0 because the metric from RouterB was 2, whereas the metric from RouterC was 1. RouterA already had RouterB's route to 168.71.7.0 installed in its routing table. Figure 2-4 maps the processes involved in convergence against a timeline.

Figure 2-4. A timeline for convergence.


Debug Messages and Reality

Unfortunately, the debug messages in the previous example are somewhat misleading. Following are the original debug messages for reference:

RouterA#sh clock
20:33:30.246 UTC Fri Aug 2 1996
RouterA#debug ip routing
Aug  2 20:36:45: RT: flushed route to 168.71.8.0 via 168.71.9.2 (Serial1)
Aug  2 20:36:45: RT: no routes to 168.71.8.0, entering holddown
Aug  2 20:36:45: RT: flushed route to 168.71.7.0 via 168.71.9.2 (Serial1)
Aug  2 20:37:41: RT: garbage collecting entry for 168.71.8.0
Aug  2 20:37:50: RT: add 168.71.8.0/24 via 168.71.6.2, rip metric [120/2]
RouterA#

Here is what is really happening:

RouterA#sh clock
20:33:30.246 UTC Fri Aug 2 1996
RouterA#debug ip routing
Aug  2 20:36:45: RT: preparing to advertise 168.71.8.0 via 168.71.9.2 (Serial1) as unreachable
Aug  2 20:36:45: RT: invalid timer expired no routes to 168.71.8.0, entering holddown,
 marking  route as possibly down
Aug  2 20:36:45: RT: preparing to advertise 168.71.7.0 via 168.71.9.2 (Serial1) as unreachable
Aug  2 20:37:41: RT: flush timer expired terminating holddown for 168.71.8.0
Aug  2 20:37:50: RT: add 168.71.8.0/24 via 168.71.6.2, rip metric [120/2]
RouterA#

When Holddown Is Initiated

A router puts a route into holddown when one of the following happens:

  • The router that was advertising the route stops advertising it for a period of time. This period of time is usually referred to as the invalid period.

  • The router that advertised the original route sends a new advertisement for the same route with a metric greater than the metric stored in the routing table. This usually indicates that there is a routing loop, which causes the route to be immediately deleted and put into holddown instead of being forced to wait for the invalid timer to fire.

  • The router that was advertising the route sends a new advertisement for the route with an unreachable metric, otherwise known as poisoning the route.

Understanding Parallel Paths and Their Effect on Packet Forwarding

Besides providing redundancy in case of a circuit failure, the availability of parallel paths enables a router to load balance packets over the available paths. This can lead to more efficient use of the available bandwidth. The two ways a router can load balance over parallel paths are as follows:

  • Round robin on a packet-by-packet basis. Packet by packet means that the router sends one packet across each of the parallel links, one link at a time.

  • Round robin on a session-by-session basis. Session by session means that instead of just keeping track of the destination subnet, the router stores the entire IP address of the destination host. Each packet to the same host uses the same link.

One type of load balancing is not necessarily better than another. Each type has its own set of benefits and drawbacks. Packet-by-packet load balancing is generally considered the better choice for parallel links that are slower than 64K. However, this method may deliver packets out of order in a network when the propagation delay (the time it takes a packet to reach the other end of a link) for the two paths is not the same.

In Figure 2-5, the PC has two equal-cost paths (parallel paths) available for subnet 168.71.7.0 to use for sending packets to 168.71.7.1 (the serial 1 interface on RouterB). Remember that routing decisions are most commonly made based only on the network portion of a destination address.

Figure 2-5. PC has two paths to reach subnet 168.71.7.0.


Note

It is possible to manually add a route in a routing table that uses an entire IP destination address. This is called a static (manually configured, not learned by a dynamic routing protocol) host (uses all bits in the destination IP host address) route. This concept is covered in more detail in Chapter 5 in the section on floating static routes.


RouterA could use packet-by-packet load balancing to forward ping packets from the PC to IP address 168.71.7.1, in which case one packet would go via RouterB directly and the other would go indirectly via RouterC.

The following debug messages from RouterA show RouterA using both paths to route pings to 168.71.7.1. Notice that the outbound interface alternates between (serial 0) and (serial 1).

RouterA#debug ip packet
RouterA#ping 168.71.7.1
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 168.71.7.1, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 32/34/40 ms
IP: s=171.68.207.222 (Ethernet0), d=168.71.7.1 (Serial0), g=168.71.6.2, len
74, forward
IP: s=171.68.207.222 (Ethernet0), d=168.71.7.1 (Serial1), g=168.71.9.2, len 74, forward
IP: s=171.68.207.222 (Ethernet0), d=168.71.7.1 (Serial0), g=168.71.6.2, len 74, forward
IP: s=171.68.207.222 (Ethernet0), d=168.71.7.1 (Serial1), g=168.71.9.2, len 74, forward
IP: s=171.68.207.222 (Ethernet0), d=168.71.7.1 (Serial0), g=168.71.6.2, len 74, forward

The following is an important concept: Just because a path is equal cost from the perspective of one router doesn't mean that both paths have the same number of routers in them, even if the routing metric (unit of measurement) is hop-based.

RouterA has no way of knowing that 168.71.7.1 is assigned to a serial interface in RouterB. RouterA only understands that both RouterB and RouterC indicated that they are connected to subnet 168.71.7.0. RouterA listened to the routing advertisements from RouterB and RouterC indicating that each has a direct connection to 168.71.7.0. RouterA added one hop to the metric for each route because from its perspective, 168.71.7.0 is reachable over one hop (one router) via two different routers.

RouterA could also use session-by-session load balancing to forward the same ping packets. The following debug message from RouterA shows RouterA using a single path to forward the packets to 168.71.7.1.

However, the fact that only a single path is being used is obscured because the debug shows only the first packet used to set up the route cache when a router uses session-by-session load balancing. The remaining packets are hidden from the router's debug function because they are being forwarded by the router's CPU at interrupt level using the information from the route cache. If both paths had been used, there would have been a second debug message pointing to (serial 1).

RouterA#deb ip packet
RouterA#ping 168.71.7.1
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 168.71.7.1, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 32/34/40 ms
IP: s=171.68.207.222 (Ethernet0), d=168.71.7.1 (Serial0), g=168.71.6.2, len 74, forward

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.220.126.5