Shooting Trouble with Frame Relay

In this section I want to reinforce the general things to look for when shooting Frame Relay troubles. Also I want to cover a little more detail on shooting trouble with running routing protocols over Frame Relay, and then discuss loopback testing.

The first question to ask yourself with Frame Relay is did it ever work. You are obviously going beyond your control that you had in the LAN to the service provider cloud. However, you must still continue your layered approach to troubleshooting. Although Frame Relay is a Layer 2 technology, it does not work across a broken physical link. If these lower layers are broken, you are wasting your time troubleshooting the upper layers.

The commands show ip interface brief, show interface s0, show controllers s 0, as well as link lights are all invaluable Physical Layer tools. Interface resets are a good indication of queued packets that have not been transmitted, hardware problems, or clocking signals. Other error counts, such as packets input and output and carrier transitions beyond your baseline, are worthwhile to analyze. Move up the stack to check the encapsulation or frame type. Are you communicating with the frame switch? Remember that the LMI type, whether it be Cisco, ANSI, or Q933A must match with the local switch port. Look at the keepalive activity between the local router and the frame switch with show frame-relay lmi. Clear the interface counters and watch the Num Status Enq Sent and the Num Status Msgs Rcvd. They should be about the same. Num Status Timeouts tracks how many times the status message was not received within the keepalive window. Perhaps there is an LMI autosensing issue. The service provider provides the DLCIs so that they can do the appropriate mapping to get you to your destination. However, they are not mistake-proof. Maybe the DLCIs are reversed; review the Inverse ARP table with show frame-relay map.

NOTE

All unassigned DLCIs reported by the frame switch via LMI are assigned to the main physical interface as multipoint. Because providers make mistakes, and Inverse ARP and CDP are enabled by default, this may cause a security concern. Check show frame-relay map frequently for the appearance of unknown DLCIs.


Use show frame-relay lmi and debug frame-relay lmi like you did back in Examples 8-13 through 8-15 to see whether the router and switch are talking. This debug command does not have much of an impact on router operations as most debug commands because the output is minimal. It does a great job of showing the LMI exchange for router-to-switch inquiries and switch-to-router reply status messages. The (out)StEnq is the LMI status inquiry sent by the router, and (in)Status is the reply from the frame switch. A full LMI message contains PVC data including DLCI, status, and CIR.

Use the following commands with a little more caution:

  • debug frame-relay events to show counts of packets received on interface

  • debug frame-relay packet to see the packets sent out a Frame Relay interface

Possible packet types include 0x308, which is a signaling message for DLCI 0, and 0x309, which is an LMI message valid with a DLCI of 1023.

Issue a debug serial interface command early on to see the keepalive activity. Change the encapsulation to HDLC to see the keepalive traffic, because if LMI is down for Frame Relay the frame interface will not be able to generate keepalives. It only takes three missed keepalives in a row to take the line down. You'll look at HDLC a little closer in the next chapter, but the point here is that only Cisco HDLC encapsulation supports detection of a looped Layer 1 and still keeps the line protocol up so that you may send test traffic.

Keepalives in the WAN world are truly between you and the service provider, not just your own interface. You look at these more in the next WAN chapter. For example:

  • Mineseq is the keepalive sent by the local side.

  • Yourseen is the keepalive sent by the remote side.

  • Mineseen is the local keepalive seen by the remote side.

NOTE

Always remember to turn off all debug processing when finished testing. Remember, u all is short for undebug all or no debug all.


Perhaps the issue is not with configuration at all but with performance. Take a look at the output of show frame-relay traffic in Example 8-32.

Example 8-32. show frame-relay traffic
r1#show frame-relay traffic
Frame Relay statistics:
        ARP requests sent 0, ARP replies sent 0
        ARP request recvd 0, ARP replies recvd

Any way you look at it, if the frame switch runs out of buffers it looks at the DE packets to see what it can discard. In general you can help with performance issues out of the router with priority queuing. Frame Relay traffic shaping assists with switch congestion. Relate this back to the Chapter 3, “Shooting Trouble with IP,” subnetting analogy with the congestion of the cars crossing the Chesapeake Bay Bridge. The Mass Transit Authority (MTA) borrows lanes as appropriate to facilitate roadwork and east-bound and west-bound access. However, the improved EZPass system dedicates one or more lanes to local commuters.

Numerous issues relate to routing protocols, mostly broadcast or multicast in nature. Yet Frame Relay is NBMA. This creates some interesting results and is actually another book in itself. Throughout this book, I have you experiment with some of the more common issues with routing protocols. The next section speaks to running those routing protocols over the Frame Relay data link.

NOTE

Other routing reference material from Cisco Press you can read includes Henry Benjamin's CCNP Practical Studies: Routing; Troubleshooting IP Routing Protocols (Shamim, Aziz, Liu, Martey); and Jeff Doyle's Routing TCP/IP, Volumes I and II (Cisco Press). Another excellent book is Advanced IP Routing in Cisco Networks (McGraw-Hill Osborne) by Terry Slattery and Bill Burton.


Frame Relay and Routing Protocols

Routing protocols such as Open Shortest Path First (OSPF), Extended Interior Gateway Routing Protocol (EIGRP), Intermediate System-to-Intermediate System (IS-IS), and Border Gateway Protocol (BGP) all run over Frame Relay. Cisco's implementation of Frame Relay supports various Layer 3 routed protocols including IP, DECnet, AppleTalk, Xerox Network Systems (XNS), Internetwork Packet Exchange (IPX), Connectionless Network Service (CLNS), and so on. Whether Frame Relay or another WAN transport, if there are traffic issues or memory issues due to large routing tables, first make sure you have properly summarized according to the routing protocol rules. Unfortunately, the commands are all slightly different with summarizing each and every routing protocol. As I have alluded to in this chapter, with Frame Relay reachability issues exist when using multiple PVCs over a single interface. Depending on the topology, split horizon may be doing its job of reducing routing loops but causing other problems because of the NBMA nature of Frame Relay.

For example, IP split horizon is disabled by default on Frame Relay interfaces. However, this creates a problem with protocols such as IPX and AppleTalk because they rely on split horizon to work properly. To make a long story short, regardless of protocol the workaround is subinterfaces. Subinterfaces resolve many upper-layer routing issues. Multipoint and point-to-point subinterfaces were discussed back in the “Frame Relay at the Physical Layer” section.

Now I'll review EIGRP, then OSPF, then IS-IS, and finally BGP because they are all very common in the real world today. My goal is just to quickly review some of the common commands to help you recognize some of the issues of running these routing protocols over Frame Relay to prepare you for the Trouble Tickets and practical application. Refer back to the general discussion of IP routing protocols back in Chapter 3.

EIGRP over Frame Relay

EIGRP, encapsulated in the IP header as protocol number 88, works well in the LAN and the WAN. However, the topology type has an impact on neighbor adjacencies across the WAN. EIGRP operates over multicast address 224.0.0.10, but Frame Relay is an NBMA technology by default. Nonbroadcast means no multicast either.

The big issues to review with EIGRP over Frame Relay include how EIGRP uses the bandwidth. It is crucial that you configure your bandwidth statements, because by default EIGRP can use up to half of the bandwidth. If you don't configure the bandwidth and you allow EIGRP to use 50 percent of the default 1.544 Mbps for a serial link when you really only have a 56 kbps or 64 kbps link to begin with, and you have a big topology table, and routes start flapping, you probably won't be too happy with EIGRP. You are already familiar with the bandwidth statement, but you can configure the percentage of bandwidth that EIGRP is allowed to use using the ip bandwidth-percent eigrp as-number percent command. For example, ip bandwidth-percent eigrp 100 200 allows EIGRP autonomous system 100 to utilize 200 percent of the configured bandwidth. So if the bandwidth is configured to 25 kbps, EIGRP would be allowed to use up to 50 kbps. Obviously you need to make sure the line is provisioned appropriately. On the other hand, you may want to lessen the percent number so that the routing updates are not consuming all of your bandwidth.

Speaking of provisioning bandwidth for the WAN, the best practice is to configure the bandwidth to be the CIR of the PVC—unless, of course, you have a 0 CIR; but I guess you wouldn't have anything to complain about if that were the case. That method works just fine for point-to-point PVCs, but for multipoint, EIGRP uses the bandwidth on the main interface divided by the number of neighbors to get the neighbor bandwidth. In effect there is a single entry point with multiple exit points so that the bandwidth is equally shared. If there are varying CIRs, it is a better practice to convert to point-to-point subinterfaces As a workaround, you can manually configure the bandwidth by taking the lowest CIR and multiplying by the number of PVCs. Be careful not to oversubscribe yourself. Adjust the EIGRP bandwidth percent so that you have about a 1:1 ratio for the amount of bandwidth that EIGRP can use.

Another big issue with EIGRP on the WAN in general is making sure you limit the need to know through summarization, outbound route filters, and distribute lists as to not end up with Stuck in Active (SIA) routes. If a router cannot answer a query because it is too busy or has memory problems, that is one problem, but if the WAN circuit is down or only works in one direction, some packets may be lost. Although not required, a hierarchical design model increases EIGRP's scalability on the WAN.

NOTE

Just a word of caution, EIGRP can form one-way neighbor relationships, but OSPF can't.


You will configure EIGRP in the Trouble Tickets. For now, however, the discussion turns to OSPF over Frame Relay.

OSPF over Frame Relay

OSPF works over nearly every data link out there, including Frame Relay. Like EIGRP, the topology type has a big impact on how adjacencies are created. OSPF is encapsulated in the IP header as protocol number 89. Keep in mind that OSPF works over multicast addresses 224.0.0.5 and 224.0.0.6, but by default Frame Relay as well as ATM and X.25 are NBMA data-link technologies. In OSPF, if you don't have any neighbors you obviously don't have link-state advertisements (LSAs) in the link-state database or any OSPF-learned routes in the routing table.

OSPF considers Frame Relay NBMA to be like any other broadcast media for its data-link transport. The default hello interval is 30 seconds, and the default dead interval is 120 seconds. As you can review in Table 8-3, there are two RFC-compliant modes and three additional modes from Cisco to control how OSPF operates over NBMA. This is not just another table to memorize. These modes really determine how the hello protocol and flooding work. Remember that OSPF uses multicast. The big issue with OSPF over an NBMA topology is that the designated router (DR) and backup designated router (BDR) need a list of all other routers to establish adjacencies.

Table 8-3. OSPF over NBMA Modes
ModeTopologyAddressingAdjacency
RFC
NBMAFull meshOne subnet[*]Manual configuration DR/BDR
Point-to-multipointPartial mesh Hub-and-spokeOne subnetAutomatic configuration No DR/BDR
Cisco
BroadcastFull meshOne subnetAutomatic configuration DR/BDR
Point-to-multipoint nonbroadcastPartial mesh Hub-and-spokeOne subnet[*]Manual configuration No DR/BDR
Point-to-pointPartial mesh

Hub-and-spoke

Using subinterfaces
Multiple subnetsAutomatic No DR/BDR

[*] It is good practice to configure neighbor statements on both ends although it may work with one. You can further control OSPF on a neighbor-by-neighbor basis using the [priority], [poll-interval], and [cost] options.

Rather than the default NBMA multipoint connectivity, Frame Relay more commonly operates in a hub-and-spoke topology. Other topologies include partial and full mesh.

NOTE

For the modes in Table 8-3 that do elect DR/BDR, it is important for the routers elected to have a direct connection (PVC) to each of the other routers.


Configure the OSPF network modes using the ip ospf network interface configuration command. Interfaces and multipoint subinterfaces default to NBMA. Other interfaces can be set to the RFC-compliant NBMA mode using the ip ospf network non-broadcast command. Nothing defaults to the RFC point-to-multipoint mode, but the command to set it is ip ospf network point-to-multipoint [non-broadcast]. The [non-broadcast] option is for the Cisco-defined mode. The other Cisco modes are set using ip ospf network broadcast and ip ospf network point-to-point. Broadcast mode acts like Ethernet, Token Ring, or FDDI, and point-to-point is the default for point-to-point subinterfaces. So to summarize, either use frame map with the broadcast parameter, subinterfaces as point-to-point links, or OSPF neighbor statements.

Refer back to these commands later as you work through the rest of this book. For now move on to IS-IS over Frame Relay.

IS-IS over Frame Relay

Integrated Intermediate System-to-Intermediate System (IS-IS) is more often used in the service provider world, as is BGP. However, IS-IS is an IGP and BGP is an Exterior Gateway Protocol (EGP). It was developed by ISO to support OSI protocols (especially CLNS and CLNP) and later extended to support IP. IS-IS is not carried in an IP packet but rather encapsulated directly into Layer 2. However, it is more like OSPF than other routing protocols.

Similarities include the following:

  • Both are link-state routing protocols that use the SPF/Dijkstra algorithm.

  • Both use hello packets to form neighbor adjacencies.

  • Areas form a built-in two-level hierarchy.

  • Both are classless routing protocols (support variable -length subnet masking [VLSM]).

  • Both support authentication.

  • Both use the concept of a DR. (IS-IS called this DIS.)

Cisco routers can operate as Level 1 (L1), Level 2 (L2), or L1/L2 routers. L1s are similar to OSPF internal routers and hold a copy of the link-state database for the local area. L2s are similar to OSPF Area Border Routers (ABRs). They interconnect areas and store interarea information, both local links and information about remote areas. L1/L2 routers are similar to OSPF backbone routers. There are separate adjacencies for L1 and L2. However, adjacencies occur with all routers, not just with the DR like in OSPF.

Although OSPF and IS-IS are quite similar, a couple of things set IS-IS ahead for very large networks. For example, there is not as much confusion over the network types; IS-IS networks are either broadcast or point-to-point. With the IS-IS L1, L2, L1/L2 design, there are fewer link-state packets to process, so it is less processor intensive, too. In OSPF, the MAXAGE of an LSA starts at 0 and counts upward to a fixed value. In practice, this means the lifetime of an LSA is two hours, after which it must be refreshed and flooded across the entire area. Obviously, this causes excessive traffic in the core. If in fact you have only one huge OSPF area, every single LSA will need to be refreshed at least once every two hours. Worse yet, if a router misses one of the refreshed LSAs, there is no longer a route. LSA MAXAGE is hard coded into the protocol for OSPF. However, IS-IS counts its equivalent to MAXAGE in reverse. It starts at a number that the user defines and counts down to 0. By increasing this refresh interval, you eliminate a lot of the overhead of the protocol. Many service provider networks set the refresh interval to the maximum and run IS-IS with thousands of routers in a single level with no ill effects.

IS-IS is a viable OSPF alternative. A network service access point (NSAP) is the location where OSI network services are provided to the Transport Layer. All routers in the same area must use the same area address. Rather than the router ID that OSPF uses, IS-IS uses the OSI NSAP address. The NSAP structure includes the area identifier; the system ID/MAC; and the selector (00). The area identifier loosely equates to the network. The system ID/MAC identifies an individual device. You can think of the Selector byte kind of like an IP port. L1 and L2 routing are based on a unique system ID. Typically the system ID is the MAC address in the CLNS world and the IP address in the IP world.

When troubleshooting IS-IS over Frame Relay in particular, remember that it does not have parameters like the ip ospf network command. Commands such as show isis topology, show clns route, show isis route, which route, show clns neighbor, show isis database, clear isis *, show frame-relay map, and debug isis adj packet are quite helpful in supporting IS-IS.

As far as Frame Relay is concerned, do not configure ip router isis on the main interface because IS-IS will treat it like a broadcast network and adjacency will not occur. You must have full-mesh PVCs to implement IS-IS in a point-to-multipoint environment. Just as with OSPF over hub-and-spoke Frame Relay where the DR needs to be the hub router, this is true with IS-IS, too. The DR in IS-IS is called the DIS.

BGP, like IS-IS, really doesn't have as many Frame Relay-specific issues but is something you may need to support. If you are interested in more detail in the BGP area, look at Internet Routing Architectures by Sam Halabi (Cisco Press) and Routing TCP/IP, Volume II, by Jeff Doyle and Jennifer DeHaven Carroll (Cisco Press).

BGP over Frame Relay

BGP is a loop-free Application Layer connection-oriented reliable EGP that runs over TCP port 179. Instead of a single metric, there are a series of attributes. BGP runs as EBGP between autonomous systems and as IBGP within an autonomous system.

BGP runs over various data links including Frame Relay. Unlike the other routing protocols, it is encapsulated within TCP. Some of the specific issues with BGP over Frame Relay include the use of the ebgp-multihop command when Exterior Border Gateway Protocol (EBGP) neighbors are not directly connected. Also, when using a loopback in the neighbor statement, use neighbor ip-address update-source loopback loopback#. Network statements don't initialize anything like an IGP; they are what you advertise.

Although not just related to Frame Relay, next-hop-self and synchronization are two commonly misunderstood topics when deploying IBGP. IP carries traffic, but BGP carries routes—and there is no way you want BGP to advertise a bad route. BGP bad routes induce autonomous system inconsistencies and black holes into your network.

The synchronization rule says not to use (or advertise to an external neighbor) a route learned via Interior Border Gateway Protocol (IBGP) until a matching route has been learned from an Interior Gateway Protocol (IGP). Hence, BGP must wait until the IGP propagates routing information across the autonomous system, which causes BGP to be synchronized with the IGP. Only then are routes added to the IP routing table. It is practical to turn off synchronization when all routers within an autonomous system are running full-mesh IBGP, which is designed to propagate routes within an autonomous system to another autonomous system when another IGP is not being used.

You can relate synchronization on (the default) to being an apprentice at something. For example, I am always learning or teaching new topics. When I teach a class for the first time, it is helpful to have someone confirm what I am talking about or check my work. When I have some experience teaching a topic, however, I no longer need someone to confirm what I already know; this is the stage similar to when you would turn synchronization off in BGP.

In Figure 8-11 both r1 and r2 should have no synchronization in their router configurations.

Figure 8-11. BGP Next-Hop-Self and Synchronization


Figure 8-11 also illustrates the next-hop-self concept. r1 has both EBGP and IBGP neighbors. When r1 passes along an externally learned route to its internal neighbor r2, it offers r2 the convenience of going through r1 to get to the external system. The command is always configured on the router with an interface in EBGP and IBGP, and in Figure 8-11 that is r1. This command (neighbor r2ipaddress next-hop-self) does not replace the neighbor statement; it is an additional statement.

Remember BGP is an EGP and every change you make effects not only you but also your peers. With any BGP changes, you normally need to reset the neighbor. Although fine in a lab environment, clear ip bgp * can be detrimental to you and whomever you are peering with in practical application. So always replace the * with the neighbor's IP address, otherwise you may find out very quickly more than you ever wanted to know about network instability and service provider route dampening. In a nutshell, dampening is where the service provider can suppress your routes according to the criteria within the bgp dampening ? command. In 12.0 code and above, soft resets were introduced if your neighbor supports them. Refer to the IOS release notes for “BGP Soft Reset Enhancement.”

Just as you experienced in the chapter scenarios, you really do need to think about how things work in order to support them. You will experiment with running the various routing protocols over Frame Relay in the Trouble Tickets. Obviously Layer 3 and above routing protocols still depend on Layer 2 and Layer 1.

You must take the divide-and-conquer approach on the WAN. Half the battle is determining whether it is your issue or a service provider issue; then you can work your way up the layers. If you get LMI but not your other DLCIs, for instance, you don't have much choice but to contact the service provider so that they can perform some remote loopback testing.

Frame Relay Loopback Testing

Loopback tests can help you define the extent of WAN problems in general. Your provider will certainly be happy and probably more willing to assist if you have already verified your side of things. Figure 8-12 shows four different loopback tests. Depending on your exact equipment, you can familiarize yourself with the appropriate loopback commands and menus.

Figure 8-12. Loopback Testing


The show interfaces s0 command shows the looped status and how the keepalives continue to increment. Use an extended ping to verify the data pattern and size to test connectivity up to the CSU/DSU. If the pings are not at least 80-percent successful, the problem is physical in nature between the local router and CSU/DSU as in Figure 8-12(A).

Figure 8-12(B) shows a local loopback test to test connectivity through the local CSU/DSU. Figure 8-12 is a remote loopback through the local CSU/DSU and up to the remote CSU/DSU. Problems here indicate issues in the cloud. Figure 8-12 is an external loopback, which could indicate a problem with the remote CSU/DSU.

Keep in mind that timing is important in troubleshooting as well. While you are testing things out, the service provider may have already caught and fixed the problem. So don't be too stumped when you go through all of this and then the data link is up as it should have been in the first place. What I am saying is that it doesn't hurt to repeat commands that you started with in the first place.

Regardless of the loopback testing type, the best command you can run while testing is an extended ping while you monitor your serial interface with a command such as show interfaces s0. To get the extended ping commands, just type ping from privilege mode and make your selections. For example, you could set the repeat count to 100, the datagram size to 1500, and vary the data pattern. The default data pattern is 0xabcd. Try the following data patterns to help detect CSU/DSU or cable issues: 0x0000, 0x1111, 0xaaaa, and 0xffff. Extended ping was introduced back in Chapter 2, “What's in Your Tool Bag.”

NOTE

Remember to restore your router back to its original setting after the loopback tests. It is also a good idea to carry a “hard” loopback plug in your tool bag to round out the possible tests.


The chapter scenarios exposed you to supporting Frame Relay by taking a layered practical methodical approach just as the other chapters did. Make sure you save all configurations and repeat any steps in which you need more practice.

Once again it is time for the chapter Trouble Tickets. The plan here is to give you several things to do, let you make mistakes and fix some things on your own, and to introduce other problems that you should have some experience with as a support person.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.142.135.4