Shooting Trouble with Switches

Shooting trouble with switches requires that you understand Physical and Data Link Layer targets and well as normal switch operations. A physical and logical map is not just something nice to have but a necessity in real-world operations. It is not easy to create if you don't understand how things work, in particular STP for Layer 2 devices. You must continue to follow a consistent methodology such as those suggested in the first part of the book to assist you in isolating fault domains.

It is probably not a bad idea to go back and review the Ethernet and switch beginning checklists and ending sections on shooting trouble. All of them allude to the fact that interfaces (ports) are the main Data Link Layer target. It is up to you to use known-good switches, modules, ports, cables, connectors, and transceivers for connectivity and performance purposes. Hardware issues could be a bad or loose cable, a faulty module or port; and they may be intermittent, in which case electrostatic discharge (ESD) may have originally caused the problem. Always reseat connections and modules, before you call for help. If the Supervisor module is not in slot 1, for example, the system doesn't boot up. In general, disconnect and reconnect; try a different port; try a different known-good cable.

Link lights (LEDs) are good but not always a 100-percent test. An 80-percent to 100-percent switch load may indicate a broadcast storm. On the other line card modules, the LEDs should flash orange (amber) or green during startup, and turn green to indicate successful initialization. Red indicates failure (reseat the module), and flashing orange could be a problem on some modules, although an instance of redundancy on others. As far as the port link integrity, LED issues can be anything from the port, to the cable, to the network interface card (NIC), or the negotiation for speed/duplex. Utilize your tools. Test with a reliable cable as well as a time domain reflectometer/optical time domain reflectometer (TDR/OTDR) to find cable length and impedance issues. Use protocol analyzers for protocol information; cable testers for cable issues; and network monitors to continuously monitor network traffic. There still could be a cable problem with lots of packet loss. On the other hand, things may work fine and the LED may just be burned out.

Other types of connection issues include using fiber where negotiation is not an issue but connectivity is. A common problem here is to plug Tx to Tx and Rx to Rx, but if you want things to work you need to connect Tx to Rx.

Pay attention not only to your LED lights but also to your logs for configuration issues. If you see a solid orange light, for instance, this may just indicate a shutdown port. A user or internal process could have shut it down but might not have automatically brought it back up. Perhaps there are speed/duplex issues. The best practice is to hard code fixed devices so that there is no negotiation. Perhaps STP has the port in a blocking state because it would cause a loop.

When things are working normally, you want to make them optimal. Performance commands include set port host, which is a macro that combines set spantree portfast, set port channel mode off, and set trunk off. Experiment with timing issues with and without portfast. Change the logging level for the session to set logging level spantree 7 and observe the time-stamped log messages to see how long the port stays in each state. You can accomplish this on an IOS box with the spanning-tree portfast interface command; the following global commands: service timestamps debug datetime localtime msec, and service timestamps log datetime localtime msec; and the following privileged exec command: debug spantree events. You can shut down a port and bring it back up to see a topology change and the associated activity. Don't forget that turning on portfast for a port really doesn't change the topology; instead it allows the switch to not send a TCN when a port becomes active.

Traffic issues may lead to segmentation of some sort or to upgrading the devices themselves. Use show port, show mac, and network management programs to monitor the average and peak utilization carefully.

NOTE

If reset system, the reload command, or rebooting seems to clear the issue and it continues to happen, perceptibly the reboot is more of a short-term fix than a permanent solution.


Obviously, you may have a software or hardware bottleneck. Know the limitations of your transport and your devices. For example, you still have collisions if Gigabit pipes are feeding 10-Mbps shared users. Use Cisco.com to assist with corrupted IOS issues; reload the operating system; and upgrade to the appropriate feature set. Again, all of this systematic troubleshooting relates back to the OSI model. Do you have power? Are the power supply and fans running? Are devices turned on? Do they have link lights? Work your way up the layers. (Refer to Table 1-2 in Chapter 1, “Shooting Trouble,“ for a review of the OSI layers.)

Once again it is time for the chapter Trouble Tickets. The plan here is to give you several things to do, let you make mistakes and fix some things on your own, and to introduce other problems that you should have some experience with as a support person. Routing and switching issues are unstated knowledge for the Cisco support person today.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.138.124.40