Practical Troubleshooting

One day you will get a call that says the network is down. Be very prepared to divide and conquer to get to the real problem. Work through the affected layers. Remember that shooting trouble is often about questions. Do you ask the equipment or the user? Who is waiting for the results? What has happened? When did it occur? Why? Where did it happen? Plug it in; turn it on. Make sure you have lights and power. Did it ever work? What has changed since it last worked? Check the obvious. Who is complaining? Is it an end-system issue? Check the application and configuration if it is an individual person or machine. Is it a group of people or machines? Check connectivity and performance. Run through the OSI layers; remember ping and trace; check the routing tables. Is it a local segment issue or does it extend through routers? Is a bad NIC, cable, or device causing performance degradation? Ping yourself, ping someone local, ping the default gateway, or start by pinging a remote network to test all of these. Trace the problem. What is slow: cabling, link, devices? Do you have a baseline comparison? Use ping, trace, a protocol analyzer, and other tools on an ongoing basis. Did someone else try to fix the problem? Never be too proud to ask for help.

Actually it is quite helpful to have people with different backgrounds on your team, whether it be in a test lab or practical environment. You must be able to prioritize problem areas and people for that matter. Normally if the CEO has a problem, you take care of it immediately; if everyone else in the company is down, however, obviously they take precedence (one of those 8, 9, 10 layer things—finance, politics, and religion). Modern day prioritization says let the CEO wait so that when you ask for more people or resources the CEO recognizes the need.

Models and Methods

I would like to credit my REDI model source, but it is something I learned about in college while at Johns Hopkins. I think it came from a systems design or database textbook. In any case, the REDI model gives me a systematic mindset for whatever I am doing. It is quite effective yet easy to remember. The basic tenants of the REDI model are as follows:

  • Define Your Requirements

  • Evaluate the Alternatives

  • Design and Develop

  • Implement (and then do it all over again)

If the design and development work is done, you are probably troubleshooting or starting the life cycle all over again. Whether it is taking a certification test, a new consulting gig, or applying for a job, taking a structured approach and documenting appropriately are of utmost importance.

Baselining and Documentation

Baselining and documentation are crucial to your long-term success with internetwork troubleshooting. This is not just theory; for if you don't know what is normal, how do you know where to begin with troubleshooting. What if you get the call saying the network is slow? Slow compared to what? Did you collect any data when the network was installed and running properly, do you audit it from time to time, or have you just taken the put-out-the-fire approach to network management? You should know what information to collect, how to store it, and who is affected by what. Utilization (CPU and bandwidth); memory; error statistics; protocol distribution; traffic statistics; changes in hardware, software, and configuration; and past troubleshooting documentation are all important aspects for troubleshooting. Track patterns and trends. When you find out who or what is affected, time of day, day of week, and month of year, you can compare this to your baseline.

In the form of pictures, charts, maps, tables, and databases, your baseline should include items such as the following:

Model numberSerial number
RAM/Flash memoryIOS version
Config-register settingsInterface statistics
Bandwidth/speedClocking
EncapsulationDuplex
DescriptionsAddresses
PasswordsSpanning-tree portfast
VLANsRouted protocols
Routing protocolsBridged protocols

In practical application, other things that are valuable to document include the detailed location of equipment (down to the country, state, city, building, wiring closet, rack, and position). Store this information in a log book, on your network, or your personal digital assistant (PDA) for that matter.

From a practical viewpoint, pictures are wonderful resources. Physical layouts, logical maps, lists of protocols (routed, bridged, and routing including redistribution and filtering) can aid you in the process. Include your Internet connections, addressing plans, DHCP, NAT, security plans, and application implementations in your diagrams. What is normal for you may not be what is normal for the next person, so documentation and diagrams are invaluable. Change is truly the only thing constant in this industry—software, hardware, and configuration. Doctors keep records on your children from the time they are born throughout their life, documenting such things as shots, diseases, symptoms, cures, operations, and so on. Do the same for your network. The answer to your problem will be easier to find if it happened before and you documented it in a database of some kind.

Practical troubleshooting is all about taking the previous methods and models and applying them to the real world. Regardless of the model/method you follow, if you take a systematic approach you will be able to narrow the problem down. Amateurs and pros alike should be able to analyze new and complex problems with an effective strategy. It is not necessary to be a know-it-all to be an effective troubleshooter. A successful troubleshooter is a logical thinker with common sense and people skills. Divide and conquer as you did with the access list Trouble Ticket; narrow possibilities down by the layers. Analyze and resolve. If you can't, escalate the issue to the team that can.

An unsystematic approach is time-consuming and costly. This concept is stressed on the CCNP Troubleshooting exam, CCIE exams, and CCSI exams. Troubleshooting models and methods help reduce a large set of causes to a smaller set of causes or, better yet, a single cause. Then you can solve the problem and document it for future reference to help mitigate the pressures of supporting critical complex internetworks. Remember, however, that vendor interoperability is far less smooth than theory models pretend it to be.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.117.188.64