Chapter Eleven
Troubleshooting Procedures and Best Practices

Introduction

Even the most well-designed and maintained networks will fail at some point. Such a failure might be as dramatic as a failed server taking down the entire network or as routine as a single computer system being unable to print. Regardless of the problem you face, as a network administrator, you will spend a sizable portion of your time troubleshooting problems with the network, the devices connected to it, and the people who use it. In each case, the approach to the problem is as important as the troubleshooting process itself. Although some steps are common to the troubleshooting process, few problems you face will be alike because so many variables are involved.

As you will see in this chapter, troubleshooting is about more than just fixing a problem: It includes isolating the problem and taking the appropriate actions to prevent it from happening again. The ability to effectively troubleshoot network-related problems goes beyond technical knowledge and includes the ability to think creatively to get to the root of a problem. In addition, strong communication skills can turn a difficult and seemingly impossible troubleshooting task into an easy one. Although the role of the network administrator can be a cellular one, you will be surprised at just how much interaction you’ll have with users and how important this element of your role will be.

This chapter provides a comprehensive look into the many facets that make up an effective troubleshooting strategy. In addition, it examines specific skills and techniques you can use to quickly isolate a network-related problem. It also examines scenarios in which these troubleshooting skills come into play.

Note

Who says?  Ask 10 network administrators about troubleshooting best practices, and you will no doubt get 15 different answers. There is no universally accepted definition or procedural acceptance of troubleshooting best practices. With this in mind, the information provided in this chapter specifies troubleshooting best practices identified by CompTIA. Whether these are the best practices in real-world application is a matter of debate. However, there is no debate that these are the best practices that will be on the exam.

Troubleshooting Basics

There is no magic or innate ability that makes a good network troubleshooter. You will hear tales of people who have a gift for troubleshooting, but those who can troubleshoot well aren’t necessarily gifted. Instead, good troubleshooters have a special combination of skills. The ability to competently and confidently troubleshoot networks comes from experience, a defined methodology, and sometimes just plain luck.

One of the factors that makes troubleshooting such a difficult task is the large number of variables that can come into play. Although it is difficult to preemptively list all the factors you have to consider while troubleshooting networks, this chapter lists a few to help you start thinking in the right direction. When you are troubleshooting, thinking in the right direction is half the battle. Considering that most network administrators spend the majority of their troubleshooting time working on the devices connected to the network rather than on the network infrastructure itself, it is worth looking at some of the factors that can affect troubleshooting of devices connected to the network. First, let’s look at the difference between troubleshooting a server and troubleshooting a workstation system.

Troubleshooting Servers and Workstations

One often overlooked but important distinction in troubleshooting networks is the difference between troubleshooting a server computer and troubleshooting a workstation system. Although the fundamental troubleshooting principles of isolation and problem determination are often the same in different networks, the steps taken for problem resolution are often different from one network to another. Make no mistake: When you find yourself troubleshooting a server system, the stakes are much higher than with workstation troubleshooting, and therefore it’s considerably more stressful. Let’s take a look at a few of the most important distinctions between workstation and server troubleshooting:

• Pressure—It is difficult to capture in words the pressure you feel when troubleshooting a downed server. Troubleshooting a single workstation with one anxious user is stressful enough, and when tens, hundreds, or even thousands of users are waiting for you to solve the problem, the pressure can be enough to unhinge even the most seasoned administrator.

• Planning—Troubleshooting a single workstation often requires very little planning. If work needs to be done on a workstation, it can often be done during a lunch break, after work, or even during the day. If work needs to be done on a server, particularly one that is heavily accessed, you might need to wait days, weeks, or even months before you have a good time to take down the server so that you can work on it and fix the problem.

• Time—For many organizations, every minute a server is unavailable is measured as much in dollars as it is in time. Servers are often relied on to provide 24-hour network service, and anything less is often considered unacceptable. Although it might be necessary to take a server down at some point for troubleshooting, you will be expected to account for every minute that it is down.

• Problem determination—Many people who have had to troubleshoot workstation systems know that finding the problem often involves a little trial and error. (Swap out the RAM; if that doesn’t work, replace the power supply, and so on.) Effective server troubleshooting involves very little trial and error—if any at all. Before the server is powered down, the administrator is expected to have a good idea of the problem.

• Expertise—Today, many people feel comfortable taking the case off their personal computers to add memory, replace a fan, or just have a quick peek. Although it is based on the same technologies as PC hardware, server hardware is more complex, and those who manage and maintain servers are expected to have an advanced level of hardware and software knowledge, often reinforced by training and certifications.

These are just a few of the differences in the troubleshooting practices and considerations between servers and workstations. As this chapter discusses troubleshooting, the focus is mainly on the server side of troubleshooting. This helps explain why some of the troubleshooting procedures might seem rigid and unnecessary on a workstation level.

Exam Alert

Workstations and servers  The Network+ exam does not require you to identify any specific differences between workstation and server troubleshooting, but it does require background knowledge of general troubleshooting procedures and the factors that influence how a network problem is approached.

General Troubleshooting Considerations

Knowing the differences between procedures and approaches for troubleshooting servers and for troubleshooting workstations is valuable, but a seemingly endless number of other considerations exist. Each of these other factors can significantly affect the way you approach a problem on the network. The following list contains some of the obvious and perhaps not so obvious factors that come into play when troubleshooting a network:

• Time—The time of day can play a huge role in the troubleshooting process. For instance, you are likely to respond differently to a network problem at 10 a.m., during high network use, than at 8 p.m., when the network is not being utilized as much. The response to network troubleshooting during high-use periods is often geared toward a Band-Aid solution, just getting things up and running as soon as possible. Finding the exact cause of the problem and developing a permanent fix generally occurs when there is more time.

• Network size—The strategies and processes used to troubleshoot small networks of 10 to 100 computer systems can be different from those used to troubleshoot networks consisting of thousands of computers.

• Support—Some network administrators find themselves working alone, as a single IT professional working for a company. In such cases, the only available sources might include telephone, Internet, or manufacturer support. Other network administrators are part of a large IT department. In that type of environment, the troubleshooting process generally includes a hierarchical consultation process.

• Knowledge of the network—It would be advantageous if uniformity existed in the installation of all networks, but that isn’t the case. You could be working on a network with ring or star topology. Before you start troubleshooting a network, you need to familiarize yourself with its layout and design. The troubleshooting strategies you employ will be affected by your knowledge of the network.

• Technologies used—Imagine being called in to troubleshoot a wide area network (WAN) that includes multiple Linux servers, a handful of NetWare servers, an old Windows NT 3.51 server, and multiple Macintosh workstations. Your knowledge of these technologies will dictate how, if at all, you are going to troubleshoot the network. There is no shame in walking away from a problem you are unfamiliar with. Good network administrators always recognize their knowledge boundaries.

These are just a few of the factors that will affect your ability to troubleshoot a network. There are countless others.

The Art of Troubleshooting

At some point in your networking career, you will be called on to troubleshoot network-related problems. Correctly and swiftly identifying these problems is not done by accident; rather, effective troubleshooting requires attention to some specific steps and procedures. Although some organizations have documented troubleshooting procedures for their IT staff members, many do not. Whether you find yourself using these exact steps in your job is debatable, but the general principles remain the same. The CompTIA objectives list the troubleshooting steps as follows:

Step 1. Information gathering—identify symptoms and problems.

Step 2. Identify the affected areas of the network.

Step 3. Determine if anything has changed.

Step 4. Establish the most probable cause.

Step 5. Determine if escalation is necessary.

Step 6. Create an action plan and solution identifying potential effects.

Step 7. Implement and test the solution.

Step 8. Identify the results and effects of the solution.

Step 9. Document the solution and the entire process.

The following sections examine each area of the troubleshooting process.

Step 1: Information Gathering—Identify Symptoms and Problems

Troubleshooting a network can be difficult at the best of times, but trying to do it with limited information makes it that much harder. Trying to troubleshoot a network without all the information can, and often will, cause you to troubleshoot the wrong problem. Without the correct information, you could literally find yourself replacing a toner cartridge when someone just used the wrong password.

With this in mind, the first step in the troubleshooting process is to establish exactly what the symptoms of the problem are. This stage of the troubleshooting process is all about information gathering. To get this information, you need knowledge of the operating system used, good communication skills, and a little patience. It is important to get as much information as possible about the problem before you charge out the door with that toner cartridge under your arm. You can glean information from three key sources: the computer (in the form of logs and error messages), the computer user experiencing the problem, and your own observation. These sources are examined in the following sections.

Information from the Computer

If you know where to look and what to look for, a computer can help reveal where a problem lies. Many operating systems provide error messages when a problem is encountered. A Linux system, for example, might present a Segmentation Fault error message, which often indicates a memory-related error. Windows, on the other hand, might display an Illegal Operation error message to indicate a possible memory or application failure. Both of these system error messages can be cross-referenced with the operating system’s website information to identify the root of the problem. The information provided in these error messages can at times be cryptic, so finding the solution might be tricky.

In addition to the system-generated error messages, network operating systems can be configured to generate log files after a hardware or software failure. An administrator can then view these log files to see when the failure occurred and what was being done when the crash occurred. Windows 2000/2003/2008/XP/Vista displays error messages in the Event Viewer, Linux stores many of its system log files in the /var/log directory, and NetWare creates a file called abend.log, which contains detailed information about the state of the system at the time of the crash. When you start the troubleshooting process, make sure that you are familiar enough with the operating system being used to be able to determine whether it is trying to give you a message.

Exam Alert

Error message storage  For the Network+ exam, you may need to know that the troubleshooting process requires you to read system-generated log errors.

Information from the User

Your communication skills will be most needed when you are gathering information from end users. Getting accurate information from a computer user or anyone with limited technical knowledge can be difficult. Having a limited understanding of computers and technical terminology can make it difficult for a nontechnical person to relay the true symptoms of a problem. However, users can convey what they are trying to do and what is not working. When you interview an end user, you will likely want the following information:

• Error frequency—If it is a repeating problem, ask for the frequency of the problem. Does the problem occur at regular intervals or sporadically? Does it happen daily, weekly, or monthly?

• Applications in use—You will definitely want to know what applications were in use at the time of the failure. Only the end user will know this information.

• Past problems—Ask whether this error has been a problem in the past. If it has and it was addressed, you might already have your fix.

• User modifications—A new screensaver, a game, or other such programs have ways of ending up on users’ systems. Although many of these applications can be installed successfully, sometimes they create problems. When you are trying to isolate the problem, ask the user whether any new software additions have been made to the system.

• Error messages—Network administrators cannot be at all the computers on a network all the time. Therefore, they are likely to miss an error message when it is displayed onscreen. The end user might be able to tell you what error message appeared.

Note

Installation policies  Many organizations have strict policies about what can and cannot be installed on computer systems. These policies are not in place to exercise the administrator’s control but rather to prevent as many crashes and failures as possible. Today many harmless-looking freeware and trial programs have Trojan viruses or spyware attached. When executed they can cause considerable problems on a system.

Observation Techniques

Finding a problem often involves nothing more than using your eyes, ears, and nose to locate the problem. For instance, if you are troubleshooting a workstation system and you see a smoke cloud wafting from the back of the system, looking for error messages might not be necessary. If you walk into a server room and hear the CPU fan grinding, you are unlikely to need to review the server logs to find the problem.

Observation techniques often come into play when you’re troubleshooting connectivity errors. For instance, looking for an unplugged cable and confirming that the light-emitting diode (LED) on the network interface card (NIC) is lit requires observation on your part. Keeping an eye as well as a nose out for potential problems is part of the network administrator’s role and can help in identifying a situation before it becomes a problem.

Exam Alert

Observation techniques  For the Network+ exam, remember that observation techniques play a large role in the preemptive troubleshooting process, which can result in finding a small problem before it becomes a large one.

Effective Questioning Techniques

Regardless of the method you are using to gather information about a problem, you will need answers to some important questions. When approaching a problem, consider the following questions:

• Is only one computer affected, or has the entire network gone down?

• Is the problem happening all the time, or is it intermittent?

• Does the problem happen during specific times, or does it happen all the time?

• Has this problem occurred in the past?

• Has any network equipment been moved recently?

• Have any new applications been installed on the network?

• Has anyone else tried to correct the problem; if so, what has that person tried?

• Is there any documentation that relates to the problem or to the applications or devices associated with the problem?

By answering these questions, as well as others, you will gain a better idea of exactly what the problem is.

Step 2: Identify the Affected Areas of the Network

Some computer problems are isolated to a single user in a single location; others affect several thousand users spanning multiple locations. Establishing the affected area is an important part of the troubleshooting process, and it often dictates the strategies you use in resolving the problem.

Exam Alert

Be thorough  On the Network+ exam, you might be provided with either a description of a scenario or a description augmented by a network diagram. In either case, you should read the description of the problem carefully, step by step. In most cases, the correct answer is fairly logical, and the wrong answers can be identified easily.

Problems that affect many users are often connectivity issues that disable access for many users. Such problems can often be isolated to wiring closets, network devices, and server rooms. The troubleshooting process for problems isolated to a single user often begins and ends at that user’s workstation. The trail might indeed lead you to the wiring closet or server, but it is not likely that the troubleshooting process would begin there. Understanding who is affected by a problem can provide the first clues about where the problem exists.

As a practical example, assume that you are troubleshooting a client connectivity problem whereby a Windows client is unable to access the network. You can try to ping the server from that system, and, if it fails, ping the same server from one or two more client systems. If all tested client systems are unable to ping the server, the troubleshooting procedure will not focus on the clients but more toward something common to all three, such as the DHCP server or network hub.

Challenge

You are a network administrator managing a network that has four separate network segments: sales, administration, payroll, and advertising. Late on Tuesday evening, you get a call from several members of the sales staff, complaining that they are unable to access a network printer. How would you troubleshoot this scenario?

Because the reported problem has a common thread, the sales department, it is likely that there is a connectivity issue with the network segment the sales group is on. The problem could be a downed router, switch, hub, or authentication server. Whatever the cause, you can more easily isolate the problem if you know the location.

Consider how you would handle this troubleshooting scenario differently if the error reports came simultaneously from the sales, payroll, and advertising groups.

Step 3: Determine if Anything Has Changed

Whether a problem exists with a workstation’s access to a database or an entire network, keep in mind that they were working at some point. Although many claim that the “computer just stopped working,” it is unlikely. Far more likely is that changes to the system or the network caused the problem. As much as users try to convince you that computers do otherwise, computer systems do not reconfigure themselves. Therefore, establishing what was done to a system will lead you in the right direction to isolate and troubleshoot a problem.

Changes can occur on the network, server, or workstation. Each of these is discussed in the following sections.

Exam Alert

Obvious solutions  In the Network+ exam, avoid discounting a possible answer because it seems too easy. Many of the troubleshooting questions are based on possible real-world scenarios, many of which do have easy or obvious solutions.

Changes to the Network

Most of today’s networks are dynamic and continually growing to accommodate new users and new applications. Unfortunately, these network changes, although intended to increase network functionality, may inadvertently cause additional problems. For instance, a new computer system added to a network might be installed with a duplicate computer name or IP address, which would prevent another computer that has the same name or address from accessing the network. Other changes that can create problems on the network include adding or removing a hub or switch, changing the network’s routing information, or adding or removing a server. In fact, almost every change that the network administrator makes to the network can potentially have an undesirable impact elsewhere on the network. For this reason, all changes made to the network should be fully documented and fully thought out.

Note

Faulty hardware  Although recent changes to systems or networks account for many network problems, some problems do happen out of the blue. Faulty hardware is a good example.

Changes to the Server

Part of a network administrator’s job involves some tinkering with the server. Although this might be unavoidable, it can sometimes lead to several unintentional problems. Even the most mundane of all server tasks can have a negative impact on the network. The following are some common server-related tasks that can cause problems:

• Changes to user accounts—For the most part, changes to accounts do not cause any problems, but sometimes they do. If after making changes to user accounts, a user or several users are unable to log on to the network or access a database, the problem is likely related to the changes made to the accounts.

• Changes to permissions—Data is protected by permissions that dictate who can and cannot access the data on the drives. Permissions are an important part of system security, but changes to permissions can inadvertently prevent users from being able to access specific files.

• Patches and updates—Part of the work involved in administering networks is to monitor new patches and updates for the network operating system and install them as needed. It is not uncommon for an upgrade or a fix to an operating system to cause problems on the network.

• New applications—From time to time, new applications and programs—such as productivity software, firewall software, or even virus software—have to be installed on the server. When any kind of new software is added to the server, it might cause problems on the network. Knowing what has recently been installed can help you isolate a problem.

• Hardware changes—Either because of failure or expansion, hardware on the server might have to be changed. Changes to the hardware configuration on the server can cause connectivity problems.

Changes to the Workstation

The changes made to the systems on the network are not always under the control of the network administrator. Often, the end user performs configuration changes and some software installations. Such changes can be particularly frustrating to troubleshoot, and many users are unaware that the changes they make can cause problems. When looking for changes to a workstation system, consider the following:

• Network settings—One of the configuration hotspots for workstation computer systems are the network settings. If a workstation is unable to access the network, it is a good idea to confirm that the network settings have not been changed.

• Printer settings—Many printing problems can be isolated to changes in the printer configuration. Some client systems, such as Linux, are more adept at controlling administrative configuration screens than others; for example, Windows leaves such screens open to anyone who wants to change the configuration. When printing problems are isolated to a single system, changes in the configuration could be the cause.

• New software—Many users love to download and install nifty screensavers or perhaps the latest 3D adventure games on their work computers. The addition of extra software can cause the system to fail. Confirm with the end user that new software has not been added to the system recently.

Note

Consider a system that could previously log on to the network but now receives an error message, saying that it cannot log on because of a duplicate IP address. A duplicate IP address means that two systems on the network are attempting to connect to the network using the same IP address. As you know, there can be only one. This often happens when a new system has been added to a network where Dynamic Host Configuration Protocol (DHCP) is not being used.

Step 4: Establish the Most Probable Cause

There can be many different causes for a single problem on a network, but with appropriate information gathering, it is possible to eliminate many of them. When looking for a probable cause, it is often best to look at the easiest solution first and then work from there. Even in the most complex of network designs, the easiest solution is often the right one. For example, if a single user cannot log on to a network, it is best to confirm network settings before replacing the NIC. Remember, though, that at this point you are only trying to determine the most probable cause, and your first guess might in fact be incorrect. It might take a few tries to determine the correct cause of the problem.

Challenge

A user calls you to inform you that she is unable to access email. After asking a few questions, you determine that the user has only recently started with the company and has been unable to get email since her start date. What, then, is the likely source of the problem?

In this scenario, there can be several causes of the problem: perhaps network connectivity, perhaps a bad NIC, or perhaps email has never been configured on her workstation. Check to see whether email has been configured. If it has not, configure it. If it has been configured and it is working correctly, consider the next most likely cause of the problem.

Step 5: Determine if Escalation Is Necessary

Sometimes the problems we encounter fall outside the scope of our knowledge. Very few organizations expect their administrators to know everything, but organizations do expect administrators to be able to fix any problem and to do this, additional help is often needed.

Note

Finding solutions  System administration is often as much about knowing whom and what to refer to in order to get information about a problem as it is about actually fixing the problem.

Technical escalation procedures do not follow a specific set of rules; rather, the procedures to follow vary from organization to organization and situation to situation. Your organization might have an informal arrangement or a formal one requiring documented steps and procedures to be carried out. Whatever the approach, there are general practices that you should follow for appropriate escalation.

Unless otherwise specified by the organization, the general rule is to start with the closest help first and work out from there. If you work in an organization that has an IT team, talk with others in your team; every IT professional has had different experiences, and someone else may know the issue at hand. If you are still struggling with the problem, it is common practice to notify a supervisor or head administrator, especially if the problem is a threat to the server’s data or can bring down the server.

Suppose that you are the server administrator who notices a problem with a hard disk in a RAID 1 array on a Linux server. You know how to replace drives in a failed RAID 1 configuration, but you have no experience working with software RAID on a Linux server. This situation would most certainly require an escalation of the problem. The job of server administrator in this situation is to notice the failed RAID 1 drive and to recruit the appropriate help to repair the RAID failure within Linux.

Note

Passing the buck  When you’re confronted with a problem, it is yours until it has been solved or until it has been passed to someone else. Of course, the passing on of an issue requires that both parties be aware that it has been passed on.

Challenge

You have noticed that a network card in a NetWare server appears to have failed. The server supports 300 users. Although you have considerable experience in replacing failed NICs, you have no experience in configuring network cards within NetWare. How do you proceed?

Because you have noticed the problem, it is yours until it is resolved or passed to someone who can resolve it. In this situation, an escalation procedure is required because, with your limited NetWare experience, you’ll most likely need some help. Three hundred users need the server up and running A.S.A.P., and there is no room for trial and error.

Step 6: Create an Action Plan and Solution Identifying Potential Effects

After identifying a cause, but before implementing a solution, develop a plan for the solution. This is particularly a concern for server systems in which taking the server offline is a difficult and undesirable prospect. After identifying the cause of a problem on the server, it is absolutely necessary to plan for the solution. The plan must include details around when the server or network should be taken offline and for how long, what support services are in place, and who will be involved in correcting the problem.

Planning is an important part of the whole troubleshooting process and may involve formal or informal written procedures. Those who do not have experience troubleshooting servers may be wondering about all the formality, but this attention to detail ensures the least amount of network or server downtime and the maximum data availability.

As far as workstation troubleshooting is concerned, rarely is a formal planning procedure required, and this makes the process easier. Planning for workstation troubleshooting typically involves arranging a convenient time with end users to implement a solution.

Step 7: Implement and Test the Solution

With the plan in place, you should be ready to implement a solution—that is, apply the patch, replace the hardware, plug in a cable, or implement some other solution. Ideally, your first solution would fix the problem, although unfortunately this is not always the case. If your first solution does not fix the problem, you will need to retrace your steps and start again.

It is important that you attempt only one solution at a time. Trying several solutions at once can make it unclear which one actually corrected the problem.

Tip

Rollback plans  A common and mandatory step that you must take when working on servers and some mission-critical workstations is to develop a rollback plan. The purpose of a rollback plan is to provide a method to get back to where you were before attempting the fix. Troubleshooting should not make the problem worse. Have an escape plan!

After the corrective change has been made to the server, network, or workstation, it is necessary to test the results. Never assume. This is where you find out whether you were right and the remedy you applied actually worked. Don’t forget that first impressions can be deceiving, and a fix that seems to work on first inspection may not have corrected the problem.

The testing process is not always as easy as it sounds. If you are testing a connectivity problem, it is not difficult to ascertain whether your solution was successful. However, changes made to an application or to databases are typically much more difficult to test. It might be necessary to have people who are familiar with the database or application run the tests with you in attendance. For example, suppose that you are troubleshooting an accounting program installed in a client/server configuration. Network clients access the accounting program and the associated data from the server. Recently, all network accountants receive only outdated data when using the application. You, being a network administrator and not an accountant, may have never used the program and therefore cannot determine the outdated data from current data. Perhaps you don’t even know how to load the data in the application. How can you possibly determine whether you have corrected the problem? Even from this simple example, we can see that the process of testing results may require the involvement of others, including end users, managers, other members of the IT team, support professionals associated with third-party applications, and so on.

Note

Avoiding false starts  When you have completed a fix, test it as thoroughly as you can before informing users of the fix. Users would generally rather wait for a real fix than have two or three false starts.

In an ideal world, you want to be able to fully test a solution to see whether it indeed corrects the problem. However, you might not know whether you were successful until all users have logged back on, the application has been used, or the database has been queried. As a network administrator, you will be expected to take the testing process as far as you realistically can, even though you might not be able to simulate certain system conditions or loads. The true test will be in a real-world application.

Tip

Virus activity  Keep in mind when troubleshooting a network or systems on a network that the problem might be virus related. Viruses can cause a variety of problems that often disguise themselves as other problems. Part of your troubleshooting toolkit should include a bootable virus disk with the latest virus definitions. Indicators that you might have a virus include increased error messages and missing and corrupt files.

Step 8: Identify the Results and Effects of the Solution

Sometimes, you will apply a fix that corrects one problem but creates another. Many such circumstances are difficult to predict, but not always. For example, you might add a new network application, but the application requires more bandwidth than your current network infrastructure can support. The result would be that overall network performance is compromised.

Everything done to one part of the network can negatively affect another area of the network. Actions such as adding clients, replacing hubs, and adding applications can all have unforeseen results. It is difficult to always know how the changes you make to a network are going to affect the network’s functioning. The safest thing to do is assume that the changes you make are going to affect the network in some way and realize that you just have to figure out how. This is where you might need to think outside the box and try to predict possible outcomes.

Understanding Potential Impacts of Solutions You Choose

It is important to remember that the effects of a potential solution may be far reaching. As a real example, a few years ago, a mid-sized network hired an IT consultant to address a problem of lost data stored on local client hard disks. His solution was to install a new client/server application that would store data and graphics on a centralized file server. With all data stored centrally, data, including backups, could be easily managed and controlled. The solution was implemented and tested on some client systems, and the application worked.

At first only a few users used the application, but within months most users were transferring large files back and forth from the file server. Network monitoring tools revealed that the network could not handle the load of the new application, and network performance was far below an acceptable level, leaving network users frustrated with wait times.

It turned out that the IT consultant failed to identify an infrastructure problem. Although the network used switches and 10/100Mbps NICs, Cat3 cable was used throughout most of the network. Cat3 UTP cable provides 10Mbps network speeds, not enough bandwidth for the number of users accessing the application.

This situation provides an example of how the troubleshooting process can easily go wrong. The first problem may have been addressed—decentralized storage on client systems—but the effects of that solution created a much bigger problem. Using CompTIA’s troubleshooting process, the troubleshooting process is systematic and takes into account the current error and does not stop until all considerations are met.

Step 9: Document the Solution and the Entire Process

Although it is often neglected in the troubleshooting process, documentation is as important as any of the other troubleshooting procedures. Documenting a solution involves keeping a record of all the steps taken during the fix—not necessarily just the solution.

For the documentation to be of use to other network administrators in the future, it must include several key pieces of information. When documenting a procedure, include the following information:

• Date—When was the solution implemented? It is important to know the date because if problems occur after your changes, knowing the date of your fix makes it easier to determine whether your changes caused the problems.

• Why—Although it is obvious when a problem is being fixed while it is being done, a few weeks later, it might become less clear why that solution was needed. Documenting why the fix was made is important because if the same problem appears on another system, you can use this information to reduce time finding the solution.

• What—The successful fix should be detailed, along with information about any changes to the configuration of the system or network that were made to achieve the fix. Additional information should include version numbers for software patches or firmware, as appropriate.

• Results—Many administrators choose to include information on both successes and failures. The documentation of failures may prevent you from going down the same road twice, and the documentation of successful solutions can reduce the time it takes to get a system or network up and running.

• Who—It might be that information is left out of the documentation, or someone simply wants to ask a few questions about a solution. In both cases, if the name of the person who made a fix is in the documentation, the person can easily be tracked down. This is more of a concern in environments where there are a number of IT staff, or if system repairs are performed by contractors instead of company employees.

Tip

Log books  Many organizations require that a log book be kept in the server room. This log book should maintain a record of everything that has been done on the network. In addition, many organizations require that administrators keep a log book of all repairs and upgrades made to networks and workstations.

Challenge

You have been away on a sunny vacation for three weeks, and when you return, there are several error messages on your company’s server. What do you do?

Part of the role of a network administrator is to review the network documentation. To troubleshoot this scenario, look for any documented changes made to the system in your absence. Specifically, look for network configuration changes and added software applications or operating system patches. It is likely that one of these modifications will be at the root of the problem.

Troubleshooting the Network

You will no doubt find yourself troubleshooting wiring and infrastructure problems much less frequently than you’ll troubleshoot client connectivity problems—and thankfully so. Wiring- and infrastructure-related problems can be difficult to trace, and sometimes a costly solution is needed to remedy the situation. When troubleshooting these problems, a methodical approach is likely to pay off.

Wiring problems are related to the cable used in a network. For the purposes of the Network+ exam, infrastructure problems are classified as those related to network devices such as hubs, switches, and routers.

Troubleshooting Wiring

Troubleshooting wiring involves knowing what wiring your network uses and where it is being used. As mentioned in Chapter 2, “Media and Connectors,” the cable used has certain limitations in terms of both speed and distance. It might be that the network problems are the result of trying to use a cable in an environment or a way for which it was not designed. For example, you might find that a network is connecting two workstations that are 130 meters apart with Category 5 UTP cabling. Category 5 UTP is specified for distances up to 100 meters, so exceeding the maximum cable length could be a potential cause of the problem.

Tip

Cable distances  Look at cable distances carefully. When you are running cables along walls, across ceilings, and along baseboards, the distances can add up quickly. For this reason, carefully consider the placement of the wiring closet and ensure that you are able to reach all extents of your network while staying within the specified maximum cable distances.

Determining the type of cable used by a network is often as easy as reading the cable. The cable should be stamped with its type—whether it is, for example, UTP Category 5, RG-58, or something else. As you work with the various cable types used to create networks, you’ll get to the point where you can easily identify them. However, be careful when identifying cable types because some cable types are almost indistinguishable. After you have determined the cable being used, you can compare the characteristics and limitations of that cable against how it is being used on the network.

Tip

Cable types  The type of cable used in a network is an important fact and one that should be included in the network documentation.

Where the Cable Is Used

Imagine that you have been called in to track down a problem with a network. After some time, you discover that clients are connected to the network via standard UTP cable run down an elevator shaft. Recall from Chapter 2 that UTP has poor resistance to electromagnetic interference (EMI), and therefore UTP and the electrical equipment associated with elevators react to each other like oil and water. The same can be said of cables that are run close to fluorescent light fittings. Such problems might seem far-fetched, but you would be surprised at how many environments you will work in that have random or erratic problems that users have lived with for a long time and nothing has been done.

Note

Risers  In many buildings, risers are used for running cables between floors. A riser is a column that runs from the bottom of the building to the top. Risers are used for running all kinds of cables, including electrical and network cables.

Part of troubleshooting wiring problems is to identify where the cable is run to isolate whether the problem is a result of cross talk or EMI. Be aware of problems associated with interference and the distance limitations of the cable being used.

Tip

Test cable  Never assume that the cable you are using is good until you test it and confirm that it is good. Sometimes cables break, and bad media can cause network problems.

If you find a problem with a network’s cable, you can do various things to correct it. For cables that exceed the maximum distance, you can use a repeater to regenerate the signal, try to reroute the cable over a more economical route, or even replace the type of cable with one that has greater resistance to attenuation. The method you choose often depends on the network’s design and your budget.

For cable affected by EMI or other interference, consider replacing the cable with one that is more resistant to such interference or rerouting the cable away from the source of the interference. If you do reroute cable, pay attention to the maximum distance, and make sure that as you’re curing one problem you don’t create another.

Wiring Issues

Depending on where the cable is used and the type of cable, you may encounter some specific cable-related problems. This section describes some problems you may encounter and their solutions.

Cross Talk

Whether its coaxial cable, or UTP, copper-based cabling is susceptible to cross talk. Cross talk happens when the signal from one cable gets mixed up with the signal in another cable. This can happen when cables are run too closely together. Some cables use shielding to help reduce the impact of cross talk. If shielded cable is not used, cables should not be run directly near each other.

Near-End Cross Talk (NEXT)

NEXT refers to interference between adjacent wire pairs within the twisted pair cable at the near-end of the link (the end closest to the origin of the data signal). NEXT occurs when an outgoing data transmission leaks over to an incoming transmission. In effect, the incoming transmission overhears the signal sent by a transmitting station at the near end of the link. The result is that a portion of the outgoing signal is coupled back into the received signal.

Far-End Cross Talk (FEXT)

FEXT occurs when a receiving station overhears a data signal being sent by a transmitting station at the other end of a transmission line. FEXT identifies the interference of a signal through a wire pair to an adjacent pair at the farthest end from the interfering source (the end where the signal is received).

EMI

Electromagnetic interference (EMI) can reduce signal strength or corrupt it altogether. EMI occurs when cables are run too close to everyday office fixtures such as computer CRT monitors, fluorescent lighting fixtures, elevators, microwaves, and anything else that creates an electromagnetic field. Again, the solution is to carefully run cables away from such devices. If they have to be run through EMI areas, shielded cabling or fiber cabling needs to be used.

Attenuation

All media has recommended lengths that the cable can be run. This is because data signals weaken as they travel farther from the point of origin. If the signal travels far enough, it can weaken so much that it becomes unusable. The weakening of data signals as they traverse the media is referred to as attenuation. All copper-based cable is particularity susceptible to attenuation. When cable lengths have to be run farther than the recommended lengths, signal regenerators can be used to boost the signal as it travels. If you are working on a network with intermittent problems and notice that cable lengths are run too far, attenuation may be the problem. To see cable lengths, refer to Chapter 6, “Ethernet Networking Standards.”

Open Impedance Mismatch (Echo)

Any network segment may consist of a single continuous section of cable or be constructed from multiple cable sections that are attached through switches and other hardware. If multiple cable sections are used, it can result in impedance mismatches that are caused by slight differences in the impedance of each cable section. Impedance refers to the total opposition a circuit or device offers to the flow of a signal, measured in ohms. All media, such as twisted-pair cable, has characteristic impedance. Impedance characteristics for twisted-pair cable include 100, 120, and 150 ohms. UTP typically has an impedance of 100 ohms, and STP has an impedance of 150 ohms. Mixing these two wires in the same cable link can result in an impedance mismatch, which can cause the link to fail. To help prevent impedance mismatch, use cable rated with the same impedance rating.

Shorts

Electrical shorts can occur in any type of cable that has electrical current flowing through it. Shorts occur when the electrical current travels along a different path than what is intended. This can often happen if a network cable is not made correctly and wires are touching each other, improperly grounded, or touching metal. This is another reason to be careful when attaching your own RJ-45 connectors to twisted-pair cable. Sometimes, network cables can become damaged, bent, or mishandled, and shorts can occur. Several networking tools are used to test for shorts, as discussed in Chapter 13, “Network Management Tools and Documentation Procedures.” Copper-based media that carries electrical current is susceptible to shorts; wireless and fiber optic cable are not.

Managing Collisions

Collisions occur on a network when two or more networked devices transmit data at the same time. The result is that the data collides, becomes corrupted, and needs to be re-sent. If these collisions keep occurring, the network slows down and may eventually impact network users. Media access control (MAC) techniques can help prevent collisions from occurring. Two commonly used MAC methods include Collision Sense Multiple Access/Collision Detection, or CSMA/CD, used with wired ethernet networks and Collision Sense Multiple Access/Collision Avoidance, or CSMA/CA, used with 802.11 wireless networks.

The more devices that are connected to an ethernet network, the more likely it is that collisions will occur on the network. In other words, the more devices you add to an ethernet network, the slower, exponentially, the network will become. This decreasing of performance has driven improvements in the structure of how ethernet networks. Improvements include the substitution of older hubs with new, high-performance ethernet switches and the reduction of broadcast-intensive applications.

Collisions can mostly be avoided by using switches instead of hubs. Switches allow for the segmentation of ethernet networks into smaller collision domain. Whereas the use of a hub creates a large single collision domain, each port on a switch represents a separate collision domain. The switch can provide full-duplex communication to the node/nodes connected to that port. In a switched network, systems do not need to use collision detection and can just transmit without hesitation. How a switch functions is covered in Chapter 3, “Networking Components and Devices.”

Exam Alert

Switched network  For the Network+ exam, remember that a switch reduces the need for a contention-based network environment because the switch ports break down the network into smaller collision domains. The smaller the collision domain, the fewer collisions that occur.

Troubleshooting Infrastructure Hardware

If you are looking for a challenge, troubleshooting hardware infrastructure problems is for you. It is often not an easy task and usually involves many processes, including baselining and performance monitoring. Both baselines and monitoring are covered in detail in Chapter 13. One of the keys to identifying the failure of a hardware network device is to know what devices are used on a particular network and what each device is designed to do. Some of the common hardware components used in a network infrastructure are shown in Table 11.1.

Table 11.1 Common Network Hardware Components, Their Function and Troubleshooting Strategies

image

For more information on network hardware devices and their function, refer to Chapter 3.

Challenge

Users on your network have been complaining that network performance has been slow, and many of their everyday tasks are taking longer than they used to. What should you do?

Conduct a network performance test and compare it with information from your baseline history. Interpret the information to see whether there is actually a problem on the network. If you determine that a problem exists, you need to find out whether there have been any changes to the network that might account for the slow network performance, such as changes to the hardware or software configurations.

Configuring and Troubleshooting Client Connectivity

Connecting clients to an existing network is a common task for network administrators. Connecting a client system requires establishing the physical connection, defining network protocols, assigning permissions, and accessing server services and resources. This section explores the requirements to connect a client PC to a network.

Verifying Client TCP/IP Configurations

Configuring a client for TCP/IP can be relatively complex, or it can be simple. Any complexity involved is related to the possible need to configure TCP/IP manually. The simplicity is related to the fact that TCP/IP configuration can occur automatically via DHCP or through APIPA. This section looks at some of the basic information required to make a system function on a network, using TCP/IP. At the least, a system needs an IP address and a subnet mask. The default gateway, DNS server, and WINS server are all optional, but network functionality is limited without them. The following list briefly explains the IP-related settings used to connect to a TCP/IP network:

• IP address—Each system must be assigned a unique IP address so that it can communicate on the network. Clients on a LAN will have a private IP address and matching subnet mask. Table 11.2 shows the private IP ranges. If a system has the wrong IP or subnet mask that client system will not be able to communicate on the network. If the client system has an IP address in the 169.254.0.0 range, the system is not connected to a DHCP server and not getting on the network. Refer to Chapter 5, “TCP/IP Addressing and Routing,” for information on APIPA and automatic IPv4 assignments.

Table 11.2 Private Address Ranges

image

• Subnet mask—The subnet mask allows the system to determine what portion of the IP address represents the network address and what portion represents the node address. Refer to Table 11.2 to see the right subnet mask associated with each private IP range. To be part of the network, each client system needs to have the correct subnet mask, and the subnet mask must use the matching one used with the rest of the network. Figure 11.1 shows a correct IP configuration and an incorrect IP configuration on a Windows Vista system.

Figure 11.1 A correct and an incorrect IP client configuration.

image

• Default gateway—The default gateway allows internal systems to communicate with systems on a remote network. In home use, the gateway would likely be the DSL or cable modem that acts as a router. In a business environment the gateway is the device that routes traffic from the workstation to the outside network. This network device will have an IP address assigned to it, and the client configuration must use this address as the default gateway. If not, the system will not be able to be routed outside the local network.

DNS server addresses—DNS servers allow dynamic hostname resolution to be performed. It is common practice to have two DNS server addresses defined so that if one server becomes unavailable, the other can be used. The client system must be configured with the IP address of the local DNS server. If a client system has the wrong DNS address listed, hostname resolution will not be possible. Figure 11.2 shows the IP configuration for connection to a private network.

Figure 11.2 The Internet Protocol (TCP/IPv4) Properties dialog box on a Windows Vista system.

image

Note

TCP/IP connection requirements  At the very minimum, an IP address and a subnet mask are required to connect to a TCP/IP network. With just this minimum configuration, connectivity is limited to the local segment, and DNS resolution is not possible.

When manually configuring a system to use TCP/IP, all information needs to be entered into the respective dialog boxes carefully. Entering a duplicate IP address may prevent the client system from being able to log on to the network, the wrong gateway will prevent the system from accessing remote networks, and so on. To view the IP settings of a client system, many utilities are used, including the ipconfig command for Windows systems and the ifconfig for Linux and UNIX systems.

When troubleshooting a system, ensure that IP address, default gateway, subnet mask, and DNS are correctly set. This information can be assigned using DHCP and should not be any errors; however, in networks where DHCP is not used and settings are inputted manually, these settings should be verified.

Setting Port Speeds and Duplex

When configuring a client for the network, there are two more settings to be aware of: port speeds and duplex settings. These two settings are adjusted in Windows in the Network Properties area of the Windows operating system. Figure 11.3 shows the port speed and duplex settings of a Windows Vista system.

Figure 11.3 The Advanced tab on the properties of a NIC found in Windows Device Manager.

image

Figure 11.3 shows several settings for port speed and duplex setting. These settings can be set to autoconfiguration to detect the setting used by the network. It can also be set to one of the other settings to match the network configuration—for example, 100Mbps and half duplex. If you are working with a client system that is unable to log on to a network, it may be necessary to ensure that the duplex setting and port speeds are correctly set for the network. You can find more information on duplex settings in Chapter 2.

Troubleshooting Incorrect VLANs

As mentioned in Chapter 1, VLANs provide a method of segmenting and organizing the network. Computer systems can be located anywhere on the network but communicate as if they are on the same segment. As an example, VLANs can be segmented according to an organization’s departments, such as sales, finance, and secretaries. It can be segmented according to usage, according to security permissions, and more.

The ability to segment the network provides clear advantages, such as increased security because devices can communicate only with other systems in the VLAN. Users can only see the systems in their VLAN segment. It can help control broadcast traffic and makes moving end systems around the network easier.

Problems can arise when users are moved or otherwise connected to the wrong VLAN. Administrators have to ensure that the user system is plugged into the correct VLAN port. For example, suppose a network is using port-based VLANs, assigning ports 1 through 8 to marketing, ports 9 through 18 to sales, and so on. Plugging a sales client into port 6 would make that sales client part of the marketing network. It sounds simple, but if documentation is not up to date and you are walking into a new network, this can be tricky to identify.

One of the keys to preventing VLAN assignment errors is to clearly document the VLAN arrangement. Should systems be moved, it is important to know how to reconnect them and forward them to the correct VLAN port.

Another consideration to keep in mind is that membership to a VLAN can be assigned both statically and dynamically. In static VLAN assignment, the switch ports are assigned to a specific VLAN, and new systems added will be assigned to VLAN associated with that particular port. For example, plug a new system into port 8 and the user becomes part of the administrator’s network. Make sure you have the right port assigned to users.

Dynamic VLAN assignment requires specific software to control VLAN distribution. Using a VLAN server, administrators can dynamically assign VLAN membership based on such criteria as MAC address or a username password combination. As a system tries to access the network, it queries the VLAN server database to ask for VLAN membership information. The server responds and logs the system onto the appropriate VLAN network. When configured correctly, dynamic assignment reduces human error associated with static VLAN assignment.

Identifying Issues That May Need Escalation

Earlier in this chapter we discussed the procedures that must be followed when issue escalation is required. Although any number of issues may need escalation, the CompTIA Network+ objectives list specific scenarios where escalation might be necessary. Each of these issues will not always require escalation; in fact, an administrator with an Internet connection and a little determination can track these down. Nevertheless, we will quickly discuss identify each of the issues listed in the CompTIA objectives:

Switching loop—On an ethernet network, only a single active path can exist between devices on a network. When multiple active paths are available, switching loops can occur. Switching loops are simply the result of having more than one path between two switches in a network. The spanning-tree protocol (STP) is designed to prevent these loops from occurring. If the packet in the loop is a broadcast message, the loop can create a full broadcast storm. Broadcast storms are discussed in this section. Switching loops occur at the data link (Layer 2) of the OSI model.

Routing loop—As the name suggests, a routing loop occurs when data packets continue to be routed in an endless circle. In proper operation, a router will forward packets according to the information presented in the routing table. If the routing table is correct, the packet takes the optimal path from the source to the destination. It is not common, but if the information in the routing table is incorrect through a manual misconfiguration or a faulty router route detection, routing loops can form. A routing loop is a path through the internetwork for a network ID that loops back onto itself. Routing loops are detectable because they can quickly bog down a network, and some packets are not received by the destination system.

Route problems—Route problems typically occur when routing tables contain information that does not reflect the correct topology of the internetwork. Out-of-date or incorrect routing tables mean that packets cannot be correctly routed through the network, and route problems occur. Verify the routing table to ensure that it is correct. Sometimes static routes are entered and cause problems when the network topology is changed.

Proxy ARP—The ARP protocol is used to resolve IP addresses to MAC addresses. This is important because on a network, devices find each other using the IP address, but communication between devices requires the MAC address. In a proxy ARP configuration, one system or network device answers ARP requests for another system. It is a proxy ARP because one network system is proxying for another’s ARP communications.

Broadcast storms—A broadcast address is an IP address that you can use to target all systems on a subnet or network instead of single hosts. In other words, a broadcast message goes to everyone on the network. A broadcast storm occurs when a network is overwhelmed with constant broadcast or multicast traffic. Broadcast storms can eventually lead to a complete loss of network connectivity as the network is bogged down with the broadcast storm. As with other network problems, you may suspect a broadcast storm when network response times are poor and people are complaining of slow network. These broadcast storms can be caused by faulty hardware, such as a NIC that continually sends out data, switching loops, or even faulty applications running on the network. Baselines work well for identifying broadcast storms.

Troubleshooting Wireless Issues

Because wireless signals travel through the atmosphere, they are subjected to all sorts of elements that can block wireless signals. This includes storms, the number of walls between the sending and receiving devices, ceilings, mirrors, and so on. Just how weakened the signal becomes depends on the building material used, RF interference, the power of the wireless signal, and how far the signal must travel. Every element that a wireless signal must pass through or around weakens the signal, reducing the distance it can travel.

Environmental factors are not the only things to consider when working with wireless networks. This section reviews two key areas to focus on when troubleshooting wireless networks: wireless signals and wireless configurations.

Note

Signal strength  Wireless signals degrade depending on the construction material used. Signals passing through concrete and steel are particularly weakened.

Troubleshooting Wireless Signals

If you are troubleshooting a wireless connection that has a particularly weak signal and one that won’t reach its destination, you can troubleshooting a signal by checking the following:

• Antenna type—As mentioned in Chapter 7, “Wireless Networking,” a wireless antenna can be either omnidirectional or directional. Omnidirectional antennas are great in an environment where there is a clear line of path between the senders and receivers. With omnidirectional antennas, the wireless signal is dispersed in a 360 degree pattern to all points.

If environmental obstacles exist, a directional wireless antenna may be a better choice. The directional antenna concentrates the signal power in a specific direction and allows you to use less power for a greater distance than an omnidirectional antenna. Omnidirectional antennas are well suited inside office buildings to accommodate numerous users.

• Antenna placement—Many home-use APs have a built-in antenna that is adequate to reach all areas of a home. Network APs may use an external wireless antenna, and placing it correctly is an important consideration. In general, the AP and the antenna should be located as near to each other as possible. The farther the signal has to travel over cabling from the antenna to the AP, the more signal degradation (RF attenuation) there is. Directional antennas connecting locations in a point-to-point configuration should be placed in a clear line of site between each other. Often the outdoor antennas are placed high to prevent the signal being blocked by physical objects. Indoor antennas should be kept away from large metal objects such as filing cabinets and devices that can cause RF interference.

• Boost signal—If all else fails, it is possible to purchase devices, such as wireless repeaters, that can amplify the wireless signal. The device takes the signal and amplifies it so that it has greater strength and can travel farther distances. Amplifiers increase the range that the client system can be placed from the AP.

• Bleed—Because wireless signals travel through the atmosphere, they are not bound by the same physical limitations of wired media. The dispersed nature of wireless communication can lead to problems. For example, although everyone in an office may be within range of a wireless signal, the signal is not restricted to that office, and someone outside may also be able to use the signal. Wireless signals that travel where administrators may not want is known as bleed. Some APs and antennas allow administrators to restrict the range a wireless signal is transmitted by reducing the strength of the wireless signal output. Bleed makes wireless security measures essential. To prevent people from using a signal, encryption and other methods are used. So, a user may be able to see the wireless signal but not be able to use the wireless network without the proper security clearance.

• Distance—Wireless signals degrade as they travel from their point of origin. While troubleshooting wireless signals, you may need to relocate the AP closer to client systems or add wireless routers to increase the wireless transmission range. Administrators often use wireless signal testers to ensure transmission ranges will be adequate before implementing the wireless network.

Exam Alert

Relocation  When troubleshooting wireless signals, it is often necessary to relocate the AP to a more favorable location. This is important to know both for the Network+ exam and for real-world application.

To successfully manage the wireless signals, you need to know the wireless standard that you are using. The standards used today specify range distances, RF ranges, and speeds. It may be that the wireless standard is not capable of doing what you need. More information on all wireless standards can be found in Chapter 7.

Troubleshooting Wireless Configurations

You can use a number of settings and configurations when working with wireless clients and APs. Some of the more common areas to check when troubleshooting wireless configurations include the following:

• Incorrect encryption—The wireless network security features are set on the wireless router or AP. This includes the wireless encryption methods that will be used—for instance, WEP or WPA. When encryption is enabled on the AP, the client must be configured to use the encryption and know the encryption key to be authenticated to the AP. When troubleshooting a connectivity problem between an AP and a wireless client, a common problem is that the encryption security settings do not match.

SSID/ESSID mismatch—Whether your wireless network is using infrastructure mode or ad-hoc mode, an SSID/ESSID is required. The SSID/ESSID is a configurable client identification that allows clients to communicate to a particular base station. Only client systems configured with the same SSID as the AP can communicate with it. SSIDs provide a simple password arrangement between base stations and clients. The ESSID/SSID may be broadcast from the AP and visible to all receiving devices in the area, or it may be configured not to broadcast. Not broadcasting the SSID name adds another level of security because people are unable to see the SSID name when browsing for wireless networks in the area. The ESSID/SSID would have to be obtained from the network administrator.

• Overlapping channels—When troubleshooting a wireless network, be aware that overlapping channels can disrupt the wireless communications. For example, in many environments, APs are inadvertently placed close together—perhaps two access points in separate offices located next door to each other or between floors. Signal disruption will result if a channel overlap occurs between the access points. You would typically change the channel of a wireless device only if there is a channel overlap with another device. If a channel must be changed, it must be changed to another nonoverlapping channel.

• Standard mismatch—The 802.11 standards commonly used today include 802.11a/b and g, with n on the horizon. When configuring client systems, be sure they are configured to use the same or compatible wireless standard. 802.11a is not compatible with b, g, or n; however, b and g are compatible.

Note

More wireless troubleshooting  When preparing for the Network+ exam, be sure to cross-reference the wireless information in this chapter with Chapter 7, “Wireless Networking.”

Summary

Troubleshooting networks is an activity with which network administrators become very familiar. Successful troubleshooting does not happen by accident; rather, the troubleshooting process follows some defined procedures. These procedures include the following:

Step 1. Information gathering—identify symptoms and problems

Step 2. Identify the affected areas of the network

Step 3. Determine if anything has changed

Step 4. Establish the most probable cause

Step 5. Determine if escalation is necessary

Step 6. Create an action plan and solution identifying potential effects

Step 7. Implement and test the solution

Step 8. Identify the results and effects of the solution

Step 9. Document the solution and the entire process

At times, you might find yourself troubleshooting wiring and infrastructure problems. Although they are less common than other troubleshooting areas, wiring and network devices should be considered possible causes of a problem. Tracking down infrastructure problems often requires using documentation and network maps or taking baselines to compare network performance.

Consider several areas when troubleshooting a wireless network. Many problems are related to poor signal strength, low transmission rates, and limited distances. When troubleshooting wireless connectivity, it is important to verify both the signal strength and the AP and wireless client configuration.

Key Terms

Cross talk

Attenuation

Collisions

Open impedance mismatch

Interference

Port speed

Port duplex mismatch

VLAN

Gateway

DNS

Subnet mask

Switching loop

Routing loop

Route problems

Proxy arp

Broadcast storms

Encryption

• Wireless channel

SSID

ESSID mismatch

802.11 a/b/g/n

Apply Your Knowledge

Exercise

11.1 Using the Microsoft support website to track error codes

As a network administrator, you have been given the task of installing and configuring a new Windows Vista computer system. However, each time you try to install the new operating system, the process is halted with the following error message:

Stop: 0x000000A5 To install the Windows Vista system, you need to find the solution to the problem.

Estimated time: 10 minutes

1. Go to http://support.microsoft.com.

2. From the Search Site bar located on the top-right corner, type Stop: 0x000000A5.

3. Select the Search Site button to continue.

4. One result will be displayed; click the result link.

5. You will be taken to a Microsoft support page titled “Troubleshooting Stop Error Messages That May Occur When You Try to Install Windows Vista.”

6. Scroll down the page until you see Stop: 0x000000A5. In this case, you will notice that the reason the installation failed is that the computer BIOS is incompatible with the Advanced Configuration and Power Interface (ACPI) standard that is supported in Windows Vista.

Exam Questions

1. Considering the following figure, which of the following statements is true?

image

image     A. The system cannot access the local network.

image     B. The system cannot access remote networks.

image     C. The system cannot have hostname resolution.

image     D. The system has the wrong subnet mask.

2. Using the following configuration screen, which of the following is true?

image

image     A. The system cannot access the local network.

image     B. The system cannot access remote networks.

image     C. The system cannot have hostname resolution.

image     D. The system has the wrong subnet mask.

3. Which of the following best describes the function of the default gateway?

image     A. Converts hostnames to IP address.

image     B. Converts IP addresses to hostnames.

image     C. Allows systems to communicate with systems on a remote network.

image     D. Allows systems to communicate with routers.

4. Which of the following bits of IP information are mandatory to join the network? (Select two answers.)

image     A. Subnet mask

image     B. IP address

image     C. DNS address

image     D. Default gateway

5. You are wiring a new network. Because of space limitations, you need to run several cables close to each other. After the setup you find that the signals from each cable are overlapping. Which of the following terms describe what is happening?

image     A. Attenuation

image     B. Cross talk

image     C. Near cross talk

image     D. EMI

6. Which of the following should you consider when troubleshooting wiring problems? (Choose all best answers.)

image     A. The distance between devices

image     B. Interference

image     C. Atmospheric conditions

image     D. Connectors

7. You get numerous calls from users who are unable to access an application. Upon investigation, you find that the application has crashed. You restart the application, and it appears to run okay. What is the next step in the troubleshooting process?

image     A. Email the users and let them know that they can use the application again.

image     B. Test the application to ensure that it is operating correctly.

image     C. Document the problem and the solution.

image     D. Reload the application executables from the CD and restart it.

8. A user calls to inform you that she is having a problem accessing her email. What is the next step in the troubleshooting process?

image     A. Document the problem.

image     B. Make sure that the user’s email address is valid.

image     C. Discuss the problem with the user.

image     D. Visit the user’s desk to reload the email client software.

9. You have successfully fixed a problem with a server and have tested the application and let the users back on the system. What is the next step in the troubleshooting process?

image     A. Document the problem.

image     B. Restart the server.

image     C. Document the problem and the solution.

image     D. Clear the error logs of any reference to the problem.

10. You are called in to troubleshoot a problem with the NIC on a server that has been running well for some time. The server reports a resource conflict. What would be the next step in the troubleshooting process?

image     A. Change the NIC.

image     B. Consult the documentation to determine whether there have been any changes to the server configuration.

image     C. Download and install the latest drivers for the NIC.

image     D. Reload the protocol drivers for the NIC and set them to use a different set of resources.

11. Which of the following can cause switching loops?

image     A. Sporadic sending of broadcast messages

image     B. Continual sending of broadcast messages

image     C. An ethernet network where multiple active paths are available for data to travel.

image     D. An ethernet network where only a single path is available for data to travel.

12. You are troubleshooting an infrastructure problem and suspect the problem may be the network media. Which of the following must be considered when troubleshooting network media? (Choose two answers.)

image     A. Where the media is used

image     B. Media frequency output/input ratio

image     C. Media type

image     D. Media voltage

13. You have been called into a network to troubleshoot a cabling error. You have traced the problem to lengths of cable that have been run too far. Which of the following describes the weakening of data signals as they travel down a given media?

image     A. Near cross talk

image     B. EMI

image     C. Attenuation

image     D. Cross talk

14. You are troubleshooting intermittent connectivity issue. You suspect the problem maybe a form of cross talk known as NEXT. Which of the following is a symptom of NEXT?

image     A. Packets are not able to be decrypted.

image     B. Packets are not able to be encrypted.

image     C. Interference between wire pairs at the near end of the link.

image     D. Interference between wire pairs at the far end of the link.

15. You are working with several homemade network cables. Which of the following is caused by poorly made cables?

image     A. Near End cross talk

image     B. Cross talk

image     C. EMI

image     D. Attenuation

16. A client on your network has had no problem accessing the wireless network, but recently the client moved to a new office in the same building. Since the move, she has been experiencing intermittent connectivity problems. Which of the following is most likely the cause of the problem? (Select the two best answers.)

image     A. The SSID on the client and the AP are different.

image     B. The client WEP settings have to be set to auto detect.

image     C. The signal is being partially blocked by physical objects.

image     D. The client system has moved too far away from the access point.

17. You have been called in to troubleshoot a problem with a specific application on a server system. The client is unable to provide any information about the problem except that the application is not accessible. Which of the following troubleshooting steps should you perform first?

image     A. Consult the documentation for the server.

image     B. Consult the application error log on the server.

image     C. Reboot the server.

image     D. Reload the application from the original CD.

18. Which of the following is not a concern when troubleshooting connectivity between an AP and a wireless client?

image     A. Ensuring both use the same encryption

image     B. Ensuring the AP and client are configured not to combine 802.11b or g

image     C. Ensuring that the same SSID is used

image     D. Ensuring that the client system is within range of the AP

19. You have been called in to troubleshoot an intermittent network problem. You suspect that cabling is a problem. You review the documentation and find out that a segment of Category 5e cable is run through the ceiling. Which of the following would you guess would be the problem?

image     A. Cross talk

image     B. Near cross talk

image     C. Attenuation

image     D. EMI

20. A user is having problems logging on to the server. Each time she tries, she receives a “server not found” message. After asking a few questions, you deduce that the problem is isolated to this single system. Which of the following are possible explanations to the problem? (Choose the two best answers.)

image     A. The protocol configuration on the workstation is incorrect.

image     B. A hub may have failed.

image     C. The cable has become disconnected from the user’s workstation.

image     D. The server is down.

Answers to Exam Questions

1. D. Internal networks are assigned one of the private address ranges. Each of these ranges have a corresponding subnet mask. In this example, the wrong subnet mask has been entered. For more information, see the section “Configuring and Troubleshooting Client Connectivity” in this chapter.

2. B. Notice from the dialog screen that the Default gateway address is incorrectly entered as the same address as the systems IP address. Because of this, the system would likely not be able to connect to remote networks. The DNS, IP, and subnet mask settings are correct. For more information, see the section “Configuring and Troubleshooting Client Connectivity” in this chapter.

3. C. The default gateway allows the system to communicate with systems on a remote network, without the need for explicit routes to be defined. The default gateway can be assigned automatically using a DHCP server or manually inputted. For more information, see the section “Configuring and Troubleshooting Client Connectivity” in this chapter.

4. A, B. Configuring a client requires at the least the IP address and a subnet mask. The default gateway, DNS server, and WINS server are all optional, but network functionality is limited without them. For more information, see the section “Configuring and Troubleshooting Client Connectivity” in this chapter.

5. B. Cross talk can occur when the signal from one cable overlaps with the signal from another. This can sometimes happen when cables are run too close together. The remedy is to run the cables farther apart or use quality shielded cable. For more information, see the section “Wiring Issues” in this chapter.

6. A, B, D. When you’re troubleshooting a wiring problem, consider the distance between devices, interference such as cross talk and EMI, and the connection points. Answer C is not correct because bound media (that is, cables) are not affected by atmospheric conditions. For more information, see the section “Troubleshooting Wiring” in this chapter.

7. B. After you have fixed a problem, you should test it fully to ensure that the network is operating correctly before allowing users to log back on. The steps described in Answers A and C are valid, but only after the application has been tested. Answer D is not correct; you would reload the executable only as part of a systematic troubleshooting process, and because the application loads, it is unlikely that the executable has become corrupt. For more information, see the section “The Art of Troubleshooting” in this chapter.

8. C. Not enough information is provided to make any real decision about what the problem might be. In this case, the next troubleshooting step would be to talk to the user and gather more information about exactly what the problem might be. All the other answers are valid troubleshooting steps, but only after the information gathering has been completed. For more information, see the section “The Art of Troubleshooting” in this chapter.

9. C. After you have fixed a problem, tested the fix, and let users back on to the system, you should create detailed documentation that describes the problem and the solution. Answer A is incorrect because you must document both the problem and the solution. It is not necessary to restart the server, so Answer B is incorrect, and Answer D would be performed only after the documentation for the system has been created. For more information, see the section “The Art of Troubleshooting” in this chapter.

10. B. In a server that has been operating correctly, a resource conflict could indicate that a device has failed and is causing the conflict. More likely, a change has been made to the server, and that change has created a conflict. Although all the other answers represent valid troubleshooting steps, it is most likely that there has been a change to the configuration. For more information, see the section “The Art of Troubleshooting” in this chapter.

11. D. On an ethernet network, only a single active path can exist between devices on a network. When multiple active paths are available, switching loops can occur. Switching loops are the result of having more than one path between two switches in a network. The spanning-tree protocol is designed to prevent these loops from occurring. For more information, see the section “Identifying Issues That May Need Escalation” in this chapter.

12. A, C. When troubleshooting media, you will need to know the type of media being used. This enables you to know the characteristics of the media and if it is being used correctly on the network. You will also want to know where the media is being used. If it is being used in an area that causes interference, another media type or another location may be required. See the section “Troubleshooting Wiring” in this chapter.

13. C. Data signals weaken as they travel farther from the point of origin. If the signal travels far enough, it can weaken so much that it becomes unusable. The weakening of data signals as they traverse the media is referred to as attenuation. For more information, see the section “Wiring Issues” in this chapter.

14. C. NEXT refers to interference between adjacent wire pairs within the twisted-pair cable at the near end of the link (the end closest to the origin of the data signal). NEXT occurs when an outgoing data transmission leaks over to an incoming transmission. Answer D refers to FEXT which is interference at the far end of the link. Answers A and B are invalid. For more information, see the section “Wiring Issues” in this chapter.

15. A. Near End cross talk or NEXT, occurs when connectors are not properly attached to UTP cable. Specifically, the cross talk can occur if the wires pushed into the RJ-45 connector are crossed or crushed. When this occurs, the signal will experience intermittent problems. For more information, see the section “Wiring Issues” in this chapter.

16. C, D. An AP has a limited distance that it can send data transmissions. When a client system moves out of range, it won’t be able to access the AP. Many strategies exist to increase transmission distances, including RF repeaters, amplifiers, and buying more powerful antennas. Also, client systems may be moved and the signal can be weakened by a physical issue such as a concrete wall, mirror, or other obstacles. This too can explain intermittent connectivity problems. The problem is not likely related to the SSID or WEP settings as the client had access to the network before and no settings were changed. For more information, see the section “Troubleshooting Wireless Issues” in this chapter.

17. A. When you are working on an unfamiliar system, the first step should be to consult the documentation to gain as much information as you can about the server and the applications that run on it. All the other troubleshooting steps are valid, but they would be performed only after the information-gathering process is complete. For more information, see the section “The Art of Troubleshooting” in this chapter.

18. B. Wireless standards 802.11b and g are compatible, so either one could be used in a configuration. Encryption, SSID, and distance all have to be verified for a client to authenticate to an AP. For more information, see the section “Troubleshooting Wireless Configurations” in this chapter.

19. D. The Category 5e cable run through the ceiling is likely an indication of EMI. Recall from Chapter 2 that UTP has poor resistance to electromagnetic interference (EMI), and therefore UTP and the electrical equipment do not mix. Cables that are run close to fluorescent light fittings can cause intermittent problems because of EMI. For more information, see the section “Troubleshooting Wiring” in this chapter.

20. A, C. The information provided indicates that this user is the only one experiencing a problem. After determining the scope of the problem, we can assume that the issue must lie with something directly connected with that system. In this case, it is likely that the configuration of the workstation or the physical connectivity is to blame. For more information, see the section “The Art of Troubleshooting” in this chapter.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.131.160.69