Chapter 11 Wireless Security Troubleshooting

IN THIS CHAPTER, YOU WILL LEARN ABOUT THE FOLLOWING:

Five Tenets of WLAN Troubleshooting

PSK troubleshooting

802.1X/EAP troubleshooting

Roaming troubleshooting

VPN troubleshooting

Throughout this book, you have learned about the building blocks of WLAN security. We’ve focused on protecting the Layer 3–7 MSDU payload of 802.11 data frames as well as on protecting the WLAN portal. You have learned about the Layer 2 dynamic encryption that is used to provide data privacy and the secure authentication methods used prior to authorizing WLAN access for users and devices. However, as with any type of communications network, problems with WLAN networks arise that might require attention from an administrator. Client connectivity issues often arise that might be the result of improper implementation of WLAN security. In this chapter, you will learn how to troubleshoot PSK and 802.1X/EAP authentication that might be the root cause of connectivity and roaming problems. You will also learn other WLAN troubleshooting strategies from a security perspective.

Five Tenets of WLAN Troubleshooting

Before we discuss specific WLAN security troubleshooting strategies, you should understand five basic tenets for troubleshooting any type of WLAN problem:

  • Implement troubleshooting best practices.

  • Troubleshoot the OSI model.

  • Most problems are client side.

  • Proper WLAN design/planning is important.

  • The WLAN will always get the blame.

We will now review these WLAN troubleshooting doctrines in greater detail.

Troubleshooting Best Practices

The fundamentals of troubleshooting best practices are to ask questions and collect information. When troubleshooting any type of computer network, you must ask the correct questions to collect information that is relevant to the problem. It is easy to get sidetracked when troubleshooting, so asking the proper questions will help an IT administrator focus on the pertinent data with a goal of isolating the root cause of the problem. WLAN security problems often result in WLAN client connectivity issues; asking the appropriate questions will point you in the right direction toward solving the problem. Some of the basic questions that need to be asked include the following:

  • When is the problem happening?

    At what time did the problem occur? Did this problem happen during a very specific time period? This information can be easily determined by looking at the log files of APs, WLAN controllers, and applicable servers such as RADIUS. Best practices mandate that all Network Time Protocol (NTP) and time zone settings be correctly configured on all network hardware.

  • Where is the problem happening?

    Is the problem widespread or does it only exist in one physical area? Is the problem occurring on a single floor or in the entire building? Does the problem affect just one access point or a group of access points? Determining the location of problem will help you gather better information toward solving the problem.

  • Does the problem affect one client or numerous clients?

    If the problem is only affecting a single client, you may have a simple driver issue or an incorrectly configured supplicant. If the issue is affecting numerous clients, then the problem is obviously of greater concern. Most connectivity problems are client side whether they are detrimental to a single client or multiple clients.

  • Does the problem reoccur or did it just happen once?

    Troubleshooting a problem that only happens one time or only a few times can be difficult. Collecting data is much easier with recurring problems. You may have to enable debug commands on APs or WLAN controllers to hopefully capture the problem again in a log file.

  • Did you make any changes recently?

    This is a question that the support personnel of WLAN vendors always ask their customers. And the answer is almost always no despite the fact that changes to the network indeed take place. Best practices dictate that any network configuration changes be planned and scheduled. WLAN infrastructure security audit logs will always leave a paper trail of which administrator made which changes at any specific time.

Once you have asked numerous questions, you can begin the process of solving the problem. Troubleshooting best practices include the following:

  1. Identify the issue.

    Because the WLAN always seems to get the blame, it is even more important to correctly identify the problem. Determine that a problem actually exists. Asking questions and collecting information will help you identify the true issue.

  2. Re-create the problem.

    Having the ability to duplicate the problem either onsite or in a remote lab gives you the ability to collect more information to diagnose the problem. If you cannot re-create a problem, you may need to ask more questions.

  3. Locate and isolate the cause.

    The whole point of asking the pointed questions and gathering data is so that you can isolate the root cause of the problem. Troubleshooting up the OSI model will also help you identify the culprit.

  4. Solve the problem.

    Formulate and implement a plan to solve the problem. This may require network changes, firmware updates, and so forth.

  5. Test to verify the problem is solved.

    Always be sure to test in different areas during different times and with multiple devices. Extensive testing will ensure that the problem is indeed resolved.

  6. Document the problem and the solution.

    Troubleshooting best practices dictate that you document all problems, diagnostics, and resolutions. A reference help desk database will assist you in solving problems in a timely fashion should any problem reoccur.

  7. Provide feedback.

    As a professional courtesy, always be sure to follow up with the individual(s) who first alerted you to the problem.

WLAN security problems usually result in WLAN client connection failures. Many WLAN vendors offer Layer 2 diagnostic tools to troubleshoot client device authentication and association. These diagnostic tools may be accessible directly from an AP, a WLAN controller, or a cloud-based network management system (NMS). Better diagnostic tools may even offer suggested remediation for detected problems. Security and AAA log files from the WLAN hardware and the RADIUS server are also a great place to start when troubleshooting either PSK or 802.1X/EAP authentication problems. Log files may also be gathered from individual WLAN supplicants.

Third-party tools are also available for diagnostics. One example is the handheld AirCheck G2 wireless tester tool from NetScout, shown in Figure 11.1. Another example is a protocol analyzer, which can be used to capture 802.11 frames relevant to RSN security associations.

FIGURE 11.1 Handheld diagnostic tool

image

Troubleshoot the OSI Model

The diagnostic approach that is used to troubleshoot wired 802.3 networks should also be applied when troubleshooting a wireless local area network (WLAN). A bottoms-up approach to analyzing the OSI reference model layers also applies to wireless networking. Remember that 802.11 technology is similar to 802.3 in that it operates at the first two layers of the OSI model. For that reason, a WLAN administrator should always try to first determine whether problems exist at Layer 1 and Layer 2. If the first two layers of the OSI model have been eliminated as the cause of the problem, the problem is not a Wi-Fi problem and the higher layers of the OSI model should be investigated.

As with most networking technologies, most problems usually exist at the Physical layer. Simple Layer 1 problems, such as nonpowered access points or client radio driver problems, are often the root cause of connectivity or performance issues. Disruption of RF signal propagation and RF interference will affect both the performance and coverage of your WLAN. But what about Physical layer problems that are actually security related? The most likely culprit is improperly configured supplicant security settings. Later in this chapter, we will discuss troubleshooting the problems that can occur at the Physical layer due to misconfigured supplicants.

After eliminating Layer 1 as the source of the problem, a WLAN administrator should try to determine whether the problem exists at the Data-Link layer. As shown in Figure 11.2, WLAN security mechanisms operate at Layer 2. You have already learned that modern-day 802.11 radios use CCMP encryption that provides data privacy for Layers 3–7. The chosen encryption method must match on both the AP and client radios. For example, if an AP has disabled backward compatibility for TKIP encryption, a legacy client that only supports TKIP will not be able to connect. Remember that only CCMP encryption can be used for 802.11n (HT) and 802.11ac (VHT) data rates. An access point might be configured to transmit an SSID that supports both TKIP and CCMP encryption. In this situation, a common support call may be that the legacy TKIP clients seem slow because of the lack of support for higher data rates. The simple solution is to replace the legacy clients with modern-day clients that support CCMP.

FIGURE 11.2 OSI model

image

Also remember that there is a symbiotic relationship between the creation of dynamic encryption keys and authentication. A pairwise master key (PMK) is used to seed the 4-Way Handshake that generates the unique dynamic encryption keys employed by any two 802.11 radios. The PMK is generated as a byproduct of either PSK or 802.1X/EAP authentication. Therefore, if authentication fails, no encryption keys are generated. We will discuss troubleshooting both 802.11 authentication methods later in this chapter.

As stated earlier, if the first two layers of the OSI model have been eliminated, the problem is not a Wi-Fi problem and therefore the problem exists within Layers 3–7. It is likely the problem is either a TCP/IP networking issue or an application issue. As shown in Figure 11.1, TCP/IP problems should be investigated at Layers 3–4, whereas most application issues exist between Layers 5 and 7.

Most Wi-Fi Problems Are Client Issues

As previously mentioned, whenever you troubleshoot a WLAN, you should start at the Physical layer and 70 percent of the time the problem will reside on the WLAN client. If there are any client connectively problems, WLAN Troubleshooting 101 dictates that you disable and re-enable the WLAN network adapter. The driver for the WLAN network interface card (NIC) is the interface between the 802.11 radio and the operating system (OS) of the client device. For whatever reason, the WLAN driver and the OS of the device may not be communicating properly. A simple disable/re-enable of the WLAN NIC will reset the driver. Always eliminate this potential problem before investigating anything else. Additionally, first-generation radio drivers and firmware are notorious for possible bugs. Always make sure the WLAN client population has the latest available drivers installed.

Another change that is quick and easy to make is to reconfigure the client configuration profile. Most client supplicants allow the user to define a WLAN configuration profile or connection parameters. Sometimes troubleshooting a problem is as easy as deleting the old profile and configuring a new profile.

As mentioned earlier, client-side security issues usually evolve around improperly configured supplicant settings. This could be something as simple as a mistyped WPA2-Personal passphrase or as complex as 802.1X/EAP digital certificate problems. Many roaming problems are also a direct result of lack of support for fast secure roaming (FSR) mechanisms on the client. Most businesses and corporations can eliminate many of the client connectivity and performance problems by simply upgrading company-owned client devices before updating the WLAN infrastructure. Sadly, the opposite is often more common, with companies spending many thousands of dollars on new access point technology upgrades while still deploying legacy clients.

Proper WLAN Design Reduces Problems

Poor WLAN performance is often a problem that must be addressed, and often the performance issues are a result of improper WLAN design. A huge percentage of WLAN support phone calls are a symptom of a lack of WLAN design. Proper capacity and coverage planning, spectrum analysis, and a validation site-survey will eliminate the majority of WLAN support tickets in regard to performance. Additionally, many WLAN security holes can be eliminated in advance with proper WLAN security planning. If 802.1X/EAP is deployed, one of the biggest challenges is how to provision the root CA certificates for mobile devices such as smart phones and tablets. A well-thought-out security strategy for employee WLAN devices, BYOD devices, and guest WLAN access is essential. Proper WLAN security planning and design in advance will reduce time spent troubleshooting WLAN security problems at a later juncture.

WLAN Always Gets the Blame

Despite all your best WLAN troubleshooting practices and best efforts, you should resign yourself to the fact the WLAN will always get the blame. Experienced WLAN administrators know the WLAN will be blamed for problems that have nothing to do with the WLAN. This is another reason that troubleshooting up the OSI stack is important. If the problem is not a Layer 1 or Layer 2 problem, then Wi-Fi is not the culprit. However, put yourself in the shoes of the end user who is connected to the WLAN. 802.11 technology exists at the access layer. The whole point of an AP is to provide a wireless portal to a preexisting network infrastructure. Your employees and guests who connect to the WLAN expect seamless wireless mobility, and they have no concept of problems that exist at Layers 3–7. A WLAN end user is not aware that the DHCP server is out of leases. A WLAN end user is not aware the Internet service provider (ISP) is experiencing difficulty and the WAN link is down. The WLAN end user just knows that they cannot access www.facebook.com through the WLAN and therefore they point the finger at the Wi-Fi network.

PSK Troubleshooting

Troubleshooting PSK authentication is relatively easy. WLAN vendor diagnostic tools, log files, or a protocol analyzer can all be used to observe the 4-Way Handshake process between a WLAN client and an access point. Let’s first take a look at a successful PSK authentication. In Figure 11.4, you can see the client associate with the AP and then PSK authentication begins. Because the PSK credentials matched on both the access point and the client, a pairwise master key (PMK) is created to seed the 4-Way Handshake. The 4-Way Handshake process is used to create the dynamically generated unicast encryption key that is unique to the AP radio and the client radio.

FIGURE 11.4 Successful PSK authentication

image

Figure 11.4 shows that the 4-Way Handshake process was successful and that the pairwise transient key (PTK) is installed on the AP and the client. The Layer 2 negotiations are now complete, and it is time for the client to move on to higher layers. So of course the next step is that the client obtains an IP address via DHCP. If the client does not get an IP address, there is a networking issue and therefore the problem is not a Wi-Fi issue.

Perhaps a Wi-Fi administrator receives a phone call from an end user who cannot get connected using WPA2-Personal. The majority of problems are at the Physical layer; therefore, Wi-Fi Troubleshooting 101 dictates that the end user first enable and disable the Wi-Fi network card. This should ensure the Wi-Fi NIC drivers are communicating properly with the operating system. If the connectivity problem persists, the problem exists at Layer 2. You can then use diagnostic tools, log files, or a protocol analyzer to observe the failed PSK authentication of the WLAN client.

In Figure 11.5, you can see the client associate and then start PSK authentication. However, the 4-Way Handshake process fails. Notice that only two frames of the 4-Way Handshake complete.

FIGURE 11.5 Unsuccessful PSK authentication

image

The problem is almost always a mismatch of the PSK credentials. If the PSK credentials do not match, a pairwise master key (PMK) seed is not properly created and therefore the 4-Way Handshake fails entirely. The final pairwise transient key (PTK) is never created. Remember a symbiotic relationship exists between authentication and the creation of dynamic encryption keys. If PSK authentication fails, so does the 4-Way Handshake that is used to create the dynamic encryption keys. There is no attempt by the client to get an IP address because the Layer 2 process did not complete.

Remember that an 8–63 character case-sensitive passphrase is entered by the user or administrator. This passphrase is then used to create the PSK. The passphrase could possibly be improperly configured on the access point; however, the majority of the time, the problem is simple: the end user is incorrectly typing in the passphrase. The administrator should make a polite request to the end user to retype the passphrase slowly and carefully, which is a well-known cure for what is known as fat-fingering.

Another possible cause of the failure of PSK authentication could be a mismatch of the chosen encryption methods. An access point might be configured to support only WPA2 (CCMP-AES), which a legacy WPA (TKIP) client does not support. A similar failure of the 4-Way Handshake would occur.

802.1X/EAP Troubleshooting

PSK authentication (also known as WPA2-Personal) is simple to troubleshoot because the authentication method was designed to be uncomplicated. However, troubleshooting the more complex 802.1X/EAP authentication (also known as WPA2-Enterprise) is a bigger challenge because multiple points of failure exist.

As you learned in Chapter 4, “802.1X/EAP Authentication,” 802.1X is a port-based access control standard that defines the mechanisms necessary to authenticate and authorize devices to network resources. The 802.1X authorization framework consists of three main components, each with a specific role. These three 802.1X components work together to make sure only properly validated users and devices are authorized to access network resources. The three 802.1X components are known as the supplicant, authenticator, and authentication server. The supplicant is the user or device that is requesting access to network resources. The authentication server’s job is to validate the supplicant’s credentials. The authenticator is a gateway device that sits in the middle between the supplicant and authentication server, controlling or regulating the supplicant’s access to the network.

802.1X/EAP Troubleshooting Zones

In the example shown in Figure 11.6, the supplicant is a Wi-Fi client, an AP is the authenticator, and an external RADIUS server functions as the authentication server. The RADIUS server can maintain an internal user database or query an external database, such as an LDAP database. Extensible Authentication Protocol (EAP) is used within the 802.1X framework to validate users at Layer 2. The supplicant will use an EAP protocol to communicate with the authentication server at Layer 2. The Wi-Fi client will not be allowed to communicate at the upper layers of 3–7 until the RADIUS server has validated the supplicant’s identity at Layer 2.

FIGURE 11.6 802.1X/EAP

image

The AP blocks all of the supplicant’s higher-layer communications until the supplicant is validated. When the supplicant is validated, higher layer communications are allowed through a virtual “controlled port” on the AP (the authenticator). Layer 2 EAP authentication traffic is encapsulated in RADIUS packets between the authenticator and the authentication server. The authenticator and the authenticator server also validate each other with a “shared secret.”

Better versions of EAP such as EAP-PEAP and EAP-TTLS use “tunneled authentication” to protect the supplicant credentials from offline dictionary attacks. Certificates are used within the EAP process to create an encrypted SSL/TLS tunnel and ensure a secure authentication exchange. As illustrated in Figure 11.6, a server certificate resides on the RADIUS server and the root CA public certificate must be installed on the supplicant. As mentioned earlier, there are many points of failure in an 802.1X/EAP process. However, as depicted in Figure 11.7, there are effectively two troubleshooting zones within the 802.1X/EAP framework where failures will occur. Troubleshooting zone 1 consists of the backend communications between the authenticator, the authentication server, and the LDAP database. Troubleshooting zone 2 resides solely on the supplicant device that is requesting access.

FIGURE 11.7 802.1X/EAP troubleshooting zones

image

Zone 1: Backend Communication Problems

Zone 1 should always be investigated first. If an AP and a RADIUS server cannot communicate with each other, the entire authentication process will fail. If the RADIUS server and the LDAP database cannot communicate, the entire authentication process will also fail.

Figure 11.8 shows a capture of a supplicant (Wi-Fi client) trying to contact a RADIUS server. The authenticator forwards the request to the RADIUS server, but the RADIUS server never responds. The AP (authenticator) then sends a deauthentication frame to the Wi-Fi client because the process failed. This is an indication that there is a backend communication problem in the first troubleshooting zone.

FIGURE 11.8 The RADIUS server does not respond.

image

As shown in Figure 11.9, if the RADIUS server never responds to the supplicant, there are four possible points of failure in the first troubleshooting zone:

  • Shared secret mismatch

  • Incorrect IP settings on the AP or the RADIUS server

  • Authentication port mismatch

  • LDAP query failure

FIGURE 11.9 Points of failure – 802.1X/EAP troubleshooting zone 1

image

The first three possible points of failure are between the authenticator and the RADIUS server. The authenticator and the authentication server validate each other with a shared secret. The most common failure in RADIUS communications is that the shared secret has been typed in wrong on either the RADIUS server or the AP functioning as the authenticator.

The second most common failure in RADIUS communications is simply misconfigured IP networking settings. The AP must know the correct IP address of the RADIUS server. Likewise the RADIUS server must be configured with the IP addresses of any APs or WLAN controllers functioning as authenticators. Incorrect IP settings will result in miscommunications.

The third point of failure between an authenticator and an authentication server is a mismatch of RADIUS authentication ports. UDP ports 1812 and 1813 are defined as the industry standard ports used for RADIUS authentication and accounting. However, some older RADIUS servers may be using UDP ports 1645 and 1646. UDP ports 1645 and 1646 are rarely used anymore but do occasionally show up on older RADIUS servers. Although not a common point of failure, if the authentication ports do not match between a RADIUS server and the AP, the authentication process will fail.

The final point of failure on the backside is a failure of the LDAP query between a RADIUS server and the LDAP database. A standard domain account can be used for LDAP queries; however, if the account has expired or if there is a networking issue between the RADIUS server and the LDAP server, the entire 802.1X/EAP authentication process will fail.

Zone 2: Supplicant Certificate Problems

If all backend communications between the authenticator and the RADIUS server are functioning properly, then the 802.1X/EAP troubleshooting focus should now be redirected to Zone 2. In simpler words, the culprit is the supplicant. Problems with the supplicant usually either revolve around certificate issues or client credential issues. Let’s take a look at Figure 11.11. Note that the RADIUS server is responding and therefore verifying that the backend communications are good. Also notice an SSL tunnel negotiation starts and finishes successfully. This 802.1X/EAP diagnostic log confirms that the certificate exchange was successful and that an SSL/TLS tunnel was successfully created to protect the supplicant credentials.

FIGURE 11.11 Successful SSL/TLS tunnel creation

image

Figure 11.12 displays an 802.1X/EAP diagnostic log where you can see the SSL negotiation begin and the server certificate sent from the RADIUS server to the supplicant. However, the SSL/TLS tunnel is never created, and EAP authentication fails. If the SSL/TLS tunnel cannot be established, this is an indication that there is some sort of certificate problem.

FIGURE 11.12 Unsuccessful SSL/TLS tunnel creation

image

You can usually verify that there is a certificate problem by editing the supplicant client software settings and temporarily disabling the validation of the server certificate, as shown in Figure 11.13. If EAP authentication is successful after you temporarily disable the validation of the server certificate, then you have confirmed there is a problem with the implementation of the certificates within the 802.1X framework. Please note that this is not a fix but an easy way to verify that some sort of certificate issue exists.

FIGURE 11.13 Server certificate validation

image

A whole range of certificate problems could be causing the SSL/TLS tunnel not to be successfully created. The most common certificate issues are

  • The root CA certificate is installed in the incorrect certificate store.

  • The incorrect root certificate is chosen.

  • The server certificate has expired.

  • The root CA certificate has expired.

  • The supplicant clock settings are incorrect.

The Root CA certificate needs to be installed in the Trusted Root Certificate Authorities store of the supplicant device. A common mistake is to install the root CA certificate in the default location, which is typically the personal store of a Windows machine. Another common mistake is to select the incorrect root CA certificate with the supplicant configuration. The SSL/TLS tunnel will fail because the incorrect root CA certificate will not be able to validate the server certificate. Digital certificates are also time-based, and a common problem is that the server certificate has expired. Although not as common, the root CA certificate can also have expired. The clock settings on the supplicant may be incorrect and might possibly predate the creation of either certificate.

Because of all the possible points of failure involving certificates, troubleshooting 802.1X/EAP certificate problems in Zone 2 can be difficult. Additionally, there are more potential problems with certificates. The server certificate configuration may be incorrect on the RADIUS server. In other words, the certificate problem exists back in troubleshooting Zone 1. What if EAP-TLS is the deployed authentication protocol? EAP-TLS requires the provisioning of client-side certificates in addition to server certificates. Client certificates add an additional layer of possible certificate troubleshooting on the supplicant as well as within the private PKI infrastructure that has been deployed.

There is one final complication that might result in the failure of tunneled authentication. The chosen Layer 2 EAP protocols must match on both the supplicant and the authentication server. For example, the authentication will fail if PEAPv0 (EAP-MSCHAPv2) is selected on the supplicant while PEAPv1 (EAP-GTC) is configured on the RADIUS sever. Although the SSL/TLS tunnel might still be created, the inner tunnel authentication protocol does not match and authentication will fail. Although it is possible for multiple flavors of EAP to operate simultaneously over the same 802.1X framework, the EAP protocols must match on both the supplicant and the authentication server.

Zone 2: Supplicant Credential Problems

If you can verify that you do not have any certificate issues and the SSL/TLS tunnel is indeed established, the supplicant problems are credential failures. Figure 11.14 displays an 802.1X/EAP diagnostic log where the RADIUS server is rejecting the supplicant credentials. Possible supplicant credential problems include

  • Expired password or user account

  • Wrong password

  • User account does not exist in LDAP

  • Machine account has not been joined to the Windows domain

FIGURE 11.14 RADIUS server rejects supplicant credentials

image

If the user credentials do not exist in the LDAP database or the credentials have expired, authentication will fail. Unless single sign-on capabilities have been implemented on the supplicant, there is always the possibility that the domain user password can be incorrectly typed by the end user.

Another common error is that the Wi-Fi supplicant has been improperly configured for machine authentication and the RADIUS server has only been configured for user authentication. In Figure 11.15 we see a diagnostic log that clearly shows the machine credentials being sent to the RADIUS server and not the user credentials. The RADIUS server was expecting a user account and therefore rejected the machine credentials because no machine accounts had been set up for validation. In the case of Windows, the machine credentials are based on a System Identifier (SID) value that is stored on a Windows domain computer after being joined to a Windows domain with Active Directory.

FIGURE 11.15 Machine authentication failure

image

Of course, a WLAN administrator can always verify that all is well with an 802.1X/EAP client session. Always remember that a byproduct of the EAP process is the generation of the pairwise master key (PMK) that seeds the 4-Way Handshake exchange. Figure 11.16 shows the EAP process completing; the pairwise master key (PMK) is sent to the AP from the RADIUS server. The 4-Way Handshake process then begins to dynamically generate the pairwise transient key (PTK) that is unique between the radios of the AP and the client device. When the 4-Way Handshake completes, the encryption keys are installed and the Layer 2 connection is completed. The virtual controlled port on the authenticator opens up for this Wi-Fi client. The supplicant can now proceed to higher layers and get an IP address. If the client does not get an IP address, there is a networking issue and therefore the problem is not a Wi-Fi issue.

FIGURE 11.16 4-Way Handshake

image

One final consideration when troubleshooting 802.1X/EAP is RADIUS attributes. RADIUS attributes can be leveraged during 802.1X/EAP authentication for role-based access control, providing custom settings for different groups of users or devices. For example, different groups of users may be assigned to different VLANs even though they are connected to the same 802.1X/EAP SSID. If the RADIUS attribute configuration does not match on the authenticator and the RADUS server, users might be assigned to default role or VLAN assignments. In worst-case scenarios, a RADIUS attribute mismatch might result in authentication failure.

Roaming Troubleshooting

Mobility is the whole point behind wireless network access. 802.11 clients need the ability to seamlessly roam between access points without any interruption of service or degradation of performance. As shown in Figure 11.17, seamless roaming has become even more important in recent years because of the proliferation of handheld personal Wi-Fi devices such as smart phones and tablets.

FIGURE 11.17 Seamless roaming

image

The most common roaming problems are a result of either bad client drivers or bad WLAN design. The very common sticky client problem is when client stations stay connected to their original AP and do not roam to a new AP of closer vicinity and stronger signal. The sticky client problem and other roaming performance issues can usually be avoided with proper WLAN design and site surveys. Good roaming design entails defining primary coverage and secondary coverage zones.

Roaming performance also has a direct relationship to WLAN security. Every time a client station roams, new encryption keys must be generated between the AP and the client station radios via the 4-Way Handshake. When using 802.1X/EAP security, roaming can be especially troublesome for VoWiFi and other time-sensitive applications. Due to the multiple frame exchanges between the authentication server and the supplicant, an 802.1X/EAP authentication can take 700 milliseconds (ms) or longer for the client to authenticate. VoWiFi requires a handoff of 150 ms or less to avoid a degradation of the quality of the call, or even worse, a loss of connection. Therefore, faster, secure roaming handoffs are required.

In Chapter 7, “802.11 Fast Secure Roaming,” you learned about opportunistic key caching (OKC) and fast BSS transition (FT), both of which produce roaming handoffs of closer to 50 ms even when 802.1X/EAP is the chosen security solution. Both OKC and FT use key distribution mechanisms so that roaming clients do not have to reauthenticate every time they roam. OKC is now considered a legacy method of fast secure roaming. The FT roaming mechanisms defined in both 802.11r and Voice Enterprise are considered the standard. Many WLAN enterprise vendor APs are now certified for Voice Enterprise by the Wi-Fi Alliance. However, client-side support for Voice Enterprise is not widespread. Any client devices that were manufactured before 2012 simply will not support 802.11r/k/v operations. Therefore, the bulk of client devices do not support Voice Enterprise capabilities. However, client-side support is growing.

Most security-related roaming problems are based on the fact that many clients simply do not support either OKC or fast BSS transition (FT). Client-side support for any device that will be using voice applications and 802.1X/EAP is critical. Proper planning and verification of client-side and AP support for OKC or FT will be necessary. Figure 11.18 shows the results of a diagnostic command that displays the roaming cache of an access point. This type of diagnostic command can verify if PMKs are being forwarded between access points. In this situation FT is enabled on the AP and supported on the client radio. You can verify the MAC address of the supplicant and the authenticator as well as the PMKR0 and the PMKRO holder. Always remember that the supplicant must also support FT; otherwise, the suppliant will reauthenticate every time the client roams.

FIGURE 11.18 Roaming cache

image

You also learned in Chapter 7 that enabling Voice Enterprise mechanisms on an access point may actually create connectivity problems for legacy clients. When FT is configured on an access point, the AP will broadcast management frames with new information elements. For example, the mobility domain information element (MDIE) will be in all beacon and probe response frames. Unfortunately, the drivers of some older legacy client radios may not be able to process the new information in these management frames. The result is that legacy clients may have connectivity problems when an AP is configured for FT. Always test the legacy client population when configuring APs for fast BSS transition. If connectivity problems arise, consider using a separate SSID solely for fast BSS transition devices. However, please remember that every SSID consumes airtime due to the Layer 2 management overhead. Additionally, as more devices begin to support FT, upgrade your client devices.

Because 802.11 wireless networks are usually integrated into preexisting wired topologies, crossing Layer 3 boundaries is often a necessity, especially in large deployments. The only way to maintain upper-layer communications when crossing Layer 3 subnets is to provide a Layer 3 roaming solution. When clients roam to a new subnet, a GRE tunnel must be created to the original subnet so that the WLAN client can maintain its original IP address. As shown in Figure 11.19, the major WLAN vendors offer diagnostic tools and commands to verify that Layer 3 roaming tunnels are being successfully created.

FIGURE 11.19 Layer 3 roaming

image

VPN Troubleshooting

VPNs are rarely used anymore as the primary method of security for WLANs. Occasionally, a VPN may be used to provide data privacy across a point-to-point 802.11 wireless bridge link. IPsec VPNs are still commonly used to connect remote branch offices with corporate offices across WAN links. Although a site-to-site VPN link is not necessarily a WLAN security solution, the wireless user traffic that originated at the remote location may be required to traverse through a VPN tunnel. Most WLAN vendors also offer VPN capabilities within their solution portfolio.

The creation of an IPsec tunnel involves two phases, called Internet Key Exchange (IKE) phases:

  • IKE Phase 1

    The two VPN endpoints authenticate one another and negotiate keying material. The result is an encrypted tunnel used by Phase 2 for negotiating the Encapsulating Security Payload (ESP) security associations.

  • IKE Phase 2

    The two VPN endpoints use the secure tunnel created in Phase 1 to negotiate ESP security associations (SAs). The ESP SAs are used to encrypt user traffic that traverses between the endpoints.

The good news is that any quality VPN solution offers diagnostic tools and commands to troubleshoot both IKE phases. Some of the common problems that can occur if IKE Phase 1 fails are

  • Certificate problems

  • Incorrect networking settings

  • Incorrect NAT settings on the external firewall

In Figure 11.20 you see the results of an IKE Phase 1 diagnostic command executed on a VPN server. IPsec uses digital certificates during Phase 1. If IKE Phase 1 fails due to a certificate problem, ensure that you have the correct certificates installed properly on the VPN endpoints. Also remember that certificates are time based. Very often, a certificate problem during IKE Phase 1 is simply an incorrect clock setting on either VPN endpoint.

FIGURE 11.20 IPsec Phase 1 – certificate failure

image

In Figure 11.21 you see the results of an IKE Phase 1 diagnostic command executed on a VPN server that indicates a possible networking error due to incorrect configuration. IPsec uses private IP addresses for tunnel communications and also uses external IP addresses, which are normally the public IP address of a firewall. If an IKE Phase 1 failure occurs as shown in Figure 11.19, check the internal and external IP settings on the VPN devices. If an external firewall is being used, also check the Network Address Translation (NAT) settings. Another common networking problem that causes VPNs to fail is that needed firewall ports are blocked. Ensure that the following ports are open on any firewall that the VPN tunnel may traverse:

  • UDP 500 (IPsec)

  • UDP 4500 (NAT Transversal)

FIGURE 11.21 IPsec Phase 1 – networking failure

image

If you can confirm that IKE Phase 1 is successful yet the VPN is still failing, then IKE Phase 2 is the likely culprit. Some of the common problems if IKE Phase 2 fails are

  • Mismatched transform sets between the client and server (encryption algorithm, hash algorithm, etc.)

  • Mixing different vendor solutions

In Figure 11.22 you see the successful results of an IKE Phase 2 diagnostic command executed on a VPN server. If this command had indicated a failure, be sure to check both encryption and hash settings on the VPN endpoints. Check other IPsec settings such as tunnel mode. You will need to verify that all settings match on both ends. IKE Phase 2 problems often occur when different VPN vendors are used on opposite sides of the intended VPN tunnel. Although IPsec is a standards-based suite of protocols, mixing different VPN vendor solutions often results in more troubleshooting.

FIGURE 11.22 IPsec Phase 2 – Success

image

Summary

Troubleshooting WLANs can be very challenging. Much of WLAN troubleshooting revolves around performance issues that are a result of improper WLAN design. However, WLAN troubleshooting can also revolve around the 802.11 security that is implemented. If you have a deep understanding of PSK authentication, 802.1X/EAP authentication, and the 4-Way Handshake mechanisms, you will be better prepared to troubleshoot potential WLAN security problems. Always remember to also use troubleshooting best practices, analyze the problems at the different layers of the OSI model, and utilize all diagnostic tools that might be available.

Exam Essentials

Understand troubleshooting basics. Recognize the importance of asking the correct questions and gathering the proper information to determine the root cause of the problem.

Explain where in the OSI model various WLAN problems occur. Remember that troubleshooting up the OSI model is a recommended strategy. WLAN security issues almost always reside at Layers 1 and 2. Remember that most WLAN connectivity problems also exist on the client devices as opposed to the WLAN infrastructure.

Explain how to troubleshoot PSK authentication. Understand that the usual causes of failed PSK authentication are client driver issues and mismatched passphrase credentials. The 4-Way Handshake will fail if PSK authentication fails.

Define the multiple points of failure of 802.1X/EAP authentication. Explain all the potential backend communications points of failure and possible supplicant failures. Understand how to analyze the 802.1X/EAP process to pinpoint the exact point of failure.

Explain potential WLAN security problems with roaming. Understand that both the WLAN infrastructure and the WLAN clients must support fast secure roaming mechanisms such as OKC or Voice Enterprise.

Define troubleshooting strategies for an IPsec VPN. Recognize that IPsec establishes a VPN tunnel through two IKE phases. Explain how to troubleshoot each independent IKE phase and how to rectify the problem.

Review Questions

1. What can cause PSK authentication to fail? (Choose all that apply.)

A. Passphrase mismatch

B. Expired root CA certificate

C. WLAN client driver problem

D. Expired LDAP user account

E. Encryption mismatch

2. When the Wi-Fi network is the actual source of either a connectivity, security, or performance problem, which WLAN device is usually where the problem resides?

A. WLAN controller

B. Access point

C. WLAN client

D. Wireless network management server

3. When you are troubleshooting client connectivity problems with a client using 802.1X/EAP security, what is the first action you should take to investigate a potential Layer 1 problem?

A. Reboot the WLAN client.

B. Verify the root CA certificate.

C. Verify the EAP protocol.

D. Disable and re-enable the client radio network interface.

E. Verify the server certificate.

4. Proper implementation of 802.1X/EAP security requires the exact same EAP protocol on which of these two devices?

A. Supplicant and authenticator

B. Supplicant and authentication server

C. Authenticator and authentication server

D. Authentication server and LDAP server

E. Supplicant and LDAP server

5. Bob the WLAN administrator is troubleshooting an IPsec VPN problem that has been deployed as the security solution over a point-to-point 802.11 wireless bridge link between two buildings. Bob cannot get the VPN tunnel to establish and notices that there is a certificate error during the IKE Phase 1 exchange. What are the possible causes of this problem? (Choose all that apply.)

A. The VPN server behind the root bridge is using AES-256 encryption, and the VPN endpoint device behind the nonroot bridge is using AES-192 encryption.

B. The VPN server behind the root bridge is using SHA-1 hash for data integrity, and the VPN endpoint device behind the nonroot bridge is using MD-5 for data integrity.

C. The root CA certificate installed on the VPN device behind the nonroot bridge was not used to sign the server certificate on the VPN server behind the root bridge.

D. The clock settings of the VPN server that is deployed behind the root bridge predate the creation of the server certificate.

E. The public/private IP address settings are misconfigured on the VPN device behind the nonroot bridge.

6. Andrew Garcia, the WLAN administrator, is trying to explain to his boss that the WLAN is not the reason that Andrew’s boss cannot post on Facebook. Andrew has determined that the problem does not exist at Layer 1 or Layer 2 of the OSI model. What should Andrew say to his boss? (Choose the best answer.)

A. Wi-Fi only operates at Layer 1 and Layer 2 of the OSI model. The WLAN is not the problem.

B. The problem is most likely a networking problem or an application problem.

C. Don’t worry, boss; I will fix it.

D. Why are you looking at Facebook during business hours?

7. You have been tasked with troubleshooting a client connectivity problem at your company’s headquarters. All the APs and employee iPads are configured for PSK authentication. An employee notices that he cannot connect his iPad to the AP in the reception area of the main building but can connect to other APs. View the following graphic and describe the cause of the problem.

image

A. The WLAN client driver is not communicating properly with the device’s OS.

B. The APs are configured for CCMP encryption only. The client only supports TKIP.

C. The client has been configured with the wrong WPA2 Personal passphrase.

D. The AP in the reception area has been configured with the wrong WPA2 Personal passphrase.

8. You have been tasked with configuring a secure WLAN for 300 APs at the corporate offices. All the APs and employee Windows laptops have been configured for 802.1X using PEAPv0 (EAP-MSCHAPv2). The domain user accounts are failing authentication with every attempt. After viewing the graphic shown here, determine the possible causes of the problem. (Choose all that apply.)

image

A. Windows OS laptops have the root certificate installed in the incorrect store.

B. Windows OS laptops’ supplicant has been configured for machine authentication.

C. The shared secret does not match between the AP and the RADIUS server.

D. The RADIUS cannot query LDAP.

E. The Windows OS laptops have been configured for PEAPv1 (EAP-GTC).

F. The server certificate has expired.

9. You have been tasked with configuring a secure WLAN for 500 APs at the corporate offices. All the APs and employee Windows laptops have been configured for 802.1X using PEAPv1 (EAP-GTC). The domain user accounts are failing authentication with every attempt. After viewing the graphic shown here, determine the possible causes of the problem. (Choose all that apply.)

image

A. The Windows OS laptops have the root certificate installed in the incorrect store.

B. The Windows OS laptops’ supplicant has been configured for machine authentication.

C. The shared secret does not match between the AP and the RADIUS server.

D. The RADIUS cannot query LDAP.

E. The Windows OS laptops have been configured for PEAPv0 (EAP-MSCHAPv2).

F. The server certificate has expired.

10. The corporate IT administrators, Hunter, Rion, and Liam, are huddled together to try to solve an issue with the newly deployed VoWiFi phones. The chosen security solution is PEAPv0 (EAP-MSCHAPv2) for the voice SSID that also has Voice Enterprise enabled on the access points. The VoWiFi phones are authenticating flawlessly and voice calls are stable when the employees use the devices from their desk. However, there seem to be gaps in the audio and sometimes disconnects when the employees are talking on the VoWiFi phones and move to other areas of the building. What are the possible causes of the interruption of service for the voice calls while the employees are mobile? (Choose all that apply.)

A. VoWiFi phones should only be configured for PSK authentication when roaming is a requirement.

B. VoWiFi phones are reauthenticating every time they roam to a new AP.

C. VoWiFi phones do not use opportunistic key caching.

D. VoWiFi phones do not support fast BSS transition.

11. You have been tasked with troubleshooting a client connectivity problem at your company’s headquarters. All the APs and employee iPads are configured for PSK authentication. An employee notices that he cannot connect to any of the APs with his iPad; however, all the other corporate iPads are connecting. After viewing the graphic shown here, determine the cause of the problem.

image

A. The WLAN client driver is not communicating properly with the device OS.

B. The APs are configured for CCMP encryption only. The client only supports TKIP.

C. The APs have been configured with the WPA2 Personal passphrase.

D. The APs have been configured for WPA2 Enterprise.

12. You have been tasked with configuring a secure WLAN for 400 APs at the corporate offices. All the APs and employee Windows laptops have been configured for 802.1X using EAP-MSCHAPv2. The domain user accounts are failing authentication with every attempt. After viewing the graphic shown here, determine the possible causes of the problem. (Choose all that apply.)

image

A. The networking settings on the AP are incorrect.

B. The Windows OS laptops’ supplicant has been configured for machine authentication.

C. The supplicant clock settings are incorrect.

D. An authentication port mismatch exists between the AP and the RADIUS server.

E. The networking settings on the RADIUS server are incorrect.

F. The incorrect root certificate is selected in the supplicant.

13. You have been tasked with configuring a secure WLAN for 600 APs at the corporate offices. All the APs and employee Windows laptops have been configured for 802.1X/EAP. The domain user accounts are failing authentication with every attempt. After looking at some packet captures of the authentication failures, you have determined that an SSL/TLS tunnel is never created. After viewing the graphic shown here, determine the possible causes of the problem. (Choose all that apply.)

image

A. The Windows laptops are missing a client certificate.

B. The incorrect root certificate is selected in the supplicant.

C. The server certificate has expired.

D. PACs have not been provisioned properly.

E. The root certificate has expired.

14. The network administrator of the WonderPuppy Coffee Company calls up the support hotline for his WLAN vendor and informs the support personnel that the WLAN is broken. The support personnel ask the customer a series of questions so that they can isolate and identify the cause of a potential problem. What are some common Troubleshooting 101 questions? (Choose all that apply.)

A. When is the problem happening?

B. What is your favorite color?

C. What is your quest?

D. Does the problem reoccur or did it just happen once?

E. Did you make any changes recently?

15. You have been tasked with configuring a secure WLAN for 900 APs at the corporate offices. All the APs and employee Windows laptops have been configured for EAP-MSCHAPv2. You are required to provide both machine and user authentication as part of the security solution. You have verified that the backend communications between the RADIUS server and the AP are working. After viewing the graphic shown here, determine the possible causes of the problem. (Choose all that apply.)

image

A. The domain account has expired.

B. The machine accounts were not joined to the domain.

C. The server certificate has expired.

D. The supplicant has only been configured for user authentication.

E. The root certificate has expired.

F. The incorrect root certificate is selected in the supplicant.

16. The network administrator of the Holy Grail Corporation calls up the support hotline for his WLAN vendor and informs the support personnel that the WLAN bridge link is no longer working. The support personnel ask the customer a series of questions so that they can isolate and identify the cause of a potential problem. What are some common Troubleshooting 101 questions? (Choose all that apply.)

A. When is the problem happening?

B. Where is the problem happening?

C. Does the problem affect one client or numerous clients?

D. What is the airspeed velocity of an unladen swallow?

17. WLAN administrator Marko Tisler is troubleshooting an IPsec VPN problem that has been deployed as the security solution over a point-to-point 802.11 wireless bridge link between two buildings. Marko cannot get the VPN tunnel to establish and notices that the IKE Phase 1 exchange is successful; however, IKE Phase 2 is failing. What are the possible causes of this problem? (Choose all that apply.)

A. The VPN server behind the root bridge is using AES-256 encryption and the VPN endpoint device behind the nonroot bridge is using AES-192 encryption.

B. The VPN server behind the root bridge is using SHA-1 hash for data integrity and the VPN endpoint device behind the nonroot bridge is using MD-5 for data integrity.

C. The root CA certificate installed on the VPN device behind the nonroot bridge was not used to sign the server certificate on the VPN server behind the root bridge.

D. The clock settings of the VPN server that sits behind the root bridge predate the creation of the server certificate.

E. The public/private IP address settings are misconfigured on the VPN device behind the nonroot bridge.

18. You have been tasked with configuring a secure WLAN for 900 APs at the corporate offices. All the APs and employee Windows laptops have been configured for EAP-MSCHAPv2. The WLAN clients are never able to connect to the WLAN. After viewing the graphic shown here, determine the possible causes of the problem. (Choose all that apply.)

image

A. The VLAN on the access layer switch is incorrectly configured.

B. The machine accounts were not joined to the domain.

C. The server certificate has expired.

D. The supplicant has only been configured for user authentication.

E. The root certificate has expired.

F. The DHCP server has run out of leases.

19. You have been tasked with configuring a secure WLAN for 600 APs at the corporate offices. All the APs and employee Windows laptops have been configured for EAP-MSCHAPv2. User authentication is failing for one of the employee laptops. After viewing the graphic shown here, determine the possible causes of the problem. (Choose all that apply.)

image

A. There is an incorrect shared secret on the RADIUS server.

B. The machine accounts were not joined to the domain.

C. The server certificate has expired.

D. The user account does not exist.

E. The user password has expired.

F. The DHCP server has run out of leases.

20. At what layer of the OSI model do most networking problems occur?

A. Physical

B. DataLink

C. Network

D. Transport

E. Session

F. Presentation

G. Application

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.167.176