Analysing DNS problems

In the previous recipe, we saw how to identify a normal operation of DNS. In this recipe, we will learn how to discover problematic behavior of DNS, and how to figure out its source.

Getting ready

A DNS problem can result in bad performance while browsing the Internet, slow network while working inside the organization network, or any other performance issues. We will see how to isolate these problems and how to find out whether it is a DNS issue or not.

How to do it...

There are two major types of problems in DNS:

  • DNS cannot resolve a name
  • Slow operation of DNS

In both cases, connect your Wireshark to the network in the following order when you suspect an Internet connectivity problem:

  1. First, port mirror the PC of the customer complaining about the problem. In this step, you will see specific problems on the PC.
  2. Then, port mirror your DNS server. In this step, you will be able to find the general problems that are common to the entire organization (or at least to the part of it that has a problem).

DNS cannot resolve a name

How will you know that this is the problem?

  1. You try but cannot browse the Internet, send e-mails, or perform any other operations on the Internet.

    Assuming your connectivity to the network is working properly, ping the website you are trying to browse (for example, issue the command: ping www.packtpub.com) and see if you get any response.

  2. If you get a response, all is working OK.
  3. If you don't get any response, it can be because of the following reasons:
    • The website you are trying to ping blocks the ICMP requests
    • The DNS server you are trying to get the data from is not functioning
  4. To make sure that this is a DNS problem, start Wireshark and configure the DNS filter. In case of a problem, you will see one of the following:
    • When a website does not exist
    • Cannot reach the DNS server

      Tip

      You can also use the command nslookup in the command line. This command checks the IP of the inserted name.

  5. When the website does not exist, you will see (example in the following screenshot):
    • The DNS query and response, both with code 0x971e (the same code in query and response indicates that this is the response to the query)
    • A 346 ms delay between the DNS query and response, which means that the response came from an overseas server (for example, browsing from Europe when the DNS server is in Taiwan)
    • The request was sent and was replied from a.dns.tw (that is, DNS server is in Taiwan), which means that the DNS system works properly and your PC queried one of the authoritative DNS servers for .tw
    • The response is No such name, which means that there is no such server
    DNS cannot resolve a name
  6. When the DNS server does not respond, you will see one of the following screenshots:
    • The DNS refused message: In this case, your DNS server refuses the request. This is illustrated in the following screenshot (you will learn why in the How it works... section):
      DNS cannot resolve a name
    • The DNS consecutive queries: In this case, the DNS server simply does not answer. This is illustrated in the following screenshot:
    DNS cannot resolve a name

    When you right-click on one of the packets in the preceding screenshot and choose Follow UDP Stream, you will see that the DNS resolver on your PC sends several queries (with increasing time intervals between them), and then stops. This is shown in the next screenshot:

    DNS cannot resolve a name

DNS slow responses

How will you know that this is the problem?

  1. When you are browsing the Internet and getting very slow responses, perform the following steps:
    1. Port mirror the connection to the Internet, and check if you have any bottleneck on the way to the Internet. You can use the IO graphs for this purpose, as described in Chapter 5, Using Advanced Statistics Tools.
    2. Verify that you don't have a significant number of retransmissions or duplicate ACK's indicating a connection problem.
    3. Verify that you don't have any window-related problem, such as zero window or window full.
  2. If answers are no for the preceding checks, it might be a DNS problem. You can have DNS problems in two cases:
    • When working in your organization
    • When connecting to the Internet
  3. These issues can be resolved in two ways:
    • When facing problems in your organization, port mirror the switch port that is connected to the DNS server
    • When facing problems with the Internet, port mirror the switch port that connects your organization to the Internet
  4. Watch the DNS response time that you get. There are several ways to locate the problem, and they are given as follows:
    • The simplest way is to right-click on a packet from a DNS query stream, choose Follow UDP Stream, and then check the time between the query and response.
    • Another way is to use IO graphs for this purpose. In the IO Graphs window, choose Advanced in the Y Axis configuration and configure the filter dns.time with AVG(*) in the Graph lines. Refer the following screenshot:
      DNS slow responses

    You will get a graph of the DNS response times throughout the capture time.

    In this graph, you will see that most of the response times fall below 100mSec, which is quite reasonable. We have two peaks that indicate a probable problem, one at the beginning of the capture with 300 ms, and one at the end of the capture with 450 ms.

    Tip

    Reasonable times inside the organization (in a local site) should be not more than tens of milliseconds. When browsing the Internet, a good response time should be less than 100 ms, while up to 200 ms is still tolerant.

How it works...

There are six basic types of DNS response codes defined in RFC 1035. Additional error codes (up to 21) were defined in later standards (RFC 2136, RFC 2671, RFC 2845, and RFC 2930). Error codes can be found at http://tools.ietf.org/html/rfc2929#section-2.3.

The most common codes are shown in the following table:

Error code

Name

What is it (RFC 1035)

Why it happens

What to do

0

No error condition

No error, everything works fine.

This signifies that everything is working.

Be happy.

1

Format error

The DNS server couldn't interpret the query.

This error code is usually shown when the DNS server does not support DNS extensions, for example, EDNS0 (RFC 2671).

In most cases, there is nothing to do. The DNS request will be sent again without the extension.

If the problem still exists, change the DNS server.

2

Server failure

The DNS server was not able to process the query due to a problem with the name server.

This error code signifies that there is a problem in the DNS server.

Configure another DNS server and check again.

3

Name error

This is meaningful only for responses that are coming from authoritative name servers.

This error code signifies that the domain name requested in the query does not exist.

Check the domain name.

4

Not Implemented

The DNS server does not support the requested type of query.

  

5

Refused

The DNS server refuses to perform the specified operation due to policy reasons.

A name server may not wish to provide the information to the particular requester.

A name server may not wish to perform a particular operation.

This occurs due to connectivity problems, if the forward DNS is not configured, or if there is a problem in one of the DNS servers on the way.

There's more...

What DNS server should I configure? I have been asked this question many times. My answer to this is simple—a server that is physically close to you (that is, not an overseas server), and one that you know is efficient. An efficient server, that is, overseas will give slow responses due to the communication lines, and a nearby non-efficient server will also give you slow response times.

There's more...

In the preceding graph, we see a measurement taken with the Google Namebench open software (freeware). It shows the following details:

  • Average DNS response time of 80 ms to our local DNS server (you can see it is local from the unregistered address 10.0.0.138)
  • Average response time of 100 ms to the DNS server of my ISP
  • Response times of 120 ms and above to the servers located overseas

To summarize this, it is OK to have response times of around 100 ms; and in most of the cases, 150-200 ms will also be good enough. Don't worry if there are momentary peaks—it can be that your resolver is querying authoritative servers on the other side of the globe.

When you open a web page that holds a lot of content, your browser can send even tens of DNS queries. In the following screenshot, you see what happens when I open the browser to www.cisco.com.

There's more...

It starts with a DNS query to the A record of www.cisco.com (marked as 1 in the preceding screenshot), then a query to ap.ff.avast.com (marked as 2 in the preceding screenshot), which is the web shield server of Avast antivirus, to www.static-cisco.com (marked as 3 in the preceding screenshot), ciscosystems.tt.omtrdc.net (marked as 4 in the preceding screenshot), news (marked as 5 in the preceding screenshot), products (marked as 6 in the preceding screenshot), and newsroom (marked as 7 in the preceding screenshot) sites.

When we look at the response time graph (shown in the next screenshot), we see that the DNS response times are up to 600 ms. This explains why it took a few seconds to open the entire web page of Cisco.

There's more...
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.15.159.136