TCP latency issues

Until now we have been troubleshooting connection-related issues. In this section, we will check the latency part. Latency can be on the network, or in application processing on the part of the client or server.

Cause of latency

Identifying the source of latency also plays an important role in TCP troubleshooting. Let's see what the common causes of latency are:

  • Network slow wire latency can be measured with the ping utility
  • Too many running processes eat memory. Check the memory management, work with free, top command to identify CPU and memory use
  • Application not started with sufficient memory or cannot serve more requests
  • Bad TCP tuning; verify the /etc/sysctl.cnf file
  • Network jitter; verify your network and check with the network administrator
  • Poor coding; benchmark your code by performing a load test over the network
  • Gateway wrongly set; check the gateway, verify the routing table, and verify the gateway
  • Higher hop counts; do a traceroute and check the number of hops (the higher the hop count, the more latency increases)
  • Slow NIC interface, the interface goes down; check the NIC card and verify its speed

Identifying latency

Various network utility tools are available to measure the latency between networks—for example traceroute, tcpping, and ping.

  • ping: This utility can be used to measure the round trip time (RTT):
    bash$ ping -c4 google.com
    PING google.com (216.58.196.110): 56 data bytes
    64 bytes from 216.58.196.110: icmp_seq=0 ttl=55 time=226.034 ms
    64 bytes from 216.58.196.110: icmp_seq=1 ttl=55 time=207.748 ms
    64 bytes from 216.58.196.110: icmp_seq=2 ttl=55 time=222.995 ms
    64 bytes from 216.58.196.110: icmp_seq=3 ttl=55 time=162.507 ms
    
    --- google.com ping statistics ---
    4 packets transmitted, 4 packets received, 0.0% packet loss
    round-trip min/avg/max/stddev = 162.507/204.821/226.034/25.394 ms
    
  • traceroute: This is used to identify the number of HOPS it has taken to reach the destination—the fewer the hops, the lower the latency

Server latency example

Wireshark can be used effectively to identify whether the network is slow or the application is slow. Open the slow_download.pcap file in Wireshark, and investigate the root cause of why the download is slow.

In this example, 5 MB of data is requested from the HTTP server, and it has taken approx. 4.99 minutes to download, as shown:

Server latency example

The steps to diagnose this issue are as follows:

  1. Go to Edit | Preferences | Protocols | HTTP and then enable all HTTP reassemble options.
  2. Apply the filter http.response.code==200.
  3. Go to HTTP and set the http.time == 299.816907000 to approximately 4.99 minutes.
  4. Check the size of the file by navigating to http.content_length_header == "5242880"; this is the size of the content.
  5. Check how many TCP segments have been sent— tcp.segment.count == 2584—and ask yourself whether so many are needed and whether the number can be reduced.
  6. Verify window_size for the client and server to check what was advertised by the client and what got used.
  7. Add tcp.window_size_value in the Wireshark column and sort in ascending order. Note that the entire packet flow from the server (10.0.0.16) to the client (122.167.205.152) has a window size of 100.
  8. Verify the sysctl.conf file in UNIX-flavored systems and check the TCP tuning parameters such as net.core.rmem_max, net.core.wmem_max, net.ipv4.tcp_rmem, and net.ipv4.tcp_wmemnet.ipv4.tcp_mem.

Tip

Make sure tcp.window_size stays large enough to avoid slowing down the sender. The window size can tell you if a system is too slow when processing incoming data; tcp_window_size indicates that the system is slow, not the network.

In this scenario, tcp.window_size was reduced in the sysctl.conf file to demonstrate the slow_download behavior and to give an insight into troubleshooting. After fixing Window_Size, the same download is reduced from 299.816907000 to 2.84 seconds. Open the fast_download.pcap file as shown in the following screenshot; the download time is reduced:

Server latency example

Wire latency

In this example, the TCP handshake process will be used to identify wire latency. Open the slow_client_ack.pcap file as shown in the following screenshot:

Wire latency

As you can see in the preceding screenshot:

  • The first two handshake messages (SYN, SYN-ACK) sent by the client/server over the wire are exchanged in less time
  • In the last handshake message, ACK sent by the client has taken frame.time_relative == 15.798777000 seconds and shows an increase in Time Since Reference. This is higher than the first two handshake messages, which confirms a wire latency on this packet
  • Once the handshake is completed, the operation resumes normally; the Time Since reference for all packets shows a consistent timing
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.29.22