Reducing latency

Latency can be a major factor in your product adoption, as users are looking for faster applications. It doesn't matter where your users are located, you need to provide a reliable service for your product to grow. You may not be able to achieve zero latency, but the goal should be to reduce response time within the user tolerance limit.

Latency is the time delay between the user sending a request and receiving the desired response. As shown in the following diagram, it takes 600 ms for a client to send a request to the server and 900 ms for the server to respond, which introduces a total latency of 1.5 seconds (1500 ms):

Request response latency in a client-server model

Most of the application needs to access the internet in order to have a diverse set of global users. These users are expecting consistency in performance, regardless of their geographical location. This is sometimes challenging, as it takes time to move data over the network from one part of the world to another.

Network latency can be caused by various factors, such as the network transmission medium, router hops, and network propagation. Oftentimes, a request that is sent over the internet hops over multiple routers, which adds latency. Enterprises commonly use their fiber optics line to set up connectivity between their corporate network or cloud, which helps to avoid inconsistency.

In addition to the problems caused by the network, latency can also be caused by various components of the architecture. At the infrastructure level, your compute server can have latency issues due to memory and processor problems, where the data transfer between the CPU and RAM is slow. The disk can have latency due to slow read and write. Latency in a hard disk drive (HDD) is dependent on the time it takes to select a disk memory sector to come around and position itself under the head for reading and writing.

The disk memory sector is the physical location of data in the memory disk. In an HDD, data is distributed in memory sectors during write operations, as the disk is continuously rotating, so data can be written randomly. During the read operation, the head needs to wait for the rotation to bring it to the disk memory sector.

At the database level, latency can be caused by slow data reads and writes from the database due to hardware bottlenecks or slow query processing. Taking the database load off by distributing the data with partitioning and sharding can help to reduce latency. At the application level, there could be an issue with transaction processing from code that needs to be handled using garbage collection and multithreading. Achieving low latency means higher throughput, as latency and throughput are directly related, so let's learn more about throughput.

Table of Contents for Reducing latency

Create new playlist

Sign In

Sign Up

Table of Contents for
Reducing latency