Load balancing and scaling 

This is an important ingredient for ensuring the extended availability of software systems. The goal of achieving application scalability (horizontal) through infrastructure elasticity is accomplished by leveraging a load balancer (LB) (software or hardware). We need traffic information to predict and prescribe the correct countermeasures in time. Deeper real-time analysis of traffic data helps us to understand and estimate the load on the applications. The insights from load data helps cloud administrators and operation teams to collectively and concisely formulate viable policies and rules for application scalability. Additional infrastructure modules can be readied in time in order to take up extra load. The API gateways can also scale horizontally as well as vertically so that the high availability of an API gateway solution is guaranteed. Otherwise, an API gateway can become a single point of failure. To have a clustered API gateway setup, we can have a LB in front of the API gateway. What this means is that multiple instances of an API gateway solution can be leveraged to ensure continuity, and all those instances can run the same configuration; this uniformity helps in virtualizing the same APIs and to execute the same policies. If there are multiple API gateway groups, then the capability (load balancing) can be elegantly extended and accurately accomplished across groups.

The API gateway does not mandate any additional requirements on LBs. That is, the user and data loads are balanced based on widely recommended characteristics including the response time, the system load at that point of time, and so on. API gateways are maintained in a stateless fashion in order to ensure they are not weighed down by state information. This also enables service messages to take any route to reach the appropriate and assigned services. Some prominent components such as caches and counters, which are typically held on a distributed cache, are meticulously updated for every unique message. This setup ultimately helps the API gateway to complete its obligations without any problem across modes (sticky and non-sticky).

The distributed nature of API gateways poses a certain restriction during active/active and active/passive clustering. For example, to lose any counter and cache state, the system has to be designed in such a way that at least one API gateway is active at all times. Precisely speaking, to ensure high availability and reliability, as previously indicated, multiple API gateway instances have to run in a connected and clustered manner. The API gateway is able to maintain zero downtime by having the configuration deployment in a steady and continuous fashion. Generally, an API gateway instance in the cluster takes a few seconds to update its configuration. And when it is getting updated, that particular instance does not entertain any new request. Still, all the existing in-flight requests are fully honored. However, the other API gateway instances in the cluster can ceaselessly receive and process new requests and deliver the results. The key role of the Load Balancer here is to ensure all the incoming requests are pushed to the correct API gateway instances that are receiving and processing fresh requests. Thus, API gateway clustering is important for continuously receiving and responding to service messages and the Load Balancer plays a vital role in fulfilling this, as illustrated in the following diagram:

Table of Contents for Load balancing and scaling&#xA0;

Create new playlist

Sign In

Sign Up

Table of Contents for
Load balancing and scaling