Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Load balancing our application

Sometimes, clustering our app is not enough and we need to scale our application horizontally.

There are a number of ways to horizontally scale an app. Nowadays, with cloud providers such as Amazon, every single provider has implemented their own solution with a number of features.

One of my preferred ways of implementing the load balancing is using NGINX.

NGINX is a web server with a strong focus on the concurrency and low memory usage. It is also the perfect fit for Node.js applications as it is highly discouraged to serve static resources from within a Node.js application. The main reason is to avoid the application from being under stress due to a task that could be done better with another software, such as NGINX (which is another example of specialization).

However, let's focus on load balancing. The following figure shows how NGINX works as a load balancer:

As you can see in the preceding diagram, we have two PM2 clusters load balanced by an instance of NGINX.

The first thing we need to do is know how NGINX manages the configuration.

On Linux, NGINX can be installed via yum, apt-get, or any other package manager. It can also be built from the source, but the recommended method, unless you have very specific requirements, is to use a package manager.

By default, the main configuration file is /etc/nginx/nginx.conf, as follows:

user  nginx;
worker_processes  1;

error_log  /var/log/nginx/error.log warn;
pid        /var/run/nginx.pid;

events {
  worker_connections  1024;
}

http {
  include                     /etc/nginx/mime.types;
  default_type                application/octet-stream;

  log_format          main '$remote_addr - $remote_user [$time_local] "$request" '
     '$status $body_bytes_sent "$http_referer" '
     '"$http_user_agent" "$http_x_forwarded_for" '
     '$request_time';

  access_log                  /var/log/nginx/access.log  main;
  server_tokens               off;
  sendfile                    on;
  #tcp_nopush                 on;
  keepalive_timeout           65s;
  send_timeout                15s;
  client_header_timeout       15s;
  client_body_timeout         15s;
  client_max_body_size        5m;
  ignore_invalid_headers      on;
  fastcgi_buffers             16 4k;
  #gzip                       on;
  include                     /etc/nginx/sites-enabled/*.conf;
}

This file is pretty straightforward, it specifies the number of workers (remember, processes to serve requests), the location of error logs, number connections that a worker can have active at the time, and finally, the HTTP configuration.

The last line is the most interesting one: we are informing NGINX to use /etc/nginx/sites-enabled/*.conf as potential configuration files.

With this configuration, every file ending in .conf under the specified folder is going to be part of the NGINX configuration.

As you can see, there is a default file already existing there. Modify it to look as follows:

http {
  upstream app {
    server 10.0.0.1:3000;
    server 10.0.0.2:3000;
  }
  server {
    listen 80;
    location / {
      proxy_pass http://app;
    }
  }
}

This is all the configuration we need to build a load balancer. Let's explain it in the following:

The upstream app directive is creating a group of services called app. Inside this directive, we specify two servers as we've seen in the previous image.
The server directive specifies to NGINX that it should be listening to all the requests from port 80 and passing them to the group of upstream called app.

Now, how does NGINX decide to send the request to which computer?

In this case, we could specify the strategy used to spread the load.

By default, NGINX, when there is not a balancing method specifically configured, uses Round Robin.

One thing to bear in mind is that if we use round robin, our application should be stateless as we won't be always hitting the same machine, so if we save the status in the server, it might not be there in the following call.

Round Robin is the most elementary way of distributing load from a queue of work into a number of workers; it rotates them so that every node gets the same amount of requests.

There are other mechanisms to spread the load, as follows:

  upstream app {
    least_conn;
    server 10.0.0.1:3000;
    server 10.0.0.2:3000;
  }

Least connected, as its name indicates, sends the request to the least connected node, equally distributing the load between all the nodes:

  upstream app {
    ip_hash;
    server 10.0.0.1:3000;
    server 10.0.0.2:3000;
  }

IP hashing is an interesting way of distributing the load. If you have ever worked with any web application, the concept of sessions is something present in almost any application. In order to remember who the user is, the browser sends a cookie to the server, which has stored who the user is in memory and what data he/she needs/can be accessed by that given user. The problem with the other type of load balancing is that we are not guaranteed to always hit the same server.

For example, if we are using Least connected as a policy for balancing, we could hit the server one in the first load, but then hit a different server on subsequent redirections that will result in the user not being displayed with the right information as the second server won't know who the user is.

With IP hashing, the load balancer will calculate a hash for a given IP. This hash will somehow result in a number from 1 to N, where N is the number of servers, and then, the user will always be redirected to the same machine as long as they keep the same IP.

We can also apply a weight to the load balancing, as follows:

  upstream app {
    server 10.0.0.1:3000 weight=5;
    server 10.0.0.2:3000;
  }

This will distribute the load in such way that, for every six requests, five will be directed to the first machine and one will be directed to the second machine.

Once we have chosen our preferred load balancing method, we can restart NGINX for the changes to take effect, but first, we want to validate them as shown in the following image:

As you can see, the configuration test can be really helpful in order to avoid configuration disasters.

Once NGINX has passed configtest, it is guaranteed that NGINX will be able to restart/start/reload without any syntax problem, as follows:

sudo /etc/init.d/nginx reload

Reload will gracefully wait until the old threads are done, and then, reload the configuration and route the new requests with the new configuration.

If you are interested in learning about NGINX, I found the following official documentation of NGINX quite helpful:

http://nginx.org/en/docs/

Health check on NGINX

Health checking is one of the important activities on a load balancer. What happens if one of the nodes suffers a critical hardware failure and is unable to serve more requests?

In this case, NGINX comes with two types of health checks: passive and active.

Passive health check

Here, NGINX is configured as a reverse proxy (as we did in the preceding section). It reacts to a certain type of response from the upstream servers.

If there is an error coming back, NGINX will mark the node as faulty, removing it from the load balancing for a certain period of time before reintroducing it. With this strategy, the number of failures will be drastically reduced as NGINX will be constantly removing the node from the load balancer.

There are a few configurable parameters, such as max_fails or fail_timeout, where we can configure the amount of failures required to mark a node as invalid or the time out for requests.

Active health check

Active health checks, unlike passive health checks, actively issue connections to the upstream servers to check whether they are responding correctly to the experiencing problems.

The most simplistic configuration for active health checks in NGINX is the following one:

http {
  upstream app {
    zone app test;
    server 10.0.0.1:3000;
    server 10.0.0.2:3000;
  }
  server {
    listen 80;
    location / {
      proxy_pass http://app;
      health_check;
    }
  }
}

There are two new lines in this config file, as follows:

health_check: This enables the active health check. The default configuration is to issue a connection every five seconds to the host and port specified in the upstream section.
zone app test: This is required by the NGINX configuration when enabling the health check.

There is a wide range of options to configure more specific health checks, and all of them are available in NGINX configuration that can be combined to satisfy the needs of different users.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Load balancing our application

Create new playlist

Sign In

Sign Up

Load balancing our application

Health check on NGINX

Passive health check

Active health check

Table of Contents for
Load balancing our application