Scaling web applications

As the number of users grow over time, the application needs to scale up to be able to keep up with the load. There are usually two main ways to scale up applications:

  • Vertical scale: This is the simplest scaling. It involves upgrading the server with more resources such as CPU, RAM, HDD, I/O, and more.
  • Horizontal scale: This is more complicated, but also better in the long run. It involves distributing the load across multiple servers instead of just one.
    Scaling web applications

    Figure 1: Example of vertical versus horizontal scaling

Scaling out vertically – one server

The simplest way to deploy an application is to put everything into a single server, that is, all of the services, such as databases and webservers are in one server.

Scaling out vertically – one server

Figure 2: Single server deployment setup

All the processes such as the database, web server, and application are going to compete for the I/O, CPU, and RAM. As the number of users grows significantly, we can scale the single server by adding more and faster resources to it. For example, by adding SSD, more CPU, and RAM. This is called vertical scaling. However, this has limits to how cost-effective it can be, rather than scaling horizontally.

Scaling out horizontally – multiple servers

Splitting the application into multiple servers has proven to be a more cost-effective way of scaling applications. As a matter of fact, companies like Google, Facebook, Amazon, and many others use multiple clusters of servers to serve millions of concurrent users.

The following diagram shows a way to scale in a multi-server deployment:

Scaling out horizontally – multiple servers

Figure 3: Multi-server deployment setup

There are many ways of splitting applications, but the following are the main components:

  • Database server(s): The application and database no longer compete for CPU, RAM, and I/O. Now, each tier can have its own servers. The bottleneck could be network bandwidth and latency, so the network should be optimized for the required transfer rates.
  • Load balancer server(s): Now that we have multiple servers, we can use a load balancer to distribute the application load into multiple instances. This provides protection against DDoS attacks. However, the load balancer should be configured properly, and have enough resources or it can become a bottleneck for performance.
  • Caching/reverse proxy server(s): Static files, images, and some HTTP requests can be cached to serve them quicker and to reduce CPU usage. For example, Nginx is good at this.
  • Database replication server(s): Very often, a web application requires many more reads (show/search products) than writes (create new products). We can add multiple-read databases (slaves) while having one read-write database (master).
  • Additional services server(s): These are not essential, but they can be useful for monitoring, logging, and taking backups, and so on. Monitoring helps in detecting when a server is down or reaching max capacity. Having centralized logging aids debugging. Finally, backups are very useful for restoring the site in case of failures.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.142.174.55