Applying autoscaling

You learned about autoscaling in Chapter 4, Principles of Solution Architecture Design. You learned about predictive autoscaling and reactive autoscaling in the section titled Design for scale. The concept of autoscaling became popular with the agility provided by the cloud computing platform. Cloud infrastructure allows you to easily scale up or scale down your server fleet based on user or resource demand.

With a public cloud platform such as AWS, you can apply autoscaling at every layer of your architecture. In the presentation layer, you can scale the web server fleet based on your requests, and at the application layer based on the server's memory and CPU utilization. You can also perform scheduled scaling if you know the traffic pattern when the server load is going to increase. At the database level, autoscaling is available for relational databases such as Amazon Aurora Serverless and the Microsoft Azure SQL database. A NoSQL database such as Amazon DynamoDB can be autoscaled based on throughput capacity.

When autoscaling, you need to define the number of desired server instances. You need to define the maximum and minimum server capacity as per your application's scaling needs. The following screenshot illustrates the autoscaling configuration from AWS Cloud:

Autoscaling configuration

In the preceding autoscaling configuration setting, if three web server instances are running, it can scale up to 5 instances if the CPU utilization of servers goes above 50% and scale down to 2 instances if the CPU utilization goes below 20%. In the case of an unhealthy instance, the count will go below the desired capacity in a standard scenario. In such a case, the load balancer will monitor the instance health and will use autoscaling to provision new instances. The load balancer monitors instance health and will trigger autoscaling to provision new instances as required.

Autoscaling is a good feature to have, but make sure you set up your desired configurations to limit the cost of a change in CPU usage. In the case of unforeseen traffic due to events such as a distributed denial of service (DDoS) attack, autoscaling can increase cost significantly. You should plan to protect your system for such kinds of events. You will learn more about this in Chapter 8, Security Considerations.

At the instance level, you need high-performance computing (HPC) to perform manufacturing simulation or gnome analysis. HPC performs well when you put all instances in the same network close to each other for low latency of data transfer between the node of a cluster. Between your data centers or cloud, you can choose to use your private network, which can provide an additional performance benefit. For example, to connect your data center to AWS cloud, you can use Amazon Direct Connect. Direct Connects provides 10 Gbps private fiber optics lines, where Network latency is much lower than sending data over the internet.

In this section, you have learned about various networking components that can help to improve application performance. You can optimize your application network traffic according to your user location and application demand. Performance monitoring is an essential part of your application, and you should do proactive monitoring to improve customer experience. Let's learn more about performance monitoring

Table of Contents for Applying autoscaling

Create new playlist

Sign In

Sign Up

Table of Contents for
Applying autoscaling