Right-sizing infrastructure

The point of optimizing your infrastructure is to protect your companies revenue, while minimizing the cost of operating your infrastructure. Your goal should be to ensure that users don't encounter high-latency, otherwise known as bad performance or worse, unfulfilled or dropped requests, all the while making your venture remains a sustainable endeavor.

The three pillars of web application performance are as follows:

CPU utilization
Memory usage
Network bandwidth

I have intentionally left disk access out of the key consideration metrics, since only particular workloads executed on an application server or data store are affected by it. Disk access would rarely ever impact the performance of serving a web application as long as application assets are delivered by a Content Delivery Network (CDN). That said, still keep an eye on any unexpected runaway disk access, such as high frequency creation of temp and log files. Docker, for example, can spit out logs that can easily fill up a drive.

In an ideal scenario, CPU, memory, and network bandwidth use should be utilized evenly around 60-80% of available capacity. If you encounter performance issues due to various other factors such as disk I/O, a slow third-party service, or inefficient code, most likely one of your metrics will peek at or near maximum capacity, while the other two are idling or severely underutilized. This is an opportunity to use more CPU, memory, or bandwidth to compensate for the performance issue and also evenly utilize available resources.

The reason behind targeting 60-80% utilization is to allow for some time for a new instance (server or container) to be provisioned and ready to serve users. After your predefined threshold has been crossed, while a new instance is provisioned, you can continue serving the increasing number of users, thus minimizing unfulfilled requests.

Throughout this book, I have discouraged over-engineering or perfect solutions. In today's complicated IT landscape, it is nearly impossible to predict where you will encounter performance bottlenecks. Your engineering may, very easily, spend $100,000+ worth of engineering hours, where the solution to your problem may be a few hundred dollars of new hardware, whether it be a network switch, solid state drive, CPU, and more memory.

If your CPU is too busy, you may want to introduce more bookkeeping logic to your code, via index, hash tables, or dictionaries, that you can cache in memory to speed up subsequent or intermediary steps of your logic. For example, if you are constantly running array lookup operations to locate particular properties of a record, you can perform an operation on that record, saving the ID and/or the property of the record in a hash table that you keep in memory will reduce your runtime cost from O(n) down to O(1).

Following the preceding example, you may end up using too much memory with hash tables. In this case, you may want to more aggressively offload or transfer caches to slower, but more plentiful data stores using your spare network bandwidth, such as a Redis instance.

If your network utilization is too high, you may want to investigate usage of CDNs with expiring links, client-side caching, throttling requests, API access limits for customers abusing their quotas, or optimize your instances to have disproportionately more network capacity compared to its CPU or Memory capacity.

Table of Contents for Right-sizing infrastructure

Create new playlist

Sign In

Sign Up

Table of Contents for
Right-sizing infrastructure