Chapter 23. Cloud Resource Allocation

To effectively and efficiently utilize cloud resources, you need to understand how they are allocated, consumed, and charged. Cloud resources can be divided reasonably into these two categories:

  • Allocated-capacity resources

  • Usage-based resources

Allocated-Capacity Resource Allocation

Allocated-capacity resources are cloud resources that are allocated in discrete units. You specify how much of a specific type of resource you need, and you are given that amount. This amount is allocated to your use, and you are allocated that amount independent of what your real needs are at the moment.

Allocated-capacity cloud resources can be recognized by the following characteristics:

  • They are allocated in discrete units.

  • You specify how many units you want, and they are allocated for your use.

  • If your application uses less of the resource, the allocated resources remain idle and unused.

  • If your application needs more of the resource, the application becomes resource starved.

  • Proper capacity planning is important to avoid both over and under allocation.

The classic example of allocated capacity cloud resources are servers, such as Amazon EC2 instances. You specify how many instances you want as well as the size of the servers, and the cloud allocates them for your use. Additionally, managed infrastructure components such as cloud databases1 use an allocated capacity model. In all of these cases, you specify the number of units and their size, and the cloud provider allocates the units for your use.

But there are other examples of capacity-allocated cloud resources that operate a bit differently—for example, Amazon DynamoDB is one. Using this option, you can specify how much capacity you want available for your DynamoDB tables. In this case, capacity is not measured in units of servers, but in units of throughput capacity units. You allocate how much capacity you want to provide to your tables, and that much capacity is available for your use to that table. If you don’t use that much capacity, the capacity goes unused. If your application uses more than the capacity you have allocated, your application will be resource starved until you allocate more capacity. As such, these capacity units are allocated and consumed in a manner very similar to servers, even though on the surface they look very different.

Changing Allocations

Typically, capacity is allocated in discrete steps (a server costs a certain amount per hour; DynamoDB capacity units cost a certain amount per hour). You can change the number of servers allocated to your application or the number of capacity units allocated to your DynamoDB table, but only in discrete steps (the size of your server or the size of a capacity unit). Although there can be steps of various sizes available (such as different server sizes), you must allocate a whole number of units at a time.

It is your responsibility to ensure that you have enough capacity at hand. This might involve performing capacity planning exercises similar to those that you perform for traditional data center–based servers. You may very well allocate capacity based on expected demand and leave the number alone until you perform a review and determine that your capacity requirements have changed. This is typical of non-cloud-based server allocation.

However, cloud allocation changes are easier to perform than traditional capacity changes in a data center. As such, other algorithms can be used to perform your allocation. For instance, because allocation changes can be typically performed (almost) immediately, you can wait until you have consumed (almost) all your capacity before you decide to increase your capacity allocation.

Another option available is to change your allocation on a fixed schedule that matches your use patterns. For instance, increase the number of servers available during heavily used daylight hours and decrease the number of servers during lesser-used nighttime hours.

Yet another option is to change your allocation dynamically and automatically, based on current usage patterns. You might, for instance, monitor CPU usage on your servers and as soon as their usage reaches a certain threshold, you can automatically allocate additional servers.2 You might build these automated mechanisms into your application or your service infrastructure yourself, or you might take advantage of cloud services such as AWS’s AutoScaling to automatically change your allocation based on current usage criteria that you specify.

Whatever mechanism you choose to determine and change capacity, it is important to note that whatever capacity you currently have allocated is all that is available to you, and you could still end up with capacity allocated (and charged) to you that is not being used. Even worse, you could find yourself resource starved because you do not have enough capacity. Even if you use an automated allocation scheme such as AutoScale to give your application additional capacity when it is needed, that does not mean that the algorithm AutoScale uses to change your capacity can notice the need fast enough before your application becomes resource starved by, for instance, a sudden resource usage increase.

Whether you change allocations manually or via an automated method, your usage is defined and constrained by your allocation. You pay for the entire allocation, whether you use it or not. If your application requires a higher allocation than is currently allocated, your application will be resource starved. Proper capacity planning, either manual or automated, is essential for managing these resources.

Reserved Capacity

You typically can change your allocated capacity as often as you want,4 increasing and decreasing as your needs require.

This is one of the advantages of the cloud. If you need 500 servers one hour and only 200 the next hour, you are only charged for 500 for one hour and 200 for the next hour. It’s clean and simple.

However, because of this essentially infinite flexibility in the amount of capacity you can allocate, you pay a premium price.

But what if your needs are more stable? What if you will always need at least 200 servers allocated? Why pay for the ability to be flexible in how many servers you need on an hour-by-hour basis when your needs are much more stable and fixed?

This is where reserved capacity comes into play. Reserved capacity is the ability for you to commit to your cloud provider up front that you will consume a certain quantity of resources for a period of time (such as one to three years). In exchange, you receive a favorable rate for those resources.

Example 23-1. Reserved Capacity Example

Reserved capacity does not limit your flexibility in allocating resources; it only guarantees to your cloud provider that you will consume a certain quantity of resources.

Suppose that you have an application that requires 200 servers continuously, but sometimes your traffic spikes so that you need to have up to 500 servers allocated at times. You can use AutoScale to automatically adjust the number of servers dynamically. Your usage in servers, therefore, varies from a minimum of 200 servers to a maximum of 500 servers.

Because you will always be using at least 200 servers, you can purchase 200 server’s worth of reserved capacity. Let’s say you purchase 200 servers for one full year. You will pay a lower rate for those 200 servers, but you will be paying for those servers all the time. That’s fine, because you are using them all the time.

For the additional 300 servers (500 – 200), you can pay the normal (higher) hourly rate, and only pay for the time you are using those servers.

Reserved capacity provides a way for you to receive capacity at a lower cost in exchange for committed allocation of those resources.5

Usage-Based Resource Allocation

Usage-based resources are cloud resources that are not allocated but are consumed at whatever rate your application requires. You are charged only for the amount of the resource you consume. There is no allocation that is required.

You can recognize usage-based cloud resources by the following characteristics:

  • There is no allocation step involved, and hence no capacity planning required.

  • If your application needs less resources, you use fewer resources and your cost is lower.

  • If your application needs more resources, you use more resources and your cost is higher.

  • Within reason, you can scale from a very tiny amount consumed to a huge amount consumed without taking any steps to scale your application or the cloud resource it is consuming.

  • The phrase “within reason” is defined entirely by the cloud provider and their abilities.

You typically have no visibility into how the resources are allocated or scaled. It is all invisible to you.

A classic example of usage-based cloud resources is Amazon S3. With S3, you are charged for the amount of data you are storing and the amount of data you transfer. You do not need to determine ahead of time how much data storage you require or how much transfer capacity you require. Whatever amount you require (within system limits) is available to you whenever you require it, and you pay only for the amount you use.

The “Magic” of Usage-Based Resource Allocation

These services are easy to manage and scale because no capacity planning is required. This seemingly “magic” materialization of the resources necessary for your application using a usage-based resource is one of the true benefits of the cloud. It is made possible by the multitenant nature of the cloud service.

Behind a service like Amazon S3 is a huge amount of disk storage and a huge number of servers, which are allocated as needed to individual requests from individual users. If your application has a spike in the amount of resources it requires, the necessary resources can be allocated from a shared availability pool of services.

This availability pool is shared by all customers, and so it is a potentially huge pool of resources. As your application’s resource spike ebbs, another user’s application might begin to spike, and those resources are then allocated to that user’s application. This is done completely transparently.

As long as the pool of available capacity is large enough to handle all the requests and all the resource usage spikes occurring across all users, there is no starvation by any consumer. The larger the scale of the service (the more users that are using the service), the greater the ability of the cloud provider to average out the usage spikes and plan enough capacity for all the users’ needs.

This model works as long as no single user represents a significant portion of the total resources available by the cloud provider. If a single customer is large enough to represent a significant portion of the resources available for the service by the cloud provider, that customer can experience resource starvation during peak usage and potentially affect the capacity available to other customers, as well.

But for services like Amazon S3, the scale of the service is so massive,6 that no single customer represents a significant portion of usage, and the resource allocation of S3 remains magical.

Note

However, even Amazon S3 has its limits. If you run an application that uses significant quantities of data transferred or stored, you can run into some of the limits S3 imposes in order to keep from causing other users from experiencing resource starvation. As such, a large consumer of S3 resources can reach these artificial limits and experience resource starvation itself. This typically happens only if you are talking about data storage and transfer in the petabyte range.

Even if you do consume S3 resources at these huge levels, there are ways you can move your usage around to reduce the impact of the limits. Additionally, you can contact Amazon and request that these limits be increased. They will increase those limits in specific areas as you require, and these limit increases are then fed into Amazon’s capacity planning process so they can ensure that there are sufficient resources available to meet your needs and everyone else’s.

The Pros and Cons of Resource Allocation Techniques

As outlined in Table 23-1, each of the techniques we’ve been discussing have some advantages and disadvantages.

Table 23-1. Cloud resource allocation comparison
Allocated-capacity Usage-based

Service examples (Amazon AWS)

EC2, ELB, DynamoDB

S3, Lambda, SES, SQS, SNS

Requires capacity planning

Yes

No

Charges based on

Capacity allocated

Capacity consumed

Under utilization

Capacity is idle

N/A

Over utilization

Application starved

N/A

Can capacity be reserved to save money?

Yes

No

How can capacity be scaled?

Manual or scripted allocation change; can be delayed

Automatic and immediate

How are usage spikes handled?

Potential usage starvation during spike or capacity ramp-up

Handled transparently

Excess capacity?

Allocated and saved for your use

Global pool available for any customer to use

1 Such as Amazon RDS, Amazon Aurora, and ElastiCache.

2 Or, in reverse, remove servers when CPU usage drops below a threshold.

3 For more information, see “Best Practices in Evaluating Elastic Load Balancing.”

4 There are sometimes restrictions, such as on DynamoDB, for which there are limitations to how often you can change capacity.

5 Using reserved capacity also guarantees that the specific type of instance will be available in your specific desired availability zone, when you want it. Without having reserved capacity, it is possible that you could request a specific type of instance in a specific availability zone, and AWS would not be able to honor the request.

6 According to the most recent published Amazon data I could find, in 2013 S3 stored two trillion objects. That’s five objects for every star in the Milky Way. See “Amazon S3 – Two Trillion Objects, 1.1 Million Requests / Second,” AWS Official Blog, April 18, 2013.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.16.42.171