Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 23. Cloud Resource Allocation

To effectively and efficiently utilize cloud resources, you need to understand how they are allocated, consumed, and charged. Cloud resources can be divided reasonably into these two categories:

Allocated-capacity resources
Usage-based resources

Allocated-Capacity Resource Allocation

Allocated-capacity resources are cloud resources that are allocated in discrete units. You specify how much of a specific type of resource you need, and you are given that amount. This amount is allocated to your use, and you are allocated that amount independent of what your real needs are at the moment.

Allocated-capacity cloud resources can be recognized by the following characteristics:

They are allocated in discrete units.
You specify how many units you want, and they are allocated for your use.
If your application uses less of the resource, the allocated resources remain idle and unused.
If your application needs more of the resource, the application becomes resource starved.
Proper capacity planning is important to avoid both over and under allocation.

The classic example of allocated capacity cloud resources are servers, such as Amazon EC2 instances. You specify how many instances you want as well as the size of the servers, and the cloud allocates them for your use. Additionally, managed infrastructure components such as cloud databases¹ use an allocated capacity model. In all of these cases, you specify the number of units and their size, and the cloud provider allocates the units for your use.

But there are other examples of capacity-allocated cloud resources that operate a bit differently—for example, Amazon DynamoDB is one. Using this option, you can specify how much capacity you want available for your DynamoDB tables. In this case, capacity is not measured in units of servers, but in units of throughput capacity units. You allocate how much capacity you want to provide to your tables, and that much capacity is available for your use to that table. If you don’t use that much capacity, the capacity goes unused. If your application uses more than the capacity you have allocated, your application will be resource starved until you allocate more capacity. As such, these capacity units are allocated and consumed in a manner very similar to servers, even though on the surface they look very different.

Changing Allocations

Typically, capacity is allocated in discrete steps (a server costs a certain amount per hour; DynamoDB capacity units cost a certain amount per hour). You can change the number of servers allocated to your application or the number of capacity units allocated to your DynamoDB table, but only in discrete steps (the size of your server or the size of a capacity unit). Although there can be steps of various sizes available (such as different server sizes), you must allocate a whole number of units at a time.

It is your responsibility to ensure that you have enough capacity at hand. This might involve performing capacity planning exercises similar to those that you perform for traditional data center–based servers. You may very well allocate capacity based on expected demand and leave the number alone until you perform a review and determine that your capacity requirements have changed. This is typical of non-cloud-based server allocation.

However, cloud allocation changes are easier to perform than traditional capacity changes in a data center. As such, other algorithms can be used to perform your allocation. For instance, because allocation changes can be typically performed (almost) immediately, you can wait until you have consumed (almost) all your capacity before you decide to increase your capacity allocation.

Another option available is to change your allocation on a fixed schedule that matches your use patterns. For instance, increase the number of servers available during heavily used daylight hours and decrease the number of servers during lesser-used nighttime hours.

Yet another option is to change your allocation dynamically and automatically, based on current usage patterns. You might, for instance, monitor CPU usage on your servers and as soon as their usage reaches a certain threshold, you can automatically allocate additional servers.² You might build these automated mechanisms into your application or your service infrastructure yourself, or you might take advantage of cloud services such as AWS’s AutoScaling to automatically change your allocation based on current usage criteria that you specify.

Whatever mechanism you choose to determine and change capacity, it is important to note that whatever capacity you currently have allocated is all that is available to you, and you could still end up with capacity allocated (and charged) to you that is not being used. Even worse, you could find yourself resource starved because you do not have enough capacity. Even if you use an automated allocation scheme such as AutoScale to give your application additional capacity when it is needed, that does not mean that the algorithm AutoScale uses to change your capacity can notice the need fast enough before your application becomes resource starved by, for instance, a sudden resource usage increase.

Allocation Problems

Consider Amazon’s Elastic Load Balancer (ELB). This is a service that provides a load balancer to your application that automatically scales in size to handle whatever quantity of traffic has been sent to it. If you are receiving very little traffic, ELB will change the servers it is using for your load balancer to be smaller servers and fewer of them. If you are receiving a lot of traffic, ELB will automatically change the servers used for your load balancer to be larger servers and put more of them into service. All of this is automatic and transparent to you as the application owner. This is how ELB is able to provide a load balancer at a very low entry price point, yet let the same load balancer scale to handle huge quantities of traffic (with a corresponding price increase) all automatically. This saves you money when your traffic is light, yet scales to your higher traffic needs when necessary.

However, there are places where the specifics of how this automated allocation mechanism becomes visible in a negative way. If you receive a sudden spike in traffic, say, because your site suddenly goes viral due to a social media campaign, your load balancer might not be able to resize itself fast enough. The result? For a period of time after the traffic increase starts, your load balancer might be resource starved, causing page requests to be slow or fail, creating a poor user experience. This situation will automatically correct as ELB determines your increased capacity needs and scales your load balancer up to larger servers and more of them. This scaling, though, can take a few minutes to complete. In the meantime, your users are having a poor experience and availability suffers.

To combat this effect, Amazon lets you contact representatives and warn them of a coming change in traffic use patterns, allowing them to prewarm your load balancer.³ This effectively means prescaling your load balancer to use larger servers (and more of them), before the traffic spike occurs. However, this only works if you know you will experience a sudden rise in traffic. It doesn’t help at all if the traffic spike is sudden or unexpected.

This situation is one of the problems with this type of cloud resource allocation.

Whether you change allocations manually or via an automated method, your usage is defined and constrained by your allocation. You pay for the entire allocation, whether you use it or not. If your application requires a higher allocation than is currently allocated, your application will be resource starved. Proper capacity planning, either manual or automated, is essential for managing these resources.

Reserved Capacity

You typically can change your allocated capacity as often as you want,⁴ increasing and decreasing as your needs require.

This is one of the advantages of the cloud. If you need 500 servers one hour and only 200 the next hour, you are only charged for 500 for one hour and 200 for the next hour. It’s clean and simple.

However, because of this essentially infinite flexibility in the amount of capacity you can allocate, you pay a premium price.

But what if your needs are more stable? What if you will always need at least 200 servers allocated? Why pay for the ability to be flexible in how many servers you need on an hour-by-hour basis when your needs are much more stable and fixed?

This is where reserved capacity comes into play. Reserved capacity is the ability for you to commit to your cloud provider up front that you will consume a certain quantity of resources for a period of time (such as one to three years). In exchange, you receive a favorable rate for those resources.

Example 23-1. Reserved Capacity Example

Reserved capacity does not limit your flexibility in allocating resources; it only guarantees to your cloud provider that you will consume a certain quantity of resources.

Suppose that you have an application that requires 200 servers continuously, but sometimes your traffic spikes so that you need to have up to 500 servers allocated at times. You can use AutoScale to automatically adjust the number of servers dynamically. Your usage in servers, therefore, varies from a minimum of 200 servers to a maximum of 500 servers.

Because you will always be using at least 200 servers, you can purchase 200 server’s worth of reserved capacity. Let’s say you purchase 200 servers for one full year. You will pay a lower rate for those 200 servers, but you will be paying for those servers all the time. That’s fine, because you are using them all the time.

For the additional 300 servers (500 – 200), you can pay the normal (higher) hourly rate, and only pay for the time you are using those servers.

Reserved capacity provides a way for you to receive capacity at a lower cost in exchange for committed allocation of those resources .⁵

Usage-Based Resource Allocation

Usage-based resources are cloud resources that are not allocated but are consumed at whatever rate your application requires. You are charged only for the amount of the resource you consume. There is no allocation that is required.

You can recognize usage-based cloud resources by the following characteristics:

There is no allocation step involved, and hence no capacity planning required.
If your application needs less resources, you use fewer resources and your cost is lower.
If your application needs more resources, you use more resources and your cost is higher.
Within reason, you can scale from a very tiny amount consumed to a huge amount consumed without taking any steps to scale your application or the cloud resource it is consuming.
The phrase “within reason” is defined entirely by the cloud provider and their abilities.

You typically have no visibility into how the resources are allocated or scaled. It is all invisible to you.

A classic example of usage-based cloud resources is Amazon S3. With S3, you are charged for the amount of data you are storing and the amount of data you transfer. You do not need to determine ahead of time how much data storage you require or how much transfer capacity you require. Whatever amount you require (within system limits) is available to you whenever you require it, and you pay only for the amount you use.

The “Magic” of Usage-Based Resource Allocation

These services are easy to manage and scale because no capacity planning is required. This seemingly “magic” materialization of the resources necessary for your application using a usage-based resource is one of the true benefits of the cloud. It is made possible by the multitenant nature of the cloud service.

Behind a service like Amazon S3 is a huge amount of disk storage and a huge number of servers, which are allocated as needed to individual requests from individual users. If your application has a spike in the amount of resources it requires, the necessary resources can be allocated from a shared availability pool of services.

This availability pool is shared by all customers, and so it is a potentially huge pool of resources. As your application’s resource spike ebbs, another user’s application might begin to spike, and those resources are then allocated to that user’s application. This is done completely transparently.

As long as the pool of available capacity is large enough to handle all the requests and all the resource usage spikes occurring across all users, there is no starvation by any consumer. The larger the scale of the service (the more users that are using the service), the greater the ability of the cloud provider to average out the usage spikes and plan enough capacity for all the users’ needs.

This model works as long as no single user represents a significant portion of the total resources available by the cloud provider. If a single customer is large enough to represent a significant portion of the resources available for the service by the cloud provider, that customer can experience resource starvation during peak usage and potentially affect the capacity available to other customers, as well.

But for services like Amazon S3, the scale of the service is so massive,⁶ that no single customer represents a significant portion of usage, and the resource allocation of S3 remains magical.

Note

However, even Amazon S3 has its limits. If you run an application that uses significant quantities of data transferred or stored, you can run into some of the limits S3 imposes in order to keep from causing other users from experiencing resource starvation. As such, a large consumer of S3 resources can reach these artificial limits and experience resource starvation itself. This typically happens only if you are talking about data storage and transfer in the petabyte range.

Even if you do consume S3 resources at these huge levels, there are ways you can move your usage around to reduce the impact of the limits. Additionally, you can contact Amazon and request that these limits be increased. They will increase those limits in specific areas as you require, and these limit increases are then fed into Amazon’s capacity planning process so they can ensure that there are sufficient resources available to meet your needs and everyone else’s.

The Pros and Cons of Resource Allocation Techniques

As outlined in Table 23-1, each of the techniques we’ve been discussing have some advantages and disadvantages .

Table 23-1. Cloud resource allocation comparison
	Allocated-capacity	Usage-based
Service examples (Amazon AWS)	EC2, ELB, DynamoDB	S3, Lambda, SES, SQS, SNS
Requires capacity planning	Yes	No
Charges based on	Capacity allocated	Capacity consumed
Under utilization	Capacity is idle	N/A
Over utilization	Application starved	N/A
Can capacity be reserved to save money?	Yes	No
How can capacity be scaled?	Manual or scripted allocation change; can be delayed	Automatic and immediate
How are usage spikes handled?	Potential usage starvation during spike or capacity ramp-up	Handled transparently
Excess capacity?	Allocated and saved for your use	Global pool available for any customer to use

¹ Such as Amazon RDS, Amazon Aurora, and ElastiCache.

² Or, in reverse, remove servers when CPU usage drops below a threshold.

³ For more information, see “Best Practices in Evaluating Elastic Load Balancing.”

⁴ There are sometimes restrictions, such as on DynamoDB, for which there are limitations to how often you can change capacity.

⁵ Using reserved capacity also guarantees that the specific type of instance will be available in your specific desired availability zone, when you want it. Without having reserved capacity, it is possible that you could request a specific type of instance in a specific availability zone, and AWS would not be able to honor the request.

⁶ According to the most recent published Amazon data I could find, in 2013 S3 stored two trillion objects. That’s five objects for every star in the Milky Way. See “Amazon S3 – Two Trillion Objects, 1.1 Million Requests / Second,” AWS Official Blog, April 18, 2013.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 23. Cloud Resource Allocation

Create new playlist

Sign In

Sign Up