Chapter 29. Soaring in the Clouds

This is called, using the conquered foe to augment one’s own strength.

—Sun Tzu

Cloud computing is probably the most important advancement in technology since the Internet. Although most people think of it as a recent innovation, the reality is that the cloud has taken more than a decade to become a reality. In this chapter, we cover the history that led up to the launch of cloud computing, discuss the common characteristics of clouds, and finish the chapter by examining the pros and cons of cloud computing.

Cloud computing is important to scalability because it offers the promise of cheap, on-demand storage and compute capacity. This has many advantages, and a few disadvantages, for physical hardware scaling. To be well versed in scaling applications, you must understand these concepts and appreciate how they could be implemented to scale an application or service.

Although the development of the technology and concepts for cloud computing have been in process for many years, the discussion and utilization of these advancements in mainstream technology organizations remain relatively new. Because of this, even definitions of the subject are not always completely agreed upon. We have been fortunate to be around the cloud environment through our clients for quite some time and have seen many players become involved in this field. As with the technology-agnostic designs in Chapter 20, Designing for Any Technology, we believe that it is the architecture—and not the technology—that is responsible for a product’s capability to scale. As such, we consider clouds as an architectural component and not as a technology. With this perspective, the particular vendor or type of service is the equivalent of a type of technology chosen to implement the architecture. We will cover some of the technology components so that you have examples and can become familiar with them, but we will focus primarily on their use as architectural components. We also refer to IaaS solutions in Chapter 32, Planning Data Centers, where we discuss how they play into the decision of whether to own data centers, lease colocation space, or rent time in the cloud.

History and Definitions

The term cloud has been around for decades. No one is exactly sure when it was first used in relation to technology, but it dates at least as far back as the era when network diagrams came into vogue. A network diagram is a graphic representation of the physical or logic layout of a network, such as a telecommunications, routing, or neural network. The cloud on network diagrams was used to represent unspecified networks.

In the early 1990s, cloud evolved into a term for Asynchronous Transfer Mode (ATM) networks. ATM is a packet switching protocol that breaks data into cells and provides OSI layer 2, the data link. It was the core protocol used on the public switched phone network. When the World Wide Web began in 1991 as a CERN project built on top of the Internet, the cloud began to be used as a term and a symbol for the underlying infrastructure.

Cloud computing can also trace its lineage through the application service providers (ASPs) of the 1990s, which were the embodiment of the concept of outsourcing computer services. This concept later became known as Software as a Service (SaaS). The ASP model is an indirect descendant of the service bureaus of the 1960s and 1970s, which were an attempt at fulfilling the vision established by John McCarthy in his 1961 speech at Massachusetts Institute of Technology.1 John McCarthy is the inventor of the programming language Lisp and the recipient of the 1971 Turing Award; he is also credited with coining the term artificial intelligence.2

1. According to Wikipedia: http://en.wikipedia.org/wiki/Application_service_provider.

2. John McCarthy’s home page at Stanford University, http://www-formal.stanford.edu/jmc/.

The idea of the modern cloud concept was extended in October 2001 by IBM in its Autonomic Computing Manifesto.3 The essence of this paper was that the information technology infrastructure was becoming too complex and that it could collapse under its own weight if the management was not automated. Around this time, the concept of Software as a Service started to grow.

3. The original manifesto can be found at http://www.research.ibm.com/autonomic/manifesto/.

Another confluent event occurred around this time at the beginning of the 21st century—the dot-com bubble. As many tech startups were burning through capital and shutting down, those that would ultimately survive and thrive were tightening their belts on capital and operational expenditures. Amazon.com was one such company; it began modernizing its data centers using early concepts of virtualization over massive amounts of commodity hardware. Although it needed lots of capacity to deal with peak usage, Amazon decided to sell its unused capacity (which it had in spades during non-peak times) as a service.4

4. BusinessWeek. November 13, 2006. http://www.businessweek.com/magazine/content/06_46/b4009001.htm.

Out of the offering of spare capacity as a service came the concept and label of Infrastructure as a Service (IaaS). This term, which first appeared around 2006, typically refers to offerings of computer infrastructure such as servers, storage, networks, and bandwidth as a service instead of by subscription or contract. This pay-as-you-use model covered items that previously required either capital expenditure to purchase outright, long-term leases, or month-to-month subscriptions for partial tenancy of physical hardware.

From Infrastructure as a Service, we have seen an explosion of Blah as a Service offerings (where Blah means “fill in the blank with almost any word you can imagine”). We even have Everything as a Service (EaaS) now. All of these terms actually do share some common characteristics such as a purchasing model of “pay as you go” or “pay as you use it,” on-demand scalability of the amount that you use, and the concept that many people or multiple tenants can use the service simultaneously.

Public Versus Private Clouds

Some of the biggest names in technology are providing or have plans to provide cloud computing services. These companies include Amazon.com, Google, Hewlett-Packard, and Microsoft. Their services are publicly available clouds, of which anyone from individuals to other corporations can take advantage. However, if you are interested in running your application in a cloud environment but have concerns about a public cloud, there is the possibility of running a private cloud. By private cloud, we mean implementing a cloud on your own hardware in your own secure environment. With more open source cloud solutions becoming available, such as Eucalyptus and OpenStack, this is becoming a realistic solution.

There are obviously both pros and cons when running your application in a cloud environment. Some of these benefits and drawbacks are present regardless of whether you use a private or public cloud. Certain drawbacks, however, are directly related to the fact that a public cloud is used. For instance, there may be a perception that the data is not as strongly protected as it would be inside of your network, similar to grid computing, even if the public cloud is very secure. One of the pros of a cloud is that you can allocate just the right amount of memory, CPU, and disk space to a particular application, thereby taking better advantage and improving utilization of the hardware. Thus, if you want to improve your hardware utilization and not deal with the perceived security concerns of a public cloud, you may want to consider running your own private cloud.

Characteristics and Architecture of Clouds

At this point in the evolution of the cloud computing concept, all cloud implementations share some basic characteristics. These characteristics have been mentioned briefly to this point, but it is now time to understand them in more detail. Almost all public cloud implementations have four specific characteristic, some of which do not apply to private clouds—namely, pay by usage, scale on demand, multiple tenants, and virtualization. Obviously, scaling on demand is an important characteristic when viewing the use of clouds from a scalability perspective, but don’t dismiss the other characteristics as unimportant. For cash-strapped startup companies, paying as you go instead of purchasing hardware up front or signing multiyear contracts could mean the difference between surviving long enough to be successful or failing ignominiously.

Pay by Usage

The idea of pay as you go or pay according to your usage is commonplace in the Software as a Service world and has been adopted by the cloud computing services. Before cloud computing was available, to grow your application and have enough capacity to scale, you had limited options. If your organization was large enough, it probably owned or leased servers that were hosted in a data center or colocation facility. This model requires lots of upfront capital expenditure as well as a healthy monthly expense to continue paying bandwidth, power, space, and cooling costs. An alternative was to contract with a hosting service that provided the hardware, with the client then paying either a long-term lease or high monthly cost for the use of the hardware. Both models are reasonable and have benefits as well as drawbacks. Indeed, many companies still use one or both of these models and will likely do so for many years to come.

The cloud offers another alternative. Instead of long-term leases or high upfront capital outlays, this model allows you to avoid the upfront costs of purchasing hardware and instead pay based on your utilization of CPU, bandwidth, or storage, or possibly all three.

Scale on Demand

Another characteristic of cloud computing is the ability to scale on demand. As a subscriber or client of a cloud, you have the theoretical ability to scale as much as you need. Thus, if you need terabytes of storage or gigahertz or more, these computing capacities will be available to you. There are, of course, practical limits to this scalability, including how much actual capacity the cloud provider has to offer, but with the larger public clouds, it is reasonable to think that you could scale to the equivalent of several hundreds or thousands of servers with no issues. Some of our clients, however, have been large enough that a cloud provider did not have enough capacity for them to fail from one data center to another. In a private cloud, this constraint becomes your organization’s limitation on physical hardware. The time that it takes to complete this process is near real time compared to the standard method of provisioning hardware in a data center.

Let’s look at the typical process first, as if you were hosting your site at a colocation facility, and then consider the case in which you run your operation in a cloud environment. Adding hundreds of servers in a colocation facility or data center can take days, weeks, or months, depending on the organization’s processes. For those readers who have never worked in an organization that hosted its processes with a colocation facility, this is a typical scenario that you might encounter.

Most organizations have budgeting and request processes that must be navigated no matter where or how the site is hosted. After the budget or the purchase order is approved, however, the processes of provisioning a new server in a cloud and provisioning one in a colocation facility are almost completely different. For a colocation facility, you need to ensure that you have the space and power available to accommodate the new servers. This can entail going to a new cage in a colocation provider if more space or power is not available in your current cage. If a new cage is required, contracts must be negotiated and signed for the lease of the new space and cross-connects are generally required to connect the cage’s networks. After the necessary space and power are secured, purchase orders for the servers can be placed. Of course, some companies will stockpile servers in anticipation of capacity demand. Others will wait until the operations team alerts them that capacity is at a point where expanding the server pools is required.

Ordering and receiving the hardware can take weeks. After the hardware arrives at the colocation facility, it needs to be placed in the racks and powered up. After this is accomplished, the operations team can get started ghosting, jumpstarting, or kick-starting the server, depending on the operating system. Only then can the latest version of the software be loaded and the server added into the production pool. The total time for this process is at least days for the most efficient operations teams who already have hardware and space available. In most organizations, it takes weeks or months.

Now, let’s consider how this process might look if you were hosting your site in a cloud environment and decided that you needed 20 more servers for a particular pool. The process would start off similarly, with the budget or purchase order request to add to the monthly expense of the cloud services. After this is approved, the operations or engineering team would use the control panel of the cloud provider to simply request the number of virtual servers, specifying the desired size and speed. Within a few minutes, the systems would be available to load the machine image of choice and the latest application code could be installed. The servers could likely be placed into production within a few hours. This ability to scale on demand is a common characteristic of cloud computing.

Multiple Tenants

Although the ability to scale on demand is enticing, all that capacity is not just waiting for you. Public clouds have many users running a variety of applications on the same physical infrastructure—a concept known as multitenanting or having multiple tenants existing on the same cloud.

If all works as designed, these users never interact or impact one another. Data is not shared, access is not shared, and accounts are not shared. Each client has its own virtual environment that is walled off from other virtual environments. What you do share with other tenants in a cloud is the physical servers, network, and storage. You might have a virtual dual-processor server with 32GB of RAM, but it is likely running on an eight-processor server with 128GB of RAM that is being shared with several other tenants. Your traffic between servers and from the servers to the storage area goes across common networking gear. There are no routers, switches, or firewalls dedicated to individual tenants. The same goes for the storage. Tenants share storage on virtual network-attached storage (NAS) or storage area network (SAN) devices, which make it appear as if they are the only ones using the storage resource. The reality, of course, is that multiple tenants are using that same physical storage device.

The downside of this multitenanting scheme is that you don’t know whose data and processes reside on the servers, storage devices, or network segment that you are on. If a neighbor gets DDOS’d (subjected to a distributed denial-of-service attack), that event can impact your network traffic. Similarly, your operations might be affected if a neighbor experiences a significant increase in activity that overloads the storage solutions or floods common network paths. We’ve had clients whose businesses were affected by this exact scenario on a public cloud. With the “noisy neighbor” problem, another service residing on the shared infrastructure affects your performance through excessive consumption of shared resources. Because of this variability, especially in input/output (I/O) capacity, many cloud providers now offer dedicated hardware or provisioning of input/output operations per second (IOPS). This obviously is a move away from the multitenant model and necessarily increases the price charged by the cloud provider.

Virtualization

All cloud computing offerings implement some form of a hypervisor on the servers that provides virtualization. This concept of virtualization is really the core architectural principle behind clouds. A hypervisor is either a hardware platform or a software service that allows multiple operating systems to run on a single host server, essentially “dicing” the server into multiple virtual servers. It is also known as a virtual machine monitor (VMM). Many vendors offer hardware and software solutions, such as VMware, Parallels, and Oracle VM. As mentioned in the discussion of multitenancy, such virtualization allows multiple users to exist on common hardware without knowing about or (we hope) impacting each other. Other virtualization, separation, and limitation techniques are used to restrict access of cloud clients to only those amounts of bandwidth and storage that they have purchased. The overall purpose of these techniques is to control access and provide, to the greatest extent possible, an environment that appears to be completely the client’s own. The better this is done, the less likely it is that clients will notice one another’s presence.

Another virtualization technique that is gaining popularity is containers. With this virtualization method, the kernel of an operating system allows for multiple isolated user space instances. Like virtual machines, these instances look and act like a real server from the point of view of the application. On UNIX-based operating systems, this technology represents an advanced implementation of the standard chroot and provides resource management features to limit the impact between containers. Docker.io is an open source project that implements Linux containers (LXC) to automate the deployment of applications inside software containers. It uses resource isolation features of the Linux kernel such as cgroups and kernel namespaces to allow independent “containers” to run within a single Linux instance.

Some of these characteristics may or may not be present in private clouds. The concept of virtualization is at the core of all clouds, regardless of whether they are public or private. The idea of using farms of physical servers in different virtual forms to achieve greater utilization, multitenancy, or any other benefit is the basic premise behind the architecture of a cloud. The ability to scale as necessary is likely to be a common characteristic regardless of whether the resource is a private or public cloud. A physical restriction is placed on the amount of scale that can occur in any cloud, based on how large the cloud is and how much extra capacity is built into it.

Having multiple tenants, however, is not required in a private cloud. Tenants in a private cloud can mean different departments or different applications, not necessarily and probably not different companies. Pay as you go is also not required but possible in private clouds. Depending on how cost centers or departments are charged in an organization, each department might have to pay based on its usage of the private cloud. If a centralized operations team is responsible for building and running its own profit and loss center, it may very well charge departments or divisions for the services provided. This payment scheme may be based the computational, bandwidth, and storage usage in a private cloud.

Differences Between Clouds and Grids

Now that we’ve covered some of the history and basic characteristics of clouds and have done the same with grids in Chapter 28, Grid Computing, it’s time to compare the two concepts. The terms cloud and grid are often confused and misused. Here, we’ll cover a few of the differences and some of the similarities between the two to ensure that we are all clear on when each should be considered for use in our systems.

Clouds and grids serve different purposes. Clouds offer virtual environments for hosting user applications on one or many virtual servers. This makes clouds particularly compelling for applications that have unpredictable usage demands. When you aren’t sure if you need 5 or 50 servers over the next three months, a cloud can be an ideal solution. Clouds allow users to share the infrastructure. Many different users can be present on the same physical hardware consuming and sharing computational, network, and storage resources.

Grids, in contrast, are infrastructures for dividing programs into small parts to be executed in parallel across two or more hosts. These environments are ideal for computationally intensive workloads. Grids are not necessarily great infrastructures to share with multiple tenants. You are likely running on a grid to parallelize and significantly increase the computational bandwidth for your application; sharing the infrastructure with other users simultaneously defeats that purpose. Sharing or multitenancy can occur serially, one after the other, in a grid environment where each application runs in isolation; when one job is completed, the next job runs. This challenge of enabling multitenancy on grids is one of the core jobs of a grid operations team. Grids are also ideal only for applications that can be divided into elements that can be simultaneously executed. The throughput of a monolithic application cannot be helped by running on a grid. The same monolithic application can likely be replicated onto many individual servers in a cloud, however, and the throughput can be scaled by the number of servers added. Stated as simply as we can, clouds allow you to expand and contract your architecture; grids decompose work into parallelizable units.

While clouds and grids serve different purposes, there are many crossovers and similarities between them. The first major overlap is that some clouds run on top of a grid infrastructure. A good example is AppLogic from 3Tera, which is a grid operating system that is offered as software but also used to power a cloud that is offered as a service. Other similarities between clouds and grids include on-demand pricing models and scalable usage. If you need 50 extra servers in a cloud, you can get them allocated quickly and you pay only for the time that you are using them. The same is true in a grid environment. If you need 50 more nodes for improving the processing time of the application, you can have this capacity allocated rather quickly and you pay for only the nodes that you use.

At this point, you should understand that clouds and grids are fundamentally different concepts and serve different purposes, but have similarities and share common characteristics, and are sometimes intertwined in implementations.

Pros and Cons of Cloud Computing

Almost everything in life has both benefits and drawbacks; rarely is something completely beneficial or completely problematic. In most cases, the pros and cons can be debated, which increases the difficulty of making decisions for your business. Making matters even more complex is the reality that the pros and cons do not affect all businesses equally. Each company must weigh each of the identified benefits and drawbacks for its own situation. We will focus on how to use the pros and cons to make decisions later in the chapter. First, however, we cover what we consider the basic and most important of the benefits and drawbacks to cloud computing. Later, we will help put relative weightings to these factors as we discuss various implementations for different hypothetical businesses.

Pros of Cloud Computing

There are three major benefits to running your infrastructure on a cloud: cost, speed, and flexibility. Each one of these will have a varying degree of importance in your particular situation. In turn, you should weight each one in terms of how applicable the benefit is to you.

Cost

The cost model of consumption-based economics, or paying just for what you need as you need it, is a compelling one. This model works especially well if your organization is a cash-strapped startup. If your business model is one that actually pays for itself as your company grows and your expense model follows the same pattern, you have effectively eliminated a great deal of risk for your company. There are certainly other models that feature limited initial cash outlays, such as managed hosted environments, but they require that you purchase or lease equipment on a per-server basis and rule out the prospect of returning the equipment when you are not using it. For a startup, being able to last long enough to become successful is the first step toward scaling. At any company, being able to manage costs so that they stay in line with the volume of business is critical to ensure the ability to scale.

Figure 29.1 depicts a normal cost progression. As demand increases, you must stay ahead of that demand and purchase or lease the next server or storage unit or whatever piece of hardware to ensure you are capable of meeting demand. Most organizations are not great at capacity planning. This lack of skill may cause the gap between cost and demand to be larger than necessary or, even worse, allow demand to exceed capacity. In such a situation, a scramble will inevitably ensue to purchase more equipment while your customers are experiencing poor performance. The key when purchasing or leasing equipment in this scenario is to get the cost and demand lines as close as possible without letting them cross. Of course, with a cloud cost model, in which services are paid for only when used, these lines can be much tighter, almost touching in most cases.

Image

Figure 29.1 Stepwise Cost Function

Staffing flexibility is another benefit of cloud computing in terms of cost. Some cloud proponents maintain that you can skip staffing an operations team if you leverage cloud computing properly. While the authors’ experiences suggest that this is a bit of an exaggeration, it is true that the cloud simplifies the complexity of operations, so that a reduction in staff is possible. Small companies can almost certainly get away with building an operating application in the cloud without the need for an operations team.

Speed

The next benefit that we see from the cloud environment is speed—specifically speed as it relates to time to market for our products. Here we are talking about time for procurement, provisioning, and deployment. Of all the colocation, data centers, managed hosting, and other infrastructure models, nothing is faster when it comes to adding another server than operating in a cloud environment (unless, of course, you’ve built similar tooling and run your own private cloud). Because of the virtual nature of cloud computing, deployment and provisioning proceed very quickly. If you are running a site that expects a spike of traffic over the weekend because of some sporting event, you can throw a few more virtual hosts into the pool on Friday afternoon and release them back Monday morning. With such a plan, you have these resources to use over the weekend to add capacity, but you don’t pay for them the week after the spike in traffic ends. The ability to increase an application’s usage of virtual hosts very quickly can serve as an effective method of scaling through peak traffic periods.

Today, many clouds augment capacity automatically through auto-scaling. If the application is designed to take advantage of this functionality, it can realize hands-off, near-real-time provisioning of compute and storage capacity. Even if you don’t take advantage of auto-scaling, provisioning is still faster than it has ever been in the past with other models. Note that we don’t mean to imply that it is wise to scale only on the x-axis with additional hardware instances. If your application has this capability and you’ve determined that this is a wise strategic architectural decision, then the greater speed adds a lot to your ability to deploy more hosts quickly. Nevertheless, you may not be able to utilize such speed if your application maintains state and your system lacks a mechanism for keeping users assigned to one host or centralizing the stateful session data. Also, if your application uses a database and cannot handle an x-axis split for read/write operations or a y-axis split of schemas, being able to quickly add more hardware will not help you scale. The bottom line is that this benefit of speedy deployments must work with your application if you are to take advantage of it.

Flexibility

The third major benefit of cloud computing environments is flexibility. What you give up in control, you gain in the ability to implement multiple configurations for different needs. For example, if today you need five quality assurance test instances, you can set them up in the morning, test your code on them, and remove them tonight. Tomorrow, you can set up a full staging environment to allow your customers to perform user acceptance testing before you roll the code to production. After your customers are satisfied, you can remove the environment and stop paying for it. If you need a load testing environment that requires a bunch of individual hosts to provide multiple connections, ramping up a dozen virtual hosts for an hour of load testing is easily done in most cloud environments.

This flexibility to add, remove, or change your environments almost at whim is something that previous infrastructures haven’t offered. After a team gets used to this ability to change and reconfigure, they won’t want to be constrained by physical devices.

Cons of Cloud Computing

There are five major categories of concerns or drawbacks for public cloud computing. These cons do not all apply to private clouds, but because the greatest utility and greatest public interest is in the use of public clouds, we will stick to using public clouds for our analysis. These five categories—security, portability, control, performance, and cost—are obviously very broad areas, so we will have to delve into each one in more detail to fully understand them.

Security

Not a month goes by that the public is not bombarded with media reports about leaked personal information or a security breach. This causes us to ask the question, “How do cloud providers store and safeguard our information?” The same question can be asked of many of our SaaS vendors. The slight difference is that with a SaaS implementation, the vendor often knows whether it is collecting and storing sensitive information such as personally identifiable information (name, address, Social Security number, phone number, and so on), so it takes extra precautions and publishes its steps for safeguarding this information. In contrast, cloud providers have no idea what is being stored on their systems—that is, whether their customers are storing credit card numbers or blogs—and, therefore, do not take any extra precautions to restrict or block access to those data by their internal employees. Of course, there are ways around this, such as not storing any sensitive information on the cloud system, but those workarounds add more complexity to your system and potentially expose you to more risks. As stated earlier, this factor may or may not be a very important consideration for your particular company or application that you are considering hosting on a cloud.

A counter-argument is that in most cases the cloud provider handles security better than a small company could. Because most small startup companies lack the skill and dedicated staff to focus on infrastructure security issues, there’s some benefit to hosting in a cloud, where ideally the vendor has already done some thinking about security.

Portability

We long for the day when we can port an application from one cloud to another without code or configuration changes—but this day has not yet arrived, nor do we think it will do so in the near future because it is not beneficial for cloud vendors to make this process easy. Of course, it’s not impossible to migrate from one cloud to another or from a cloud to a physical server hosting environment, but this effort can be a nontrivial endeavor depending on the cloud and particular services being utilized. For instance, if you are using Amazon’s Simple Storage Solution and you want to move to another cloud or to a set of physical servers, you will likely have to rework your application to implement storage in a simple database. Although not the most challenging engineering project, this rework does take time and resources that could be used to work on product features.

One of the principles discussed in Chapter 12, Establishing Architectural Principles, was “use commodity hardware”; this vendor-agnostic approach to hardware is important to scale in a cost-efficient manner. Not being able to port across clouds easily goes against this principle and, therefore, is a drawback that should be considered. Nevertheless, this situation is improving. Cloud providers such as Amazon Web Services are now making it easier to get in and out of their clouds with services such as import/export functionality that enables you to import virtual machine images from an existing environment to Amazon EC2 instances and export them back to on-premises environments.

Control

Whenever you rely solely on a single vendor for any part of your system, you are putting your company’s future in the hands of another organization. We like to control our own destiny as much as possible, so relinquishing a significant amount of control to a third party is a difficult step for us to take. This approach is probably acceptable when it comes to operating systems and relational database management systems, because ideally you will be using a vendor or product line that has been around for years and you aren’t likely to build or manage anything better with your engineering team—unless, of course, you are in the business of operating systems or relational database management systems.

When it comes to hosting environments, many companies move away from managed environments because they get to a point where they have the technical talent on staff to handle the operational tasks required for hosting their own hardware and they get fed up with vendors making painful mistakes. Cloud environments are no different. They are staffed by people who are not your employees and who do not have a personal stake in your business. This is not to say that cloud or hosting providers have inferior employees. Quite the opposite: Their personnel are usually incredibly talented, but they do not know or understand your business. The provider has hundreds or thousands of servers to keep up and running. It doesn’t know that this one is any more important than that one; they are all the same to the cloud or hosting provider. Giving up control of your infrastructure to this type of third party, therefore, adds an amount of risk into your business.

Many cloud vendors have not even reached the point of being able to offer guaranteed availability or uptime. When vendors do not stand behind their products with remuneration clauses specific for failures, it would be wise to consider their service as being on a “best effort” basis, which means you need to have an alternative method of receiving that service. As we mentioned in the discussion of portability, running on or switching between multiple clouds is not a simple task.

Performance

Another major category of concerns that we have in regard to clouds is performance related. From our experiences with our clients on cloud computing infrastructures, the expected performance from equivalent pieces of physical hardware and virtual hardware is not the same. This issue is obviously very important to the scalability of your application, especially if you have singletons—that is, single instances of batch jobs or parts of your application running on only a single server. Obviously, running a single instance of anything is not an effective way to scale, but it is common for a team to start on a single server and not test the job or program on multiple servers until they are needed. Migrating to a cloud and realizing that the processing of the job is falling behind on the new virtual server might put you in panic mode, causing you to hurriedly test and validate that that job can run correctly on multiple hosts.

Virtual hardware underperforms its physical counterparts in some aspects by orders of magnitude. The standard performance metrics in such cases include memory speed, CPU, disk access, and so on. There is no standard degradation or equivalence among virtual hosts; in fact, it often varies within cloud environments and certainly varies from one vendor to another. Most companies and applications either don’t notice this degradation or don’t care about it, but when performing a cost–benefit analysis for switching to a cloud computing vendor, you need to test this yourself with your application. Do not take a vendor’s word and assume that the cloud offers a truly equivalent virtual host. Each application has its own sensitivity and bottlenecks with regard to host performance. Some applications are bottlenecked on memory, such that slowing down memory by even 5% can cause the entire application to scale much more poorly on certain hosts. This performance matters when you are paying thousands of dollars in computing costs per month. What might have been a 12-month break even now becomes 18 or 24 months in some cases.

Cost

While we listed cost as a benefit of clouds, it can also be a drawback. Large, rapidly growing companies can usually realize greater margins by using wholly owned equipment than they can by operating in the cloud. This difference arises because IaaS operators, while being able to purchase and manage their equipment cost-effectively, are still looking to make a profit on their services. As such, many larger companies can negotiate purchase prices that allow them to operate on a lower cost basis after they calculate the overhead of teams and the amortization of equipment.

This is not to say that IaaS solutions don’t have a place for large, fast-growing companies. They do! Systems that are not utilized around the clock and sit idle for large portions of the day are a waste of capital and can likely run more cost-effectively in the cloud. We will discuss this issue more in Chapter 32, Planning Data Centers, when we discuss how companies should think about their data center strategies from a broad perspective.

So far, we have covered what we see as the top drawbacks and benefits of cloud computing as they exist today. As we have mentioned throughout this section, how these factors affect your decision to implement a cloud computing infrastructure will vary depending on your business and your application. In the next section, we’ll highlight some of the different ways in which you might consider utilizing a cloud environment as well as how you might assess the importance of some of the factors discussed here based on your business and systems.

Where Clouds Fit in Different Companies

In this section, we describe a few of the various implementations of clouds that we have either seen or recommended to our clients. Of course, you can host your application’s production environment on a cloud, but many other environments are also available in today’s software development organizations. Likewise, there are many ways to utilize different environments together, such as combining a managed hosting environment with a colocation facility. Obviously, hosting your production environment in a cloud offers “scale on demand” ability from a virtual hardware perspective. At the same time, you cannot be sure that your application’s architecture can make use of this virtual hardware scaling; you must ensure that such compatibility exists ahead of time. This section also covers some other ways that clouds can help your organization scale. For instance, if your engineering or quality assurance teams are waiting for environments to become available for their use, the entire product development cycle is slowed down, which means scalability initiatives such as splitting databases, removing synchronous calls, and so on get delayed and affect your application’s ability to scale.

Environments

For your production environment, you can host everything in one type of infrastructure, such as managed hosting, colocation, your own data center, a cloud computing environment, or some other scheme. At the same time, there are creative ways to utilize several of these options together to take advantage of their benefits but minimize their drawbacks. To see how this works, let’s look at an example of an ad serving application.

The ad serving application consists of a pool of Web servers that accept the ad request, a pool of application servers that choose the right advertisement based on the information conveyed in the original request, an administrative tool that allows publishers and advertisers to administer their accounts, and a database for persistent storage of information. The ad servers in our application do not need to access the database for each ad request. Instead, they make a request to the database once every 15 minutes to receive the latest advertisements. In this situation, we could obviously purchase a bunch of servers to rack in a colocation space for each of the Web server, ad server, administrative server, and database server pools. We could also just lease the use of these servers from a managed hosting provider and let the third-party vendor worry about the physical server. Alternatively, we could host all of this in a cloud environment on virtual hosts.

We think there is another alternative, as depicted in Figure 29.2. Perhaps we have the capital needed to purchase the pools of servers and we have the skill set in our team members required to handle setting up and running our own physical environment, so we decide to rent space at a colocation facility and purchase our own servers. But we also like the speed and flexibility gained from a cloud environment. Recognizing that the Web and app servers don’t talk to the database very often, we decide to host one pool of each in a colocation facility and another pool of each on a cloud. The database will stay at the colocation but snapshots will be sent to the cloud to be used as a disaster recovery. The Web and application server pools in the cloud can be increased with greater traffic demands to help us cover unforeseen spikes.

Image

Figure 29.2 Combined Colocation and Cloud Production Environment

Another use of cloud computing is in all the other environments that are required for modern software development organizations. These environments include, but are not limited to, production, staging, quality assurance, load and performance, development, build, and repositories. Many of these should be considered candidates for implementing in a cloud environment because of the possibly reduced cost, as well as the flexibility and speed in setting up these resources when needed and tearing them down when they are no longer needed. Even enterprise-class SaaS companies and Fortune 500 corporations that might never consider hosting production instances of their applications on a cloud could benefit from utilizing the cloud for other types of environments.

Skill Sets

What are some of the other factors when considering whether to utilize a cloud, and if you do utilize the cloud, then for which environments is it appropriate? One consideration is the skill set and number of personnel that you have available to manage your operations infrastructure. If you do not have both networking and system administration skill sets among your operations staff, you need to consider this factor when determining whether you can implement and support a colocation environment. The most likely answer in that case is that you cannot. Without the necessary skill set, moving to a more sophisticated environment will actually cause more problems than it will solve. The cloud has similar issues; if someone isn’t responsible for deploying and shutting down instances and this is left to each individual developer or engineer, it is very possible that the bill at the end of the month will be much higher than you expected. Instances that are left running are wasting money unless someone has made a purposeful decision that the instance is necessary.

Another type of skill set that may influence your decision is capacity planning. Whether your business has very unpredictable traffic or you do not have the necessary skill set on staff to accurately predict the traffic, this factor may heavily influence your decision to use a cloud. Certainly one of the key benefits of the cloud is the ability to handle spiky demand by quickly deploying more virtual hosts.

All in all, we believe that cloud computing likely fits in almost any company in some role. This fit might not be for hosting your production environment, but rather might entail hosting your testing environments. If your business’s growth is unpredictable, if speed is of utmost urgency, and if cutting costs is imperative to your organization’s survival, the cloud might be a great solution. If you can’t afford to allocate headcount for operations management or predict which kind of capacity you may need down the line, cloud computing could be what you need. How you pull all of this information together to make the decision is the subject of the next section.

Decision Process

Now that we’ve looked at the pros and cons of cloud computing and discussed some of the ways in which cloud environments can be integrated into a company’s infrastructure, the last step is to provide a process for making the final decision on whether to pursue and implement cloud computing. The overall process that we recommend here is to first determine the goals or purpose of wanting to investigate cloud computing, then create alternative implementation options that achieve those goals. Weigh the pros and cons based on your particular situation. Rank each alternative based on the pros and cons. Based on the final tally of pros and cons, select an alternative.

Let’s walk through an example to see how this decision-making process works. The first step is to determine which goals we hope to achieve by utilizing a cloud environment. Perhaps the goals are to lower the operations cost of infrastructure, decrease the time to procure and provision hardware, and maintain 99.99% availability for an application hosted on this environment. Based on these three goals, you might decide on three alternatives. The first is to do nothing, remain in a colocation facility, and forget about all this cloud computing talk. The second alternative is to use the cloud for only surge capacity but remain in the colocation facility for most of the application services. The third alternative is to move completely onto the cloud and out of the colocation space. This has accomplished steps 1 and 2 of the decision process.

Step 3 is to apply weights to all of the pros and cons that we can come up with for our alternative environments. Here, we will use the four cons and three pros that we outlined earlier. Note that “cost” can be either a pro or a con. For our example, we’ll use it as a pro. We will use a 1, 3, or 9 scale to rank these and thereby differentiate the factors that we care about. The first con is security; we care about it to some extent, but we don’t store personally identifiable information or credit card data, so we weight it a 3. We continue with portability and determine that we don’t really feel the need to be able to move quickly between infrastructures so we weight it a 1. Next is control, which we really care about, so we rank it a 9. Finally, the last of the cons is performance. Because our application is not very memory or disk intensive, we don’t feel that this is a major deal for us, so we weight it a 1. For the pros, we really care about cost, so we weight it a 9. The same with speed: It is one of the primary goals, so we care a lot about it and rate it as a 9. Last is flexibility, which we don’t expect to use much, so we rank it a 1.

The fourth step is to rank each alternative on a scale from 0 to 5 based on how well it demonstrates each of the pros and cons. For example, with the “use the cloud for only surge capacity” alternative, the portability drawback should be ranked very low because it is not likely that we will need to exercise that option. Likewise, with the “move completely to the cloud” alternative, the controls are more heavily influential because there is no other environment, so it gets ranked a 5.

The completed decision matrix can be seen in Table 29.1. After all of the alternatives are scored against the pros and cons, the numbers can be multiplied and summed. The weight of each pro is multiplied by the rank or score of each alternative; these products are summed for each alternative. For example, alternative 2, Cloud for Surge, has been ranked a 2 for security, which is weighted as –3. All cons are weighted with negative scores so the math is simpler. The product of the rank and the weight is –6, which is then summed with all the other products for alternative 2, for a total score: (2 ∴ –3) + (1 ∴ –1) + (3 ∴ –9) + (3 ∴ –1) + (3 ∴ 9) + (3 ∴ 9) + (1 ∴ 1) = 18.

Image

Table 29.1 Decision Matrix

The final step is to compare the total scores for each alternative and apply a level of common sense to it. Here, we have alternatives with 0, 9, and –6 scores; thus alternative 2 is clearly the better choice for us. Before automatically assuming that this is our final decision, we should verify that based on common sense and other factors that might not have been included, it is a sound decision. If something appears to be off or you want to add other factors such as operations skill sets, redo the matrix or have several people do the scoring independently to see how they would score the matrix differently.

The most likely question with regard to introducing cloud computing into your infrastructure is not whether to do it but rather when and how is the right way to do it. Cloud computing is not going away; in fact, it is likely to be the preferred but not only infrastructure model of the future. All of us need to keep an eye on how cloud computing evolves over the coming months and years. This technology has the potential to change the fundamental cost and organizational structures of most SaaS companies.

Conclusion

The history of cloud computing dates back several decades, and the concept of the modern-day cloud can be credited to IBM’s manifesto. However, the evolution of cloud computing into its current form has been made possible by many different people and companies, including one of the first public cloud services, EC2.

Cloud computing has three major pros: cost, speed, and flexibility. The “pay per use” model is extremely attractive to companies and makes great sense. Working in a virtual environment also offers unequaled speed of procurement and provisioning. An example of flexibility is how you can utilize a set of virtual servers today as a quality assurance environment: shut them down at night and bring them back up the next day as a load and performance testing environment. This is a very attractive feature of the virtual host in cloud computing.

Cons of cloud computing include concerns about security, portability, control, performance, and cost. The security factor reflects concerns about how data is handled after it is in the cloud. The provider has no idea which type of data is stored there, and clients have no idea who has access to the data. This discrepancy between the two causes some concern. The portability factor addresses the fact that porting between clouds or clouds and physical hardware is not necessarily easy depending on the application. The control issues arise from the integration of another third-party vendor into your infrastructure that has influence over not just one part of your system’s availability, but probably the entirety of your service’s availability. Performance is a con because it can vary dramatically between cloud vendors as well as compared to physical hardware. In terms of cost, the “pay per use: model makes getting started very attractive financially, but many large companies find that they can lower their compute costs by owning hardware equipment if it remains mostly in use. Your company and the applications that you are considering hosting on the cloud environment should dictate the degree to which you care about any of these cons.

Cloud computing can fit into different companies’ infrastructures in different ways. Some of these alternatives include existing not only as part of or all of the production environment but also in other environments such as quality assurance or development. As part of the production environment, cloud computing could be used for surge capacity or disaster recovery or, of course, to host all of production. The examples presented in this chapter were designed to show you how you might make use of the pros or benefits of cloud computing to aid your scaling efforts, whether directly for your production environment or more indirectly by aiding your product development cycle. This could take the form of capitalizing on the speed of provisioning virtual hardware or the flexibility in using the environments in different ways each day.

Each organization must make the decision of whether to use cloud computing in its operations. A five-step process for reaching this decision includes (1) establishing goals, (2) describing alternatives, (3) weighting pros and cons, (4) scoring the alternatives, and (5) tallying the scores and weightings to determine the highest-scoring alternative. The bottom line: Even if a cloud environment is not right for your organization today, you should continue looking at clouds because they will continue to improve, and it is very likely that cloud computing will be a good fit at some time.

Key Points

• The term cloud has been around for decades and was used primarily in network diagrams.

• The idea of the modern cloud concept was put forth by IBM in its Autonomic Computing Manifesto.

• Developing alongside the idea of cloud computing was the concept of Software as a Service, Infrastructure as a Service, and many more “as a Service” concepts.

• Software as a Service refers to almost any form of software that is offered via a “pay as you use” model.

• Infrastructure as a Service is the idea of offering infrastructure such as storage, servers, network, and bandwidth in a “pay as you use” model.

• Platform as a Service provides all the required components for developing and deploying Web applications and services.

• Everything as a Service is the idea of being able to have small components that can be pieced together to provide a new service.

• Pros of cloud computing include cost, speed, and flexibility.

• Cons of cloud computing include security, control, portability, performance, and cost.

• There are many ways to utilize cloud environments.

• Clouds can be used in conjunction with other infrastructure models by using them for surge capacity or disaster recovery.

• You can use cloud computing for development, quality assurance, load and performance testing, or just about any other environment, including production.

• A five-step process is recommended when deciding where and how to use cloud computing in your environment.

• All technologists should be aware of cloud computing; almost all organizations can take advantage of cloud computing in some way.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.219.239.118