Chapter 21. Distributing the Cloud

We all know the value of distributing an application across multiple data centers. The same philosophy applies to the cloud. As we put portions of our applications, or complete applications, into the cloud, we need to watch where in the cloud they are located. How distributed our applications are is just as important in cloud approaches as it is with normal data centers, particularly as applications scale.

However, the cloud makes knowing whether your application is distributed more difficult. The cloud also makes it more difficult to proactively make your application more distributed. Some cloud providers don’t even expose enough information to let you know where, geographically, your application is running.

Luckily, larger providers like AWS, although they won’t tell you specifically where your application is running, will give you enough information to make decisions about where your application is running. Interpreting and understanding this information and using it to your advantage requires an understanding of how AWS is architected.

AWS Architecture

First, let’s discuss some terms used within the AWS ecosystem.

AWS Region

An AWS region is a large area connection of cloud resources that represent a specific geographic area. In general, regions represent a portion of an individual continent or country (such as Western Europe, Northeastern Asia-Pacific, and United States East). They describe and document geographic diversity of cloud resources. They are composed of multiple availability zones (AZs); however, it is possible for a region to have only a single availability zone.

An AWS region is identified by a string representing its geographical location. Table 21-1 gives the current list of AWS regions, their names, and where they serve.

Table 21-1. AWS regions
Region namea Geographic area covered

us-east-1

US East Coast (N. Virginia)

us-west-1

US West Coast (N. California)

us-west-2

US West Coast (Oregon)

eu-west-1

EU (Ireland)

eu-central-1

EU (Frankfurt)

ap-northeast-1

Asia Pacific (Tokyo)

ap-northeast-2

Asia Pacific (Seoul)

ap-southeast-1

Asia Pacific (Singapore)

ap-southeast-2

Asia Pacific (Sydney)

sa-east-1

South America (Sao Paulo)

a AWS regions and availability zones as of February 2016.

AWS Availability Zone

An AWS availability zone is a subset of an AWS region that represents cloud resources within a specific portion of a region but are network topologically isolated from one another. AWS availability zones describe and document network topological diversity of cloud resources. If two cloud resources are in different availability zones, they can be assumed to be in distinct data centers, even if they are in the same AWS region. If two cloud resources are in the same availability zone, they can potentially both be in the same data center, floor, rack, or even physical server.

An AWS availability zone is identified by a string beginning with the name of the region the AZ is in, followed by a letter (a–z). For example, Table 21-2 shows some example availability zones and the regions they are in.

Table 21-2. AWS availability zone names
Region name AZ names

us-east-1

us-east-1a us-east-1b us-east-1c us-east-1d us-east-1e

us-west-1

us-west-1a us-west-1b

us-west-2

us-west-2a us-west-2b

Data Center

This is not a term used within AWS vocabulary, but we will use it as we map typical noncloud terminology into AWS terminology.

A data center is a specific floor, building, or group of buildings that constitute a single location of system resources, such as servers.

Architecture Overview

Figure 21-1 shows at a high level what the AWS cloud architecture looks like. AWS is composed of several AWS regions, which are geographically distributed around the globe in order to provide high-quality access to most locations in the world. The AWS regions each have connections to the Internet. The AWS regions themselves also are connected among themselves, but they use long-distance network connections similar to the rest of the Internet.

AWS Data Center Architecture.
Figure 21-1. AWS data center architecture

A single AWS region is composed of one or more AWS availability zones. The AZs within a single region are connected via an extremely high-speed hub network link, as shown in Figure 21-2. The goal is to make access between any two servers within a region to have similar performance characteristics without concern for the AZ in which they are located.

A given AZ is composed of one or more data centers, depending on the size of the AZ.

AWS Region and AZ Network Performance.
Figure 21-2. AWS region and availability zone network performance

As you can see, the network topography is designed to make it easy to build an application within a single region but distributed across availability zones. This distribution is designed to give redundant systems failover opportunities in light of problems with individual data centers, while maintaining the ability for the independent components to communicate with one another at high speeds transparently, without regard to the availability zone they are in.

However, regions are designed so that an entire application would be contained within a single region, and not require high-speed communications with components contained in other regions. Instead, if an application wants to be in multiple regions, multiple copies of the application are typically run independently, one copy within each region desired. This makes it possible for individual geographic regions to have access to an instance of an application locally without suffering the cost of long-distance communication links. This is shown in Figure 21-3. This model is supported by the AWS network traffic costing model, which typically allows traffic between AZs within a single region to be free, while traffic destined between regions or out from a region to the Internet to be charged appropriately.

Customer Architecture.
Figure 21-3. Customer architecture

This architecture is important not only from a cost standpoint, but also from a latency standpoint (region-to-region network latency is higher than AZ-to-AZ). Additionally, this structure gives your application the ability to support various governmental regulations, such as EU Safe Harbor.1

Availability Zones Are Not Data Centers

Within a given account, an EC2 instance in one AZ (such as us-east-1a) and an EC2 instance in another AZ (such as us-east-1b) may safely be assumed to be in distinct data centers.

However, this is not necessarily true when you are using more than one AWS account. When you create an EC2 instance in account 1 that is in AZ us-east-1a, and an EC2 instance in account 2 that is in AZ us-east-1c, these two instances might, in fact, be on the same physical server within the same data center.

Why is this the case? It is because the AZ names do not statically map directly to specific data centers. Instead, the data center(s) used for “us-east-1a” in one account might be different than the data center(s) used for “us-east-1a” in another account.

When you create an AWS account, they “randomly” create a mapping of availability zone names to specific data centers.2 This means that one account’s view of “us-east-1a” will be physically present in a very different location than another account’s view of “us-east-1a”. This is demonstrated in Table 21-3. Here we show an arbitrary number of data centers (arbitrarily numbered 1 through 8) within a single region. Then, we show a possible mapping between AZ names and those data centers for four sample accounts.

Table 21-3. Unexpected availability zone mappings
Data Center AWS Account 1 AWS Account 2 AWS Account 3 AWS Account 4

DC #1

us-east-1a

us-east-1d

us-east-1e

DC #2

us-east-1a

us-east-1c

us-east-1a

us-east-1a

DC #3

us-east-1b

us-east-1a

us-east-1d

us-east-1d

DC #4

us-east-1c

us-east-1a

us-east-1b

DC #5

us-east-1d

us-east-1b

us-east-1c

us-east-1c

DC #6

us-east-1e

us-east-1b

DC #7

us-east-1e

DC #8

us-east-1e

From this, you’ll notice a few things. First, a single AZ for an account can, in fact, be contained in multiple distinct data centers. This means the two EC2 instances you create within a single account and a single AZ may be on the same physical server, or they could be in completely different data centers. Second, two EC2 instances created in different accounts may or may not be in the same data center, even if the AZs are different.

For example, in Table 21-3, if account #1 creates an instance in us-east-1b, and account #3 creates an instance in us-east-1d, those two instances will both be created in data center #3.

This is important to keep in mind for one simple reason: just because you have two EC2 instances in two accounts in two different AZs, does not mean they can be assumed to be independent for availability purposes.

As discussed in Parts I and II of this book, maintaining independence of replicated components is essential for availability and risk management purposes. However, when using multiple AWS accounts, the AWS AZ model does not enforce this. The AZ model can be used to enforce this only when dealing within a single AWS account.

Why would you ever want to use more than one AWS account? Actually, this is fairly common. Many companies create multiple AWS accounts used by different groups within the company. AWS might do this for billing purposes, permissions management, or other reasons.

Maintaining Location Diversity for Availability Reasons

How do you ensure that AWS resources you launch have redundant components that are guaranteed to be located in different data centers and therefore risk tolerant to outages?

There are a couple things you can do. First, make sure that you maintain redundant components in distinct AZs within a single account. If you have redundant components that are in multiple accounts, make sure you maintain redundancy in multiple AZs within each account individually. Don’t compare AZs across accounts.

1 EU Safe Harbor is a set of EU privacy principles that govern the transmission of data about EU citizens to locations outside of the EU. It often can matter where data is stored in order to comply with local laws, and AWS regions make it possible for applications to be built to support these laws and principles.

2 Of course, it’s not random, but done algorithmically. And actually the mapping is not done until a specific account makes use of a specific availability zone or region.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.135.199.27