THE AWS CERTIFIED SOLUTIONS ARCHITECT ASSOCIATE EXAM OBJECTIVES COVERED IN THIS CHAPTER MAY INCLUDE, BUT ARE NOT LIMITED TO, THE FOLLOWING:
The ultimate focus of a traditional data center/server room is its precious servers. But, to make those servers useful, you'll need to add racks, power supplies, cabling, switches, firewalls, and cooling.
AWS's Elastic Compute Cloud (EC2) is designed to replicate the data center/server room experience as closely as possible. At the center of it all is the EC2 virtual server, known as an instance. But, like the local server room I just described, EC2 provides a range of tools meant to support and enhance your instance's operations.
This chapter will explore the tools and practices used to fully leverage the power of the EC2 ecosystem, including the following:
An EC2 instance may only be a virtualized and abstracted subset of a physical server, but it behaves just like the real thing. It will have access to storage, memory, and a network interface, and its primary storage drive will come with a fresh and clean operating system running.
It's up to you to decide what kind of hardware resources you want your instance to have, what operating system and software stack you'd like it to run, and, ultimately, how much you'll pay for it. Let's see how all that works.
You configure your instance's operating system and software stack, hardware specs (the CPU power, memory, primary storage, and network performance), and environment before launching it. The OS is defined by the Amazon Machine Image (AMI) you choose, and the hardware follows the instance type.
An AMI is really just a template document that contains information telling EC2 what OS and application software to include on the root data volume of the instance it's about to launch. There are four kinds of AMIs:
A particular AMI will be available in only a single region—although there will often be images with identical functionality in all regions. Keep this in mind as you plan your deployments: invoking the ID of an AMI in one region while working from within a different region will fail.
AWS allocates hardware resources to your instances according to the instance type—or hardware profile—you select. The particular workload you're planning for your instance will determine the type you choose. The idea is to balance cost against your need for compute power, memory, and storage space. Ideally, you'll find a type that offers exactly the amount of each to satisfy both your application and budget.
Should your needs change over time, you can easily move to a different instance type by stopping your instance, editing its instance type, and starting it back up again.
As listed in Table 2.1, there are currently more than 75 instance types organized into five instance families, although AWS frequently updates their selection. You can view the most recent collection at aws.amazon.com/ec2/instance-types
.
TABLE 2.1 EC2 instance type family and their top‐level designations
Instance type family | Types |
---|---|
General purpose | A1, T3, T3a, T2, M6g, M5, M5a, M5n, M4 |
Compute optimized | C5, C5n, C4 |
Memory optimized | R5, R5a, R5n, X1e, X1, High Memory, z1d |
Accelerated computing | P3, P2, Inf1, G4, G3, F1 |
Storage optimized | I3, I3en, D2, H1 |
M5 and M4 instances are recommended for many small and midsized data‐centric operations. Unlike T2, which requires EBS virtual volumes for storage, some M* instances come with their own instance storage drives that are actually physically attached to the host server. M5 types range from m5.large (2 vCPUs and 8 GB of memory) to the monstrous m5d.metal (96 vCPUs and 384 GB of memory).
The specification details and instance type names will frequently change as AWS continues to leverage new technologies to support its customers' growing computing demands. But it's important to be at least familiar with the instance type families and the naming conventions AWS uses to identify them.
Deciding where your EC2 instance will live is as important as choosing a performance configuration. Here, there are three primary details to get right: geographic region, virtual private cloud (VPC), and tenancy model.
As you learned earlier, AWS servers are housed in data centers around the world and organized by geographical region. You'll generally want to launch an EC2 instance in the region that's physically closest to the majority of your customers or, if you're working with data that's subject to legal restrictions, within a jurisdiction that meets your compliance needs.
EC2 resources can be managed only when you're “located within” their region. You set the active region in the console through the drop‐down menu at the top of the page and through default configuration values in the AWS CLI or your SDK. You can update your CLI configuration by running aws configure
.
Bear in mind that the costs and even functionality of services and features might vary between regions. It's always a good idea to consult the most up‐to‐date official documentation.
Virtual private clouds (VPCs) are easy‐to‐use AWS network organizers and great tools for organizing your infrastructure. Because it's so easy to isolate the instances in one VPC from whatever else you have running, you might want to create a new VPC for each one of your projects or project stages. For example, you might have one VPC for early application development, another for beta testing, and a third for production (see Figure 2.1).
Adding a simple VPC that doesn't incorporate a network address translation (NAT) gateway (docs.aws.amazon.com/AmazonVPC/latest/UserGuide/vpc-nat-gateway.html
) or VPN access (docs.aws.amazon.com/vpn/latest/s2svpn/VPC_VPN.html
) won't cost you anything. You'll learn much more about all this in Chapter 4, “Amazon Virtual Private Cloud.”
When launching an EC2 instance, you'll have the opportunity to choose a tenancy model. The default setting is shared tenancy, where your instance will run as a virtual machine on a physical server that's concurrently hosting other instances. Those other instances might well be owned and operated by other AWS customers, although the possibility of any kind of insecure interaction between instances is remote.
To meet special regulatory requirements, your organization's instances might need an extra level of isolation. The Dedicated Instance option ensures that your instance will run on a dedicated physical server. This means that it won't be sharing the server with resources owned by a different customer account. The Dedicated Host option allows you to actually identify and control the physical server you've been assigned to meet more restrictive licensing or regulatory requirements.
Naturally, dedicated instances and dedicated hosts will cost you more than instances using shared tenancy.
Exercise 2.1 will guide you through the launch of a simple EC2 Linux instance.
In Exercise 2.2, you'll see how changing an instance's type works.
You can optionally tell EC2 to execute commands on your instance as it boots by pointing to user data in your instance configuration (this is sometimes known as bootstrapping). Whether you specify the data during the console configuration process or by using the ‐‐user‐data
value with the AWS CLI, you can have script files bring your instance to any desired state.
User data can consist of a few simple commands to install a web server and populate its web root, or it can be a sophisticated script setting up the instance as a working node within a Puppet Enterprise–driven platform.
By default AWS will attempt to spread your instances across their infrastructure to create a profile that will be optimal for most use cases. But the specific demands of your operation might require a different setup. EC2 placement groups give you the power to define nonstandard profiles to better meet your needs. There are, at this time, three kinds of placement groups:
You can purchase the use of EC2 instances through one of three models.
For always‐on deployments that you expect to run for less than 12 months, you'll normally pay for each hour your instance is running through the on‐demand model. On‐demand is the most flexible way to consume EC2 resources since you're able to closely control how much you pay by stopping and starting your instances according to your need. But, per hour, it's also the most expensive.
If you're planning to keep the lights burning 24/7 for more than a year, then you'll enjoy a significant discount by purchasing a reserve instance—generally over a term commitment of between one and three years. You can pay up front for the entire term of a reserve instance or, for incrementally higher rates, either partially up front and the rest in monthly charges or entirely through monthly charges. Table 2.2 gives you a sense of how costs can change between models. These estimates assume a Linux platform, all up‐front payments, and default tenancy. Actual costs may vary over time and between regions.
TABLE 2.2 Pricing estimates comparing on‐demand with reserve costs
Instance type | Pricing model | Cost/hour | Cost/year |
---|---|---|---|
t2.micro | On‐demand | $0.0116 | $102.00 |
t2.micro | Reserve (three‐year term) | $38.33 | |
g3.4xlarge | On‐demand | $1.14 | $9986.40 |
g3.4xlarge | Reserve (three‐year term) | $4429.66 |
For workloads that can withstand unexpected disruption (like computation‐intensive genome research applications), purchasing instances on Amazon's spot market can save you a lot of money. The idea is that you enter a maximum dollar‐value bid for an instance type running in a particular region. The next time an instance in that region becomes available at a per‐hour rate that's equal to or below your bid, it'll be launched using the AMI and launch template you specified. Once up, the instance will keep running either until you stop it—when your workload completes, for example—or until the instance's per‐hour rate rises above your maximum bid. You'll learn more about the spot market and reserve instances in Chapter 13, “The Cost Optimization Pillar.”
It will often make sense to combine multiple models within a single application infrastructure. An online store might, for instance, purchase one or two reserve instances to cover its normal customer demand but also allow autoscaling to automatically launch on‐demand instances during periods of unusually high demand.
Use Exercise 2.3 to dive deeper into EC2 pricing.
The state of a running EC2 instance can be managed in a number of ways. Terminating the instance will shut it down and cause its resources to be reallocated to the general AWS pool.
If your instance won't be needed for some time but you don't want to terminate it, you can save money by simply stopping it and then restarting it when it's needed again. The data on an EBS volume will in this case not be lost, although that would not be true for an instance volume.
Later in this chapter, you'll learn about both EBS and instance store volumes and the ways they work with EC2 instances.
You should be aware that a stopped instance that had been using a nonpersistent public IP address will most likely be assigned a different address when it's restarted. If you need a predictable IP address that can survive restarts, allocate an elastic IP address and associate it with your instance.
You can edit or change an instance's security group (which we'll discuss a bit later in this chapter) to update access policies at any time—even while an instance is running. You can also change its instance type to increase or decrease its compute, memory, and storage capacity (just try doing that on a physical server). You will need to stop the instance, change the type, and then restart it.
The more resources you deploy on your AWS account, the harder it can be to properly keep track of things. Having constantly changing numbers of EC2 instances—along with accompanying storage volumes, security groups, and elastic IP addresses—all spread across two or three VPCs can get complicated.
The best way to keep a lid on the chaos is to find a way to quickly identify each resource you've got running by its purpose and its relationships to other resources. The best way to do that is by establishing a consistent naming convention and applying it to tags.
AWS resource tags can be used to label everything you'll ever touch across your AWS account—they're certainly not restricted to just EC2. Tags have a key and, optionally, an associated value. So, for example, you could assign a tag with the key production‐server
to each element of a production deployment. Server instances could, in addition, have a value of server1
, server2
, and so on. A related security group could have the same production‐server
key but security‐group1
for its value. Table 2.3 illustrates how that convention might play out over a larger deployment group.
TABLE 2.3 A sample key/value tagging convention
Key | Value |
---|---|
production‐server |
server1 |
production‐server |
server2 |
production‐server |
security‐group1 |
staging‐server |
server1 |
staging‐server |
server2 |
staging‐server |
security‐group1 |
test‐server |
server1 |
test‐server |
security‐group1 |
Applied properly, tags can improve the visibility of your resources, making it much easier to manage them effectively, audit and control costs and billing trends, and avoid costly errors.
By default, each AWS account has limits to the number of instances of a particular service you're able to launch. Sometimes those limits apply to a single region within an account, and others are global. As examples, you're allowed only five VPCs per region and 5,000 Secure Shell (SSH) key pairs across your account. If necessary, you can ask AWS to raise your ceiling for a particular service.
You can find up‐to‐date details regarding the limits of all AWS services at docs.aws.amazon.com/general/latest/gr/aws_service:limits.html
.
Storage drives (or volumes as they're described in AWS documentation) are for the most part virtualized spaces carved out of larger physical drives. To the OS running on your instance, though, all AWS volumes will present themselves exactly as though they were normal physical drives. But there's actually more than one kind of AWS volume, and it's important to understand how each type works.
You can attach as many Elastic Block Store (EBS) volumes to your instance as you like (although one volume can be attached to no more than a single instance at a time) and use them just as you would hard drives, flash drives, or USB drives with your physical server. And as with physical drives, the type of EBS volume you choose will have an impact on both performance and cost.
The AWS SLA guarantees the reliability of the data you store on its EBS volumes (promising at least 99.99 percent availability), so you don't have to worry about failure. When an EBS drive does fail, its data has already been duplicated and will probably be brought back online before anyone notices a problem. So, practically, the only thing that should concern you is how quickly and efficiently you can access your data.
There are currently four EBS volume types, two using SSD technologies and two using the older spinning hard drives. The performance of each volume type is measured in maximum IOPS/volume (where IOPS means input/output operations per second).
If your applications will require intense rates of I/O operations, then you should consider provisioned IOPS, which provides a maximum IOPS/volume of 64,000 and a maximum throughput/volume of 1,000 MB/s. Provisioned IOPS—which in some contexts is referred to as EBS Optimized—can cost $0.125/GB/month in addition to $0.065/provisioned IOPS.
For most regular server workloads that, ideally, deliver low‐latency performance, general‐purpose SSDs will work well. You'll get a maximum of 16,000 IOPS/volume, and it will cost you $0.10/GB/month. For reference, a general‐purpose SSD used as a typical 8 GB boot drive for a Linux instance would, at current rates, cost you $9.60/year.
Throughput‐optimized HDD volumes can provide reduced costs with acceptable performance where you're looking for throughput‐intensive workloads, including log processing and big data operations. These volumes can deliver only 500 IOPS/volume but with a 500 MB/s maximum throughput/volume, and they'll cost you only $0.045/GB/month.
When you're working with larger volumes of data that require only infrequent access, a 250 IOPS/volume type might meet your needs for only $0.025/GB/month.
Table 2.4 lets you compare the basic specifications and estimated costs of those types.
TABLE 2.4 Sample costs for each of the four EBS storage volume types
EBS‐provisioned IOPS SSD | EBS general‐purpose SSD | Throughput‐optimized HDD | Cold HDD | |
---|---|---|---|---|
Volume size | 4 GB–16 TB | 1 GB–16 TB | 500 GB–16 TB | 500 GB–16 TB |
Max IOPS/volume | 64,000 | 16,000 | 500 | 250 |
Max throughput/volume (MB/s) | 1,000 | 250 | 500 | 250 |
Price (/month) | $0.125/GB + $0.065/prov IOPS | $0.10/GB | $0.045/GB | $0.025/GB |
All EBS volumes can be copied by creating a snapshot. Existing snapshots can be used to generate other volumes that can be shared and/or attached to other instances or converted to images from which AMIs can be made. You can also generate an AMI image directly from a running instance‐attached EBS volume—although, to be sure no data is lost, you should shut down the instance first.
EBS volumes can be encrypted to protect their data while at rest or as it's sent back and forth to the EC2 host instance. EBS can manage the encryption keys automatically behind the scenes or use keys that you provide through the AWS Key Management Service (KMS).
Exercise 2.4 will walk you through launching a new instance based on an existing snapshot image.
Unlike EBS volumes, instance store volumes are ephemeral. This means that when the instances they're attached to are shut down, their data is permanently lost. So, why would you want to keep your data on an instance store volume more than on EBS?
Whether one or more instance store volumes are available for your instance will depend on the instance type you choose. This is an important consideration to take into account when planning your deployment.
Even with all the benefits of EBS and instance storage, it's worth noting that there will be cases where you're much better off keeping large data sets outside of EC2 altogether. For many use cases, Amazon's S3 service can be a dramatically less expensive way to store files or even databases that are nevertheless instantly available for compute operations.
You'll learn more about this in Chapter 3, “AWS Storage.”
The bottom line is that EBS volumes are likely to be the right choice for instances whose data needs to persist beyond a reboot and for working with custom or off‐the‐shelf AMIs. Instance store volumes are, where available, useful for operations requiring low‐latency access to large amounts of data that needn't survive a system failure or reboot. And non‐EC2 storage can work well when you don't need fantastic read/write speeds, but you wish to enjoy the flexibility and cost savings of S3.
Like all networked devices, EC2 instances are identified by unique IP addresses. All instances are assigned at least one private IPv4 address that, by default, will fall within one of the blocks shown in Table 2.5.
TABLE 2.5 The three IP address ranges used by private networks
From | To |
---|---|
10.0.0.0 | 10.255.255.255 |
172.16.0.0 | 172.31.255.255 |
192.168.0.0 | 192.168.255.255 |
Out of the box, you'll only be able to connect to your instance from within its subnet, and the instance will have no direct connection to the Internet.
If your instance configuration calls for multiple network interfaces (to connect to otherwise unreachable resources), you can create and then attach one or more virtual elastic network interfaces to your instance. Each of these interfaces must be connected to an existing subnet and security group. You can optionally assign a static IP address within the subnet range.
Of course, an instance can also be assigned a public IP through which full Internet access is possible. As noted in the instance lifecycle discussion, the default public IP assigned to your instance is ephemeral and probably won't survive a reboot. Therefore, you'll usually want to allocate a permanent elastic IP for long‐term deployments. As long as it's attached to a running instance, there's no charge for elastic IPs.
I'll talk about accessing an instance as an administrator a bit later within the context of security. But there's a lot you can learn about a running EC2 instance—including the IP addresses you'll need to connect—through the instance metadata system. Running the following curl
command from the command line while logged into the instance will return a list of the kinds of data that are available:
$ curl http://169.254.169.254/latest/meta‐data/
ami-id
ami-launch-index
ami-manifest-path
block-device-mapping/
hostname
instance-action
instance-id
instance-type
local-hostname
local-ipv4
mac
metrics/
network/
placement/
profile
public-hostname
public-ipv4
public-keys/
reservation-id
security-groups
Entries ending with a trailing slash (/
) contain further sublevels of information that can also be displayed by curl
. Adding a data type to that curl
command will then return the information you're after. This example displays the name of the security groups used by the instance:
$ curl http://169.254.169.254/latest/meta-data/security-groups launch-wizard-1
You are responsible for configuring appropriate and effective access controls to protect your EC2 instances from unauthorized use. Broadly speaking, AWS provides four tools to help you with this task: security groups, Identity and Access Management (IAM) roles, network address translation (NAT) instances, and key pairs.
An EC2 security group plays the role of a firewall. By default, a security group will deny all incoming traffic while permitting all outgoing traffic. You define group behavior by setting policy rules that will either block or allow specified traffic types. From that point on, any data packet coming into or leaving the perimeter will be measured against those rules and processed accordingly.
Traffic is assessed by examining its source and destination, the network port it's targeting, and the protocol it's set to use. A TCP packet sent to the SSH port 22 could, for example, only be allowed access to a particular instance if its source IP address matches the local public IP used by computers in your office. This lets you open up SSH access on your instance without having to worry about anyone from outside your company getting in.
Using security groups, you can easily create sophisticated rule sets to finely manage access to your services. You could, say, open up a website to the whole world while blocking access to your backend servers for everyone besides members of your team.
If necessary, you can update your security group rules and/or apply them to multiple instances.
You can also control access to AWS resources—including EC2 instances—through the use of IAM roles. You define an IAM role by giving it permissions to perform actions on specified services or resources within your AWS account. When a particular role is assigned to a user or resource, they'll gain access to whichever resources were included in the role policies.
Using roles, you can give a limited number of entities (other resources or users) exclusive access to resources like your EC2 instances. But you can also assign an IAM role to an EC2 instance so that processes running within it can access the external tools—like an RDS database instance—it needs to do its work.
You'll learn more about IAM in Chapter 6, “Authentication and Authorization—AWS Identity and Access Management.”
Sometimes you'll need to configure an EC2 instance without a public IP address to limit its exposure to the network. Naturally, that means it won't have any Internet connectivity. But that can present a problem because you'll probably still need to give it Internet access so that it can receive security patches and software updates.
One solution is to use network address translation (NAT) to give your private instance access to the Internet without allowing access to it from the Internet. AWS gives you two ways to do that: a NAT instance and a NAT gateway (see Figure 2.2). They'll both do the job, but since a NAT gateway is a managed service, it doesn't require that you manually launch and maintain an instance. Both approaches will incur monthly charges.
NAT will be discussed at greater length in Chapter 4.
As any professional administrator will know, remote login sessions on your running instances should never be initiated over unencrypted plain‐text connections. To ensure properly secured sessions, you'll need to generate a key pair, save the public key to your EC2 server, and save its private half to your local machine. If you're working with a Windows AMI, you'll use the private key file to retrieve the password you'll need to authenticate into your instance. For a Linux AMI, the private key will allow you to open an SSH session.
Each key pair that AWS generates for you will remain installed within its original region and available for use with newly launched instances until you delete it. You should delete the AWS copy in the event your public key is lost or exposed. Just be careful before you mess with your keys—your access to an instance might depend on it.
The EC2 Auto Scaling service offers a way to both avoid application failure and recover from it when it happens. Auto Scaling works by provisioning and starting on your behalf a specified number of EC2 instances. It can dynamically add more instances to keep up with increased demand. And when an instance fails or gets terminated, Auto Scaling will automatically replace it.
EC2 Auto Scaling uses either a launch configuration or a launch template to automatically configure the instances that it launches. Both perform the same basic function of defining the basic configuration parameters of the instance as well as what scripts (if any) run on it at launch time. Launch configurations have been around longer and are more familiar to you if you've been using AWS for a while. You're also more likely to encounter them if you're going into an existing AWS environment. Launch templates are newer and are what AWS now recommends. You'll learn about both, but which you use is up to you.
When you create an instance manually, you have to specify many configuration parameters, including an AMI, instance type, SSH key pair, security group, instance profile, block device mapping, whether it's EBS optimized, placement tenancy, and user data, such as custom scripts to install and configure your application. A launch configuration is essentially a named document that contains the same information you'd provide when manually provisioning an instance.
You can create a launch configuration from an existing EC2 instance. Auto Scaling will copy the settings from the instance for you, but you can customize them as needed. You can also create a launch configuration from scratch.
Launch configurations are for use only with EC2 Auto Scaling, meaning you can't manually launch an instance using a launch configuration. Also, once you create a launch configuration, you can't modify it. If you want to change any of the settings, you have to create an entirely new launch configuration.
Launch templates are similar to launch configurations in that you can specify the same settings. But the uses for launch templates are more versatile. You can use a launch template with Auto Scaling, of course, but you can also use it for spinning up one‐off EC2 instances or even creating a spot fleet.
Launch templates are also versioned, allowing you to change them after creation. Any time you need to make changes to a launch template, you create a new version of it. AWS keeps all versions, and you can then flip back and forth between versions as needed. This makes it easier to track your launch template changes over time. Complete Exercise 2.5 to create your own launch template.
An Auto Scaling group is a group of EC2 instances that Auto Scaling manages. When creating an Auto Scaling group, you must first specify either the launch configuration or launch template you created. When you create an Auto Scaling group, you must specify how many running instances you want Auto Scaling to provision and maintain using the launch configuration or template you created. You must specify the minimum and maximum size of the Auto Scaling group. You may also optionally set the desired number of instances you want Auto Scaling to provision and maintain.
If you want to use an application load balancer (ALB) to distribute traffic to instances in your Auto Scaling group, just plug in the name of the ALB target group when creating the Auto Scaling group. Whenever Auto Scaling creates a new instance, it will automatically add it to the ALB target group.
When you create an Auto Scaling group, Auto Scaling will strive to maintain the minimum number of instances, or the desired number if you've specified it. If an instance becomes unhealthy, Auto Scaling will terminate and replace it.
By default, Auto Scaling determines an instance's health based on EC2 health checks. Chapter 7, “CloudTrail, CloudWatch, and AWS Config,” covers how EC2 automatically performs system and instance status checks. These checks monitor for instance problems such as memory exhaustion, filesystem corruption, or an incorrect network or startup configuration, as well as for system problems that require AWS involvement to repair. Although these checks can catch a variety of instance and host‐related problems, they won't necessarily catch application‐specific problems.
If you're using an application load balancer to route traffic to your instances, you can configure health checks for the load balancer's target group. Target group health checks can check for HTTP response codes from 200 to 499. You can then configure your Auto Scaling group to use the results of these health checks to determine whether an instance is healthy.
If an instance fails the ALB health check, it will route traffic away from the failed instance, ensuring that users don't reach it. At the same time, Auto Scaling will remove the instance, create a replacement, and add the new instance to the load balancer's target group. The load balancer will then route traffic to the new instance.
Once you create an Auto Scaling group, you can leave it be and it will continue to maintain the minimum or desired number of instances indefinitely. However, maintaining the current number of instances is just one option. Auto Scaling provides several other options to scale out the number of instances to meet demand.
If you change the minimum, desired, or maximum values at any time after creating the group, Auto Scaling will immediately adjust. For example, if you have the desired capacity value set to 2 and change it to 4, Auto Scaling will launch two more instances. If you have four instances and set the desired capacity value to 2, Auto Scaling will terminate two instances. Think of the desired capacity as a thermostat.
Most AWS‐managed resources are elastic—that is, they automatically scale to accommodate increased load. Some examples include S3, load balancers, Internet gateways, and NAT gateways. Regardless of how much traffic you throw at them, AWS is responsible for ensuring that they remain available while continuing to perform well. But when it comes to your EC2 instances, you're responsible for ensuring that they're powerful and plentiful enough to meet demand.
Running out of instance resources—be it CPU utilization, memory, or disk space—will almost always result in the failure of whatever you're running on it. To ensure that your instances never become overburdened, dynamic scaling policies automatically provision more instances before they hit that point. Auto Scaling generates the following aggregate metrics for all instances within the group:
You're not limited to using just these native metrics. You can also use metric filters to extract metrics from CloudWatch logs and use those. As an example, your application may generate logs that indicate how long it takes to complete a process. If the process takes too long, you could have Auto Scaling spin up new instances.
Dynamic scaling policies work by monitoring a CloudWatch alarm and scaling out—by increasing the desired capacity—when the alarm is breaching. You can choose from three dynamic scaling policies: simple, step, and target tracking.
With a simple scaling policy, whenever the metric rises above the threshold, Auto Scaling simply increases the desired capacity. How much it increases the desired capacity, however, depends on which of the following adjustment types you choose:
For example, suppose you have four instances and create a simple scaling policy that specifies a PercentChangeInCapacity adjustment of 50 percent. When the monitored alarm triggers, Auto Scaling will increment the desired capacity by 2, which will in turn add two instances to the Auto Scaling group, for a total of six.
After Auto Scaling completes the adjustment, it waits a cooldown period before executing the policy again, even if the alarm is still breaching. The default cooldown period is 300 seconds, but you can set it as high as you want or as low as 0—effectively disabling it. Note that if an instance is unhealthy, Auto Scaling will not wait for the cooldown period before replacing the unhealthy instance.
Referring to the preceding example, suppose that after the scaling adjustment completes and the cooldown period expires, the monitored alarm drops below the threshold. At this point, the desired capacity value is 6. If the alarm triggers again, the simple scaling action will execute again and add three more instances. Keep in mind that Auto Scaling will never increase the desired capacity beyond the group's maximum setting.
If the demand on your application is rapidly increasing, a simple scaling policy may not add enough instances quickly enough. Using a step scaling policy, you can instead add instances based on how much the aggregate metric exceeds the threshold.
To illustrate, suppose your group starts out with four instances. You want to add more instances to the group as the average CPU utilization of the group increases. When the utilization hits 50 percent, you want to add two more instances. When it goes above 60 percent, you want to add four more instances.
You'd first create a CloudWatch Alarm to monitor the average CPU utilization and set the alarm threshold to 50 percent, since this is the utilization level at which you want to start increasing the desired capacity.
You must then specify at least one step adjustment. Each step adjustment consists of the following:
The upper and lower bounds define a range that the metric has to fall within for the step adjustment to execute. Suppose that for the first step you set a lower bound of 50 and an upper bound of 60, with a ChangeInCapacity adjustment of 2. When the alarm triggers, Auto Scaling will consider the metric value of the group's average CPU utilization. Suppose it's 55 percent. Because 55 is between 50 and 60, Auto Scaling will execute the action specified in this step, which is to add two instances to the desired capacity.
Suppose now that you create another step with a lower bound of 60 and an upper bound of infinity. You also set a ChangeInCapacity adjustment of 4. If the average CPU utilization increases to 62 percent, Auto Scaling will note that 60 <= 62 < infinity and will execute the action for this step, adding four instances to the desired capacity.
You might be wondering what would happen if the utilization were 60 percent. Step ranges can't overlap. A metric of 60 percent would fall within the lower bound of the second step.
With a step scaling policy, you can optionally specify a warm‐up time, which is how long Auto Scaling will wait until considering the metrics of newly added instances. The default warm‐up time is 300 seconds. Note that there are no cooldown periods in step scaling policies.
If step scaling policies are too involved for your taste, you can instead use target tracking policies. All you do is select a metric and target value, and Auto Scaling will create a CloudWatch Alarm and a scaling policy to adjust the number of instances to keep the metric near that target.
The metric you choose must change proportionally to the instance load. Metrics like this include average CPU utilization for the group and request count per target. Aggregate metrics like the total request count for the ALB don't change proportionally to the load on an individual instance and aren't appropriate for use in a target tracking policy.
In addition to scaling out, target tracking will scale in by deleting instances to maintain the target metric value. If you don't want this behavior, you can disable scaling in. Also, just as with a step scaling policy, you can optionally specify a warm‐up time.
Scheduled actions are useful if you have a predictable load pattern and want to adjust your capacity proactively, ensuring you have enough instances before demand hits.
When you create a scheduled action, you must specify the following:
You may optionally set the policy to recur at regular intervals, which is useful if you have a repeating load pattern. You can also set an end time, after which the scheduled policy gets deleted.
To illustrate how you might use a scheduled action, suppose you normally run only two instances in your Auto Scaling group during the week. But on Friday, things get busy, and you know you'll need four instances to keep up. You'd start by creating a scheduled action that sets the desired capacity to 2 and recurs every Saturday, as shown in Figure 2.3.
The start date is January 5, 2019, which is a Saturday. To handle the expected Friday spike, you'd create another weekly recurring policy to set the desired capacity to 4, as shown in Figure 2.4.
This action will run every Friday, setting the desired capacity to 4, prior to the anticipated increased load.
Note that you can combine scheduled actions with dynamic scaling policies. For example, if you're running an e‐commerce site, you may use a scheduled action to increase the maximum group size during busy shopping seasons and then rely on dynamic scaling policies to increase the desired capacity as needed.
AWS Systems Manager, formerly known as EC2 Systems Manager and Simple Systems Manager (SSM), lets you automatically or manually perform actions against your AWS resources and on‐premises servers.
From an operational perspective, Systems Manager can handle many of the maintenance tasks that often require manual intervention or writing scripts. For on‐premises and EC2 instances, these tasks include upgrading installed packages, taking an inventory of installed software, and installing a new application. For your other AWS resources, such tasks may include creating an AMI golden image from an EBS snapshot, attaching IAM instance profiles, or disabling public read access to S3 buckets.
Systems Manager provides the following two capabilities:
Actions let you automatically or manually perform actions against your AWS resources, either individually or in bulk. These actions must be defined in documents, which are divided into three types:
Automation enables you to perform actions against your AWS resources in bulk. For example, you can restart multiple EC2 instances, update CloudFormation stacks, and patch AMIs.
Automation provides granular control over how it carries out its individual actions. It can perform the entire automation task in one fell swoop, or it can perform one step at a time, enabling you to control precisely what happens and when. Automation also offers rate control so that you can specify as a number or a percentage how many resources to target at once.
While automation lets you automate tasks against your AWS resources, Run commands let you execute tasks on your managed instances that would otherwise require logging in or using a third‐party tool to execute a custom script.
Systems Manager accomplishes this via an agent installed on your EC2 and on‐premises managed instances. The Systems Manager agent is installed by default on more recent Windows Server, Amazon Linux, and Ubuntu Server AMIs. You can manually install the agent on other AMIs and on‐premises servers.
By default, Systems Manager doesn't have permissions to do anything on your instances. You first need to apply an instance profile role that contains the permissions in the AmazonEC2RoleforSSM policy.
AWS offers a variety of preconfigured command documents for Linux and Windows instances; for example, the AWS‐InstallApplication document installs software on Windows, and the AWS‐RunShellScript document allows you to execute arbitrary shell scripts against Linux instances. Other documents include tasks such as restarting a Windows service or installing the CodeDeploy agent.
You can target instances by tag or select them individually. As with automation, you may use rate limiting to control how many instances you target at once.
Session Manager lets you achieve interactive Bash and PowerShell access to your Linux and Windows instances, respectively, without having to open up inbound ports on a security group or network ACL or even having your instances in a public subnet. You don't need to set up a protective bastion host or worry about SSH keys. All Linux versions and Windows Server 2008 R2 through 2016 are supported.
You open a session using the web console or AWS CLI. You must first install the Session Manager plug‐in on your local machine to use the AWS CLI to start a session. The Session Manager SDK has libraries for developers to create custom applications that connect to instances. This is useful if you want to integrate an existing configuration management system with your instances without opening ports in a security group or NACL.
Connections made via Session Manager are secured using TLS 1.2. Session Manager can keep a log of all logins in CloudTrail and store a record of commands run within a session in an S3 bucket.
Patch Manager helps you automate the patching of your Linux and Windows instances. It will work for supporting versions of the following operating systems:
You can individually choose instances to patch, patch according to tags, or create a patch group. A patch group is a collection of instances with the tag key Patch Group. For example, if you wanted to include some instances in the Webservers patch group, you'd assign tags to each instance with the tag key of Patch Group and the tag value of Webservers. Keep in mind that the tag key is case‐sensitive.
Patch Manager uses patch baselines to define which available patches to install, as well as whether the patches will be installed automatically or require approval.
AWS offers default baselines that differ according to operating system but include patches that are classified as security related, critical, important, or required. The patch baselines for all operating systems except Ubuntu automatically approve these patches after seven days. This is called an auto‐approval delay.
For more control over which patches get installed, you can create your own custom baselines. Each custom baseline contains one or more approval rules that define the operating system, the classification and severity level of patches to install, and an auto‐approval delay.
You can also specify approved patches in a custom baseline configuration. For Windows baselines, you can specify knowledgebase and security bulletin IDs. For Linux baselines, you can specify Common Vulnerabilities and Exposures (CVE) IDs or full package names. If a patch is approved, it will be installed during a maintenance window that you specify. Alternatively, you can forego a maintenance window and patch your instances immediately. Patch Manager executes the AWS‐RunPatchBaseline document to perform patching.
While Patch Manager can help ensure your instances are all at the same patch level, State Manager is a configuration management tool that ensures your instances have the software you want them to have and are configured in the way you define. More generally, State Manager can automatically run command and policy documents against your instances, either one time only or on a schedule. For example, you may want to install antivirus software on your instances and then take a software inventory.
To use State Manager, you must create an association that defines the command document to run, any parameters you want to pass to it, the target instances, and the schedule. Once you create an association, State Manager will immediately execute it against the target instances that are online. Thereafter, it will follow the schedule.
There is currently only one policy document you can use with State Manager: AWS‐GatherSoftwareInventory. This document defines what specific metadata to collect from your instances. Despite the name, in addition to collecting software inventory, you can have it collect network configurations, file information, CPU information, and for Windows, registry values.
Insights aggregate health, compliance, and operational details about your AWS resources into a single area of AWS Systems Manager. Some insights are categorized according to AWS resource groups, which are collections of resources in an AWS region. You define a resource group based on one or more tag keys and optionally tag values. For example, you can apply the same tag key to all resources related to a particular application—EC2 instances, S3 buckets, EBS volumes, security groups, and so on. Insight categories are covered next.
Built‐in insights are monitoring views that Systems Manager makes available to you by default. Built‐in insights include the following:
Business and Enterprise support customers get access to all Trusted Advisor checks. All AWS customers get the following security checks for free:
The Inventory Manager collects data from your instances, including operating system and application versions. Inventory Manager can collect data for the following:
You choose which instances to collect data from by creating a regionwide inventory association by executing the AWS‐GatherSoftwareInventory policy document. You can choose all instances in your account or select instances manually or by tag. When you choose all instances in your account, it's called a global inventory association, and new instances you create in the region are automatically added to it. Inventory collection occurs at least every 30 minutes.
When you configure the Systems Manager agent on an on‐premises server, you specify a region for inventory purposes. To aggregate metadata for instances from different regions and accounts, you may configure Resource Data Sync in each region to store all inventory data in a single S3 bucket.
Compliance insights show how the patch and association status of your instances stacks up against the rules you've configured. Patch compliance shows the number of instances that have the patches in their configured baseline, as well as details of the specific patches installed. Association compliance shows the number of instances that have had an association successfully executed against them.
The following example code shows how you can use an AWS CLI command to deploy an EC2 instance that includes many of the features you learned about in this chapter. Naturally, the image‐id
, security‐group‐ids
, and subnet‐id
values are not real. Those you would replace with actual IDs that fit your account and region.
aws ec2 run-instances --image-id ami-xxxxxxxx --count 1
--instance-type t2.micro --key-name MyKeyPair
--security-group-ids sg-xxxxxxxx --subnet-id subnet-xxxxxxxx
--user-data file://my_script.sh
--tag-specifications
'ResourceType=instance,Tags=[{Key=webserver,Value=production}]'
'ResourceType=volume,Tags=[{Key=cost-center,Value=cc123}]'
This example launches a single (‐‐count 1
) instance that's based on the specified AMI. The desired instance type, key name, security group, and subnet are all identified. A script file (that must exist locally so it can be read) is added using the user‐data
argument, and two tags are associated with the instance (webserver:production
and cost‐center:cc123
).
If you need to install the AWS CLI, perform Exercise 2.6.
Never leave any resources running after you've finished using them. Exercise 2.7 can help.
The base software stack that runs on an EC2 instance is defined by your choice of Amazon Machine Image and any scripts or user data you add at launch time, and the hardware profile is the product of an instance type. A tenancy setting determines whether your instance will share a physical host with other instances.
As with all your AWS resources, it's important to give your EC2 instances easily identifiable tags that conform to a systemwide naming convention. There are limits to the number of resources you'll be allowed to launch within a single region and account wide. Should you hit your limit, you can always request access to additional resources.
If you plan to run an instance for a year or longer, you can save a significant amount of money compared to paying for on‐demand by purchasing a reserve instance. If your workload can withstand unexpected shutdowns, then a spot instance could also make sense.
There are four kinds of Elastic Block Store volumes: two high IOPS and low‐latency SSD types and two traditional hard drives. Your workload and budget will inform your choice. In addition, some EC2 instance types come with ephemeral instance store volumes that offer fast data access but whose data is lost when the instance is shut down.
All EC2 instances are given at least one private IP address, and should they require Internet access, they can also be given a nonpermanent public IP. If you require a permanent public IP, you can assign an elastic IP to the instance.
You secure access to your EC2 instances using software firewalls known as security groups and can open up secure and limited access through IAM roles, NAT instances or NAT gateways, and key pairs.
EC2 Auto Scaling can help you avoid application failures caused by overloaded instances. By implementing dynamic scaling policies, you can ensure that you always have enough instances to handle increased demand. In the event of some failure, a well‐designed Auto Scaling group will ensure that you always have a minimum number of healthy instances. When an instance becomes unhealthy, Auto Scaling will terminate and replace it.
InvalidAMIID.NotFound
. What of the following is the most likely cause?
~/.aws/config
file.servers:server1
project1:server1
EC2:project1:server1
server1:project1
18.189.180.76