© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2022
M. ZadkaDevOps in Pythonhttps://doi.org/10.1007/978-1-4842-7996-0_13

13. Amazon Web Services

Moshe Zadka1  
(1)
Belmont, CA, USA
 

Amazon Web Services (AWS) is a popular cloud platform. It allows using computation and storage resources in a data center and other services.

Amazon provides a free tier for some services. In general, services are pay-by-usage.

Note

When experimenting with AWS, note that you are charged for any services you consume. Monitor your billed usage carefully, and turn off any services you do not use to avoid paying more than you intend.This chapter is not, and cannot be, a comprehensive introduction to all AWS services. It explains some specific ways to interact with some of the services using Python and some general techniques for using AWS API through API.

One of the central principles of AWS is that all interactions with it should be possible via an API. The web console, where computing resources can be manipulated, is just another front-end to the API. This allows automating the infrastructure configuration—infrastructure as code, where the computing infrastructure is reserved and manipulated programmatically.

The Amazon Web Services team supports a package on PyPI, boto3, to automate AWS operations. This is one of the best ways to interact with AWS.

Note

There are different ways to configure boto3. Setting the AWS_DEFAULT_REGION environment variable to an AWS region, such as us-west-2, means that passing an explicit region name to boto3 setup functions is optional. Some of the examples in this chapter take advantage of that to focus on other aspects of the library or AWS.

While AWS does support a console UI, it is usually best to use that as a read-only window into AWS services. When making changes through the console UI, there is no repeatable record of it. While it is possible to log actions, this does not help reproduce them.

Combining boto3 with Jupyter, as discussed in an early chapter, makes for a powerful AWS operations console. Actions taken through Jupyter, using the boto3 API, can be repeated, automated, and parameterized as needed.

When making ad hoc changes to the AWS setup to solve a problem, it is possible to attach the notebook to the ticket tracking the problems to have a clear record of what was done to address the problem. This serves both to understand what was done if this caused some unforeseen issues and to easily repeat this intervention if this solution is needed again.

As always, notebooks are not an auditing solution; for one, when allowing access via boto3, actions do not have to be performed via a notebook. AWS has internal ways to generate audit logs. The notebooks are there to document intent and allow repeatability.

13.1 Security

For automated operations, AWS requires access keys. Access keys can be configured for the root account, but this is not a good idea. There are no restrictions possible on the root account, so this means that these access keys can do everything.

AWS platform for roles and permissions is called identity and access management (IAM). The IAM service is responsible for users, roles, and policies.

In general, it is better to have a separate IAM user for each human user and each automated task that needs to be taken. Even if they all share an access policy, having distinct users means it is easier to do key management and have accurate audit logs of who (or what) did what.

13.1.1 Configuring Access Keys

With the right security policy, users can be in charge of their own access keys. A single access key comprises the access key ID and the access key secret. The ID does not need to be kept secret and remains accessible via the IAM user interface after generation. This allows, for example, disabling or deleting an access key by ID.

A user can configure up to two access keys. Having two keys allows for doing 0-downtime key rotations. The first step is to generate a new key. Then replace the old key everywhere. Afterward, disable the old key. Disabling the old key makes anything that tries to use it fail. If such a failure is detected, it is easy to re-enable the old key until the task using that key can be upgraded to the new key.

After a certain amount of time, when no failures have been observed, it should be safe to delete the old key.

Local security policies generally determine how often keys should be rotated, but this should usually be at least a yearly ritual. This should generally follow practices for other API secrets used in the organization.

Note that in AWS, different computation tasks can have their own IAM credentials.

For example, an EC2 machine can be assigned an IAM role. Other higher-level computation tasks can also be assigned a role. For example, an Elastic Container Service (ECS) task, which runs one or more containers, can be assigned an IAM role. Serverless Lambda functions, which run on infrastructure allocated on an as-needed basis, can also be assigned an IAM role.

The boto3 client automatically uses these credentials if running from such a task. This removes the need to explicitly manage credentials and is often a safer alternative.

13.1.2 Creating Short-Term Tokens

AWS supports short-term tokens (STS), which can be used for several things. They can be used to convert alternative authentication methods into tokens that can be used with any boto3-based program, for example, by putting them in an environment variable.

For example, the following code takes the configured default credentials and uses them to get a short-term token for a role.
import boto3
response = client.assume_role(
    DurationSeconds=120,
    RoleArn='arn:aws:iam::123456789012:role/demo',
)
credentials = response['Credentials']
session = boto3.Session(
    aws_access_key_id=credentials['AccessKeyId'],
    aws_secret_access_key=credentials['SecretAccessKey'],
    aws_session_token=credentials['SessionToken'],
)
print(session.client('ec2').describe_instances())

This can be used to improve auditing or to reduce the security impact of lost or stolen credentials.

A more sophisticated example could be used in a web portal with assume_role_with_saml.
import boto3
import base64
def credentials_from_saml_assertion(assertion):
    assertion = base64.b64encode(assertion.encode("ascii")).decode("ascii")
    response = boto3.client('sts').assume_role_with_saml(
        RoleArn='arn:aws:iam::123456789012:role/demo',
        PrincipalArn='arn:aws:iam::123456789012:role/princpal',
        SAMLAssertion=assertion,
        DurationSeconds=120
    )
    return response['Credentials']

With this logic, the web portal has no special access to the AWS account. After the user logs in, it uses the assertion to create a short-term token for operations.

On an account that has been configured with cross-account access, assume_token can return credentials for the granting account.

Even when using a single account, sometimes it is useful to create a short-term token. For example, this can be used to limit permissions. It is possible to create an STS with a limited security policy. Using limiting tokens in code that is more prone to vulnerabilities, such as direct user interactions, limits the attack surface.

13.2 Elastic Computing Cloud (EC2)

The Elastic Computing Cloud (EC2) is the most basic way to access AWS compute (CPU and memory) resources. EC2 runs machines of various types. Most are virtual machines (VMs) that run with other VMs on physical hosts. The AWS infrastructure takes care of fairly dividing resources between the VMs.

The EC2 service also handles the resources that machines need to work properly: operating system images, attached storage, and networking configuration, among others.

13.2.1 Regions

EC2 machines run in regions. Regions usually have a human-friendly name (such as Oregon) and an identifier used for programs (such as us-west-2).

There are several regions in the United States, including North Virginia (us-east-1), Ohio (us-east-2), North California (us-west-1), and Oregon (us-west-2). There are also several regions in Europe, Asia Pacific, and more.

When you connect to AWS, you connect to the region you need to manipulate; boto3.client("ec2", region_name="us-west-2") returns a client that connects to the Oregon AWS data center.

It is possible to specify default regions in environment variables and configuration files, but it is often best to be explicit in code (or retrieve it from higher-level application configuration data).

EC2 machines also run in the availability zone. While regions are objective (every customer sees the region the same), availability zones are not: one customer’s us-west-2a might be another’s us-west-2c.

Amazon puts all EC2 machines into a virtual private cloud (VPC) network. For simple cases, an account has one VPC per region, and all EC2 machines belonging to that account are in that VPC.

A subnet is how a VPC intersects with an availability zone. All machines in a subnet belong to the same zone. A VPC can have one or more security groups. Security groups can set up various firewall rules about what network connections are allowed.

13.2.2 Amazon Machine Images

To start an EC2 machine, you need an operating system image. While it is possible to build custom Amazon Machine Images (AMIs), you can often use a ready-made one.

There are AMIs for all major Linux distributions. The AMI ID for the right distribution depends on the AWS region you want to run the machine. Once you have decided on the region and the distribution version, you need to find the AMI ID.

The ID can sometimes be non-trivial to find. If you have the product code, for example, aw0evgkw8e5c1q413zgy5pjce, you can use describe_images.
client = boto3.client('ec2', region_name='us-west-2')
description = client.describe_images(Filters=[{
    'Name': 'product-code',
    'Values': ['aw0evgkw8e5c1q413zgy5pjce']
}])
print(description)

The CentOS wiki contains product codes for all relevant CentOS versions.

AMI IDs for Debian images can be found on the Debian wiki. The Ubuntu website has a tool to find the AMI IDs for various Ubuntu images based on region and version. Unfortunately, there is no centralized, automated registry. It is possible to search for AMIs with the UI, but this is risky. The best way to guarantee the authenticity of the AMI is to look at the creator's website.

13.2.3 SSH Keys

For ad hoc administration and troubleshooting, it is useful to be able to SSH into the EC2 machine. This might be for manual SSH, using Paramiko, Ansible, or bootstrapping Salt.

Best practices for building AMIs, which are followed by all major distributions for their default images, use cloud-init to initialize the machine. One of the things cloud-init does is allow a preconfigured user to log in via an SSH public key retrieved from the machine’s user data.

Public SSH keys are stored by region and account. There are two ways to add an SSH key; letting AWS generate a key-pair and retrieving the private key, or generating a key-pair ourselves and pushing the public key to AWS.

The first way is done as follows.
key = boto3.client("ec2").create_key_pair(KeyName="high-security")
fname = os.path.expanduser("~/.ssh/high-security")
with open(fname, "w") as fpout:
    os.chmod(fname, 0o600)
    fpout.write(key["KeyMaterial"])
There are a few things to note about this example.
  • Keys are ASCII-encoded. Using string (rather than byte) functions is safe.

  • It is a good idea to change the file permissions before putting in sensitive data.

  • The ~/.ssh directory tends to have conservative permissions.

  • This code only works once since key names are unique per region. To retry it, delete the keys in the console, or add some unique identifier to the key.

13.2.4 Bringing up Machines

The run_instances method on the EC2 client can start new instances. Choose IMAGE_ID based on the image you want to run by searching the console.

If you have already created any machines through the console-based wizard, you have a reasonable security group you can use for SECURITY_GROUP. Creating an initial security group that is reasonable is non-trivial. If you are new to AWS and not just to boto3, create and immediately destroy an EC2 instance through the console to get a security group.
client = boto3.client("ec2")
client.run_instances(
    ImageId=IMAGE_ID,
    MinCount=1,
    MaxCount=1,
    InstanceType='t2.micro',
    KeyName=ssh_key_name,
    SecurityGroupIds=[SECURITY_GROUP]
)

The API is a little counter-intuitive. In almost all cases, both MinCount and MaxCount need to be 1. For running several identical machines, it is much better to use an AutoScaling Group (ASG), which is beyond the scope of the current chapter. It is worth remembering that as AWS's first service, EC2 has the oldest API, with the least lessons learned on designing good cloud automation APIs.

While the API generally allows running more than one instance, this is not often done. The SecurityGroupIds imply which VPC the machine is in. When running a machine from the AWS console, a fairly liberal Security group is automatically created. Using this security group is a useful shortcut for debugging purposes, although it is better to create custom security groups.

The AMI chosen here is a CentOS AMI. While KeyName is not mandatory, it is highly recommended to create a key-pair, or import one, and use the name.

The InstanceType indicates the amounts of computation resources allocated to the instance. t2.micro is, as the name implies, a fairly minimal machine. It is useful mainly for prototyping but usually cannot support all but the most minimal production workloads.

13.2.5 Securely Logging In

When logging in via SSH, it is good to know beforehand what public key you expect. Otherwise, an intermediary can hijack the connection. Especially in cloud environments, the Trust On First Use (TOFU) approach is problematic. There are a lot of first uses whenever you create a new machine. Since VMs are best treated as disposable, the TOFU principle is of little help.

The main technique in retrieving the key is to realize that the key is written to the console as the instance boots up. AWS has a way for you to retrieve the console output.
client = boto3.client('ec2')
output = client.get_console_output(InstanceId=sys.argv[1])
result = output['Output']
Unfortunately, boot-time diagnostic messages are not well structured, so the parsing must be somewhat ad hoc.
rsa = next(line
           for line in result.splitlines()
           if line.startswith('ssh-rsa'))

Look for the first line that starts with ssh-rsa. Now that you have the public key, there are several things you can do with it. If you just want to run the SSH command line, and the machine is not VPN-accessible-only, you want to store the public IP in known_hosts.

This avoids a TOFU situation. boto3 uses certificate authorities to connect securely to AWS, so the SSH key’s integrity is guaranteed. Especially for cloud platforms, TOFU is a poor security model. Since it is so easy to create and destroy machines, the lifetime of machines is sometimes measured in weeks or even days.
resource = boto3.resource('ec2')
instance = resource.Instance(sys.argv[1])
known_hosts = (f'{instance.public_dns_name},'
               f'{instance.public_ip_address} {rsa}')
with open(os.path.expanduser('~/.ssh/known_hosts'), 'a') as fp:
    fp.write(known_hosts)

13.2.6 Building Images

Building your own images can be useful. One reason to do it is to accelerate start-up. Instead of booting up a vanilla Linux distribution and then installing needed packages, setting configuration, and so on, it is possible to do it once, store the AMI, and then launch instances from this AMI.

Another reason to do it is to have known upgrade times; running apt-get update && apt-get upgrade means getting the latest packages at the time of upgrade. Instead, doing this in an AMI build allows knowing all machines are running from the same AMI. Upgrades can be done by first replacing some machines with the new AMI, checking the status, and replacing the rest. This technique (used by Netflix, among others) is called an immutable image. While there are other approaches to immutability, this is one of the first successfully deployed in production.

One way to prepare machines is to use a configuration management system. Both Ansible and Salt have a local mode that runs commands locally instead of via a server/client connection.

The steps are as follows.
  1. 1.

    Launch an EC2 machine with the right base image (for example, vanilla CentOS).

     
  2. 2.

    Retrieve the host key for securely connecting.

     
  3. 3.

    Copy over the Salt code.

     
  4. 4.

    Copy over the Salt configuration.

     
  5. 5.

    Via SSH, run Salt on the EC2 machine.

     
  6. 6.

    At the end, call client("ec2").create_image to save the current disk contents as an AMI.

     
$ pex -o salt-call -c salt-call salt-ssh
$ scp -r salt-call salt-files $USER@$IP:/
$ ssh $USER@$IP /salt-call --local --file-root /salt-files
(botovenv)$ python
...
>>> client.create_image(....)

This approach means a simple script running on a local machine or in a CI environment can generate an AMI from source code.

13.3 Simple Storage Service (S3)

The simple storage service (S3) is an object storage service. Objects, which are byte streams, can be stored and retrieved. This can store backups, compressed log files, video files, and similar things.

S3 stores objects in buckets by a key (a string). Objects can be stored, retrieved, or deleted. However, objects cannot be modified in place.

S3 buckets names must be globally unique, not just per account. This uniqueness is often accomplished by adding the account holder’s domain name; for example, large-videos.production.example.com.

Buckets can be set to be publicly available, in which case objects can be retrieved by accessing a URL composed of the bucket’s name and the object’s name. This allows S3 buckets, properly configured, to be static websites.

13.3.1 Managing Buckets

In general, bucket creation is a fairly rare operation. New buckets correspond to new code flows, not code runs. This is partially because buckets need to have unique names. However, it is sometimes useful to create buckets automatically, perhaps for many parallel test environments.
response = client("s3").create_bucket(
    ACL='private',
    Bucket='my.unique.name.example.com',
)

There are other options, but those are usually not needed. Some of those have to do with granting permissions on the bucket. In general, a better way to manage bucket permissions is how all permissions are managed by attaching policies to roles or IAM users.

In order to list possible keys, you can use the following.
response = client("s3").list_objects(
    Bucket=bucket,
    MaxKeys=10,
    Marker=marker,
    Prefix=prefix,
)

The first two arguments are important. It is necessary to specify the bucket, and it is a good idea to make sure that responses are of known maximum size.

The Prefix parameter is useful, especially when using the S3 bucket to simulate a file system. For example, this is what S3 buckets that serve as websites usually look like. When exporting CloudWatch logs to S3, it is possible to specify a prefix exactly to simulate a file system. While internally, the bucket is still flat, you can use something like Prefix="2018/12/04/" to get only the logs from December 4, 2018.

When more objects qualify than MaxKeys, the response is truncated. The IsTruncated field in the response is True, and the NextMarker field is set. Sending another list_objects with the Marker set to the returned NextMarker retrieve the next MaxKeys objects. This allows pagination through consistent responses even in the face of mutating buckets, in the limited sense that you get at least all objects that were not mutated while paginating.

To retrieve a single object, you use get_object.
response = boto3.client("s3").get_object(
    Bucket=BUCKET,
    Key=OBJECT_KEY,
)
value = response["Body"].read()

The value is a bytes object.

Especially for small to medium-sized objects, say up to several megabytes, this allows simple retrieval of all data.

To push such objects into the bucket, you can use the following.
response = boto3.client("s3").put_object(
    Bucket=BUCKET,
    Key=some_key,
    Body=b'some content',
)

Again, this works well for the case where the body all fits in memory.

As alluded to earlier, when uploading or downloading larger files (for example, videos or database dumps), you would like to be able to upload incrementally without keeping the whole file in memory at once.

The boto3 library exposes a high-level interface to such functionality using the *_fileobj methods. For example, you can transfer a large video file using the following.
client = boto3.client('s3')
with open("meeting-recording.mp4", "rb") as fpin:
    client.upload_fileobj(
        fpin,
        my_bucket,
        "meeting-recording.mp4"
    )
You can also use similar functionality to download a large video file.
client = boto3.client('s3')
with open("meeting-recording.mp4", "wb") as fpout:
    client.upload_fileobj(
        fpin,
        my_bucket,
        "meeting-recording.mp4"
    )

Finally, it is often the case that you would like objects to be transferred directly out of S3 or into S3 without the data going through the custom code, but you do not want to allow unauthenticated access.

For example, a continuous integration job might upload its artifacts to S3. You would like to be able to download them through the CI web interface, but having the data pass through the CI server is unpleasant. This server now needs to handle potentially larger files where people would care about transfer speeds.

S3 allows you to generate pre-signed URLs. These URLs can be given as links from another web application or sent via email or other methods and allow time-limited access to the S3 resource.
url = s3.generate_presigned_url(
    ClientMethod='get_object',
    Params={
        'Bucket': my_bucket,
        'Key': 'meeting-recording.avi'
    }
)

This URL can now be sent via email to people who need to view the recording, and they can download the video and watch it. In this case, you saved yourself any need to run a web server.

An even more interesting use case is allowing pre-signed uploads. This is especially interesting because uploading files sometimes requires subtle interplays between the web server and the web application server to allow large requests to be sent in.

Instead, uploading directly from the client to S3 allows you to remove all the intermediaries. For example, this is useful for some document-sharing applications.
post = boto3.client("s3").generate_presigned_post(
    Bucket=my_bucket,
    Key='meeting-recording.avi',
)
post_url = post["url"]
post_fields = post["fields"]
You can use this URL from code with something like the following.
with open("meeting-recording.avi", "rb"):
    requests.post(post_url,
                  post_fields,
                  files=dict(file=file_contents))

This lets you upload the meeting recording locally, even if the meeting recording device does not have S3 access credentials. It is also possible to limit the maximum size of the files via generate_presigned_post to prevent potential harm from an unknown device uploading these files.

Note that pre-signed URLs can be used multiple times. Making a pre-signed URL only valid for a limited time can mitigate any risk of potentially mutating the object after uploading. For example, if the duration is one second, you can avoid checking the uploaded object until the second is done.

13.4 Summary

AWS is a popular infrastructure as a service platform, which is generally used on a pay-as-you-go basis. It is suitable for automation of infrastructure management tasks, and boto3, maintained by AWS, is a powerful way to approach this automation.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
13.59.243.64