Chapter 9
Amazon S3

WHAT'S IN THIS CHAPTER

  • Introduction to the basic concepts of Amazon S3
  • Learn to create Amazon S3 buckets
  • Learn to upload objects into Amazon S3 buckets
  • Learn to download objects from Amazon S3 buckets
  • Learn to interact with Amazon S3 using the AWS CLI tools
  • NOTE   Amazon occasionally updates the user interface of the Amazon S3 management console. As a result, some of the screenshots in this chapter may not match what you see in your web browser. However, the general concepts that you will learn in this chapter will still be applicable.

Amazon Simple Storage Service (S3) is a highly reliable web service that allows you to securely store and retrieve object data in the AWS cloud. After Amazon EC2, Amazon S3 is one of the most commonly used services. Data on Amazon S3 is spread across multiple devices and availability zones within a region automatically.

Amazon S3 is an object-based storage service (not block-based). It is ideal for storing files but cannot be used to install an operating system; thus, it cannot provide the storage for an EC2 instance.

Data within Amazon S3 is stored using a key-value system, with keys being globally unique. There is no limit to how much data can be stored on Amazon S3; however, the maximum size of a single file cannot exceed 5 TB.

Key Concepts

In this section, you learn some of the key concepts you will encounter when working with Amazon S3.

Bucket

A bucket is a folder on Amazon S3 where you can store your files. Bucket names are globally unique; therefore, no two users can own a bucket with the same name. Amazon S3 does not internally implement a hierarchical file system similar to what you encounter on your computer's operating system. All files across all Amazon S3 buckets are stored within a global flat file system. However, your bucket names can contain the forward path delimiter character (/). Therefore, you can name your buckets in such a way so as to create the appearance of a nested folder structure.

For each bucket you create, you can set up permissions that control who can access the bucket and what they can do with the bucket. Each object you store in an Amazon S3 bucket has an object key and metadata associated with it.

Object Key

An object key is a sequence of UTF-8 characters that identifies an object in an Amazon S3 bucket. The key is assigned to the object when it is first uploaded into an Amazon S3 bucket and can be up to 1024 bytes long.

The key name is basically the name of the file you have uploaded to the bucket. Amazon S3 internally stores data alphabetically, which means files with similar names are stored next to each other on the same physical disks. This can be an important factor to consider if the files you are planning on storing in Amazon S3 are going to have some kind of sequential naming scheme, or share a common prefix with each other. If this is the case, you could encounter performance bottlenecks when reading the data out of Amazon S3; you may want to consider naming the files differently or adding a short random string, or a timestamp to the start of the filename.

Object Value

The object value is the data that you are storing. It is a sequence of bytes and can be up to 5 TB in length.

Version ID

The version ID is a string value that identifies the version of the object. Amazon S3 assigns a version ID when you upload an object to a bucket. If object versioning is subsequently enabled, every update to the object creates a new version ID. Together, the object key and the version ID uniquely identify an object.

Storage Class

Each object in Amazon S3 has a storage class associated with it. The storage class determines how Amazon S3 stores the data for the object and if you will be charged additional costs to read the data. Amazon S3 offers the following storage classes:

  • Standard: This is the default storage class for objects when they are uploaded to Amazon S3. It is ideal if your use case requires high reliability, durability, and quick access times. This storage class has been designed for 99.99% availability, 99.999999999% durability. Data is stored redundantly across devices and facilities and can withstand the loss of two facilities simultaneously.
  • Standard - IA: IA is an acronym for infrequently accessed. This storage class is designed for long-lived objects that are accessed less frequently, costs less to use than the Standard storage class, and is designed to provide the same availability and durability as the Standard storage class. You can access your objects in real time, but each retrieval has an additional charge associated with it.
  • Reduced Redundancy Storage (RRS): This storage class is designed for noncritical objects that can easily be reproduced. The objects cost less to store than the Standard storage class but are stored at lower levels of redundancy. This storage class is designed for 99.99% availability and 99.99% durability.
  • Glacier: Amazon Glacier is an independent, low-cost cloud-based archival solution. This storage class uses Amazon Glacier to store your objects and is suitable for data-archiving tasks. Storage costs are extremely low, but it can take up to 5 hours to read the data.

Costs

Amazon charges you for the following aspects when you use Amazon S3. The specific costs differ between regions.

  • Storage: You are charged for the objects you store in your Amazon S3 buckets.
  • Requests: You are charged for the number of requests being made for objects in your Amazon S3 buckets.
  • Storage Management Pricing: In November 2016, Amazon announced a new feature called Amazon S3 Object Tagging. Amazon S3 allows you to create object-based tags, and these tags can be created, updated, and deleted at any time during the life of the object. These tags can be used to get information on which objects are being accessed more than others. Amazon charges you a small fee per tag. For more information on Amazon S3 Object Tagging, visit https://aws.amazon.com/about-aws/whats-new/2016/11/revolutionizing-s3-storage-management-with-4-new-features/.
  • Data Transfer Pricing: Additional costs are involved if you want to replicate your Amazon S3 buckets across different regions.
  • Transfer Acceleration: Amazon S3 Transfer Acceleration is a feature that allows you to leverage Amazon CloudFront's CDN endpoints to offer your users access to the content of your Amazon S3 buckets. For instance, if your bucket were located in Tokyo, without Transfer Acceleration your users from around the world would have to make requests to Tokyo. With Transfer Acceleration enabled, they would only have to make requests to the nearest CloudFront CDN endpoint, which in many cases would be located much closer to them than the Amazon S3 bucket.

    You can visit the following site to get an idea of the difference in access times with and without Amazon S3 Transfer Acceleration:

http://s3-accelerate-speedtest.s3-accelerate.amazonaws.com/en/accelerate-speed-comparsion.html

To get an updated list of charges, visit https://aws.amazon.com/s3/pricing/.

Subresources

Every bucket and object in Amazon S3 has a set of subordinate objects associated with it. These subordinate objects are called subresources of the object. Subresources cannot exist on their own; they are always associated with a bucket or an object. When this chapter was written, two subresources were associated with Amazon S3 objects:

  • acl: This is an access control list that defines the list of people who have access to the resource as well as what they can do with the resource.
  • torrent: You can use this to retrieve a .torrent file associated with the specific resource.

Object Metadata

Two kinds of metadata are associated with each object in Amazon S3: system-defined and user-defined.

SYSTEM-DEFINED METADATA

As the name suggests, system-defined metadata is automatically maintained by Amazon S3 and includes information such as object creation date, object size, and more. Users cannot edit all system-defined metadata fields. Table 9.1 lists the system-defined metadata fields associated with an object.

TABLE 9.1: Amazon S3 System-Defined Metadata

NAME DESCRIPTION USER EDITABLE
Date Date when the object was created. No
Content-Length Size of the object in bytes. No
Last-Modified Date when the object was last modified (or created if the object has never been modified). No
Content-MD5 MD5 hash of the object. No
x-amz-server-side-encryption Indicates whether server-side encryption is enabled for the object and which service is providing the encryption. No
x-amz-version-id The version number of the object, only applicable to objects that have versioning enabled. No
x-amz-delete-marker Only applicable to objects that have versioning enabled; for such objects this field indicates whether the object is a delete marker. No
x-amz-storage-class Storage class used for storing the object. Yes
x-amz-website-redirect-location If configured, allows you to redirect requests for the object to another object or external URL. Yes
x-amz-server-side-encryption-aws-kms-key-id Applicable only if server-side encryption is enabled on the object. Contains the ID of the encryption key that encrypted the object. Yes
x-amz-server-side-encryption-customer-algorithm Indicates if server-side encryption is enabled on the object using customer-provided keys. Yes

USER-DEFINED METADATA

User-defined metadata is any additional key-value metadata provided by the user when the object was created.

Common Tasks

In this section, you learn to use the AWS management console to create Amazon S3 buckets and manage the content in these buckets. Log in to the IAM console using your dedicated IAM user-specific sign-in link and navigate to the Amazon S3 service home page (Figure 9.1).

Screenshot of accessing the Amazon S3 management console.

FIGURE 9.1 Accessing the Amazon S3 management console

Creating a Bucket

To create a new Amazon S3 bucket, follow these steps.

  1. Click the Create Bucket button. The Amazon S3 service is available in all regions, so you do not need to select a region in the management console. If you have never created an Amazon S3 bucket you will be presented with the S3 management console welcome page (see Figure 9.2).
    Screenshot of Amazon S3 management console welcome page.

    FIGURE 9.2 Amazon S3 management console welcome page

    If you have existing buckets in your Amazon S3 account, you will be presented with a page that lists them (Figure 9.3).

    Screenshot of a list of Amazon S3 buckets.

    FIGURE 9.3 List of Amazon S3 buckets

  2. A bucket, on the other hand, is region-specific, and you will need to provide a unique name for your bucket as well as select the region in which you want to create it (Figure 9.4). Click Next to proceed to the next screen once you have specified the bucket name and region.
    Screenshot of specifying the bucket name and region.

    FIGURE 9.4 Specifying the bucket name and region

    In this section, the name of the bucket being created is com.asmtechnology.samplebucket and is located in the EU (Ireland) region. The name you choose for your bucket must be globally unique, and prefixing a reverse domain name is a common practice to ensure unique naming.

  3. You are presented with a screen that will let you configure bucket versioning, logging, and cost allocation tags (Figure 9.5). You do not need to set up these options at this stage; click Next.
    Screenshot of configuring versioning, logging, and cost allocation tags.
    FIGURE 9.5 Configuring versioning, logging, and cost allocation tags
  4. You are presented with a screen that will let you configure permissions for the new bucket (Figure 9.6). By default, a new bucket is not accessible publicly, and can only be accessed by the user who created it via the AWS CLI or the AWS management console.
    Screenshot of configuring bucket permissions.

    FIGURE 9.6 Configuring bucket permissions

    Access to Amazon S3 resources are controlled using resource-based IAM policies. A resource-based IAM policy is a JSON document that describes which IAM users have access to a resource, and what they can do with the resource. Amazon S3 buckets and objects within the buckets have independent resource-based policies, and objects do not inherit permissions from a bucket.

    Each bucket also has an XML document associated with it, called an access control list (ACL). The ACL is used to control access to the bucket from other AWS accounts, and the general public.

    It is highly recommended to leave the default options unchanged on this screen, and change them (if needed) at a later point in time. Click Next to proceed.

  5. You are presented with a screen that summarizes the options and settings for the bucket that will be created (Figure 9.7). Click the Create Bucket button to create the bucket.
    Screenshot of bucket summary page.
    FIGURE 9.7 Bucket summary page
  6. The bucket will be created, and you will be taken to a page that lists all your Amazon S3 buckets. When you click the icon beside the name of a bucket from the list, a pop-up window will appear with shortcuts to options that allow you to configure bucket-specific settings (Figure 9.8).
    Screenshot of list of Amazon S3 buckets in your account.
    FIGURE 9.8 List of Amazon S3 buckets in your account

If you have one or more buckets, this screen becomes the home screen presented to you whenever you visit the Amazon S3 console.

Uploading an Object

Complete these steps to upload an object to an existing bucket.

  1. Click the name of the bucket in the list of buckets to access its contents (Figure 9.9).
    Screenshot of contents of Amazon S3 bucket.
    FIGURE 9.9 Contents of an Amazon S3 bucket
  2. Click the Upload button to bring up the File Upload dialog box (Figure 9.10). Use the options in the File Upload dialog box to select one or more files from your computer; then click the Next button.
    Screenshot of selecting files in the File Upload dialog box.
    FIGURE 9.10 Selecting files in the File Upload dialog box
  3. You are presented with a screen that will let you configure permissions for the new object (Figure 9.11). By default, a new object can only be accessed by the user who created it via the AWS CLI or the AWS management console. If you want the object to be accessible to other AWS accounts, you can add the accounts to the ACL for the bucket. If you would like the object to be accessible to users on the Internet via a URL, ensure you change the value in the Manage Public Permissions drop-down menu from Do Not Grant Public Read Access To This Object(s) to Grant Public Read Access To This Object in the Manage Permissions drop-down menu.
    Screenshot of configuring object permissions.
    FIGURE 9.11 Configuring object permissions
  4. You are presented with a screen that will let you select a storage class and an encryption setting, and specify any user-defined metadata for the new file (Figure 9.12). By default, a new file uses the Standard storage class and no encryption. User-defined metadata is a set of key-value pairs that can only be specified at the point when an object is created. Accept the default options and click Next.
    Screenshot of configuring file storage class and encryption.
    FIGURE 9.12 Configuring file storage class and encryption
  5. You are presented with a screen that summarizes the options and settings for the file that will be uploaded (Figure 9.13). Click the Upload button to upload the file to the bucket.
    Screenshot of file summary page.

    FIGURE 9.13 File summary page

    Once the file has finished uploading, it appears in your bucket (Figure 9.14).

    Screenshot of Amazon S3 bucket showing a file.

    FIGURE 9.14 Amazon S3 bucket showing a file

Accessing an Object

To download an object from your Amazon S3 bucket onto your computer, follow these steps:

  1. Navigate to the bucket using the Amazon S3 management console, select the icon beside the name of the bucket, and click the Download button in the pop-up dialog that appears on the screen (Figure 9.15).
    Screenshot of downloading a file from a bucket.

    FIGURE 9.15 Downloading a file from a bucket

    If you do not want to use the management console, you can also access any object in Amazon S3 using a URL.

  2. To find the URL for an object in an Amazon S3 bucket, navigate to the bucket in the management console and select the object within the bucket. Copy the value of the Object URL field in the pop-up dialog (Figure 9.16).
    Screenshot of locating the Amazon S3 Object URL.

    FIGURE 9.16 Locating the Amazon S3 Object URL

    The value within the Object URL field is a URL that follows this naming convention:

    https://s3.<region name>.amazonaws.com/<bucket name> /<file name>

    For example, a file called sunset.jpg, in a bucket called com.asmtechnology.awsbook.testbucket1, in the eu-west-2 region can be accessed using the following URL:

    https://s3.eu-west-2.amazonaws.com/com.asmtechnology.awsbook.testbucket1/sunset.jpg

    If both the bucket and the file you are accessing are not publicly accessible, you will receive an access denied error when you try the URL in a web browser (Figure 9.17).

    Screenshot of non-public buckets and files that are not accessible using a URL.

    FIGURE 9.17 Non-public buckets and files are not accessible using a URL.

  3. Before you can make the object publicly accessible, you will need to change the public access settings for the bucket. Select the bucket and switch to the Permissions tab (Figure 9.18).
    Screenshot of accessing Amazon S3 bucket permissions.
    FIGURE 9.18 Accessing Amazon S3 bucket permissions
  4. Click the Edit button and uncheck all four options (Figure 9.19).
    Screenshot of configuring Amazon S3 bucket permissions.
    FIGURE 9.19 Configuring Amazon S3 bucket permissions
  5. Click the Save button.
  6. Switch to the Overview tab and select the object in the Amazon S3 bucket. Click the Make Public menu item under the Actions drop-down menu (Figure 9.20).
    Screenshot of accessing the Make Public option.
    FIGURE 9.20 Accessing the Make Public option
  7. Click the Make Public button in the pop-up dialog that appears on the screen (Figure 9.21).
    Screenshot of making a file publicly accessible.
    FIGURE 9.21 Making a file publicly accessible

If you retry the URL in a web browser, you will be able to access the file. You can either set permissions at an individual object level, or you can set up permissions for the entire bucket.

Changing the Storage Class of an Object

The default storage class of objects on Amazon S3 is Standard. To change the storage class of an object:

  1. Navigate to the bucket using the Amazon S3 management console and select the object from the contents of the bucket.
  2. Select the Change Storage Class menu item under the Actions drop-down menu (Figure 9.22).
    Screenshot of changing the storage class of an object.
    FIGURE 9.22 Changing the storage class of an object
  3. Select an option for the storage class and click the Save button.

Deleting an Object

To delete an object from an Amazon S3 bucket:

  1. Navigate to the bucket using the Amazon S3 management console and select the object from the contents of the bucket.
  2. Select the Delete menu item under the Actions drop-down menu (Figure 9.23).
Screenshot of deleting an object from an Amazon S3 bucket.

FIGURE 9.23 Deleting an object from an Amazon S3 bucket

Once you delete an object, it is permanently removed from Amazon S3. The only exception to this rule occurs when versioning has been enabled on a bucket, in which case an object that has been deleted from a bucket can be restored.

Amazon S3 Bucket Versioning

Versioning is a bucket-level concept that, when enabled, stores all versions of an object. You can download an older version of an object, and you can even recover an object after it has been deleted. Once versioning is enabled on a bucket, you cannot remove it. You can, however, temporarily suspend versioning.

To enable versioning on a bucket:

  1. Navigate to the bucket in the management console and click the Properties tab.
  2. Expand the Versioning section, select the Enable Versioning option, and click the Save button (Figure 9.24).
Screenshot of enabling bucket versioning.

FIGURE 9.24 Enabling bucket versioning

To understand how versioning works:

  1. Create a new text document on your computer called welcome_letter.txt and in that document type the following line:
    Welcome to the world of Amazon Web Services.
  2. Save the document on your computer and upload it to a bucket that has versioning enabled. To make the document accessible to the public you can select Grant Public Read Access To This Object(s) in the Manage Public Permissions drop-down (Figure 9.25). This will ensure you can access the document from a web browser.
    Screenshot of making an object publicly accessible while uploading it.
    FIGURE 9.25 Making an object publicly accessible while uploading it
  3. Obtain the URL for the document and open the document in a web browser. Your web browser should render the contents of the text document.
  4. Open the welcome_letter.txt file that you had previously saved on your computer, and edit its contents to resemble the following:
    Welcome to the world of Amazon Web Services.
    Amazon S3 versioning allows you to access older versions of documents.
  5. Save the file and upload it again to the same bucket.
  6. Once the document has finished uploading, click the row titled welcome_letter.txt to reveal a pop-up dialog with options. Expand the versions drop-down menu in the pop-up dialog to reveal links to the different versions of the document (Figure 9.26).
    Screenshot of accessing document versions.

    FIGURE 9.26 Accessing document versions

    The newest version of the document is always listed at the top. It is important to remember that you are charged for the combined space occupied by all versions of a document.

  7. If you want to delete a version of the document that you do not need, click the trash can icon beside a document version in the versions drop-down menu. When you delete a version of a document (and not the entire document itself), the version you are deleting is permanently lost.
  8. If instead you want to delete the document, select the document, and use the Delete menu item under the Actions menu.

When versioning is enabled on a bucket, you will see an additional selector that allows you to view all versions of the objects in your bucket (Figure 9.27).

Screenshot of version selector switch.

FIGURE 9.27 Version selector switch

When the selector switch is set up to show versioned objects, you can see not only object versions, but also delete markers, which are special entries used to indicate that an object has been deleted. Restoring a deleted object is simply a matter of deleting the delete marker.

Accessing Amazon S3 Using the AWS CLI

You can use the AWS CLI to interact with Amazon S3 over the command line. Setup and configuration of the CLI client for Mac OS X and Windows is covered in Appendix C.

The general syntax of the aws command follows:

$ aws <service identifier > <service instructions>

The service identifier is a string that identifies an AWS service you want to interact with. The service identifier for Amazon S3 is s3 (in lowercase). Each service supports a different list of instructions. For a complete list of s3 instructions that are available within the CLI, visit http://docs.aws.amazon.com/cli/latest/userguide/using-s3-commands.html.

As an example, the ls instruction retrieves a list of all buckets in the user account that have been configured into the CLI. If you type the following instruction at the command prompt:

$ aws s3 ls

you receive a list of buckets:

Abhisheks-MacBook:~ abhishekmishra$ aws s3 ls
2017-01-15 16:52:59 com.asmtechnology.awsbook.testbucket1
Abhisheks-MacBook:~ abhishekmishra$

In addition to the high-level operations that can be performed using the s3 service identifier, Amazon also provides access to lower-level operations using the s3api service identifier. For more information on lower-level operations that can be performed on Amazon S3 buckets using the s3api service identifier, visit http://docs.aws.amazon.com/cli/latest/userguide/using-s3api-commands.html.

Summary

  • Amazon S3 is a key-value, object-based storage service.
  • Amazon S3 organizes objects into buckets; bucket names must be globally unique.
  • Objects can be uploaded to Amazon S3 buckets using the AWS management console or the AWS CLI.
  • You can control access to both buckets and the objects in buckets.
  • Each object in Amazon S3 has a storage class associated with it. The storage class determines how Amazon S3 stores the data for the object and if you will be charged additional costs to read the data.
  • Amazon S3 versioning allows you to save multiple versions of an object. You are charged for the combined space occupied by all versions of a document.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.12.107.81