Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 8
The Core Storage Services

THE AWS CERTIFIED CLOUD PRACTITIONER EXAM OBJECTIVES COVERED IN THIS CHAPTER MAY INCLUDE, BUT ARE NOT LIMITED TO, THE FOLLOWING:

Domain 2: Security
2.2 Define AWS Cloud security and compliance concepts
2.3 Identify AWS access management capabilities
Domain 3: Technology
3.1 Define methods of deploying and operating in the AWS Cloud
3.3 Identify the core AWS services

images

Introduction

For many organizations, their most valuable asset is their data. But organizations often have their data split between the cloud and on-premises locations such as offices and data centers. It’s therefore critical to understand the different options AWS provides for storing data in the cloud, transferring data between the cloud and on-premises, and doing both in a cost-effective way without sacrificing performance.

In this chapter, you’ll learn about the following AWS services that answer these challenges:

Simple Storage Service (S3)—Amazon’s flagship cloud storage service that lets you store and retrieve unlimited amounts of data. Because it’s part of the AWS ecosystem, data in S3 is available to other AWS services, including EC2, making it easy to keep storage and compute together for optimal performance.
S3 Glacier—Offers long-term archiving of infrequently accessed data, such as backups that must be retained for many years.
AWS Storage Gateway—A virtual appliance that seamlessly moves data back and forth between your on-premises servers and AWS S3. It uses industry-standard storage protocols, making integration seamless.
AWS Snowball—A hardware storage appliance designed to physically move massive amounts of data to or from S3, particularly when transferring the data over a network would take days or weeks.

Simple Storage Service

Amazon Simple Storage Service (S3) lets you store and retrieve unlimited amounts of data from anywhere in the world at any time. You can use S3 to store any kind of file. Although AWS calls it “storage for the internet,” you can implement access controls and encryption to restrict access to your files to specific individuals and IP addresses. S3 opens up a variety of uses, both inside and outside of AWS. Many AWS services use S3 to store logs or retrieve data for processing, such as with analytics. You can even use S3 to host static websites!

Objects and Buckets

S3 differs from Elastic Block Store (EBS), which you learned about in Chapter 7, “The Core Compute Services.” Rather than storing blocks of raw data, S3 stores files, or, as AWS calls them, objects on disks in AWS data centers. You can store any kind of file, including text, images, videos, database files, and so on. Each object can be up to 5 TB in size.

The filename of an object is called its key and can be up to 1,024 bytes long. An object key can consist of alphanumeric characters and some special characters, including the following:

! - _ . * ( )

S3 stores objects in a container called a bucket that’s essentially a flat file system. A bucket can store an unlimited number of objects. When you create a bucket, you must assign it a name between 3 and 63 characters long, and the name must be globally unique across AWS. Within a bucket, each object key must be unique. One reason for this is to make it easier to access objects using an S3 endpoint URL. For example, the object text.txt stored in a bucket named benpiper would have a URL of https://benpiper.s3.amazonaws.com/text.txt.

To ensure a globally unique name when creating a bucket, try using a friendly name with a random string appended to the end.

Although bucket names must be globally unique, each bucket—and by extension any objects in that bucket—can exist in only one region. This helps with latency, security, cost, and compliance requirements, as your data is stored only in the region in which you create the bucket. You can use cross-region replication to copy the contents of an object from a bucket in one region to a bucket in another, but the objects are still uniquely separate objects. AWS never moves objects between regions.

Even though S3 functions as a flat file system, you can organize objects into a folder structure by including a forward slash (/) delimiter in the name. For example, you could create an object with the name production/database.sql and another object named development/database.sql in the same bucket. Because the delimiter is part of the name, the objects are unique.

As you might expect, S3 comes with a cost. You’re not charged for uploading data to S3, but you may be charged when you download data. For example, in the US East region, you can download up to 1 GB of data per month at no charge. Beyond that, the rate is $0.09 or less per gigabyte. S3 also charges you a monthly fee based on how much data you store and the storage class you use.

S3 Storage Classes

All data is not created equal. Depending on its importance, certain files may require more or less availability or protection against loss than others. Some data you store in S3 is irreplaceable, and losing it would be catastrophic. Examples of this might include digital photos, encryption keys, and medical records.

Durability and Availability

Objects that need to remain intact and free from inadvertent deletion or corruption are said to need high durability, which is the percent likelihood that an object will not be lost over the course of a year. The greater the durability of the storage medium, the less likely you are to lose an object.

To understand how durability works, consider the following two examples. First, suppose you store 100,000,000,000 objects on storage with 99.999999999 percent durability. That means you could expect to lose only one (0.000000001 percent) of those objects over the course of a year! Now consider a different example. Suppose you store the same number of objects on storage with only 99.99 percent durability. You’d stand to lose 10,000,000 objects per year!

Different data also has differing availability requirements. Availability is the percent of time an object will be available for retrieval. For instance, a patient’s medical records may need to be available around the clock, 365 days a year. Such records would need a high degree of both durability and availability.

The level of durability and availability of an object depends on its storage class. S3 offers six different storage classes at different price points. S3 charges you a monthly storage cost based on the amount of data you store and the storage class you use. Table 8.1 gives a complete listing of storage classes.

TABLE 8.1 S3 Storage Classes

Storage Class	Durability	Availability	Availability Zones	Storage Pricing in the US East Region (per GB/month)
STANDARD	99.999999999%	99.99%	>2	$0.023
STANDARD_IA	99.999999999%	99.9%	>2	$0.0125
INTELLIGENT_TIERING	99.999999999%	99.9%	>2	Frequent access tier: $0.023 Infrequent access tier: $0.0125
ONEZONE_IA	99.999999999%	99.5%	1	$0.01
GLACIER	99.999999999%	Varies	>2	$0.004
REDUCED_REDUNDANCY (RRS)	99.99%	99.99%	>2	$0.024

With one exception that we’ll cover in the following section, you’ll want the highest degree of durability available. Your choice of storage class then hinges upon your desired level of availability, which depends on how frequently you’ll need to access your files. You may find it helpful to categorize your files in terms of the following three access patterns:

Frequently accessed objects
Infrequently accessed objects
A mixture of frequently and infrequently accessed objects

Storage Classes for Frequently Accessed Objects

If you need to access objects frequently and with minimal latency, the following two storage classes fit the bill:

STANDARD This is the default storage class. It offers the highest levels of durability and availability, and your objects are always replicated across at least three Availability Zones in a region.

REDUCED_REDUNDANCY The REDUCED_REDUNDANCY (RRS) storage class is meant for data that can be easily replaced, if it needs to be replaced at all. It has the lowest durability of all the classes, but it has the same availability as STANDARD. AWS recommends against using this storage class but keeps it available for people who have processes that still depend on it. If you see anyone using it, do them a favor and tell them to move to a storage class with higher durability!

Storage Classes for Infrequently Accessed Objects

Two of the storage classes designed for infrequently accessed objects are suffixed with the “IA” initialism for “infrequent access.” These IA classes offer millisecond-latency access and high durability but the lowest availability of all the classes. They’re designed for objects that are at least 128 KB in size. You can store smaller objects, but each will be billed as if it were 128 KB:

STANDARD_IA This class is designed for important data that can’t be re-created. Objects are stored in multiple Availability Zones and have an availability of 99.9 percent.

ONEZONE_IA Objects stored using this storage class are kept in only one Availability Zone and consequently have the lowest availability of all the classes: only 99.5 percent. An outage of one Availability Zone could affect availability of objects stored in that zone. Although unlikely, the destruction of an Availability Zone could result in the loss of objects stored in that zone. Use this class only for data that you can re-create or have replicated elsewhere.

GLACIER The GLACIER class is designed for long-term archiving of objects that rarely need to be retrieved. Objects in this storage class are stored using the S3 Glacier service, which you’ll read about later in this chapter. Unlike the other storage classes, you can’t retrieve an object in real time. Instead, you must initiate a restore request for the object and wait until the restore is complete. The time it takes to complete a restore depends on the retrieval option you choose and can range from 1 minute to 12 hours. Consequently, the availability of data stored in Glacier varies. Refer to the “S3 Glacier” section later in this chapter for information on retrieval options.

Storage Class for Both Frequently and Infrequently Accessed Objects

S3 currently offers only one storage class designed for both frequently and infrequently accessed objects:

INTELLIGENT_TIERING This storage class automatically moves objects to the most cost-effective storage tier based on past access patterns. An object that hasn’t been accessed for 30 consecutive days is moved to the lower-cost infrequent access tier. Once the object is accessed, it’s moved back to the frequent access tier. Note that objects less than 128 KB are always charged at the higher-cost frequent access tier rate. In addition to storage pricing, you’re charged a monthly monitoring and automation fee.

Access Permissions

S3 is storage for the internet, but that doesn’t mean everyone on the internet can read your data. By default, objects you put in S3 are inaccessible to anyone outside of your AWS account.

S3 offers the following three methods of controlling who may read, write, or delete objects stored in your S3 buckets:

Bucket policies
User policies
Bucket and object access control lists

Bucket Policies

A bucket policy is a resource-based policy that you apply to a bucket. You can use bucket policies to grant access to all objects within a bucket or just specific objects. You can also control which principals and accounts can read, write, or delete objects. You can also grant anonymous read access to make an object, such as a webpage or image, available to everyone on the internet.

User Policies

In Chapter 5, “Securing Your AWS Resources,” you learned about Identity and Access Management (IAM) user policies. You can use these policies to grant IAM principals access to S3 objects. Unlike bucket policies that you apply to a bucket, you can apply user policies only to an IAM principal. Keep in mind that you can’t use user policies to grant public (anonymous) access to an object.

Bucket and Object Access Control Lists

Bucket and object access control lists (ACLs) are legacy access control methods that have mostly been superseded by bucket and user policies. Nevertheless, you can still use bucket and object ACLs to grant other AWS accounts and anonymous users access to your S3 resources. You can’t use ACLs to grant access to specific IAM principals. Due in part to this limitation, AWS recommends using bucket and user policies instead of ACLs whenever possible.

You can use any combination of bucket policies, user policies, and access control lists. They’re not mutually exclusive.

Encryption

S3 doesn’t change the contents of an object when you upload it. That means if you upload a document containing personal information, that document is stored unencrypted. Using appropriate access permissions can protect your data from unauthorized access, but to add an additional layer of security, you have the option of encrypting objects before storing them in S3. This is called encryption at rest. S3 gives you the following two options for encrypting objects at rest:

Server-side encryption—When you create an object, S3 encrypts the object and saves only the encrypted content. When you retrieve the object, S3 decrypts it and delivers the unencrypted object. Server-side encryption is the easiest to implement and doesn’t require you to keep track of encryption keys. Amazon manages the keys and therefore has access to your objects.
Client-side encryption—You encrypt the data prior to uploading it to S3. You must decrypt the object when you retrieve it from S3. This option is more complicated, as you’re responsible for encryption and decryption. If you lose the key used to encrypt an object, you won’t be able to decrypt it. Organizations with strict security requirements may choose this option to ensure Amazon doesn’t have the ability to read their encrypted objects.

Versioning

To further protect against accidentally deleting or overwriting the contents of your important files, you can use versioning. To understand how versioning works, consider this example. Without versioning, if you upload an object with the same name as an existing object in the same bucket, the contents of the original object will get overwritten. But if you enable versioning on the bucket and then upload an object with the same name as an existing object, S3 will simply create a new version of that object. The original version will remain intact and available.

If you delete an object in a bucket on which versioning is disabled, the contents of the object aren’t deleted. Instead, S3 adds a delete marker to the object and hides it from the S3 service console view.

Versioning is disabled by default when you create a bucket. You must explicitly enable versioning on a bucket to use it, and it applies to all objects in the bucket. There’s no limit to the number of versions of an object you can store. You can delete versions manually or automatically using object life cycle configurations.

Deleting a bucket will also delete all objects contained in it. Be careful!

Object Life Cycle Configurations

Because S3 can store practically unlimited amounts of data, it’s possible to run up quite a bill if you continually upload objects but never delete any. This could happen if you have an application that frequently uploads log files to a bucket. You may also spend more than necessary by keeping objects in a more expensive storage class when a cheaper one would meet your needs. Object life cycle configurations can help you control costs by automatically moving objects to different storage classes or deleting them after a time.

Object life cycle configuration rules are applied to a bucket and consist of one or both of the following types of actions:

Transition actions Transition actions move objects to a different storage class once they’ve reached a certain age. For example, you can create a rule to move objects from the STANDARD storage class to the STANDARD_IA storage class 90 days after creation.

Expiration actions These can automatically delete objects after they reach a certain age. For example, you can create a rule to delete an object older than 365 days. If you have versioning enabled on a bucket, you can create expiration actions to delete object versions of a certain age. This allows you to take advantage of versioning without having to store endless versions of an object.

It’s common to use both types of actions together. For example, if you’re storing web server log files in a bucket, you may want to initially keep each log file in STANDARD storage. Once the file reaches 90 days, you can have S3 transition it to STANDARD_IA storage. Then, after 365 days in STANDARD_IA storage, the file is deleted.

Follow the steps in Exercise 8.1 to create your own S3 bucket.

EXERCISE 8.1

Create an S3 Bucket

To create an S3 bucket with versioning enabled

In the S3 service console, choose the Create Bucket button.
Enter a bucket name. Remember that this must be globally unique, so try to pick something that nobody else is likely to have used.
Select a region of your choice, and then choose the Next button.
Enable versioning by checking the Versioning check box, and then choose the Next button.
Choose the Next button again to stick with the default settings. By default, objects in an S3 bucket are not publicly accessible.
Choose the Create Bucket button.
You should see the bucket you just created. Choose the bucket name.

From here, you can upload a file, create a folder, and view and delete object versions. It costs nothing to keep an empty bucket around, but when you’re done with your bucket, feel free to delete it.

S3 Glacier

S3 Glacier offers long-term archiving of infrequently accessed data at an incredibly low cost. You upload large amounts of data for long-term, durable storage in the hopes that you’ll never need it, but with the expectation that it will be there in the unlikely event that you do. Glacier guarantees 99.999999999 percent durability over a given year. Not coincidentally, this is the same durability level as the GLACIER storage class in S3.

Archives and Vaults

With Glacier, you store one or more files in an archive, which is a block of information. Although you can store a single file in an archive, the more common approach is to combine multiple files into a .zip or .tar file and upload that as an archive. An archive can be anywhere from 1 byte to 40 TB. Glacier stores archives in a vault, which is a region-specific container (much like an S3 bucket) that stores archives. Vault names must be unique within a region but don’t have to be globally unique.

You can create and delete vaults using the Glacier service console. But to upload, download, or delete archives, you must use the AWS command line interface (CLI) or write your own code to interact with Glacier using an AWS software development kit (SDK). There are also third-party programs that let you interact with Glacier, such as CloudBerry Backup (https://www.cloudberrylab.com/backup.aspx), FastGlacier (https://fastglacier.com), and Arq Backup (https://www.arqbackup.com).

Retrieval Options

Because Glacier is designed for long-term archiving, it doesn’t provide real-time access to archives. Downloading an archive from Glacier is a two-step process that requires initiating a retrieval job and then downloading your data once the job is complete. The length of time it takes to complete a job depends on the retrieval option you choose. There are three retrieval options:

Expedited Except during times of unusually high demand, expedited retrievals usually complete within 1 to 5 minutes, although archives larger than 250 MB may take longer. In the US East region, the cost is $0.03 per gigabyte. You can optionally purchase provisioned capacity to ensure expedited retrievals complete in a timely fashion.

Standard Standard retrievals typically complete within 3 to 5 hours. This is the default option. The cost of this option is $0.01 per gigabyte in the US East region.

Bulk Bulk retrievals are the lowest-cost option, at $0.0025 per gigabyte in the US East region, and they typically complete within 5 to 12 hours.

AWS Storage Gateway

AWS Storage Gateway makes it easy to connect your existing on-premises servers to storage in the AWS cloud. Because it uses industry-standard storage protocols, there’s no need to install special software on your existing servers. Instead, you just provision an AWS Storage Gateway virtual machine on-premises and connect your servers to it. Storage Gateway handles the data transfer between your servers and the AWS storage infrastructure. The virtual machine can run on a VMware ESXi or Microsoft Hyper-V hypervisor.

AWS Storage Gateway offers the following three virtual machine types for different use cases:

File gateways
Volume gateways
Tape gateways

File Gateways

A file gateway lets you use the Network File System (NFS) and Server Message Block (SMB) protocols to store data in Amazon S3. Although data is stored on S3, it’s cached locally, allowing for low-latency access. A file gateway can function as a normal on-premises file server. Because data is stored in S3, you can take advantage of all S3 features including versioning, bucket policies, life cycle management, encryption, and cross-region replication.

Volume Gateways

Volume gateways offer S3-backed storage volumes that your on-premises servers can use via the Internet Small Computer System Interface (iSCSI) protocol. Volume gateways support the following two configurations:

Stored volumes With a stored volume, Storage Gateway stores all data locally and asynchronously backs it up to S3 as Elastic Block Store (EBS) snapshots. Stored volumes are a good option if you need uninterrupted access to your data. A stored volume can be from 1 GB to 16 TB in size.

Cached volumes The volume gateway stores all your data on S3, and only a frequently used subset of that data is cached locally. A cached volume can range from 1 GB to 32 TB in size. This is a good option if you have a limited amount of local storage. A cached volume can range from 1 GB to 32 TB in size. Because only a subset of data is cached locally, it’s possible that any interruption in connectivity to AWS could make some data inaccessible. If this isn’t acceptable, you should use stored volumes instead.

Both configurations allow you to take manual or scheduled EBS snapshots of your volumes. EBS snapshots are always stored in S3. You can restore an EBS snapshot to an on-premises gateway storage volume or an EBS volume that you can attach to an EC2 instance.

Tape Gateways

A tape gateway mimics traditional tape backup infrastructure. It works with common backup applications such as Veritas Backup Exec and NetBackup, Commvault, and Microsoft System Center Data Protection Manager. You simply configure your backup application to connect to the tape gateway via iSCSI. On the tape gateway, you create virtual tapes that can store between 100 GB and 2.5 TB each.

A tape gateway stores virtual tapes in a virtual tape library (VTL) backed by S3. Here’s how it works: when your backup software writes data to a virtual tape, the tape gateway asynchronously uploads that data to S3. Because backups can be quite large and take a long time to upload, the tape gateway keeps the data in cache storage until the upload is complete. If you need to recover data from a virtual tape, the tape gateway will download the data from the S3-backed VTL and store it in its cache.

For cost-effective, long-term storage, you can archive virtual tapes by moving them out of a VTL and into a virtual tape shelf backed by Glacier. To restore an archived virtual tape, you must initiate a retrieve request, which can take 3 to 5 hours. Once the retrieval is complete, the virtual tape will be available in your S3-backed VTL and will be available to transfer to the tape gateway.

AWS Snowball

AWS Snowball is a hardware appliance designed to move massive amounts of data between your site and the AWS cloud in a short time. Some common use cases for Snowball include the following:

Migrating data from an office or data center to the AWS cloud
Quickly transferring a large amount of data to or from S3 for backup or recovery purposes
Distributing large volumes of content to customers and partners

The idea behind Snowball is that it’s quicker to physically ship a large amount of data than it is to transfer it over a network. For instance, suppose you want to migrate a 40 TB database to AWS. Such a transfer even over a blazing-fast 1 Gbps connection would still take more than 4 days!

But instead, for a nominal fee, AWS will send you a Snowball device. You simply transfer your files to it and ship it back. When AWS receives it, AWS transfers the files from Snowball to one or more S3 buckets. You’re not charged any transfer fees for importing files into S3, and once there, they’re immediately available for use by other AWS services.

Hardware Specifications

The largest 80 TB Snowball device costs $250 to use and can store up to 72 TB. If you don’t need to transfer that much, you can opt for the slightly smaller 50 TB Snowball, which costs $200 and stores up to 42 TB. Snowball’s RJ45 and SFP+ network interfaces support speeds up to 10 Gbps, making it possible to transfer 72 TB of data to the device in about 2.5 days! (Although a 10 Gbps connection can transfer this amount of data in less than a day, the write speeds of the solid-state drives [SSDs] in Snowball limit the effective transfer rate to around 3 Gbps.)

Once you receive your Snowball, you can keep it for 10 days without incurring any additional costs. If you hold onto it longer than that, you’ll be charged an extra $15 per day. You’re allowed to keep Snowball for up to 90 days, which is more than enough time to fill it up.

You can use Snowball to export data from S3, but you’ll be charged outbound S3 transfer rates, which range from $0.03 to $0.05 per gigabyte, depending on the region.

Security

Snowball is contained in a rugged, tamper-resistant enclosure. It includes a trusted platform module (TPM) chip that detects unauthorized modifications to the hardware, software, or firmware. After each use, AWS verifies that the TPM did not detect any tampering. If any tampering is detected by the TPM chip or if the device appears damaged, AWS does not transfer any data from it.

Snowball uses two layers of encryption. First, when you transfer data to or from Snowball, the data is encrypted in transit using SSL. Second, the data you put on a Snowball is always encrypted at rest. Snowball enforces data encryption by requiring you to transfer data to it using only either the Snowball Client or the more advanced S3 SDK Adapter for Snowball. The former doesn’t require any coding knowledge. Both run on Linux, macOS, and Windows operating systems.

Data is encrypted using AES 256-bit encryption that’s enforced by the Snowball Client or S3 SDK Adapter for Snowball, ensuring that the device never stores your data unencrypted. As an added security measure, AWS erases your data from Snowball before sending it to another customer, following the media sanitization standards published by the National Institutes for Standards and Technology (NIST) in Special Publication 800-88.

Snowball Edge

Snowball Edge is like Snowball but offers a wider variety of features. Snowball Edge offers the same network connectivity options as Snowball but adds a QSFP+ port, allowing you to achieve faster network speeds than Snowball. Also, Snowball is designed to transfer large amounts of data only between your local environment and S3. Snowball Edge offers the same functionality plus the following:

Local storage for S3 buckets
Compute power for EC2 instances and Lambda functions locally
File server functionality using the Network File System (NFS) version 3 and 4 protocols

There are three different device options to choose from, each optimized for a different application:

Storage Optimized This option provides up to 80 TB of usable storage, 24 vCPUs, and 32 GB of memory for compute applications. The QSFP+ network interface supports up to 40 Gbps.

Compute Optimized This offers the most compute power, giving you 52 vCPUs and 208 GB of memory. It has 39.5 TB of usable storage, plus 7.68 TB dedicated to compute instances.

Compute Optimized with GPU This is identical to the Compute Optimized option, except it includes an NVIDIA V100 Tensor Core graphical processing unit (GPU), making it ideal for machine learning and high-performance computing applications. Both Compute Optimized device options feature a QSFP+ network interface capable of speeds up to 100 Gbps.

You can cluster 5 to 10 Snowball Edge devices together to build a local, highly available compute or storage cluster. This is useful if you have a large amount of data that you need to process locally.

Snowball Edge doesn’t support virtual private clouds (VPCs). You can’t place an EC2 instance running on Snowball Edge in a VPC. Instead, you assign the instance an IP addresses on your local network.

Table 8.2 highlights some key similarities and differences between Snowball and Snowball Edge.

TABLE 8.2 Comparison of Snowball and Snowball Edge

Feature	Snowball	Snowball Edge
Transfer data to and from S3	Yes	Yes
Local EC2 instances	No	Yes
Local compute with Lambda	No	Yes
File server functionality using NFS	No	Yes
Local S3 buckets	No	Yes

Summary

S3 is the primary storage service in AWS. Although S3 integrates with all other AWS services, it enjoys an especially close relationship with S3 Glacier and the AWS compute services: EC2, and Lambda.

For durable, highly available cloud storage, use S3. You can use bucket policies to make your files as private or as public as you want. If you need long-term storage of infrequently accessed data at a lower cost, S3 Glacier is your best bet.

When it comes to local storage, AWS Storage Gateway lets you access your data by going through a virtual machine that automatically synchronizes your data with S3.

For getting your files to or from S3, most often you’ll just transfer your files over the internet, a virtual private network (VPN), or Direct Connect link. But for large amounts of data this is impractical, and a hardware solution makes more sense. Snowball is a rugged, tamper-resistant device that AWS ships to you. You drop your files onto it, send it back, and AWS transfers the files to S3.

Another hardware option is Snowball Edge. It has the same functionality as Snowball but can also function as a durable local file server using the NFSv3 and NFSv4 protocols. Additionally, you can use it to run EC2 instances or Lambda functions locally.

Exam Essentials

Understand the difference between durability and availability in S3. Durability is the likelihood that an object won’t be lost over the course of a year. Availability is the percentage of time an object will be accessible during the year.

Be able to select the best S3 storage class given cost, compliance, and availability requirements. S3 offers six storage classes. STANDARD has the highest availability at 99.99 percent, replicates objects across at least three zones, and is the most expensive in terms of monthly storage cost per gigabyte. ONEZONE_IA has the lowest availability at 99.5 percent and stores objects in only one zone, and its monthly per-gigabyte storage cost is less than half that of the STANDARD storage class.

Know the different options for getting data into and out of S3. You can upload or download an object by using the S3 service console, by using the AWS CLI, or by directly accessing the object’s URL. AWS Storage Gateway lets your on-premises servers use industry-standard storage protocols such as iSCSI, NFS, and SMB to transfer data to and from S3. AWS Snowball and Snowball Edge allow secure physical transport of data to and from S3.

Understand when to use bucket policies, user policies, and access control lists in S3. Use bucket policies or ACLs to grant anonymous access to objects, such as webpages or images you want made public. Use user policies to grant specific IAM principals in your account access to objects.

Be able to explain the differences between S3 and Glacier. S3 offers highly available, real-time retrieval of objects. Retrieving data from Glacier is a two-step process that requires first requesting an archive using the Expedited, Standard, or Bulk retrieval option and then downloading the archive once the retrieval is complete.

Know how to use encryption, versioning, and object life cycle configurations in S3. S3 offers server-side and client-side encryption to protect objects at rest from unauthorized access. Versioning helps protect against object overwrites and deletions. Object life cycle configurations let you delete objects or move them to different storage classes after they reach a certain age.

Understand the three virtual machine types offered by AWS Storage Gateway. File gateways offer access to S3 via the NFS and SMB storage protocols. Volume gateways and tape gateways offer access via the iSCSI block storage protocol, but tape gateways are specifically designed to work with common backup applications.

Review Questions

When trying to create an S3 bucket named documents, AWS informs you that the bucket name is already in use. What should you do in order to create a bucket?
1. Use a different region.
2. Use a globally unique bucket name.
3. Use a different storage class.
4. Use a longer name.
5. Use a shorter name.
Which S3 storage classes are most cost-effective for infrequently accessed data that can’t be easily replaced? (Select TWO.)
1. STANDARD_IA
2. ONEZONE_IA
3. GLACIER
4. STANDARD
5. INTELLIGENT_TIERING
What are the major differences between Simple Storage Service (S3) and Elastic Block Store (EBS)? (Select TWO.)
1. EBS stores volumes.
2. EBS stores snapshots.
3. S3 stores volumes.
4. S3 stores objects.
5. EBS stores objects.
Which tasks can S3 object life cycle configurations perform automatically? (Select THREE.)
1. Deleting old object versions
2. Moving objects to Glacier
3. Deleting old buckets
4. Deleting old objects
5. Moving objects to an EBS volume
What methods can be used to grant anonymous access to an object in S3? (Select TWO.)
1. Bucket policies
2. Access control lists
3. User policies
4. Security groups
Your budget-conscious organization has a 5 TB database file it needs to retain off-site for at least 5 years. In the event the organization needs to access the database, it must be accessible within 8 hours. Which cloud storage option should you recommend, and why? (Select TWO.)
1. S3 has the most durable storage.
2. S3.
3. S3 Glacier.
4. Glacier is the most cost effective.
5. S3 has the fastest retrieval times.
6. S3 doesn’t support object sizes greater than 4 TB.
Which of the following actions can you perform from the S3 Glacier service console?
1. Delete an archive
2. Create a vault
3. Create an archive
4. Delete a bucket
5. Retrieve an archive
Which Glacier retrieval option generally takes 3 to 5 hours to complete?
1. Provisioned
2. Expedited
3. Bulk
4. Standard
What’s the minimum size for a Glacier archive?
1. 1 byte
2. 40 TB
3. 5 TB
4. 0 bytes
Which types of AWS Storage Gateway let you connect your servers to block storage using the iSCSI protocol? (Select TWO.)
1. Cached gateway
2. Tape gateway
3. File gateway
4. Volume gateway
Where does AWS Storage Gateway primarily store data?
1. Glacier vaults
2. S3 buckets
3. EBS volumes
4. EBS snapshots
You need an easy way to transfer files from a server in your data center to S3 without having to install any third-party software. Which of the following services and storage protocols could you use? (Select FOUR.)
1. AWS Storage Gateway—file gateway
2. iSCSI
3. AWS Snowball
4. SMB
5. AWS Storage Gateway—volume gateway
6. The AWS CLI
Which of the following are true regarding the AWS Storage Gateway—volume gateway configuration? (Select THREE.)
1. Stored volumes asynchronously back up data to S3 as EBS snapshots.
2. Stored volumes can be up to 32 TB in size.
3. Cached volumes locally store only a frequently used subset of data.
4. Cached volumes asynchronously back up data to S3 as EBS snapshots.
5. Cached volumes can be up to 32 TB in size.
What’s the most data you can store on a single Snowball device?
1. 42 TB
2. 50 TB
3. 72 TB
4. 80 TB
Which of the following are security features of AWS Snowball? (Select TWO.)
1. It enforces encryption at rest.
2. It uses a Trusted Platform Module (TPM) chip.
3. It enforces NFS encryption.
4. It has tamper-resistant network ports.
Which of the following might AWS do after receiving a damaged Snowball device from a customer?
1. Copy the customer’s data to Glacier
2. Replace the Trusted Platform Module (TPM) chip
3. Securely erase the customer’s data from the device
4. Copy the customer’s data to S3
Which of the following can you use to transfer data to AWS Snowball from a Windows machine without writing any code?
1. NFS
2. The Snowball Client
3. iSCSI
4. SMB
5. The S3 SDK Adapter for Snowball
How do the AWS Snowball and Snowball Edge devices differ? (Select TWO.)
1. Snowball Edge supports copying files using NFS.
2. Snowball devices can be clustered together for storage.
3. Snowball’s QSFP+ network interface supports speeds up to 40 Gbps.
4. Snowball Edge can run EC2 instances.
Which of the following Snowball Edge device options is the best for running machine learning applications?
1. Compute Optimized
2. Compute Optimized with GPU
3. Storage Optimized
4. Network Optimized
Which of the following hardware devices offers a network interface speed that supports up to 100 Gbps?
1. Snowball Edge with the Storage Optimized configuration
2. Snowball Edge with the Compute Optimized configuration
3. Storage Gateway
4. 80 TB Snowball

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 8 The Core Storage Services

Create new playlist

Sign In

Sign Up