Chapter 7
Amazon CloudFront

THE AWS CERTIFIED ADVANCED NETWORKING – SPECIALTY EXAM OBJECTIVES COVERED IN THIS CHAPTER MAY INCLUDE, BUT ARE NOT LIMITED TO, THE FOLLOWING:

  • Domain 2.0: Design and Implement AWS Networks
  • images2.4. Determine network requirements for a specialized workload
  • images2.5. Derive an appropriate architecture based on customer and application requirements
  • Domain 4.0: Configure Network Integration with Application Services
  • images4.5 Determine a content distribution strategy to optimize for performance
image

Introduction to Amazon CloudFront

Amazon CloudFront is a global Content Delivery Network service that speeds up the distribution of your static and dynamic web content. Amazon CloudFront delivers your content through a worldwide network of edge locations. Amazon CloudFront integrates with other AWS products to give developers and organizations an easy way to distribute content to end users with low latency, high data transfer speeds, and no minimum usage commitments. This chapter reviews the components that make up Amazon CloudFront and then examines its advanced features. The chapter concludes with key exercises and questions related to Amazon CloudFront and the AWS Certified Advanced Networking – Specialty Exam.

Content Delivery Network Overview

A Content Delivery Network (CDN) is a globally-distributed network of caching servers that accelerate the downloading of web pages, images, videos, and other content. CDNs use Domain Name System (DNS) geolocation to determine the geographic location of each request for a web page or other content. They then serve that content from caching servers closest to that location—whether “closest” is measured in distance or time (latency)—instead of the original web server. A CDN allows you to increase the scalability and decrease the latency of a website or mobile application easily in response to traffic spikes. In most cases, using a CDN is completely transparent—end users simply experience better website performance, while the load on your original website is reduced.

CDNs were primarily invented to circumvent a constant that has yet to be overcome in the networking world: the speed of light. In a vacuum, the speed of light is roughly 300,000 kilometers per second; in fiber-optic cables, it can be up to 30 percent slower. When such fiber-optic cables and their associated optical repeaters traverse the vast expanse of the Pacific Ocean, for example, responses from web servers back to clients can take upwards of hundreds of milliseconds. In the networking world, this results in reduced throughput and poor performance for customers. By using a CDN, you can overcome the limitations of serving content over large distances by caching or pre-positioning data at predefined locations. You can also isolate the load on your centralized web servers by having each edge location where your content is cached serve the content for you, therefore increasing your scale immensely on an edge location basis.

The AWS CDN: Amazon CloudFront

Amazon CloudFront is the AWS CDN. It can be used to deliver your web content using Amazon’s global network of edge locations. When a user requests content that is served with Amazon CloudFront, the user is routed to the edge location that provides the lowest latency (time delay), so content is delivered with the best possible performance. If the content is already in the edge location with the lowest latency, Amazon CloudFront delivers it immediately. If the content is not currently in that edge location, Amazon CloudFront retrieves it from the origin server, such as an Amazon Simple Storage Service (Amazon S3) bucket or a web server. The origin server stores the original, definitive versions of your content.

Amazon CloudFront is optimized to work with other AWS Cloud services that serve as the origin server, including Amazon S3 buckets, Amazon S3 static websites, Amazon Elastic Compute Cloud (Amazon EC2) instances, and Elastic Load Balancing load balancers. Amazon CloudFront also works seamlessly with non-AWS origin servers, such as an existing on-premises web server. Amazon CloudFront also integrates with Amazon Route 53.

Amazon CloudFront supports all content that can be served over HTTP or HTTPS. This includes any popular static files that are a part of your web application, such as HTML files, images, JavaScript, and CSS files, and also audio, video, media files, or software downloads. Amazon CloudFront also supports serving dynamically generated web pages, so it can be used to deliver your entire website. Lastly, Amazon CloudFront supports media streaming, using both HTTP and Real-Time Messaging Protocol (RTMP).

Amazon CloudFront Basics

There are three core concepts that you need to understand in order to start using Amazon CloudFront: distributions, origins, and cache control. With these concepts, you can use Amazon CloudFront to speed up delivery of content from your websites.

Distributions

To use Amazon CloudFront, you start by creating a distribution, which is identified by a DNS domain name such as d111111abcdef8.cloudfront.net. To serve files from Amazon CloudFront, you simply use the distribution domain name in place of your website’s domain name; the rest of the file paths stay unchanged. You can use the Amazon CloudFront distribution domain name as-is, or more typically you create a user-friendly DNS name in your own domain by creating a Canonical Name Record (CNAME) in Amazon Route 53 or another DNS service that refers to the distribution’s domain name. Clients who use the CNAME are automatically redirected to your Amazon CloudFront distribution domain name. If you use Route53 as your DNS service, you can also use a feature called aliases to redirect a zone root address such as “example.com” (which cannot be a CNAME) to your CloudFront distribution.

Origins

When you create a distribution, you must specify the DNS domain name of the origin—the Amazon S3 bucket or HTTP server—from which you want Amazon CloudFront to retrieve the definitive version of your objects (web files). For example, note the following:

Cache Control

Once requested and served from an edge location, objects stay in the cache until they expire or are evicted to make room for more frequently requested content. By default, objects expire from the cache after 24 hours. After an object expires, the next request results in Amazon CloudFront forwarding the request to the origin to verify that the object is unchanged or to fetch a new version if it has changed.

Optionally, you can control how long objects stay in an Amazon CloudFront cache before expiring. To do this, you can choose to use Cache-Control headers set by your origin server, or you can set the minimum, maximum, and default Time to Live (TTL) for objects in your Amazon CloudFront distribution.

You can also remove copies of an object from all Amazon CloudFront edge locations at any time by calling the invalidation Application Programming Interface (API) or through the Amazon CloudFront console. This feature removes the object from every Amazon CloudFront edge location regardless of the expiration period you set for that object on your origin server. The invalidation feature is designed to be used in unexpected circumstances, such as to correct an error or to make an unanticipated update to a website—not as part of your everyday workflow.

Instead of invalidating objects manually or programmatically, it is a best practice to use a version identifier as part of the object (file) path name. For example, note the following:

  • Old file: assets/v1/css/narrow.css
  • New file: assets/v2/css/narrow.css

When using versioning, users will see the latest content through Amazon CloudFront when you update your site without using invalidation. Old versions will expire from the cache automatically. That said, depending on other settings, you may need to invalidate the base page that includes references to the versioned objects.

How Amazon CloudFront Delivers Content

After some initial setup, Amazon CloudFront works transparently to speed up delivery of your content. This overview provides you with the steps required to set up Amazon CloudFront to serve your content, as well as the process that happens behind the scenes when serving content to your users.

Configuring Amazon CloudFront

The following steps walk you through the process required to configure Amazon CloudFront:

Configure Your Origin Servers
  1. CloudFront uses your origin server to retrieve your files for distribution from Amazon CloudFront edge locations.

    An origin server stores the original, definitive version of your objects. If you are serving content over HTTP, your origin server is either an Amazon S3 bucket or an HTTP server, such as a web server. Your HTTP server can run on an Amazon EC2 instance or on a server that you manage; these servers are also known as custom origins.

    If you distribute media files on demand using the Adobe RTMP protocol, your origin server is always an Amazon S3 bucket.

Place Your Content on Your Origin Servers
  1. Your files, known as objects, typically include web pages, images, and media files, but they can be anything that is served over HTTP or a supported version of Adobe RTMP, the protocol used by Adobe Flash Media Server. Dynamically generated content, such as HTML generated from a database in response to an HTTP GET operation, is also fully supported.
Create Your Amazon CloudFront Distribution
  1. The Amazon CloudFront distribution will tell CloudFront which origin servers to get your content from when users request the content through your website or application. You can also specify details such as whether you want Amazon CloudFront to log all requests, and whether you want the distribution to be enabled as soon as it is created.
Amazon CloudFront Assigns a Domain Name
  1. After creating your distribution, Amazon CloudFront will automatically assign a domain name that will be used to reference your distribution.
Amazon CloudFront Configures Its Edge Locations
  1. After assigning a domain name, Amazon CloudFront will automatically send your distribution’s configuration (but not your content) to all of its edge locations.

As you build your website or application, you can use the domain name that Amazon CloudFront provides for your URLs when referencing objects. For example, if Amazon CloudFront returns the domain d111111abcdef8.cloudfront.net for your distribution, the URL for logo.jpg in your Amazon S3 bucket or the root directory of your web server would be as follows: http://d111111abcdef8.cloudfront.net/logo.jpg. A more typical practice, however, is to use relative paths that do not specify the host part of the URL at all, unless another host name is actually required. This provides more flexibility in terms of site construction and the use of CNAMEs, load balancers, and CloudFront distributions. For example, an image file would be referenced as “/images/website-logo.png”, or to take the previous example, “logo.jpg”. This allows the reference to work properly whether the web page is accessed directly from the server by its DNS name or IP address, via the CloudFront distribution’s DNS name, or via a CNAME such as www.example.com that you provide that points to the CloudFront distribution’s DNS name.

Optionally, you can configure your origin server to add headers to the files, with a header indicating how long you want the files to stay in the cache in the Amazon CloudFront edge location. By default, each object stays in an edge location for 24 hours before it expires. The minimum expiration time is 0 seconds, with no maximum expiration time limit.

Figure 7.1 shows an overview of the steps required to configure your Amazon CloudFront distribution.

Diagram shows developer linked to Amazon S3 bucket or HTTP server, and Amazon CloudFront by activities like configuring origin server, uploading objects, creating distribution, receiving domain, and configuring edge locations.

FIGURE 7.1 Configuring your Amazon CloudFront distribution

How CloudFront Operates

The following steps outline what happens when users request objects after you’ve configured Amazon CloudFront to deliver your content.

  1. Users access your website or application and request one or more objects, such as an image file and an HTML file.
  2. DNS routes the request to the Amazon CloudFront edge location that can best serve the user’s request, typically the nearest Amazon CloudFront edge location in terms of network latency.
  3. In the edge location, Amazon CloudFront will check its cache for the requested files, returning them to the user if they are found in the cache. If the files are not found in the cache, Amazon CloudFront will perform the following actions:
    1. Amazon CloudFront will compare the request with your distribution configuration and forward the request for the files to the applicable origin server for the corresponding file type (for example, to your Amazon S3 bucket for image files and to your HTTP server for the HTML files).
    2. The origin servers send the files back to the Amazon CloudFront edge location.
    3. As soon as the first byte arrives from the origin, Amazon CloudFront begins to forward the files to the user. Amazon CloudFront also adds the files to the cache in the edge location for the next time someone requests those files.

The process for CloudFront content delivery is shown in Figure 7.2.

Image described by caption and surrounding text.

FIGURE 7.2 Amazon CloudFront content delivery

Amazon CloudFront Edge Locations

Amazon CloudFront edge locations are the regional points of presence that are used to cache objects and store these closer to your application or website’s end users. As of the time of this writing, Amazon CloudFront has a global network of 100 edge locations in 50 cities across 23 countries. These edge locations include 89 Points of Presence and 11 Regional Edge Caches.

Amazon CloudFront Regional Edge Caches

Regional Edge Caches are CloudFront locations that are deployed globally in AWS regions, at closer proximity to your users. These locations sit between your origin server and the global edge locations that serve traffic directly to your users. As the popularity of your objects declines, individual edge locations may evict those objects to make room for more popular content. Regional Edge Caches have a much larger cache size than their global edge location counterparts, which allows objects to remain in cache longer.

When a user makes a request to your website or application, DNS routes the request to the Amazon CloudFront edge location that can best serve the user’s request. This location is typically the nearest Amazon CloudFront edge location in terms of latency. In the edge location, Amazon CloudFront checks its cache for the requested files. If the files are in the cache, Amazon CloudFront returns them to the user. If the files are not in the cache, the edge servers go to the nearest Regional Edge Cache to fetch the object. In the Regional Edge Cache location, Amazon CloudFront again checks its cache for the requested files. If the files are in the cache, Amazon CloudFront forwards the files to the requested edge location.

As soon as the first byte arrives from a Regional Edge Cache location, Amazon CloudFront will begin to forward the files to the user. Amazon CloudFront also adds the files to the cache in the requested edge location for the next time someone requests those files.

Amazon CloudFront Regional Edge Cache locations are suited for content that might not be popular enough to remain consistently within Amazon CloudFront edge locations but still might benefit from being located closer to the requestor of the content.

Some important points to consider for Amazon CloudFront Regional Edge Caches:

  • You do not need to make any changes to your Amazon CloudFront distributions that use Regional Edge Caches; they are enabled by default for all Amazon CloudFront distributions.
  • There is no additional cost for using Amazon CloudFront Regional Edge Caches.
  • Regional Edge Caches have feature parity with edge locations. For example, a cache invalidation request removes an object from both edge locations and Regional Edge Caches before it expires. The next time a viewer requests the object, Amazon CloudFront returns to the origin to fetch the latest version of the object.
  • Regional Edge Caches work with custom origins. Amazon S3 origins are, however, accessed directly from the edge locations.
  • Proxy methods PUT/POST/PATCH/OPTIONS/DELETE flow directly to the origin from the edge locations and do not proxy through the Regional Edge Caches.
  • Dynamic content, as determined at request time (cache behavior configured to forward all headers), does not flow through the Regional Edge Caches but goes directly to the origin.
  • You can measure the performance improvements from this feature by using cache-hit ratio metrics available from the Amazon CloudFront console.

Web Distributions

When you want to use Amazon CloudFront to distribute your content, you create a distribution and specify configuration settings such as your origin and whether you want your files to be available to everyone or have restricted access.

You can also configure Amazon CloudFront to require users to use HTTPS to access your content, forward cookies and/or query strings to your origin, prevent users from particular countries from accessing your content, and create access logs.

You can use web distributions to serve the following content over HTTP or HTTPS:

  • Static and dynamic content. For example, HTML, CSS, JS, and image files using HTTP or HTTPS.
  • Multimedia content on demand using progressive download and Apple HTTP Live Streaming (HLS). You cannot serve Adobe Flash multimedia content over HTTP or HTTPS, but you can serve it using an Amazon CloudFront RTMP distribution.
  • A live event, such as a meeting, conference, or concert, in real time. For live streaming, you can create the distribution automatically by using an AWS CloudFormation stack.

Dynamic Content and Advanced Features

Amazon CloudFront can do much more than simply serve static web files. To start using the service’s advanced features, you will need to understand how to use cache behaviors and how to restrict access to sensitive content.

Dynamic Content, Multiple Origins, and Cache Behaviors

Serving static assets, as described previously, is a common way to use a CDN. An Amazon CloudFront distribution, however, can easily be set up to also serve dynamic content and to use more than one origin server. You can control which requests are served by which origin and how requests are cached using a feature called cache behaviors.

A cache behavior lets you configure a variety of Amazon CloudFront functionalities for a given URL path pattern for files on your website, as shown in Figure 7.3. One cache behavior applies to all PHP files in a web server (dynamic content) using the path pattern *.php, while another behavior applies to all JPEG images in another origin server (static content) using the path pattern *.jpg.

Diagram shows configuring Amazon CloudFront functionalities for URL path example.com for files on a website. Dynamic content of web server includes elastic load balancing and Amazon EC2 and static content includes Amazon S3.

FIGURE 7.3 Amazon CloudFront content delivery

The functionality that you can configure for each cache behavior includes the following:

  • The path pattern
  • Which origin to forward your requests to
  • Whether to forward query strings to your origin
  • Whether accessing the specified files requires signed URLs
  • Whether to require HTTPS access
  • The amount of time that those files stay in the Amazon CloudFront cache (regardless of the value of any cache control headers that your origin adds to the files)

Cache behaviors are applied in order; if a request does not match the first path pattern, it drops down to the next path pattern. Normally, the last path pattern specified is * to match all files.

A Note on Performance: Dynamic Content and HTTP/2

It is very useful that Amazon CloudFront can seamlessly deal with all content, including dynamically-generated content that is not cacheable alongside a wide array of content that can be cached (see previous and following sections). But you may assume that there is no performance benefit in that case. After all, if the Amazon CloudFront edge location needs to reach back to the origin each time it receives a request for a particular URL representing dynamic content, how can it speed up content delivery? You may then assume that if an item is not in the Amazon CloudFront cache, the use of Amazon CloudFront won’t speed up access to that content the first time it is requested.

As it turns out, even dynamic or initially uncached content will often be delivered with lower latency to end users. The reason has to do with the time it takes to set up the TCP or TLS connections that underlie the content caching and delivery mechanisms. Each such connection takes a finite amount of time to establish, and if a connection from CloudFront to the origin can be reused, significant latency gains are possible.

For example, let’s assume that the round-trip latency between an end user and the Amazon CloudFront edge location is 30 milliseconds, and the round-trip latency between the edge location and the origin is 100 milliseconds. (For context, as of this writing, even over the high performance AWS backbone the roundtrip latency from the Singapore region to the Northern Virginia region was about 240 milliseconds.) In all cases, before any content can be delivered for the very first time, the TCP connection establishment between the three hosts (which require one full round-trip for the SYN/ACK packets from client to edge, and the edge to origin) will take at least 130 milliseconds (ignoring local overhead, which is much higher for TLS connections).

Now, let’s assume a new client connects to the edge location and begins requesting content from the same origin, whether dynamic content, or content not yet in the edge cache. The Amazon CloudFront edge server will often be able to re-use an existing connection to the origin server, and avoid the connection setup overhead. This can reduce the first-byte delivery time by 100 milliseconds or more. That may not seem like a lot, but even 1/10 of a second per TCP connection can add up quickly. Avoiding the overhead of establishing an encrypted TLS session each time will decrease latency even more. So using Amazon CloudFront is a performance win even in cases where content caching is not playing a role. Your users will be happy to receive the best possible performance in all of these scenarios.

Amazon CloudFront also supports connections from clients via the HTTP/2 protocol. That new protocol, already supported by most modern browsers, provides a significant number of enhancements that improve performance by connection re-use, multiplexing, server push, etc. Even if your origin server does not support HTTP/2 yet, those enhanced features in use between your end-users and the Amazon CloudFront edge servers can significantly improve performance even when Amazon CloudFront is accessing your origin server using HTTP/1.x. Not only can you use Amazon CloudFront to optimize origin access via connection re-use, but content in the edge cache will be delivered faster than it could be from your origin servers, even ignoring latency differences between the edge and the origin.

Whole Website

Using cache behaviors and multiple origins, you can easily use Amazon CloudFront to serve your whole website and to support different behaviors for different client devices.

Private Content

In many cases, you may want to restrict access to content in Amazon CloudFront only to selected requestors, such as paid subscribers or to applications or users in your company network. Amazon CloudFront provides several mechanisms to allow you to serve private content:

Signed URLs Use URLs that are valid only between certain times and optionally from certain IP addresses.

Signed cookies Require authentication via public and private key pairs.

Origin Access Identities (OAI) Restrict access to an Amazon S3 bucket only to a special Amazon CloudFront user associated with your distribution. This is the easiest way to ensure that content in a bucket is accessed only by Amazon CloudFront.

RTMP Distributions

RTMP distributions stream media files using Adobe Media Server and the Adobe RTMP. When using an RTMP distribution for Amazon CloudFront, you need to provide both your media files and a media player to your end users. Media player examples include JW Player, Flowplayer, and Adobe Flash.

End users will view your media files using the media player that you provide for them. They do not use the media player (if any) that is already installed on their computer or device. This is due in part to the fact that when the end user streams your media file, the media player begins to play the content of the file while the file is still being downloaded from Amazon CloudFront. The media file is not stored locally on the end user’s system.

To use Amazon CloudFront to serve media in this way, you need two types of distributions: a web distribution to serve the media player and an RTMP distribution for the media files. The web distribution will serve files over HTTP, while the RTMP distribution will stream media files over RTMP or a variant of RTMP.

Figure 7.4 shows that the media files and your media player are stored in different buckets in Amazon S3. You could also make the media player available to users in other ways, such as using Amazon CloudFront and a custom origin; however, the media files must use an Amazon S3 bucket at the origin.

Image described by caption and surrounding text.

FIGURE 7.4 Streaming distributions, web, and RTMP

Figure 7.4 also shows two separate buckets being used: one for your media files and the other for your media player. You can also store media files and your media player in the same Amazon S3 bucket (not shown in the figure).

In Figure 7.4, there are two distributions used for Amazon CloudFront streaming:

  1. Your media player bucket holds the media player, and it is the origin server for a regular HTTP distribution. In this example, the domain name for the distribution is d1234.cloudfront.net. The d in d1234.cloudfront.net indicates that this is a web distribution.
  2. Your streaming media bucket holds your media files, and it is the origin server for an RTMP distribution. In this example, the domain name for the distribution is  s5678.cloudfront.net. The s in s5678.cloudfront.net indicates that this is an RTMP distribution.

There are other streaming options available with Amazon CloudFront.

Wowza Streaming Engine 4.2 You can use the Wowza Streaming Engine 4.2 to create live streaming sessions for global delivery using Amazon CloudFront. Wowza Streaming Engine 4.2 supports the following HTTP-based streaming protocols:

  • HLS
  • HTTP Dynamic Streaming (HDS)
  • Smooth Streaming
  • MPEG Dynamic Adaptive Streaming over HTTP (DASH)

For these protocols, Amazon CloudFront will break video into smaller chunks that are cached in the Amazon CloudFront network for improved performance and scalability.

Live HTTP streaming using Amazon CloudFront and any HTTP origin Amazon CloudFront supports any live encoder, such as Elemental Live. The encoder must output HTTP-based streams to stream live performances, webinars, and other events.

On-demand video streaming using Amazon CloudFront and other media players When streaming media files using Amazon CloudFront, you provide both the media files and media player that you want end users to utilize to play the media file.

Alternate Domain Names

In Amazon CloudFront, an alternate domain name lets you use your own domain name (for example, www.example.com) for links to your objects instead of using the domain name that CloudFront assigns to your distribution. Both web and RTMP distributions support alternate domain names.

When you create a distribution, Amazon CloudFront returns a domain name for the distribution, for example: d111111abcdef8.cloudfront.net.

When you use the Amazon CloudFront domain name for your objects, the URL for an object called /images/image.jpg would be: http://d111111abcdef8.cloudfront.net/images/image.jpg.

If you want to use your own domain name, such as www.example.com, instead of the cloudfront.net domain name that Amazon CloudFront assigned to your distribution, you can add an alternate domain name to your distribution for www.example.com. You can then use the following URL for /images/image.jpg: http://www.example.com/images/image.jpg.

When you add alternate domain names, you can use the wildcard * at the beginning of a domain name instead of specifying subdomains individually. For example, with an alternative domain name of *.example.com, you can use any domain name that ends with example.com in your object URLs, such as www.example.com, product-name.example.com, and marketing.product-name.example.com.

HTTPS

For web distributions, you can configure Amazon CloudFront to require that viewers use HTTPS to request your objects, and even automatically redirect users from an HTTP endpoint to the HTTPS endpoint for your distribution. This results in connections between users and Amazon CloudFront being encrypted. You also can configure Amazon CloudFront to use HTTPS to retrieve objects from your origin so that connections are encrypted when Amazon CloudFront communicates with your origin from edge locations and Regional Edge Caches.

Here is the process that is followed when Amazon CloudFront receives a request for an object, and you require HTTPS to communicate with both your users and your origin:

  1. A web client submits an HTTPS request to Amazon CloudFront. There is a Secure Sockets Layer (SSL)/Transport Layer Security (TLS) negotiation here between the user and Amazon CloudFront. The client submits the request in an encrypted format.
  2. If the object is in the Amazon CloudFront Regional Edge Cache, Amazon CloudFront encrypts the response and returns it to the client. The client then decrypts it.
  3. If the object is not in the Amazon CloudFront cache, Amazon CloudFront performs SSL/TLS negotiation with your origin and, when the negotiation is complete, forwards the request to your origin in an encrypted format.
  4. The origin decrypts the request, encrypts the requested object, and returns the object to Amazon CloudFront.
  5. Amazon CloudFront decrypts the response, re-encrypts it, and forwards the object to the client. Amazon CloudFront also saves the object in its cache so that the object is available the next time it is requested.
  6. The client decrypts the response.

Amazon CloudFront and AWS Certificate Manager (ACM)

AWS Certificate Manager (ACM) is designed to simplify and automate many of the tasks that are traditionally associated with management of SSL/TLS certificates. ACM takes care of the complexity surrounding the provisioning, deployment, and renewal of digital certificates, with certificates being provided by Amazon’s certificate authority (CA), Amazon Trust Services.

You can provision SSL/TLS certificates and associate them with Amazon CloudFront distributions. First, you provision a certificate using ACM and then deploy it to your Amazon CloudFront distribution. ACM also has the ability to manage certificate renewals for you. ACM allows you to provision, deploy, and manage the certificate with no additional charges. There are, however, additional charges when using Amazon CloudFront and HTTPS.

To use an ACM Certificate with Amazon CloudFront, you must request or import the certificate in the US East (N. Virginia) Region. ACM certificates in this region that are associated with an Amazon CloudFront distribution are disseminated to all the geographic locations configured for that distribution.

Invalidating Objects (Web Distributions Only)

If you need to remove objects from an Amazon CloudFront Regional Edge Cache before they expire, you can invalidate the object from the Amazon CloudFront Regional Edge Caches. There is no charge for the first 1,000 invalidations per month; you pay for each invalidation over 1,000 in a month.

To invalidate objects, you can specify either the path for individual objects or a path that ends with the * wildcard, which might apply to one object or many objects. The following are examples of specific object and wildcard invalidations:

  • /images/image1.jpg
  • /images/image*
  • /images/*

An alternative to invalidating objects is to use object versioning to serve a different version of the object that has a different fully-qualified name (name including path).

Access Logs

Amazon CloudFront can create log files that contain detailed information about every user request that Amazon CloudFront receives. Access logs are available for both web and RTMP distributions. When you enable logging for your distribution, you specify the Amazon S3 bucket in which you want Amazon CloudFront to store log files.

You can store the log files for multiple distributions in the same bucket. When you enable logging, you can specify an optional prefix for the file names so that you can keep track of which log files are associated with which distributions.

Amazon CloudFront and AWS Lambda@Edge

AWS Lambda@Edge is an extension of AWS Lambda, a compute service that lets you execute functions that customize the content that is delivered through Amazon CloudFront. You can author functions in one region and execute them in AWS Regions and edge locations globally, without provisioning or managing servers. Just as with AWS Lambda, Lambda@Edge scales automatically, from a few requests per day to thousands per second. Lambda@Edge processes requests at edge locations instead of an origin server, which can significantly reduce latency and improve the user experience.

When you associate an Amazon CloudFront distribution with a Lambda@Edge function, Amazon CloudFront intercepts requests and responses at Edge locations. Lambda@Edge functions execute in response to Amazon CloudFront events in the region or edge location that is closest to your customer.

You can execute AWS Lambda functions when the following Amazon CloudFront events occur:

  • When Amazon CloudFront receives a request from a viewer (viewer request)
  • Before Amazon CloudFront forwards a request to the origin (origin request)
  • When Amazon CloudFront receives a response from the origin (origin response)
  • Before Amazon CloudFront returns the response to the viewer (viewer response)

The following are some example use cases for Lambda@Edge:

  • You can write AWS Lambda functions that inspect cookies and rewrite URLs so that users see different versions of a site for A/B testing.
  • You can use an AWS Lambda function to generate HTTP responses when Amazon CloudFront viewer request events or origin request events occur.
  • An AWS Lambda function can inspect headers or authorization tokens and insert the applicable header to control access to your content before Amazon CloudFront forwards a request to the origin.
  • An AWS Lambda function can add, drop, and modify headers and can rewrite URL paths so that Amazon CloudFront returns different objects.

Amazon CloudFront Field-Level Encryption

With Amazon CloudFront Field-Level Encryption, you can encrypt sensitive pieces of content at the edge before requests are forwarded to your origin servers. The data is encrypted using a public key that you supply. That data can then be decrypted inside your application using the associated private key. In an era of agile dev/ops teams developing large applications on the basis of a range of APIs and loosely-coupled micro-services, isolating sensitive data when it first enters the application, and only decrypted it at one or a few key points in its lifecycle, can significantly improve application security while enabling greater agility in secure application development.

You configure Amazon CloudFront field-level encryption by going through a series of steps that include uploading the private key, creating encryption profiles, setting up a configuration that makes use of those profiles, and then linking that configuration to cache behavior. You can specify up to 10 fields in an HTTP POST request that are to be encrypted, and you can set it so that different profiles are applied to each request based on a query string within the request URL.

When all is properly configured, sensitive data fields coming from end users will be encrypted automatically at the edge, and then the body of the content including both encrypted and unencrypted data can flow to and throughout your application. Only at the point where the application—most likely, a particular micro-service carefully designed and managed to deal with the sensitive data—needs to read the original data, is data decrypted and utilized. Meanwhile, all other parts of the application, as well as general logging, monitoring, performance tracing facilities, will never inadvertently examine or record or expose the sensitive data elements that arrived from the user if configured correctly.

Summary

In this chapter, you learned about Amazon CloudFront, a global CDN service that integrates with other AWS products to give developers and organizations an easy way to distribute content to end users with low latency, high data transfer speeds, and no minimum usage commitments.

You learned about the different capabilities and features of Amazon CloudFront, including edge locations, Regional Edge Caches, web and RTMP distributions, origin servers, dynamic content delivery, access logs, Lambda@Edge and field-level encryption.

CDNs are one of the main ways to provide consistent performance to users who are geographically dispersed across the globe. They can also reduce load on your origin server and provide increased web application scalability, performance, and security.

Exam Essentials

Know the basic use cases for Amazon CloudFront. Know when to use Amazon CloudFront, such as for popular static and dynamic content with geographically-distributed users.

Know how Amazon CloudFront works. Amazon CloudFront optimizes downloads by using geolocation to identify the geographical location of users and then serving and caching content at the edge location closest to each user.

Know how to create an Amazon CloudFront distribution and what types of origins are supported. To create a distribution, you specify an origin and the type of distribution, and then Amazon CloudFront creates a new domain name for the distribution. Origins supported include Amazon S3 buckets or static Amazon S3 websites and HTTP servers located on Amazon EC2 or in your own data center.

Know how to use Amazon CloudFront for dynamic content and multiple origins. Understand how to specify multiple origins for different types of content and how to use cache behaviors and path strings to control what content is served by which origin.

Know what mechanisms are available to serve private content through Amazon CloudFront. Amazon CloudFront can serve private content using Amazon S3 OAIs, signed URLs, and signed cookies.

Know how access logs for Amazon CloudFront work. Amazon CloudFront can create log files that contain detailed information about every user request that Amazon CloudFront receives.

Know how and why you would invalidate objects from Amazon CloudFront. If you need to remove objects from an Amazon CloudFront edge location cache before the object expires, you can invalidate the object from the Amazon CloudFront edge location caches.

Know how Lambda@Edge works and the use cases where it would be useful. Lambda@Edge is an extension of AWS Lambda, a compute service that lets you execute functions that customize the content that is delivered through Amazon CloudFront. You can execute AWS Lambda functions when Amazon CloudFront events occur.

Know why and how you would use ACM. ACM is designed to simplify and automate many of the tasks that are traditionally associated with management of SSL/TLS certificates. To use an ACM certificate with Amazon CloudFront, you must request or import the certificate in the US East (N. Virginia) Region.

Know why and how you would use HTTPS with Amazon CloudFront. For web distributions, you can configure Amazon CloudFront to require that viewers use HTTPS to request your objects.

Resources to Review

https://www.youtube.com/watch?v=wRaPw1tx6LA

Exercises

The best way to become familiar with Amazon CloudFront is to build your own Amazon CloudFront distribution, which is what you will be doing in this section.

For assistance completing these exercises, refer to the Amazon CloudFront user guide located at: https://aws.amazon.com/documentation/cloudfront/.

Review Questions

  1. What is a Content Delivery Network (CDN)?

    1. A managed Domain Name System (DNS) service
    2. A type of load balancer
    3. A distributed network of caches
    4. A protocol for the distribution of traffic over the web
  2. You are using Amazon CloudFront for your website. A user requests content, which is routed to a local edge location. What happens before the requested content is available at that edge location?

    1. Amazon CloudFront will respond with an HTTP 404 error.
    2. Amazon CloudFront will not send users to edge locations that do not contain the requested data.
    3. Amazon CloudFront always pre-positions content in edge locations so that users never experience a cache miss.
    4. The edge location sends a request to the origin server, serves the user the content, and then stores the content.
  3. Amazon CloudFront can work with which of the following origin servers? (Choose three.)

    1. Amazon Simple Storage Service (Amazon S3)
    2. Elastic Load Balancing
    3. On-premises servers
    4. An Amazon Elastic Compute Cloud (Amazon EC2) Auto Scaling group
    5. A Virtual Private Cloud (VPC) route table
  4. What is the default expiry time for an Amazon CloudFront cache?

    1. 300 seconds
    2. 24 hours
    3. 12 months
    4. Objects never expire by default.
  5. What does the Amazon CloudFront invalidation feature do?

    1. Blocks users from flooding edge locations with requests.
    2. Removes duplicate objects from the origin server.
    3. Allows the override of origin server encryption.
    4. Removes objects from the CloudFront cache.
  6. What does an Amazon CloudFront cache behavior do?

    1. Controls how requests are cached.
    2. Applies rules to control selection of origins.
    3. Enforces HTTPS encryption for all users.
    4. Allows dynamic content caching.
  7. What does Amazon CloudFront do when it uses HTTP Live Streaming (HLS), HTTP Dynamic Streaming (HDS), Smooth Streaming, and MPEG DASH formats for streaming video?

    1. Uses the native Amazon CloudFront media player for improved performance.
    2. Uses multiple edge locations for improved performance.
    3. Sends parallel streams for improved performance.
    4. Encapsulates video into pull (rather than push) formats that allow clients to adapt to changing conditions for improved performance.
  8. When adding an alternate domain to your Amazon CloudFront distribution, the wildcard * can be used to do what?

    1. Replace part of a subdomain name (for example, subdomain.*.example.com).
    2. Replace part of a subdomain name (for example, *domain.example.com).
    3. Act in the place of specifying subdomains individually.
    4. Reference multiple files on your origin server.
  9. When using AWS Certification Manager (ACM) and Amazon CloudFront, you configured your certificate within ACM. When you try to enable Amazon CloudFront, however, you do not see the certificate available for use. What could be the problem?

    1. ACM does not support Amazon CloudFront.
    2. You need to purchase a certificate from a third-party Certificate Authority (CA) and upload it to ACM.
    3. You need to configure the preshared key for ACM.
    4. You might not have created the ACM certificate in the right region.
  10. How can you use the wildcard * when invalidating objects with Amazon CloudFront?

    1. In place of specifying subdomains individually.
    2. As a form of object versioning.
    3. To allow access to your origin server.
    4. To specify a path that applies to many objects.
  11. What do Amazon CloudFront access logs do?

    1. They are a way to monitor performance of your Amazon Simple Storage Service (Amazon S3) bucket.
    2. They contain detailed information about every user request that Amazon CloudFront receives.
    3. They enable you to capture information about the IP traffic going to and from network interfaces.
    4. They enable governance, compliance, operational auditing, and risk auditing of your AWS account.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.188.178.181