Serverless architectures are versatile. You can use them to build an entire back end or glue a few services together to solve a specific task. Building a proper back end requires the development of an application programming interface (API) that sits between the client and back-end services. In AWS, the API Gateway is this key AWS service that allows developers to create a RESTful API.
This chapter takes a look at the API Gateway. We examine the fundamental activities that go into building an API and discuss features such as staging and versioning, as well as caching, logging, and throttling of requests. You’ll also continue to add new functionality to 24-Hour Video such as the ability to list videos facilitated by the API Gateway. Note that API Gateway is a service with many features that can’t all be addressed in a single chapter of a book. We recommend reading this chapter and then building a sample API, playing with different features, and reading through the official documentation. Like most of AWS, API Gateway is a rapidly developing service, so don’t be surprised if you see one or two new features not discussed here.
Think of the API Gateway as an interface (figure 7.1) between back-end services (including Lambda) and client applications (web, mobile, or desktop).
We’ve mentioned before that your front-end application should communicate with services directly. But there are many cases where this isn’t possible or desirable in terms of security or privacy. You should perform some actions only from a back-end service. For example, sending an email to all users should be done via a Lambda function. You shouldn’t do it from the front end because it would involve loading every user’s email address into another user’s browser. That’s a serious security and privacy issue and a quick way to lose your customers. So don’t trust the user’s browser and don’t perform any sensitive operations in it. The browser is also a bad environment for performing operations that may leave your system in a bad state. Have you seen those websites that say “Do not close this window until the operation has finished”? Avoid building systems like that. They’re too brittle. Instead, run operations from a back-end Lambda function, and flag the UI when the operation is completed.
An API Gateway is an example of technology that makes serverless applications easier to build and maintain than their traditional server-based counterparts. In a more traditional system you might need to provision EC2 instances, configure load balancing using Elastic Load Balancer (ELB), and maintain software on each server. The API Gateway removes the need to do all that. You can use it to define an API and connect it to services in minutes. In us-east-1, the API Gateway cost is $3.50 per million API calls received, which makes it affordable for many applications. Let’s look now at a few important features of the API Gateway in more detail.
Back in chapter 5, you connected an API Gateway to a user-profile Lambda function. You did this so that your website could request information about the user from a Lambda function. Those with keen eyesight would have noticed that Lambda was one of four options. The other three were HTTP Proxy, AWS Service Proxy, and Mock Integration, which are briefly described here.
The HTTP Proxy can forward requests to other HTTP endpoints. Standard HTTP methods (HEAD, POST, PUT, GET, PATCH, DELETE, and OPTIONS) are supported. The HTTP Proxy is particularly useful if you have to build an interface in front of a legacy API or transform/modify the request before it reaches the desired endpoint.
The AWS Service Proxy can call through to AWS services directly rather than through a Lambda function. Each method (for example, GET) is mapped to a specific action in a desired AWS service, such as adding an item to a DynamoDB table directly. It’s much quicker to proxy straight to DynamoDB than to create a Lambda function that can write to a table. Service Proxy is a great option for basic use cases (such as list, add, and remove) and it works across a wide range of AWS services. But in more advanced use cases (especially those that need logic), you’ll still have to write a function.
The Mock Integration option is used to generate a response from the API Gateway without having to integrate with another service. It’s used in cases such as when a preflight cross-origin resource sharing (CORS) request is issued and the response is predefined in the API Gateway.
It wouldn’t be a useful service if the API Gateway didn’t have facilities for caching, throttling, encryption, and logging. Section 7.3 deals with these concerns in more detail. Caching can help to reduce latency and alleviate the load on the back end by returning results computed earlier. But caching isn’t easy, so you must take care to get it right.
Throttling reduces the number of calls to the API using a token bucket algorithm. You can use it to restrict the number of invocations per second to prevent your back end from being hammered with requests. Finally, logging allows CloudWatch to capture what’s happening to the API. It can capture the full incoming request and outgoing response and track information such as cache hits and misses.
Staging and versioning are features that you’ve already used. A stage is an environment for your API. You can have up to 10 stages per API (and 60 APIs per account), and it’s entirely up to you how to set them up. We prefer to create stages for development, user acceptance testing, and production environments. Sometimes we create stages for individual developers. Each stage can be configured separately and use stage variables to invoke different endpoints (that is, you can configure different stages to invoke different Lambda functions or HTTP endpoints).
Each time an API is deployed it creates a version. You can go back to previous versions if you make a mistake, making rollbacks rather easy. Different stages can reference different versions of the API, making it flexible enough to support different versions of your application.
Configuring the API Gateway manually (using the AWS console) is fine while you’re learning how to use it. But it isn’t a sustainable or robust way to work in the long term. Luckily, you can script an entire API using Swagger, which is a popular format for defining APIs (http://swagger.io). Your existing API can be exported to Swagger, and Swagger definitions can be imported as new APIs.
In chapter 5, you provisioned a new API for 24-Hour Video. You might recall that an API is made out of resources (which are entities like user) that can be accessed through a resource path (for example, /api/user). Each resource can have one or more operations defined against it, represented by HTTP verbs such as GET, DELETE, or POST (figure 7.2).
In this section, you’re going to add a new resource and method to the API Gateway, connect it to a Lambda function, and learn how to use Lambda proxy integration. Figure 7.3 shows which component of the 24-Hour Video system you’ll be working on in this chapter.
If you didn’t create an API in chapter 5, you’ll have to do it now. To create an API, choose the API Gateway in the AWS console and then click the Create API button. Leave the New API radio button selected, and enter an API name (such as 24-hour-video) and an optional description. Click Create API to finalize your choice (figure 7.4).
In chapter 6, you created a Lambda function to return a list of videos in your S3 bucket. It would be good to show these videos on your website. You want the user to open the site and see a list of videos they can play—just as they would on YouTube. For this to work, your website needs to issue a request to the get-video-list Lambda function via the API Gateway.
You’re going to create a resource called Videos and add a GET method to it, which you’ll use to request and receive the list of videos. When you finish working through this chapter, your implementation of 24-Hour Video will look similar to figure 7.5. To make things a little more interesting, you’ll add an optional URL query parameter called encoding. You’ll use this parameter to return videos of a specific encoding (for example, 720p or 1080p).
Be forewarned: having a Lambda function and an API Gateway to return a list of videos is what you have to do for now because you don’t have a database. In chapter 9, we’ll show you an alternative way of retrieving video URLs straight from the database without having to use the API Gateway or Lambda.
In the API Gateway, choose the 24-hour-video API you created earlier. Then follow these steps to create a resource called Videos (figure 7.6):
When you create a resource in the API Gateway, there’s an option called Configure as Proxy Resource (figure 7.6). Enabling this option creates a proxy resource with a “greedy” path variable that looks like {proxy+}. A greedy path variable represents any child resource under a parent resource. For example, if you had a path /video/ {proxy+} you could issue requests to /video/abc, /video/xyz, or any other endpoint starting with /video/. All of those requests would be routed to the {proxy+} resource automatically (it’s essentially a wildcard for paths). The + symbol tells the API Gateway to capture all requests on the matched resource (https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-set-up-simple-proxy.html#api-gateway-proxy-resource).
Enabling the Configure as Proxy Resource option also creates an ANY method under the resource. The ANY method allows the client to use any HTTP method (GET, POST, and so on) to access the resource. But you don’t have to use the ANY resource if you don’t want to. You can still create individual methods like GET and POST.
Finally, you can choose a Lambda function proxy or an HTTP proxy as the integration type. We’ll discuss Lambda proxy and HTTP proxy integration shortly. When should you enable the Configure as Proxy Resource option? The answer is only if you have a specific reason or a use case for it. Our suggestion is to try to build mature RESTful APIs whenever possible (https://martinfowler.com/articles/richardsonMaturityModel.html) and to resort to Configure as Proxy Resource only when you have to. One final note: the greedy path variable, the ANY method, and proxy integrations are separate features and can be used independently from each other. We’ll show how to use Lambda proxy integration in this chapter but leave greedy path variables and the ANY method for you to experiment with.
Yet another option you can turn on when creating a resource is Enable API Gateway CORS. It’s safe to enable it during the creation of the resource. It creates an OPTIONS method that’s needed for CORS. If you use Lambda or HTTP proxy integration (as you will in this chapter), then everything is set up for you automatically. The option generates the OPTIONS method, and any additional CORS headers can be set in the Lambda function (as you’ll shortly see).
But if you end up mapping requests/responses individually, then you need to run Enable CORS from the Actions drop-down every time you create a new method. Running Enable CORS adds a necessary CORS header to Method Response, which you need in this case. We’ve also noticed that using Enable API Gateway CORS during resource creation generates a slightly more permissive OPTIONS method (but you can tweak it after the fact). Feel free to enable that check box, but don’t forget about the Enable CORS option from the Actions drop-down menu.
Having created the resource, you can create a method for it:
You should now see integration setup for the GET method. You’re going to configure it to invoke your Lambda function:
Lambda Proxy Integration is an option that will make life easier when using API Gateway and Lambda together. If you enable it, API Gateway will map every request to JSON and pass it to Lambda as the event object. In the Lambda function you’ll be able to retrieve query string parameters, headers, stage variables, path parameters, request context, and the body from it.
Without enabling Lambda Proxy Integration, you’ll have to create a mapping template in the Integration Request section of API Gateway and decide how to map the HTTP request to JSON yourself. And you’d likely have to create an Integration Response mapping if you were to pass information back to the client. Before Lambda Proxy Integration was added, users were forced to map requests and responses manually, which was a source of consternation, especially with more complex mappings.
Lambda Proxy Integration makes things simpler, and in most cases you’ll find it’s the preferred option. There are cases, however, where you might want to create a specific mapping template (as you did in chapter 5). A mapping can help to produce a succinct and targeted integration payload as needed by the function (as opposed to passing the full request with proxy integration).
If you choose HTTP Request Integration, you’ll get an option similar to Lambda Proxy Integration called Use Http Proxy Integration. If you enable this option, your request will be proxied in its entirety to the specified HTTP endpoint. If you don’t enable the option, you’ll be able to specify a mapping and create a new request payload yourself.
You now have a resource and a GET method. You need to enable CORS to allow your clients to access the API. For the moment, you’re going to allow any client from any origin to issue GET requests against /videos. As you move to create staging and production versions of the site, you’re going to lock down the origin, so that only your website can access the API. To enable CORS, do the following:
Security is important to get right. Nothing will take the shine off your newly designed serverless system than someone compromising your security. Remember that in a real-world setting you’ll need to restrict CORS to your domain rather than leave it wide open. If you use Lambda proxy integration, then CORS settings must be set in the response created by the Lambda function. If you manually map requests and responses, then you can set CORS settings in the integration response of the method.
If you select the GET method of your videos resource, you’ll see a page similar to figure 7.8. This page has the following configuration sections that can be accessed:
Click Method Request to access its configuration. You can do a number of things here, but they aren’t relevant right now because you’re using Lambda proxy integration. If you wish to find out what some of these settings do, refer to appendix E. The only option that’s applicable at this stage is the custom authorizer. You must set it to authenticate requests to this GET method. To do this, click the pen icon next to Authorization and select your custom authorizer from the list (remember, you created this authorizer in chapter 5). Figure 7.9 shows what this screen looks like.
Here’s where things get interesting. In this chapter we’ll show you how to create an API using Lambda proxy integration. But what if you want to do the same thing without using Lambda proxy integration? What if you want to write a mapping and have granular control over what’s available to the Lambda function via the event object? Understanding mappings and models is useful, so we’ve added appendix E, which explains how to implement an API without Lambda proxy integration. In the appendix we show how to configure integration request and response. We also introduce you to the Velocity Template Language (VTL) and show how to use regular expressions to create HTTP status codes in the API Gateway. We think you’ll end up using Lambda proxy integration most of the time, but appendix E serves as a great guide if you wish to understand mappings or want more precise control over the payloads produced by the API Gateway.
Anytime you make a change to the API Gateway, it must be redeployed. If you forget to do this, you won’t see any of your modifications. To perform a deployment, do the following:
The API Gateway side of things is mostly done. The Gateway proxies the HTTP request to the Lambda function, which becomes available to it via the event object. The function can extract useful information like the body of the request, the headers, and query string parameters. At the end, the function must create a specially formed response that the API Gateway can pass back to the client. If this response doesn’t follow the prescribed format, API Gateway will return a 502 Bad Gateway error to the client.
The following listing shows what an event parameter looks like when API Gateway invokes a Lambda function. The example given here is a basic GET request with a query string parameter called encoding. Note that some parts of this listing have been condensed or slightly modified for brevity.
A Lambda function must return a response, via the callback function, that matches the JSON format given in the next listing. If the format isn’t followed, the API Gateway will return a 502 Bad Gateway response (https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-set-up-simple-proxy.html).
{ "statusCode": httpStatusCode, "headers": { "headerName": "headerValue", ... }, "body": "..." }
You implemented the get-video-list Lambda function in chapter 6. Now you need to update this function to work with the API Gateway. The next listing shows an updated implementation that accounts for the API Gateway and Lambda proxy integration. Replace the implementation of the existing function with the code given in the listing and deploy the function to AWS.
You’ll notice in listing 7.3 that you always invoke callback(null, response) even when you want to return an error to the client. The first parameter to the callback function is null although you’re dealing with an error state from the user’s perspective. Why is that? This is because from Lambda’s perspective everything is correct. The function itself didn’t fail. The second parameter is the response and whether it needs to inform the client if there’s an issue. Luckily you can also set an HTTP status code that the API Gateway will send with the response. If you need to send back a 400 or 500 HTTP status code, that’s easy to do by tweaking the payload and changing the statusCode parameter to whatever you want. If you forget to put a null as the first parameter in the callback, your client will get a 502 response. Having updated the implementation of the get-video-list Lambda function, deploy it to AWS.
You can do a quick test in the API Gateway to check that everything is right. In the Method Execution window for the GET method (figure 7.8), click Test in the client rectangle. In the Query Strings text box enter encoding=720p and click Test. If you have any transcoded files ending with 720p.mp4, you should see them listed under Response Body (figure 7.10). If you don’t have any 720p files, you should see a response body that states that “No files were found” with an HTTP status code of 404. If you leave the Query String text box empty, then the response body will contain a list of all mp4 files in your transcoded videos bucket.
You’ve done all this work with the API Gateway and Lambda, but there’s one last thing to do. You need to update your 24-Hour Video website, which you began in chapter 5, to show videos. You’re going to change the front page to show videos that users have uploaded when the page loads. Also, you’re going to use the HTML5 video tag to play videos. All the latest versions of major browsers support it.
To update the site, open index.html in your favorite text editor and remove the entire section of the code (near the bottom) that begins with <div class="container"> and ends with </div>. In the place of that div, copy the contents of the following listing.
Next, you need to implement code that will issue a GET request against your API Gateway and populate the front page with the videos. To do this, create a file called video-controller.js in the js directory of the website and copy the contents of the next listing to it.
Another file that needs to be changed is main.js in the js directory. Replace the contents of this file with the following code.
Finally, add <script src="js/video-controller.js"></script> above <script src="js/user-controller.js"></script> in index.html and save the file. You’re finally in a position to see what the website looks like. From the command line, run npm run start to launch the website. If you’ve uploaded any videos, they should appear after a short wait. If you haven’t uploaded any videos, you can do it now and then refresh. If you’re not seeing any videos after a short wait, look at your browser’s console to investigate what’s going on.
In section 7.1, we briefly listed some of the features of the API Gateway, including caching and throttling. Let’s look at these in more detail because they’ll come in handy as you build your serverless architecture. You can find all of the options we mention in this section in the Settings tab of the Stage Editor. To get to the Settings tab, take these steps:
Let’s talk about throttling first, and specifically about rate and burst limit. The rate is the average number of times the API Gateway will allow a method to be called per second. The burst limit is the maximum number of times the gateway will allow the method to be called. API Gateway sets the “steady-state request rate to 1000 requests per second (rps) and allows bursts of up to 2000 rps across all APIs, stages, and methods within an AWS account” (https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-request-throttling.html).
These defaults can be increased if you ask Amazon. The throttling feature prevents denial-of-service attacks by disallowing additional HTTP requests above the set threshold. You can see how it works by lowering the request rate and putting together a quick Lambda function that will fire off a few hundred requests in rapid succession.
To see how it works for yourself, do the following:
Having set a limit, create a new Lambda function and paste the contents of the next listing into it. This function is based on the https-request blueprint you can select when creating a new function.
If you’ve enabled a custom authorizer for the /videos GET method, you should temporarily disable it to run this test. In Resources, click GET under /videos, select Method Execution, and then set the Authorization drop-down to NONE. Deploy the API for the change to take effect. Don’t forget to set your custom authorizer back once you’ve finished the throttle test.
To run the function, click Test and paste the contents of the next listing as the event. Change the hostname to the hostname of your API and click Save and Test.
The test may take a few seconds to run, but you can scroll down the page to see the results under Log Output. Not all of the results will be captured there, so you can choose the logs link next to the Execution Result heading. If you scroll through the log, you’ll see that most requests did not succeed. After you’ve finished testing, don’t forget to set the Rate and Burst limits back to more sensible numbers or uncheck Enable Throttling.
We strongly recommend that you enable CloudWatch Logs and CloudWatch Metrics for your API. To do this, you need to have an IAM role that has permissions to write to CloudWatch and you need to specify the ARN of this role in the API Gateway. Create a new role, call it api--gateway-log in the IAM console, and attach a policy called AmazonAPIGateway-PushToCloudWatchLogs to it (figure 7.12). Write down the ARN of the role.
To finish, you need to configure the API Gateway:
To check that everything has been correctly set up, follow these steps:
As you begin using your API Gateway, logs will begin to appear in CloudWatch. In fact, these logs will help you in the next section.
The> AWS documentation (https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-caching.html) has a great section on caching. Caching can increase the performance of your API by returning a result without calling your back-end service. Enabling cache is easy, but you also have to know when to invalidate the cache so that your clients are not served stale results. It also costs money.
An API Gateway cache can be as small as 0.5 GB and as large as 237 GB. Amazon charges per hour for a cache and the price depends on the size of the cache. As an example, 0.5 GB is $0.020 per hour, whereas 237GB is $3.800 per hour. You can find the pricing table at https://aws.amazon.com/api-gateway/pricing/.
There’s a common saying that there are two hard things in computer science: cache invalidation, naming things, and off-by-one errors. Caching is hard to get right regardless of the system, so it will always take a bit of tweaking and experimenting the first time you do it.
To enable caching for your API, choose Enable API Cache in the Stage Editor (figure 7.14).
What difference does caching make? It all depends on how long the endpoint (such as your Lambda function) normally takes to run. As a quick and dirty test, we ran 500 requests against our get-video-list API with and without caching. Without caching, the overall execution time was around 30,000 to 31,000 ms. With caching, the duration was around 15,000 ms. Your mileage will vary, but for heavy-duty systems, caching will make a difference. You can see the results for yourself, including duration and memory consumption, in figure 7.15.
There are some helpful things you should know about caching (if you want to know the details refer to https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-caching.html):
We’ve briefly mentioned stages with regard to the API Gateway but haven’t explored what they are. To refresh your memory, a stage is an environment for your API. You can have stages to represent development, UAT, production, and anything else you want. You can have a stage for each developer if you need to. APIs can be deployed to different stages and each can have its own unique URL.
One of the nice things about stages is that they support stage variables, which are key/value pairs. These act as environment variables. They can be used in mapping templates and passed to Lambda functions, HTTP and AWS integration URIs, or in AWS integration credentials. You can configure different stages of the same API to invoke different Lambda functions or pass the value of a stage variable to another HTTP endpoint.
To create a stage variable (figure 7.18), do this:
Each stage maintains its individual variables. If you need to have a variable called function available in three stages, you must create it individually three times.
A stage variable can be referenced in a mapping template or in place of a Lambda function name or an HTTP integration URI. It takes the form of stageVariables.<variable_name>. Often you’ll have to surround the stage variable with $ and {} as per the shorthand notation for references in the Velocity Template Language (see appendix E for more information). The following is an example that works in our case because we created a stage variable called function in the previous section:
${stageVariables.function}
If you wanted to reference this stage variable in place of a Lambda function name in the integration request, you could do it directly (figure 7.19).
Figure 7.20 shows a real example from the production system we discussed in chapter 2 (A Cloud Guru). This system has API Gateway stages such as production, uat (user acceptance test), and development. Lambda functions have aliases such as serverless-join:production and serverless-join:uat. Using a stage variable in the API Gateway allows it to invoke the right Lambda function when the appropriate URI is invoked (for example, myapi/staging and myapi/production).
If you want to roll back to a previous deployment of your API, you can do it via the Deployment History tab in the Stage Editor. Every deployment you make has a date/time stamp (and a description if you entered one) to help you identify earlier revisions. You can select a different version (figure 7.21) and click the Change Deployment button at the bottom of the page to go to a different version. This is a useful feature if you make a mistake and need to go back a version while you figure out what went wrong.
In this chapter, we covered a lot of the functionality that the API Gateway offers. Try to complete the following exercises to reinforce your knowledge:
In this chapter we looked at the API Gateway, including how to
In the next chapter you’ll look at storage in more detail. You’ll see how to upload files directly to an S3 bucket and use a Lambda function to grant the uploader the permissions to do so. You’ll also take a look at securing access to files and generating presigned URLs.
18.117.86.224