Chapter 6. Lambda the orchestrator

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 6. Lambda the orchestrator

This chapter covers

Invocation types and programming models
Versioning, aliases, and environment variables
Usage of the CLI
Development practices
Testing of Lambda functions

If there’s one thing you take away from this book, it should be an understanding that a compute service such as Lambda is the heart of serverless architecture. You used Lambda in chapters 3 and 5, so you have a feel for it already. This chapter explores Lambda in more detail. It looks at core concepts and investigates design of functions. We explain features such as versioning and aliases and go over important design patterns such as async waterfall. We also continue to add features to 24Hour Video as we turn it into a full-fledged application.

6.1. Inside Lambda

Serverless compute services like Lambda are as big a shift for cloud computing as S3 was for cloud storage. If you think about it, the two are similar. S3 deals in objects for storage. You provide an object and S3 stores it. You don’t know how, you don’t know where, and you don’t really care. There are no drives to concern yourself with and no such thing as disk space. You can’t over-provision or under-provision storage capacity in S3.

Likewise, with Lambda you provide function code; Lambda executes it on demand. You don’t know how and you don’t know where. There are no virtual machines to concern yourself with, and there are no such things as server farm capacity, too many idling servers, not enough servers to meet demand, or scaling groups. You can’t over-provision or under-provision execution capacity in Lambda. It’s just what you want it to be and Amazon charges you only for the time it executes. This is why Lambda and similar serverless compute services such as Azure Functions, Google Cloud Functions, and IBM OpenWhisk are as big a shift forward for compute as S3 was for storage (http://bit.ly/2jQnlGB).

Function as a service

Some people prefer to use the acronym FaaS (function as a service) to describe technologies like Lambda. In fact, they prefer not to use the term serverless at all. They feel that it’s not accurate enough and that it needs to be constantly explained. In this book, we’ve been using the term serverless not as a synonym for Lambda but as a descriptor for an approach that encourages you to use a compute service, use third-party services and APIs, and employ powerful patterns and architectures (such as having thick front ends that talk directly to services using delegation tokens). Therefore, we say that serverless is an umbrella term that encompasses FaaS and that FaaS is just one aspect (albeit a very important one) of what serverless technologies and architectures have to offer.

6.1.1. Event models and sources

Lambda is a serverless compute service that can execute code in response to the following:

Events raised in AWS
HTTP requests arriving through the API Gateway
API calls made using the AWS SDK
Manual user invocation via the AWS console

Lambda functions can also run on a schedule, which makes them suitable for repeatable tasks such as backups or system health checks. Lambda supports functions written in four languages: JavaScript (Node.js), Python, C#, and Java. You’ve been using JavaScript so far, but there’s no reason why you couldn’t use one of the other languages. They’re all first-class citizens.

The two ways to invoke a Lambda function

Lambda supports two invocation types: Event and RequestResponse.

Event invocation takes place when an event (such as a file being created in S3) triggers a Lambda function. You saw event invocations in chapter 3 when you used S3 and SNS to invoke Lambda. Event invocation is asynchronous. A Lambda function that executes because of an event doesn’t send a response back to the event source.

The other model is RequestResponse. It comes into play when Lambda is used with the API Gateway, invoked via the AWS console, or called with the CLI. Request-Response forces Lambda to execute the function synchronously and return the response to the caller. You used RequestResponse in chapter 5 when you integrated the API Gateway with the user-profile Lambda function. Note that if you invoke a function via the SDK/CLI, you can choose whether to use Event or RequestResponse invocation.

6.1.2. Push and pull event models

Lambda’s event-based invocation is quite interesting. It has two modes: push and pull. In a push model, a service (such as S3) publishes its event to Lambda and directly invokes your function. Figure 6.1 shows what this looks like.

Figure 6.1. Except for stream-based services, which are Amazon Kinesis Streams and DynamoDB, all other AWS services use the push model of operation.

In a pull model, Lambda’s runtime polls a streaming event source (such as a DynamoDB stream or a Kinesis stream) and invokes your function when needed. Figure 6.2 shows what this looks like.

Figure 6.2. The pull model applies to Amazon Kinesis Streams and DynamoDB streams only.

In both models, an event source mapping describes how an event source is associated with a Lambda function. One subtle difference between push and pull is this: “With the pull model, you maintain the mappings in AWS Lambda by creating event source mappings using the relevant AWS Lambda API. With the push model, the event sources maintain the mapping and you use the APIs provided by the event sources to maintain the mapping” (http://amzn.to/1Xb78FV).

6.1.3. Concurrent executions

AWS has a cap of 100 concurrent executions across all functions within a region (per account). This cap, however, can be lifted by asking Amazon. The company says that the limit is there to protect developers “from costs due to potential runaway or recursive functions during initial development and testing” (http://amzn.to/29nORER). The number of concurrent executions is calculated differently depending on whether the event source is stream-based (that is, if the event source is Kinesis Streams or a DynamoDB stream) or not.

Stream-based event source

In a stream-based event source, function invocation concurrency is equal to the number of active shards. If, for example, there are 10 shards, there will be 10 Lambda functions executing concurrently. Lambda functions process records off shards in the order in which they arrive. If a function encounters an error processing a record, it retries until it succeeds or the record expires, before going on to the next one.

Non-stream-based event source

Amazon proposes a simple formula for estimating the number of concurrent invocations for non-stream event sources:

events (or requests) per second x function duration

A simple example is an S3 bucket that publishes 10 events per second with the function taking an average of three seconds to run, which equates to 30 concurrent executions (http://amzn.to/29nORER). If a Lambda function is throttled and continues to be invoked synchronously, Lambda will respond with a 429 error. It’s then up to the event source (for example, your application) to try invoking the function again. If a function was invoked asynchronously, AWS will automatically retry the throttled event for up to six hours with delays in between every invocation (http://amzn.to/29c7Bar).

6.1.4. Container reuse

Lambda functions execute in a container (sandbox), which provides isolation from other functions and an allocation of resources such as memory, disk space, and CPU. Container reuse is important to understand for more advanced uses of Lambda. When a function is instantiated for the first time, a new container is initialized and the code for the function is loaded (we say that the function is cold when this is done for the first time). If a function is rerun (within a certain period), Lambda may reuse the same container and skip the initialization process (we say that the function is now warm), thus making it available to execute code quicker.

Tim Wagner, the general manager of Lambda at AWS (http://amzn.to/237CWCk), makes an important point: “Remember, you can’t depend on a container being reused, since it’s Lambda’s prerogative to create a new one instead.” This means that every time you run a function, you should assume that you have a new container. But if you use the /tmp folder or touch the filesystem in other ways, your files or changes from the previous invocation may still be there. We’ve experienced this many times. If it happens to you, you’ll have to clean the /tmp directory manually.

Another important detail is what Wagner calls the freeze/thaw cycle. You can run a function and launch a background thread or a process. When the function finishes executing, the background process will become frozen. Lambda may reuse the container the next time you invoke the function and thaw the background process, thus resuming its execution. The background process will continue to run as though nothing happened. Keep this in mind if you decide to run background processes.

6.1.5. Cold and warm Lambda

Here’s an experiment. Create any simple Hello World function in the AWS console and run it. You can easily do this by using the hello-world blueprint and then clicking the Test button in the console. Have a look at the duration in the summary in the bottom-left corner (figure 6.3).

Figure 6.3. The time it takes to run a cold function is nearly 90 ms.

Then run the test again and look at the duration in the summary (figure 6.4).

Figure 6.4. A warm function runs much quicker than a cold one.

If you compare the duration of both executions, you’ll see that the time it takes to run a function for the first time is a lot longer than running it a second time. This is the result of the container reuse we described in the previous section. The first time the function is run (when it’s cold), the container needs to be created and the environment needs to be initialized. A lengthy initialization time may be especially noticeable in complex functions that have multiple dependencies. Reusing a container and running a function again is almost always much quicker.

You should try to reduce cold starts (when a function hasn’t been run for a long time and needs to fully initialize) to make the application appear more responsive. If you experience many cold starts, you can try a few steps to increase performance:

Schedule the function (using scheduled events) to run periodically to keep it warm (http://amzn.to/29AZsuX).
Move initialization and setup code out of the event handler. If the container is warm, this code won’t run.
Increase the amount of memory allocated to the Lambda function. The CPU share is (proportionally) based on the amount of memory allocated to the function. AWS gives an example: “If you allocate 256 MB to your Lambda function, it will receive twice the CPU share than if you allocated 128 MB” (http://amzn.to/23aFKif). The more memory and CPU share the function has, the quicker it will initialize.
Reduce as much of your code as possible. Remove unnecessary modules and requires() import statements. Fewer modules to include and initialize will help startup performance.
Experiment with other languages. Java has the longest cold start. This may change in the future, but if you notice long cold starts using Java, try one of the other languages.

6.2. Programming model

We touched on Lambda’s programming model back in chapter 3. Let’s look at it now in more detail from the perspective of the Node.js 4.3 runtime you’ve been using. These are the important elements to consider:

Function handler
Callback function
Context object
Event object
Logging

6.2.1. Function handler

You’ve seen that the function handler is what the Lambda runtime calls to run your function. It’s the entry point. Lambda passes in the event data to the handler function as the first parameter, a context object as the second parameter, and a callback object as the third. The syntax for the function handler is as follows:

exports.handler = function(event, context, callback) { //code }

The callback object is optional and is used if you want to return information to the caller of the function or log an error. Over the next three sections we’ll describe event, context, and callback parameters in more detail.

6.2.2. Event object

You already saw the event object in action when you invoked Lambda functions in previous chapters. The event object contains information about the event and the source that triggered the Lambda function. It’s just a JSON object with an arbitrary number of properties that are specified by the event source.

You can look at sample event objects in Lambda’s console by following this process:

Click into a Lambda function.
Click Actions.
Select Configure Test Event.
Select a template from the Sample Event Template drop-down (figure 6.5).
Figure 6.5. The available event templates provided through the AWS console. You can customize a template or create your own from scratch.

If you invoke a Lambda function via the AWS console, via the CLI, or through an API Gateway, you can create your own event object and customize the way it is structured.

6.2.3. Context object

The context object provides a number of useful properties for getting information about Lambda’s runtime. You can call a number of methods on the context object, such as done(), succeed(), and fail(). These methods were important in the Node.js 0.1 version of the Lambda runtime but aren’t needed in the Node.js 4.3 version. You can review appendix D if you want to know what these are. The other method on the context object that you might find useful is getRemainingTimeInMillis(). Calling this method returns the approximate remaining execution time. This function is valuable if you need to check how much time is left before a timeout (a Lambda function can run for a maximum of five minutes).

The context object also has these useful properties:

functionName—Returns the name of the Lambda function currently executing.
functionVersion—The function version that is executing.
invokedFunctionArn—The ARN used to invoke the function.
memoryLimitInMB—The configured memory limit of the function.
awsRequestId—The AWS request ID.
logGroupName—The CloudWatch log group to which the function will write.
logStreamName—The CloudWatch log stream to which the function will write.
identity—The Amazon Cognito identity if available.
clientContext—Information about client application and device when invoked via the AWS Mobile SDK. It can contain additional information such as platform version, make, model, and locale.

See http://amzn.to/1UK9eib for more information about the methods and properties available via the context object.

6.2.4. Callback function

The callback function is an optional third parameter in the handler function. It’s used to return information to the caller in the RequestResponse invocation type, such as when a function is invoked via the API Gateway. The syntax for using the callback object is as follows:

callback(Error error, Object result)

The Error parameter is optional and is used when you want to specify information about a failed execution. The second parameter, also optional, is used to provide information to the caller when the function succeeds. Note that you need to pass null as the first parameter if you’re going to specify the second parameter and there’s no error. The following are examples of valid uses of callback:

callback(null, "Success");
callback("Error");
callback(); //This is the same as callback(null);

You don’t need to specify any parameters in the callback if you don’t want to return information to the caller. You don’t even need to add callback() to your code if you don’t want to return a response or log an error. Lambda will call it for you implicitly if you don’t include it in your code. For more information on using the callback function, see the section titled “Using the Callback Parameter” at http://amzn.to/1NeqXM5.

6.2.5. Logging

Logging to CloudWatch can be done using console.log("message"). The other supported ways of logging are console.error(), console.warn(), and console.info(), but there’s no real distinction between them in terms of CloudWatch. If you invoke a Lambda function programmatically (see section 6.4 for more on this), you can add a LogType parameter to receive the last 4 KB of log data (it’s returned in the x-amz-log-results header of the response). The callback function will also log to a CloudWatch log stream if you provide a non-null value as the first parameter. At the end of the day, we highly recommend that you adopt a proper logging framework that will manage alert levels and log objects (for example, have a look at log at http://bit.ly/1VHIxuA).

6.3. Versioning, aliases, and environment variables

When Lambda was originally released, it didn’t have support for versioning, aliases, or environment variables. But it’s now hard to imagine building and running a real production system without these features.

6.3.1. Versioning

Versioning allows developers to create new versions of functions without overwriting previous ones. Once a new version of a function is published, the old version can still be accessed but it can’t be changed. Importantly, each version of a function has its own unique ARN, and each version can be invoked. To create a new version of a function, follow these steps:

Open the Lambda console in AWS and click a function.
Choose Actions and choose Publish New Version.
Type a description in the dialog box. This description will be added to the version you’re about to create.
Choose Publish to close the dialog box.

If you click the Qualifiers drop-down and then select the Versions tab, you’ll see all current versions of the function (figure 6.6). The most recent version is always identified as $LATEST. If you don’t specify a version number when invoking a function, this is the function that’s invoked.

Figure 6.6. Versions are easy to create and invoke through the console and the CLI.

Your next question might be how to invoke a specific version of a function. That depends on where you’re trying to invoke the function from. If it’s the API Gateway, you can specify the function name and version with a colon in between, as you can see in figure 6.7 (for example, my-special-function:3).

Figure 6.7. Setting the right version of the Lambda function to invoke is trivial in the API Gateway. If you don’t specify a version, the API Gateway will invoke the $LATEST version.

If it’s S3, you can specify the function ARN, which, as we mentioned before, is unique for every version of the function (figure 6.8).

Figure 6.8. S3 uses the ARN to invoke the right version of the function. You can look up the ARN in Lambda’s console.

6.3.2. Aliases

An alias is a pointer or a shortcut to a specific version of a Lambda function. It has an ARN just like a function, and it can be mapped to point to any function (or version) but not to another alias. Having an alias makes things easier when you need to switch from one version of a function to another. Imagine the following scenario:

You have three versions of a function:
Version 1 is in production.
Version 2 is being tested in a staging/UAT environment.
$LATEST is the current development version.
You’ve finished testing function version 2 and want to promote it to production.
You’d have to update every event source that references function 1 (the current production) to reference function 2. This isn’t ideal because it may mean a redeployment of your code and multiple updates throughout your system.

With an alias, this scenario becomes into easier to manage:

Create three aliases called dev, staging, and production.
Assign the right alias to the right version of the function:
- The production alias points to version 1.
- The staging alias points to version 2.
- The dev alias points to $LATEST.
Configure event sources to point to an alias instead of a specific version of a function.

Whenever you need to update the system to use a new version of a function, change the alias to point to that new version instead (figure 6.9). Event sources remain ignorant of the fact that an alias now points to a new version of a function and continue to operate as normal.

Figure 6.9. Initially an alias called production points to version 1 of a Lambda function. After an update, it’s remapped to point to version 2. The staging alias is also remapped to point to the $LATEST version of the function.

To create an alias for a function, follow these steps:

Choose Lambda in the AWS console and choose any function.
Choose Actions.
Choose Create alias.
In the dialog box enter a name for the alias, such as dev or production, a description, and select the version that the alias should point to.
Choose Submit to create the alias and close the dialog box.

To view aliases for a function, use the Qualifiers drop-down, just as you did with versions (figure 6.10). A tab in the drop-down allows you to switch the view between versions and aliases. To delete an alias, choose Actions and select Delete Alias. Doing this deletes the alias and any related event source mappings that point to it. Everything else, including function versions, is left intact.

Figure 6.10. You can switch between aliases and versions in the sidebar.

6.3.3. Environment variables

You already saw environment variables in chapter 5 when you created the user--profile Lambda function. Environment variables are key-value pairs, which can be set using the Lambda console, CLI, or the SDK. They can be referenced by the function’s source code and accessed during function execution.

By using environment variables for settings and secrets, you’ll avoid having to bake this information into the function’s code. And you’ll be able to change variables without having to modify and redeploy the function. Environment variables work with function versioning, which we’ve discussed in this section. A development version of a function can use a variable that points to the connection string of a development database. The same environment variable can point to the production version of a database for the production version of the function.

Basic usage

Figure 6.11 shows a part of the Lambda console (the Code tab) where you can set environment variables. You may notice that in this book we uppercase the names (keys) of environment variables (for example, UPLOAD_BUCKET). We chose to follow this convention. You don’t have to uppercase environment variable names if you don’t like that.

Figure 6.11. The key must start with a letter and contain only letters, numbers, and underscores. Values don’t have such restrictions, but at the time of this writing you shouldn’t use commas in them. You’d have to either use a different delimiter or encrypt the value.

You can set environment variables using the AWS CLI. The CreateFunction and UpdateFunctionConfiguration APIs allow you to do that (more on these APIs in the next section).

Note

Some environment variable key names are reserved. You can’t, for example, set a key called AWS_REGION or AWS_ACCESS_KEY. To see a full list of reserved variables, have a look at this page: http://amzn.to/2jDCgBa.

Environment variables can be accessed through process.env (for Node.js functions). If you wanted to print the value of the UPLOAD_BUCKET variable shown in figure 6.11, you’d add the following line to your function:

console.log(process.env.UPLOAD_BUCKET);

Encryption

For sensitive data, you can choose to encrypt environment variables. You can do it in the console by enabling the Enable Encryption Helpers check box (figure 6.12). The first time you enable it, you’ll get a chance to create an encryption key using the AWS Key Management Service (KMS). You’ll then have this key available for encryption across all Lambda functions (in each region). You can, of course, create multiple keys too.

Figure 6.12. Use encryption whenever you’re handling sensitive data.

Having created a key, you’ll be able to encrypt all or some of the variables. In the console, you’ll see a button labeled Encrypt next to each environment variable. Use this button to encrypt the variable. The value will be immediately replaced with an encrypted string. You’ll also see a button called Code. You can click this button to get a snippet of code that shows how to decrypt the variable within the function.

Our advice is to use environment variables for settings and secrets whenever you can. Don’t bake these into your function. Use what the platform offers you and you’ll find life to be a lot easier.

6.4. Using the CLI

So far, you’ve primarily used the AWS console to create and configure Lambda functions. It’s likely, though, that you’ll have to use the CLI at some point to create, update, configure, and delete functions, especially if you start thinking about automation.

6.4.1. Invoking commands

If you followed chapter 3, you installed the AWS CLI (http://amzn.to/1XCoTOC). The CLI allows you to issue commands in the following form:

aws lambda <function-name> <command-options>

The page at https://docs.aws.amazon.com/cli/latest/reference/lambda/index.html describes available CLI commands. Let’s take delete alias (delete-alias) as an example. It has a few optional parameters, but at its core it’s as simple as running the following (where the --name flag is the name of the alias):

aws lambda delete-alias --function-name return-response --name production

Remember that if you invoke CLI commands, you must also have the right IAM security configured. If you were to try running the delete alias command right now, you’d receive an error message, such as “Client error (AccessDeniedException) occurred when calling the DeleteAlias operation.” To make it work you’d need to add the Delete-Alias permission to the user’s list of permissions.

6.4.2. Creating and deploying functions

In chapters 3 and 5 you deployed functions to AWS using the UpdateFunctionCode API. You did it by adding the following script to package.json:

aws lambda update-function-code --function-name arn:aws:lambda:
     us-east-1:038221756127:function:transcode-video --zip-file
     fileb://Lambda-Deployment.zip

But you had to create the function initially in the AWS console before you could use update-function-code. That’s a manual step and doesn’t fit with your ethos of complete automation. How would you go about creating and deploying a function entirely from the command line? Let’s step through an exercise to see how it’s done.

First, you need to update the lambda-upload user to IAM user to allow you to create functions. Back in chapter 4, you created a group called Lambda-DevOps and assigned the user lambda-upload to it. Now you need to edit the group’s policy and add a new permission:

In the IAM console, open Groups.
Click the Lambda-Upload-Policy group.
Select the Permissions tab if it’s not already selected.
In the Inline Policies section, click Edit Policy on the right of the policy name.
Add lambda:CreateFunction to the Action array (figure 6.13).
Figure 6.13. A simple update to the policy will allow the user to create functions.
Click Apply Policy to save.

You should also double-check that the user lambda-upload is in the Lambda-Upload-Policy group:

Click the Users tab in the Lambda-Upload-Policy group.
Check to see if the lambda-upload user is listed in the table.
If the user isn’t listed, click the Add Users to Group button, find lambda-upload in the list, put a checkmark next to the user, and then click the Add Users button.

To create a function using the CLI, you need to provide a zip file with the source of the function or point to an S3 bucket that has the source. It’s easy to create a function locally and zip it up:

Create a file named index.js.
Copy the contents of the following listing to the file.
Zip the file to create index.zip.

Listing 6.1. Basic function

In the same directory as the zip file of the function, run the command given in the next listing (remember to update the role ARN; it should be the ARN of your own lambda-s3-execution-role). Take a look in the Lambda’s console to make sure it’s there.

Listing 6.2. Working example of the `create-function` command

aws lambda create-function --function-name cli-function --
     handler index.handler --memory-size 128 --runtime nodejs4.3 --role
     arn:aws:iam::038221756127:role/lambda-s3-execution-role --timeout 3 --
     zip-file fileb://index.zip --publish

This next listing shows a subset of the syntax you used with the create-function command, together with an explanation of each of the options. (Lambda supports many more settings and flags than shown in listing 6.3; refer to http://amzn.to/2jeCOfR if you want to see all of the options.)

Listing 6.3. Syntax for the `create-function` command

Naturally, there are many other useful functions to invoke, including these:

delete-function to delete a function (http://amzn.to/2jdefz4)
create-alias to create an alias (http://amzn.to/2jde9rh)
invoke to invoke a function using the RequestResponse or Event invocation type (http://amzn.to/2jYhui7)
publish-version to publish a new version of a function (http://amzn.to/-2-jdsCDm)
list-functions, list-aliases, list-versions-by-function to list functions, aliases, and versions, respectively

6.5. Lambda patterns

If you’re using JavaScript (Node.js) to write your Lambda functions, you’ll have to deal with asynchronous callbacks. You’ve already seen these in action in chapter 3 and especially in the third Lambda function that you created in that chapter. Having multiple callbacks is frustrating and complex because following the logic of the program becomes difficult. If your function naturally leads to a series of sequential steps, you can adopt an async waterfall pattern and reduce the complexity of managing multiple asynchronous callbacks.

Not the only game in town

Async waterfall is a good pattern, but it’s by no means the only way you can deal with callback hell. ES6 supports promises, generators, and yields (http://bit.ly/2k70Zge), which you can use with Node.js 4.3 and Lambda. You could even try using ES7 features—like async/await and transpiling your code down to promise chains—but that could make debugging harder. In other words, read the next section and think about whether the async waterfall pattern is right for you. In many cases, especially if you’re dealing with legacy code that hasn’t been ported to Node.js 4+, knowing and applying a pattern like this is a good idea.

6.5.1. Async waterfall

Async (http://bit.ly/23RfWVe) is a JavaScript library that can be installed as an npm module. It has several powerful features, one of them being support for the waterfall pattern. This pattern allows you to run a set of functions one after another, passing the result of one function into the next one using a callback function. If one of the functions passes an error into the callback, the execution of the waterfall is stopped, and the next task is not invoked (figure 6.14).

Figure 6.14. The async waterfall pattern allows you to invoke and pass results from one function to another. It makes handling of asynchronous methods easier than using callbacks.

The following listing shows a general example of an async waterfall pattern (this listing is adopted from an example given in http://bit.ly/1WaSNui).

Listing 6.4. Async waterfall example

In listing 6.4, take note of the callback function, which is used often. This function must be called on the completion of each task. The first parameter in the callback represents an error. If there’s no error, use a null. The other parameters can be whatever you want. They’re passed on to the next task.

This callback function is similar to the callback you’ve already seen in Lambda. You must not confuse the two callback functions, however, so we recommend naming the callback used in async waterfall to something else (such as next).

24-. Hour Video list

As a clone of YouTube, 24-Hour Video needs to list videos that users can click and view. You don’t have a database at the moment to store URLs to your videos, but you can create a Lambda function to make a list of files in the S3 bucket. This function can be invoked via the API Gateway, and it can return a list of URLs to your videos. You can use async waterfall for this example because you need to take a few steps in series.

Basics setup

Create a new function on your system and name it get-video-list. Here’s how to do this:

Copy one of the previous functions (such as transcode-video) to a new folder and name it get-video-list.
Remove all contents in index.js.
Update package.json to resemble the next listing. The bolded text is what you need to add or modify from your existing file.

Listing 6.5. Package.json for the `get-video-list` function

Add the async module using npm. In the terminal, change to the directory of the function and run the following:

npm install async --save

You should also run npm install to make sure that the AWS SDK is installed. If you look at package.json, there should be two dependencies: async and aws-sdk.

Now (if you followed section 6.4.2) you can create the required Lambda function in AWS using the command npm run create. If you skipped over section 6.4.2, you’ll have to create the get-video-list function in the AWS console yourself.

Implementation

This function has a fairly simple implementation, as shown in listing 6.6. Notably, it doesn’t take into account some scenarios, such as what happens when there are many files in the S3 bucket (the S3 listObjects operation returns up to 1000 objects in the bucket). The function is also not very efficient. But it’s a good temporary measure until we introduce a proper database, and it’s a good way to show how the waterfall pattern is used.

Listing 6.6. The `get-video-list` function

Deploy the function by executing npm run deploy from the directory of the function.

Environment variables

The code in listing 6.6 uses two environment variables: BUCKET and BASE_URL. The BUCKET variable is the name of the second S3 bucket with transcoded files. BASE_URL is a base address for S3 buckets, which is https://s3.amazonaws.com. You must add these two variables for the function to work. In the Lambda console, click the get-video-list function, and add these two environment variables at the bottom of the Code tab (figure 6.15).

Figure 6.15. You must add the BUCKET and BASE_URL environment variables for the function to run.

Testing

The simplest way to test this function is to jump into the AWS console, click Lambda, and then click get-video-list. From there click the Test button. If you get the Input Test Event dialog, click Save And Test to proceed. You should see a list of URLs (if you have mp4s in the bucket) under the Execution Result heading toward the bottom of the page (figure 6.16).

Figure 6.16. You can preview the response in the AWS console, making it easy to test your Lambda function.

Invoking functions from the command line

The AWS CLI can invoke Lambda functions from the command line. It supports both invocation types, RequestResponse and Event. The syntax for the command can be found in http://amzn.to/269Z2U2. If you decide to try a synchronous Request-Response invocation, you need to provide at least two parameters: the function name and the output file that will contain the response from the function.

To invoke the get-video-list function, you need to run the following from your terminal:

aws lambda invoke --function-name get-video-list output.txt

Remember to grant the right permission (lambda:InvokeFunction) to your IAM user if you decide to use this.

6.5.2. Series and parallel

In addition to waterfall, the async library supports series and parallel patterns of execution. The series pattern is similar to waterfall; it invokes a series of functions one by one. Values (results) are passed into the optional callback function at the very end once the series has finished (figure 6.17).

Figure 6.17. The async series pattern can help if you have a series of independent calculations and then get all the results at the end.

The parallel pattern is used to run functions in parallel without waiting for other functions to finish. Once all functions have completed, the results are passed into the optional (final) callback (figure 6.18).

Figure 6.18. The parallel pattern allows functions to execute at the same time and pass their results to the optional callback function at the end.

6.5.3. Using libraries

This advice is a given for most developers. Identify code that’s repeated in multiple Lambda functions and move it to another file so that this code is written only once (the Don’t Repeat Yourself principle). You can import your libraries using node’s require().

Let’s see how to practice what we preach and build a library for sending emails using Amazon’s Simple Email Service (SES). You can use it to send email when a user uploads a new video or as a way to enable messaging between users. There are two ways you can include this library with the rest of your code:

You can build a module, deploy it to npm, and then use npm install --save to add it. It may take a little more overhead, but it’s a good way to manage dependencies and libraries. If having your code publicly available isn’t what you want, there’s a way to set up and use a private npm repository (see http://bit.ly/1MOsyIF).
Another approach is to create a lib directory and place your libraries there. Every function can reference the lib directory and import what’s needed. This approach may work perfectly well for small applications but it has major disadvantages. Maintaining, sharing, and using different versions of the library will become difficult once you begin to grow. Having two or more versions of the library can become problematic as you try to remember which version goes where. So use this approach sparingly, for experiments or for very simple systems. Go with a proper package management system such as npm if you decide to build anything substantial.

We’ll go with the second method in this example, but as part of an exercise later on, we’ll ask you to create an npm module for your library, deploy it to the npm repository, and then install it using npm install.

Get the code

Create a directory called lib for your library. In this directory make a file called email.js and copy the following listing into it.

Listing 6.7. Adding email support

The code in listing 6.7 has a function called send, which can be invoked by external code. It takes four parameters: an array of receiver emails, the sender’s own email, a subject, and a message. To use this library in your Lambda functions, follow these steps:

Copy the file (email.js) over to the directory of the function.
Use require() to load the library in the function:
```
var email = require('email');
```

Invoke send and pass the required parameters to it:

email.send(['[email protected]'], '[email protected]', 'Subject', 'Body');

You will have noticed that in this library you imported async but didn’t bother to run npm install. This is because the library you created will be shipped with a Lambda function that will hopefully have the right npm modules, including async, installed. To be on the safe side, however, you could write a package.json for this library and npm install needed dependencies. In fact, you should do this if you decide to build libraries properly.

Here are a few notes about sending email using SES:

The role used to execute the function and send email needs to have the ses:SendEmail permission to execute correctly:
Create a new role or modify an existing one that you wish to use for the function.
Add a new inline policy and select Policy Generator.
From the AWS Service drop-down, select Amazon SES.
From the Action drop-down, select SendEmail and SendRawEmail.
Click the Add Statement button.
Then proceed to apply the policy and exit.
The from email address needs to be verified in the SES console before email can be sent (figure 6.19). To do it, click SES in the AWS console, click Email Addresses, and click Verify a New Email Address. Follow the wizard to verify your email.
Figure 6.19. Basic email sending is easy via SES. Try to move code and features that are used often (such as email sending) into libraries.

6.5.4. Move logic to another file

Building on advice in the previous section, we recommend moving all of your domain/business logic to a different file/library. Your Lambda handler should be a thin wrapper that executes code stored in another file. Having the bulk of your logic in a separate file will lead to an implementation that’s more testable and is much more decoupled from Lambda. If one day you decide to move away from Lambda, you’ll find it easier to port your code to a new serverless compute service.

6.6. Testing Lambda functions

There are two main ways you can test a Lambda function. You can run tests locally (or during continuous integration/deployment) and you can test the function once it’s deployed to AWS.

Back in chapter 3, you installed an npm module called run-local-lambda. That package allowed you to invoke a Lambda function locally on the computer and pass in an event, a context, and a callback function. Going forward, however, you need to set up a much more rigorous and robust system for executing tests. You need to have a way to mock dependencies, spy on variables and functions, and manage setup and teardown procedures. This section looks at how to put together a good approach to testing.

6.6.1. Testing locally

Let’s look at how to test Lambda functions on your computer (later, you’ll make these tests run as part of your continuous integration/deployment pipeline). To make this more interesting, you’ll write tests for the get-video-list function you created in section 6.5.3. You’ll use Mocha, Chai, Sinon, and rewire to help you write and run your tests.

In your terminal, change to the directory of the function you created in section 6.5.3 and run the following npm install commands to download the required components:

npm install mocha –g Mocha (http://bit.ly/1VKV1lY) is a JavaScript test framework.
npm install chai --save-dev Chai (http://bit.ly/1pu2xmq) is a test-driven development/behavior-driven development assertion library.
npm install sinon --save-dev Sinon (http://bit.ly/1NIhN5q) is a mocking framework. It provides spies, stubs, and mocks.
npm install rewire --save-dev Rewire (http://bit.ly/1YNPj05) is a framework for monkey-patching and overriding dependencies for Node.js unit tests.

We should note that Lambda functions are no different from any other regular Node.js application. You can use a different JavaScript framework or assertion library if you prefer.

6.6.2. Writing tests

Having installed the modules, create a new subdirectory called test. In this subdirectory create a file called test.js and open it in your favorite text editor. Copy the next listing into this file.

Listing 6.8. Tests for the `get-video-list` function

Having implemented listing 6.8, run mocha from the directory of the function. You should see output similar to figure 6.20.

Figure 6.20. You can run tests locally and as part of your continuous deployment pipeline.

Let’s do a quick review of what happens in your test file:

You import Chai, Sinon, and rewire and create a sampleData data object that has test data.
The before hook creates stubs and spies and calls the getModule function to get a copy of your Lambda function. You use rewire to monkey-patch your Lambda function so that when a request to S3 is made (s3.listObjects()) you return a list of objects defined previously in your test file.
You declare two tests. Both tests check your spy. The first test checks that Lambda’s callback function was invoked only once. The second test checks that the arguments passed into Lambda’s callback function are what you expect. The second test is a great way to check that a response from a Lambda function is valid in a RequestResponse invocation.

Testing is a broad and complex subject that will take time to master and get right. Thankfully, testing Lambda functions is straightforward compared to more complex Node.js applications. If you want another example, similar to the one you just completed, have a look at http://bit.ly/1MQc4zO.

Our recommendation is to write tests for each Lambda function you create. Once you have a template for wiring up and mocking dependencies (and you can begin by taking listing 6.8 and repurposing it for your needs), creating new tests should become relatively easy.

6.6.3. Testing in AWS

You’ve written a number of wonderful tests on your system and deployed a function to AWS. It’s a good idea to test your function in AWS to make sure that it works as you expect. An obvious way to do it is to click the Test button in the console and provide an event for your function to consume. Luckily, there’s an even better way to do testing using the unit and load test harness blueprint provided by AWS. This blueprint creates a Lambda function for you that can invoke a Lambda function you wish to test and record the results in a DynamoDB table.

Tim Wagner originally wrote about this in a blog post (http://amzn.to/1Nq37Nx) titled “A Simple Serverless Test Harness using AWS Lambda.” You can configure this test harness as follows:

Create a new role called lambda-dynamo and add an inline policy to it. Add lambda:InvokeFunction and dynamodb:PutItem actions to that policy. For the lambda:InvokeFunction action, set the ARN as arn:aws:lambda:*:*:*.
Create a new table in DynamoDB and name it unit-test-results. Set the Partition Key to testId. Accept all other default settings.
In the Lambda’s console, click Create a Lambda Function and search for lambda-test-harness among the available blueprints (figure 6.21).
Figure 6.21. Look through the various blueprints on offer. Some of them will give you ideas, and others might save you time.
Create a new function from the lambda-test-harness blueprint.
Set lambda-dynamo as the role for the function and leave the timeout on one minute.

To correctly run the lambda-test-harness function, you need to create an event object that describes a test configuration and pass it to the test harness function. The next listing shows an example test configuration for the get-video-list function you developed earlier in this chapter. Note that the event is empty because get-video-list doesn’t use it during execution.

Listing 6.9. Example unit test configuration

To run the test harness function, click the Test button in the AWS console (assuming you have the function open) and type the configuration from listing 6.9 into the input test event dialog. Then click Save and Test.

Look at the unit-test-results table in DynamoDB for information after executing the function to see the results. To perform a load test, modify the configuration slightly by changing the operation from unit to load and setting a desired number of iterations. The next listing shows a configuration you can use to execute the function 50 times.

Listing 6.10. Example load test configuration

Having a mix of tests to run in your local environment and in AWS will give you confidence to make changes and improvements. Don’t neglect testing, and do it from the very start.

6.7. Exercises

You learned a lot about Lambda in this chapter, but there’s no better way to test your knowledge than trying a few exercises. See if you can do the following:

Create a Lambda function to check if a given string is a palindrome. Your function should get the string via an environment variable.
Implement the email-sending library given in section 6.5.3 and include it in the transcode-video Lambda function. Modify the transcode-video function to send you an email whenever a new transcoding job is created.
Package the email-sending library as an npm module and deploy it to the npm repository (see http://bit.ly/1r6heOf for more information on how to do this). When it’s deployed, install it into a Lambda function of your choice using npm install <your-module> --save. Update the Lambda function so that it sends email and test it.
Create a new Lambda function to send yourself an email every 24 hours. Can you automate it?
Add a breaking test to the tests you implemented in section 6.6.2 and run mocha to see what it looks like. Either fix this test or remove it before you proceed to the next question.
Write a test for each of the Lambda functions you created in chapters 3 and 5.
Create a unit and load-test harness in AWS to test existing Lambda functions. Create a new Lambda function that triggers when new DynamoDB records are inserted and sends you an email with the results of those records.

6.8. Summary

A compute service such as Lambda is the heart of serverless architecture. It’s the glue that holds everything together. It can serve as a back end for your application or act as a coordinator between other services in your system. In this chapter, we looked at the following:

Lambda’s core principles including invocation types and event models
Programming model
Versioning, aliases, and environment variables
Usage of the CLI
Patterns such as async waterfall and creation of libraries
Testing of Lambda functions locally and in AWS

In the next chapter, we’ll look at the API Gateway and discuss how to create robust back ends for web and mobile applications.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 6. Lambda the orchestrator

Create new playlist

Sign In

Sign Up

Chapter 6. Lambda the orchestrator

6.1. Inside Lambda

6.1.1. Event models and sources

6.1.2. Push and pull event models

Figure 6.1. Except for stream-based services, which are Amazon Kinesis Streams and DynamoDB, all other AWS services use the push model of operation.

Figure 6.2. The pull model applies to Amazon Kinesis Streams and DynamoDB streams only.

6.1.3. Concurrent executions

Stream-based event source

Non-stream-based event source

6.1.4. Container reuse

6.1.5. Cold and warm Lambda

Figure 6.3. The time it takes to run a cold function is nearly 90 ms.

Figure 6.4. A warm function runs much quicker than a cold one.

6.2. Programming model

6.2.1. Function handler

6.2.2. Event object

Figure 6.5. The available event templates provided through the AWS console. You can customize a template or create your own from scratch.

6.2.3. Context object

6.2.4. Callback function

6.2.5. Logging

6.3. Versioning, aliases, and environment variables

6.3.1. Versioning

Figure 6.6. Versions are easy to create and invoke through the console and the CLI.

Figure 6.7. Setting the right version of the Lambda function to invoke is trivial in the API Gateway. If you don’t specify a version, the API Gateway will invoke the $LATEST version.

Figure 6.8. S3 uses the ARN to invoke the right version of the function. You can look up the ARN in Lambda’s console.

6.3.2. Aliases

Figure 6.9. Initially an alias called production points to version 1 of a Lambda function. After an update, it’s remapped to point to version 2. The staging alias is also remapped to point to the $LATEST version of the function.

Figure 6.10. You can switch between aliases and versions in the sidebar.

6.3.3. Environment variables

Basic usage

Figure 6.11. The key must start with a letter and contain only letters, numbers, and underscores. Values don’t have such restrictions, but at the time of this writing you shouldn’t use commas in them. You’d have to either use a different delimiter or encrypt the value.

Note

Encryption

Figure 6.12. Use encryption whenever you’re handling sensitive data.

6.4. Using the CLI

6.4.1. Invoking commands

6.4.2. Creating and deploying functions

Figure 6.13. A simple update to the policy will allow the user to create functions.

Listing 6.1. Basic function

Listing 6.2. Working example of the create-function command

Listing 6.3. Syntax for the create-function command

6.5. Lambda patterns

6.5.1. Async waterfall

Figure 6.14. The async waterfall pattern allows you to invoke and pass results from one function to another. It makes handling of asynchronous methods easier than using callbacks.

Listing 6.4. Async waterfall example

24-. Hour Video list

Basics setup

Listing 6.5. Package.json for the get-video-list function

Implementation

Listing 6.6. The get-video-list function

Environment variables

Figure 6.15. You must add the BUCKET and BASE_URL environment variables for the function to run.

Testing

Figure 6.16. You can preview the response in the AWS console, making it easy to test your Lambda function.

6.5.2. Series and parallel

Figure 6.17. The async series pattern can help if you have a series of independent calculations and then get all the results at the end.

Figure 6.18. The parallel pattern allows functions to execute at the same time and pass their results to the optional callback function at the end.

6.5.3. Using libraries

Get the code

Listing 6.7. Adding email support

Figure 6.19. Basic email sending is easy via SES. Try to move code and features that are used often (such as email sending) into libraries.

6.5.4. Move logic to another file

6.6. Testing Lambda functions

6.6.1. Testing locally

6.6.2. Writing tests

Listing 6.8. Tests for the get-video-list function

Figure 6.20. You can run tests locally and as part of your continuous deployment pipeline.

6.6.3. Testing in AWS

Figure 6.21. Look through the various blueprints on offer. Some of them will give you ideas, and others might save you time.

Listing 6.9. Example unit test configuration

Listing 6.10. Example load test configuration

6.7. Exercises

6.8. Summary

Table of Contents for
Chapter 6. Lambda the orchestrator

Listing 6.2. Working example of the `create-function` command

Listing 6.3. Syntax for the `create-function` command

Listing 6.5. Package.json for the `get-video-list` function

Listing 6.6. The `get-video-list` function

Listing 6.8. Tests for the `get-video-list` function