Reuse and optimization

In this section, we are going to learn how to leverage the execution model to our advantage. We'll also find out about a thing called cold starts, and why we need to be considerate of them.

In the early days of the AWS Lambda service, we could make educated estimates about the underlying implementation of the infrastructure. From this introspection, we were able to postulate that our Lambda code was running within some sort of container. Each time we would trigger a function to be executed, an invisible orchestration service would have to find a physical host with capacity, schedule the container for deployment, download and deploy our .zip package from S3, start the container, and manage it until it was terminated. While the code had stopped executing for that one invocation (it was, in fact, frozen), we found that the underlying container was still available to be reused. Subsequent invocations of the same Lambda functions had a high chance of being scheduled in the existing container environment.

We could infer this from a few factors:

Any temporary files we had created in our /tmp filesystem were still available in the next invocation.
Variables that we defined outside of our handler function were still in memory.
Background processes and callbacks may still not be complete.
The function executed a whole lot quicker.

In addition to this, we found that the reuse of the same container was not guaranteed. The container would exist for an indeterminately random amount of time until it was either cleaned up by the invisible orchestration service or the invocations of our function reached a point where the Lambda service needed to scale up the number of containers that were deployed.

So, while this gave us developers an opportunity to take advantage of some persistence, we couldn't count on it all the time. This leads us to our first pro tip.

Declare your database connections outside the scope of the handler function so that they can be (possibly) reused in subsequent invocations, saving the time it takes to establish a connection on every invocation.

This is similar for the /tmp directory. If you had a file you were processing or working on, saving it here would be a good way to enable processing to resume, should the Lambda reach its timeout period.

The amount of time it takes for a new container to be scheduled and readied for your code to be started is called cold starting. In this scenario, you incur a time penalty for every new container. The length of the penalty varies between languages but can be reduced in all examples by making your deployment package as small as possible. This reduces the time that's needed to download the package from S3 and load it into the new container. One other way to avoid this penalty is to always keep your function warm. We know that containers can last anywhere from 5 to 60 minutes before they are cleaned up. This means we just need to invoke our own functions with an event every 5 or so minutes to make it less likely for our end users to be impacted by the penalty.

I did preface at the start that this is what happened in the early days of Lambda. At some time between 2017 and 2018, the Lambda service moved to a new underlying implementation using micro virtual machines, or microVMs. Dubbed as Firecracker, this new virtualization technology is tipped to be the next evolution in how to run massively multi-tenanted functions and serverless workloads on hardware. Firecracker is a virtual machine manager that essentially replaces QEMU to present kernel resources via KVM to microVMs. The benefits of using microVMs over containers are as follows:

Higher levels of security isolation
Achieves greater density of VMs to physical resources
Efficient startup times
Specifically designed for short-lived transient processes

While this technology is pretty new, we expect the same best practice patterns for Lambda development to still apply.

The next thing to talk about is more focused on optimization once the Lambda is running. Once we have a good baseline for how our function performs and how long it takes to complete execution, we can start playing with the configuration of memory. At the time of writing, you can allocate between 128 MB and 3,008 MB of memory for your function to run with. The Lambda service also allocates CPU proportional to the amount of memory allocated. This means that, if you allocate more memory, you get more CPU to play with as a bonus. More memory also means the cost point is higher per execution, but if your function is executing with more CPU, what if this means the function can complete sooner? There is no silver bullet here, but, I recommend that you thoroughly test your function with different configurations of memory size to find the best setting. You might end up saving heaps on your bill!

This next section is all about using the ecosystem—but what do I mean by this? Let's take a look.

Table of Contents for Reuse and optimization

Create new playlist

Sign In

Sign Up

Table of Contents for
Reuse and optimization