Concurrency and error handling

The next pieces of information you should be aware of before putting any Lambda-backed solutions into pasture include how to manage execution concurrency and when things go wrong.

Quick definition check: concurrency is invoking the same Lambda function more than once at any given time.

Lambda comes with an out of the box limit of 1,000 concurrent executions per account. While this might seem low, when you have something that is potentially limitless in scale, all you're doing is shifting the performance bottleneck to the next component. There aren't many RDBMS systems that can handle millions of transactions per second, so let's step back and see how we're managing our data, queries, and throughput before raising that concurrency limit (which can be done by raising a support request). Our options are to either match our Lambda execution rate with the throughput of the downstream resources or enforce more appropriate batch sizes upstream. Both of these options are examples of rate-limiting and differ in the stack, where the limit is being applied.

On our Lambda function configuration page, which can be found in the console, we can actually reserve some of the concurrency for that function to use since the limit is account-wide and applies to all the functions running in that account. If you had a particularly busy function that's being used by another application or team, that could result in the throttling of your function if you reach the service limits, unless you had reserved some capacity. However, be aware here—if we set a reserved concurrency of 200 executions, the total number of concurrent executions for that function can't exceed this. So, treat this function configuration setting as a new limit (that you can change at any time through configuration):

Setting the reserved concurrency for a Lambda function

Now that we understand our limits and how to tune them to our advantage, what we need is a way to monitor concurrency so that we can take action if it approaches the limit.

To do this, Lambda gives you two CloudWatch Metrics to keep an eye on:

ConcurrentExecutions: This is available as a metric for each function, and can also be used as an aggregate of the execution counts for all the functions in the account. If you're running too close to the service limit, you can expect some functions to be throttled if you hit the limit.
UnreservedConcurrentExecutions: Again, this is an aggregate of execution counts, but, this time, the functions with reserved concurrency are excluded.

Okay, let's take some more moments now to look at a situation where we are reaching our limits and invocations are being throttled. What do I mean by throttle? In Lambda, the term throttle refers to something that happens when a function is invoked but does not get executed, usually because a service limit has been exceeded. What actually happens when a function is throttled is different, depending on the source of the event.

Here are the three throttling behaviors for each invocation type:

For functions that are invoked synchronously and subsequently throttled, the client will receive an HTTP 429 error code with the TooManyRequestsException message. The client is then responsible for managing retries from there.
If functions are invoked using the asynchronous invocation type and are throttled, Lambda will automatically retry the invocation twice more, with random delays between retries. On the third failure, the event can be redirected to a Dead Letter Queue.
Things get more complex if you are throttled during stream-based invocations, so check the documentation for the latest handling logic.

There is another metric to watch as well, called Throttles, which is a count of invocation requests that didn't quite make it to execution, specifically because of a throttling event. If this metric goes above zero, your clients may be experiencing an impact on the service.

Regarding Dead Letter Queues, these are either an SQS queue or SNS topic that you create and configure yourself, and then assign to be the DLQ of a particular function in the function's configuration. For asynchronous invocations, if none of the retries succeed, then the event will be sent to the DLQ. This will give you a record of all the failed invocations and their error parameters. The idea behind having this is that your unprocessed events aren't lost; you have the record and you can choose what action to take from there.

While concurrency is a characteristic of Lambda that we should call out specifically, observability is a larger objective to meet. This is looking at the service or solution you are providing as a whole and pinpointing measurable characteristics. We'll learn more about observability in the next section.

Table of Contents for Concurrency and error handling

Create new playlist

Sign In

Sign Up

Table of Contents for
Concurrency and error handling