Extension post-runtime flush causes 429 TooManyRequestsException with reserved_concurrent_executions=1 on synchronous invocation

**Environment**
Extension version: latest 
Runtime: Node.js
reserved_concurrent_executions: 1
Invocation type: synchronous (AWS Step Functions)

**Description**
When a Lambda function with reserved_concurrent_executions=1 returns its response, the Datadog extension enters the post-runtime flush phase to send telemetry data to Datadog. During this phase, the execution environment is still considered "busy" by AWS Lambda, consuming the single reserved concurrency slot.

If a second invocation arrives immediately after the function has returned its response (but while the extension is still flushing), AWS throttles it with a Lambda.TooManyRequestsException (HTTP 429).

**Steps to reproduce**
Deploy a Lambda with reserved_concurrent_executions=1 and the Datadog extension enabled
Invoke it synchronously from a Step Function with sequential invocations
When a second invocation is triggered right after the first one completed, a 429 is returned

**Expected behavior**
The 429 should not occur between two sequential (non-concurrent) invocations.

**Observed behavior**
Lambda.TooManyRequestsException is raised on the second call because the extension's post-runtime flush keeps the execution environment busy beyond the function's response time.

**Workaround**
1. Setting reserved_concurrent_executions=2 absorbs the overlap between the post-runtime flush of invocation N and the start of invocation N+1.
2. DD_SERVERLESS_FLUSH_STRATEGY=periodically,<DELAY>
Defers the flush to a periodic interval. While this reduces the frequency of the issue, it is not acceptable in our case: we require complete telemetry coverage for every Lambda invocation. With a periodic strategy, invocations that complete between two flush intervals may have their telemetry dropped or delayed, making observability unreliable.

**Requested solution**
We are looking for a solution that allows the extension to flush telemetry without blocking the concurrency slot, so that reserved_concurrent_executions=1 remains usable for sequential workloads. Ideally, the extension would either:

Release the execution environment to AWS before completing its flush, or
Expose a configuration option to cap the post-runtime flush duration to avoid holding the slot beyond an acceptable threshold, without dropping telemetry.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extension post-runtime flush causes 429 TooManyRequestsException with reserved_concurrent_executions=1 on synchronous invocation #1175

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Extension post-runtime flush causes 429 TooManyRequestsException with reserved_concurrent_executions=1 on synchronous invocation #1175

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions