Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/patterns/best-practices/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,3 +14,5 @@ Best practices for your Durable Execution SDK workflow code.
and polling external services with backoff.
- [Code organization](code-organization.md) Separating business logic from
orchestration, using child contexts, and grouping configuration.
- [Saga pattern](saga-pattern.md) Implementing sagas with compensation logic
for distributed transactions.
140 changes: 140 additions & 0 deletions docs/patterns/best-practices/saga-pattern.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,140 @@
# Saga Pattern

The Saga pattern is used in multi-step workflows to ensure a graceful rollback to a consistent state in the event of a workflow step failure.

The core idea is to keep a record of compensating steps for each successful forward step. If there is failure at any step in the workflow, execute compensating steps for all previously completed steps in reverse order. This makes sure that the system can return to a consistent state even when something goes wrong.

Durable functions are well-suited for sagas because:

- Each step is automatically retried on transient failures
- When compensating actions are made durable steps, they are checkpointed and retried
- If Lambda crashes mid-compensation, execution resumes from the last checkpoint

## Example

Consider an order processing workflow with three steps: reserve inventory, charge payment, and fulfill shipment. If the shipment fulfilment step fails, we need to refund the payment and cancel the inventory reservation to avoid holding stock indefinitely.

```
Forward: reserve inventory → charge payment → fulfill shipment
Failure at fulfill shipment:
Compensate: refund payment → cancel reservation
```

=== "TypeScript"

```typescript
--8<-- "examples/typescript/patterns/saga-pattern/saga-pattern.ts"
```
=== "Python"

```python
--8<-- "examples/python/patterns/saga-pattern/saga-pattern.py"
```
=== "Java"

```java
--8<-- "examples/java/patterns/saga-pattern/saga-pattern.java"
```

## Parallel Steps

If steps are independent, they can be executed concurrently using the `context.parallel()` method.

=== "TypeScript"

```typescript
--8<-- "examples/typescript/patterns/saga-pattern/saga-pattern-parallel.ts"
```
=== "Python"

```python
--8<-- "examples/python/patterns/saga-pattern/saga-pattern-parallel.py"
```
=== "Java"

```java
--8<-- "examples/java/patterns/saga-pattern/saga-pattern-parallel.java"
```

## Idempotency of Compensating Steps

Compensating steps must be idempotent. The SDK retries a durable compensation step if it
fails due to a transient error, so the step can execute more than once. Design
compensation logic to handle duplicate execution gracefully.

For example, a step that cancels an inventory reservation should succeed even if the reservation was already cancelled.

!!! warning
Durable compensation steps can execute more than once hence design them to be idempotent. Cancelling an already-cancelled forward step must not error or produce side effects.

=== "TypeScript"

```typescript
--8<-- "examples/typescript/patterns/saga-pattern/idempotent-compensation.ts"
```
=== "Python"

```python
--8<-- "examples/python/patterns/saga-pattern/idempotent-compensation.py"
```
=== "Java"

```java
--8<-- "examples/java/patterns/saga-pattern/idempotent-compensation.java"
```


## Common Pitfalls

**1. Non-durable compensations**

Always wrap compensations in a durable step. Without it, a compensation that fails will not be retried and the failure will be silently lost.

!!! warning
If a compensation fails outside a durable step, the exception aborts the loop. The failed
compensation is not retried and all remaining compensations are skipped, leaving
the system partially rolled back.

=== "TypeScript"

```typescript
--8<-- "examples/typescript/patterns/saga-pattern/pitfall1-non-durable.ts"
```
=== "Python"

```python
--8<-- "examples/python/patterns/saga-pattern/pitfall1-non-durable.py"
```
=== "Java"

```java
--8<-- "examples/java/patterns/saga-pattern/pitfall1-non-durable.java"
```

**2. Compensations that depend on non-deterministic data**

The compensations list is rebuilt on every replay because it lives outside any step. The list is always rebuilt in the same order because the steps that add to it always replay in the same order. However, never store non-deterministic data (values computed outside a step like timestamps or random IDs) in the compensation closure.

=== "TypeScript"

```typescript
--8<-- "examples/typescript/patterns/saga-pattern/pitfall2-in-memory-state.ts"
```
=== "Python"

```python
--8<-- "examples/python/patterns/saga-pattern/pitfall2-in-memory-state.py"
```
=== "Java"

```java
--8<-- "examples/java/patterns/saga-pattern/pitfall2-in-memory-state.java"
```

## See Also

- [Steps](../../sdk-reference/operations/step.md)
- [Parallel](../../sdk-reference/operations/parallel.md)
- [Error Handling](../../sdk-reference/error-handling/errors.md)
- [Determinism and Replay](./determinism.md)
- [Idempotency and Retries](./idempotency.md)
3 changes: 3 additions & 0 deletions docs/patterns/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,3 +22,6 @@ Best practices for deterministic workflows.
timeouts, heartbeats, and polling external services with backoff.
- [Code organization](best-practices/code-organization.md) Separating business logic
from orchestration, using child contexts, and grouping configuration.
- [Saga pattern](best-practices/saga-pattern.md) Implementing sagas with compensation logic
for distributed transactions.

22 changes: 22 additions & 0 deletions examples/java/patterns/saga-pattern/idempotent-compensation.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
try {
Map<String, Object> reservation = context.step("reserve-inventory", Map.class,
ctx -> reserveInventory(orderId));
compensations.add(new Compensation(
"cancel-reservation",
// reservation id is the idempotency key and service uses it to deduplicate
() -> cancelReservation((String) reservation.get("id"))
));

// ... more steps ...

} catch (Exception error) {
for (int i = compensations.size() - 1; i >= 0; i--) {
Compensation comp = compensations.get(i);
try {
context.step(comp.name, Void.class, ctx -> { comp.fn.run(); return null; });
} catch (Exception compError) {
context.getLogger().error("Compensation failed: " + comp.name);
}
}
throw error;
}
14 changes: 14 additions & 0 deletions examples/java/patterns/saga-pattern/pitfall1-non-durable.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
// WRONG
for (int i = compensations.size() - 1; i >= 0; i--) {
compensations.get(i).fn.run(); // if this throws error, subsequent compensations don't run
}

// CORRECT
for (int i = compensations.size() - 1; i >= 0; i--) {
Compensation comp = compensations.get(i);
try {
context.step(comp.name, Void.class, ctx -> { comp.fn.run(); return null; });
} catch (Exception e) {
context.getLogger().error("Compensation failed: " + comp.name);
}
}
12 changes: 12 additions & 0 deletions examples/java/patterns/saga-pattern/pitfall2-in-memory-state.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
// WRONG - timestamp outside a step, changes on replay
long timestamp = System.currentTimeMillis();
compensations.add(new Compensation("cancel",
() -> cancelWithTimestamp((String) reservation.get("id"), timestamp)
));

// CORRECT - data from step return value, stable on replay
Map<String, Object> reservation = context.step("reserve", Map.class,
ctx -> reserveInventory(orderId));
compensations.add(new Compensation("cancel-reservation",
() -> cancelReservation((String) reservation.get("id"))
));
123 changes: 123 additions & 0 deletions examples/java/patterns/saga-pattern/saga-pattern-parallel.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
import java.util.ArrayList;
import java.util.List;
import java.util.Map;

import software.amazon.lambda.durable.DurableContext;
import software.amazon.lambda.durable.DurableHandler;
import software.amazon.lambda.durable.DurableFuture;
import software.amazon.lambda.durable.TypeToken;

public class ParallelSaga extends DurableHandler<Map<String, Object>, Map<String, Object>> {

static class Compensation {
final String name;
final Runnable fn;

Compensation(String name, Runnable fn) {
this.name = name;
this.fn = fn;
}
}

@Override
public Map<String, Object> handleRequest(Map<String, Object> event, DurableContext context) {
List<Compensation> compensations = new ArrayList<>();

try {
String orderId = (String) event.get("order_id");
double amount = ((Number) event.get("amount")).doubleValue();
Map<?, ?> address = (Map<?, ?>) event.get("address");

// Reserve inventory AND verify address in parallel
DurableFuture<Map<String, Object>> reservationFuture;
DurableFuture<Map<String, Object>> addressFuture;

try (var parallel = context.parallel("pre-checks")) {
reservationFuture = parallel.branch("reserve-inventory",
new TypeToken<Map<String, Object>>() {},
ctx -> ctx.step("reserve", Map.class, s -> reserveInventory(orderId))
);
addressFuture = parallel.branch("verify-address",
new TypeToken<Map<String, Object>>() {},
ctx -> ctx.step("address", Map.class, s -> verifyAddress(address))
);
}

List<Map<String, Object>> results = DurableFuture.allOf(reservationFuture, addressFuture);
Map<String, Object> reservation = results.get(0);
Map<String, Object> addressVerification = results.get(1);

// Only reserve-inventory needs a compensation, verify-address is stateless
compensations.add(new Compensation(
"cancel-reservation",
() -> cancelReservation((String) reservation.get("id"))
));

// Stop execution if address is invalid, catch block will cancel reservation
if (!"true".equals(addressVerification.get("valid"))) {
throw new IllegalArgumentException("Invalid shipping address");
}

// Charge payment
Map<String, Object> payment = context.step("charge-payment", Map.class,
ctx -> chargePayment(amount)
);
compensations.add(new Compensation(
"refund-payment",
() -> refundPayment((String) payment.get("id"))
));

// Create shipment
Map<String, Object> shipment = context.step("create-shipment", Map.class,
ctx -> createShipment(orderId)
);

context.getLogger().info("Order completed: " + orderId);
return Map.of("success", true, "tracking_id", shipment.get("tracking_id"));

} catch (Exception error) {
context.getLogger().error("Order failed, running compensations: " + error.getMessage());

for (int i = compensations.size() - 1; i >= 0; i--) {
Compensation comp = compensations.get(i);
try {
context.step(comp.name, Void.class, ctx -> {
comp.fn.run();
return null;
});
} catch (Exception compError) {
context.getLogger().error("Compensation failed: " + comp.name + " - " + compError.getMessage());
}
}

throw error;
}
}

// Mock APIs for demonstration purposes

private Map<String, Object> reserveInventory(String orderId) {
return Map.of("id", "RES-" + orderId);
}

private void cancelReservation(String reservationId) {
System.out.println("Reservation " + reservationId + " cancelled");
}

private Map<String, Object> verifyAddress(Map<?, ?> address) {
// Stateless — no compensation needed
return Map.of("valid", "true");
}

private Map<String, Object> chargePayment(double amount) {
return Map.of("id", "PAY-" + amount);
}

private void refundPayment(String paymentId) {
System.out.println("Payment " + paymentId + " refunded");
}

private Map<String, Object> createShipment(String orderId) {
return Map.of("tracking_id", "TRACK-" + orderId);
}
}
Loading