Skip to content

Step telemetry writeToStream 409 "Run is already completed" after a successful workflow eager-closes its writable mid-run #1949

@tomcgs

Description

@tomcgs

Summary

In @workflow/world-vercel@4.1.2 (transitive dep of workflow@4.2.4), step telemetry's internal writeToStream call rejects with HTTP 409 {"success":false,"error":"conflict","message":"Run \"wrun_…\" is already completed"} when a workflow function calls writable.close() on the writable obtained from getWritable() BEFORE the function returns, while a subsequent "use step" body is still running.

The rejection becomes an unhandled promise rejection (mechanism: auto.node.onunhandledrejection) → Sentry captures it as an error event. We've seen 6 events across 4 production deployments in the last 4 days (sample rate 0.1, so the actual frequency is closer to ~60 events / 4 days for our chat traffic).

Reproduction shape (real workflow)

import { getWritable, getWorkflowMetadata } from "workflow";

export async function chatTurn(input: { sessionId: string; promptParts: any[] }) {
  "use workflow";
  const writable = getWritable<UIMessageChunk>();
  const { workflowRunId } = getWorkflowMetadata();

  try {
    const turn = await runAgentTurn(input.sessionId, input.promptParts);

    await clearActiveStream({ sessionId: input.sessionId, streamId: workflowRunId });

    // Eager-close the durable stream INLINE (no step hop) right after the
    // CAS claim is released. Drops 30-50s of "Generating…" UI sparkle on
    // the success path because client thread.isRunning flips to false
    // immediately instead of waiting for the post-finish step fanout.
    if (turn.status === "success") {
      await writable.close();
    }

    // These run as steps AFTER the writable is closed.
    // Their step telemetry's writeToStream() then rejects with 409.
    await persistTurnMessages({ ... });   // "use step" inside
    await runPostFinish({ ... });          // Promise.allSettled of steps inside

    return { sessionId, status };
  } finally {
    await sendFinish(writable);
    await closeStream(writable);
  }
}

Stack trace from production

Error: Stream write failed: HTTP 409: {"success":false,"error":"conflict","message":"Run \"wrun_01KQKXJQGJ78X9YSF34QG3GHFX\" is already completed"}

at @workflow/world-vercel@4.1.2/src/streamer.ts:119:25 (writeToStream)
   if (!response.ok) {
     throw new Error(\`Stream write failed: HTTP \${response.status}: …\`);
   }

at @workflow/world-vercel@4.1.2/src/telemetry.ts:50:23 (tg)
   const [tracer, otel] = await Promise.all([getTracer(), getOtelApi()]);

at @workflow/core@4.2.4/src/serialization.ts:502:11 (o)
   await world.writeToStream(name, runId, chunk);

mechanism: auto.node.onunhandledrejection — the 409 rejects up through world.writeToStream (serialization), telemetry.ts (telemetry wrapper), and out of the framework as an unhandled rejection.

What we expect

Either:

  1. The framework's step telemetry should treat a 409 from a completed run as a benign post-completion noop (the run's lifecycle has progressed past where telemetry writes make sense), suppressing the rejection internally with a debug log; or
  2. Calling writable.close() mid-workflow should be documented as unsupported (calling it should throw or be a no-op), with the user expected to rely on the framework's exit-time auto-close.

Today's behavior is the worst of both: writable.close() mid-workflow appears to "work" (the client gets the close, the UI experience is improved by 30-50s), but it leaks framework-level rejections that surface as unhandled promise rejections in the application's process.

Workaround we are applying

We must keep the eager-close — the UX win is large (~30-50s of "Generating…" sparkle removed). So we are filtering this specific error pattern in our application's Sentry beforeSend hook with a comment linking back to this issue + a 60-day re-evaluation date. The filter matches the exact 409 message format and the mechanism: auto.node.onunhandledrejection tag.

We'd very much prefer (1) above as the upstream fix.

Versions

  • workflow@4.2.4
  • @workflow/world-vercel@4.1.2 (transitive)
  • Node 24.14.1 on Vercel (vercel-production environment)
  • Region: iad1

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions