Skip to content

fix(rivetkit): temporarily disable HWS for serverless runners#4596

Open
NathanFlurry wants to merge 5 commits intov2.1.xfrom
fix/disable-hws-serverless
Open

fix(rivetkit): temporarily disable HWS for serverless runners#4596
NathanFlurry wants to merge 5 commits intov2.1.xfrom
fix/disable-hws-serverless

Conversation

@NathanFlurry
Copy link
Copy Markdown
Member

Summary

  • Disables hibernating WebSockets (HWS) for action/event connections (PATH_CONNECT) by making #hwsCanHibernate() return false
  • Non-hibernatable connections still keep the actor awake via CanSleep.ActiveConns check
  • This is a temporary workaround until the engine implements ToRunnerPing/ToGatewayPong for serverless TS runners

Root Cause

The gateway sends ToRunnerPing via NATS and expects ToGatewayPong within 30s. For non-serverless runners, the Rust pegboard-runner bridge responds. For serverless TS runners (e.g. pax on Rivet Cloud), there is no bridge — nobody responds, so the gateway kills every WebSocket with ws.downstream_closed every ~30s. Disabling HWS avoids the broken hibernation/restore cycle.

Test plan

  • Deploy to pax production and verify WebSocket connections are stable (no more ~30s disconnect cycle)
  • Verify actor stays awake while connections exist
  • Verify connections are cleanly closed during runner drain (no stale_metadata errors)

🤖 Generated with Claude Code

The gateway tunnel ping (ToRunnerPing) is not handled for serverless TS
runners since there is no Rust pegboard-runner bridge to respond with
ToGatewayPong. This causes the gateway to close every WebSocket with
ws.downstream_closed after 30s, triggering a reconnect cycle.

Disable hibernation on PATH_CONNECT WebSockets until the engine
implements ping/pong for serverless runners.
@claude
Copy link
Copy Markdown

claude bot commented Apr 9, 2026

PR Review: fix(rivetkit): temporarily disable HWS for serverless runners

Core Fix (actor-driver.ts) - Looks Good

The workaround is correct and well-reasoned. Disabling hwsCanHibernate() for PATH_CONNECT avoids the broken 30s disconnect cycle caused by the gateway expecting a ToGatewayPong that serverless TS runners never send. Non-hibernatable connections still keep the actor alive via CanSleep.ActiveConns, so there is no regression in actor lifetime behavior.

Suggestion: Add a tracking issue reference in the comment so this workaround does not get forgotten:

// Temporarily disable HWS until the engine implements ToRunnerPing/ToGatewayPong for serverless TS runners.
return false;

Sandbox Changes - Two Concerns

1. Client-side build removed from package.json

The old build ran both a client and server vite build. The new build is only vite build --mode server. The client-side bundle is no longer built. The Dockerfile also dropped the COPY step for the dist/public directory. If the sandbox example serves any frontend assets, they will be missing. Please confirm this is intentional, or restore the client build step.

2. Inlined SQL in migrations.js creates maintenance overhead

Switching from a .sql file import to an inline template literal means the migration SQL now lives in two places. If the schema changes, both the .sql source file and the inline copy need to stay in sync. Consider using a raw import approach or removing the original .sql file to eliminate the duplication.

Minor Notes

  • src/server-runner.ts is clean and minimal.
  • The journal import type assertion is the correct modern syntax.
  • The Dockerfile conditional CMD using SANDBOX_MODE is a reasonable pattern for dual-mode containers.
  • Building rivetkit before sandbox in the Dockerfile is more correct. The old filter may have silently succeeded only because rivetkit was already cached.

@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new bot commented Apr 9, 2026

More templates

@rivetkit/cloudflare-workers

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/cloudflare-workers@4596

@rivetkit/framework-base

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/framework-base@4596

@rivetkit/next-js

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/next-js@4596

@rivetkit/react

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/react@4596

rivetkit

pnpm add https://pkg.pr.new/rivet-dev/rivet/rivetkit@4596

@rivetkit/sql-loader

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/sql-loader@4596

@rivetkit/sqlite-vfs

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/sqlite-vfs@4596

@rivetkit/traces

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/traces@4596

@rivetkit/workflow-engine

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/workflow-engine@4596

@rivetkit/virtual-websocket

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/virtual-websocket@4596

@rivetkit/engine-runner

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner@4596

@rivetkit/engine-runner-protocol

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner-protocol@4596

commit: 192bbe4

@railway-app railway-app bot temporarily deployed to kitchen-sink / production April 10, 2026 22:22 Inactive
@railway-app railway-app bot temporarily deployed to kitchen-sink / production April 10, 2026 22:26 Inactive
@railway-app railway-app bot temporarily deployed to kitchen-sink / production April 10, 2026 22:27 Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant