Skip to content

reconcile provider session state and settle stuck turns#2666

Open
justsomelegs wants to merge 17 commits into
pingdotgg:mainfrom
justsomelegs:t3code/4b95a799
Open

reconcile provider session state and settle stuck turns#2666
justsomelegs wants to merge 17 commits into
pingdotgg:mainfrom
justsomelegs:t3code/4b95a799

Conversation

@justsomelegs
Copy link
Copy Markdown
Contributor

@justsomelegs justsomelegs commented May 12, 2026

What Changed

Fixes several provider session / turn lifecycle cases where T3 Code could leave a thread in the wrong state after provider runtime events, interrupts, stop failures, or server restart.

Main changes:

  • Treat active turn.completed events as lifecycle-ending events:

    • successful turns move the session back to ready
    • failed turns move the session to error
    • activeTurnId is cleared
  • Treat active turn.aborted events as lifecycle-ending events:

    • interrupted/aborted turns move the session back to ready
    • activeTurnId is cleared
  • Ignore completion/abort events for non-active turns so auxiliary provider work, like OpenCode title generation, does not incorrectly stop the main running turn.

  • Reconcile projected running sessions on server startup:

    • if the provider still has a live session, mirror the live provider state
    • if the provider no longer has the session, mark the projected session stopped
    • if live session listing fails, leave projected state alone instead of guessing
  • Make thread.session.stop non-destructive on provider stop failure:

    • generic stop failures keep the current session and record lastError
    • typed “session already gone” errors are treated as a successful idempotent stop
  • Preserve failed turn state in server and web projections:

    • sessions entering error settle the latest running turn as error
    • stopped/interrupted/ready inactive sessions settle the latest running turn as interrupted
  • Fix OpenCode adapter interrupt handling:

    • interrupt clears active adapter state and returns the session to ready
    • interrupted sessions can then be stopped cleanly
    • prompt-start failures no longer emit a misleading turn.aborted event

Why

A few different provider lifecycle paths were being treated too loosely.

The main symptom was that a thread could stay stuck as “working” even after the provider had already finished, aborted, or lost the active session. This was especially visible with OpenCode, but the underlying fixes are in shared provider orchestration and projection code, so they apply more broadly than just OpenCode.

This approach keeps the existing event model, but makes the lifecycle rules stricter:

  • only the active turn can settle the active session
  • provider runtime state is reconciled instead of blindly overwritten
  • stop failures do not pretend the session stopped
  • failed turns remain failed instead of being flattened into interrupted turns

Fixes #2644
Fixes #2633
Fixes #2573

UI Changes

No UI changes.

Testing

  • bun run test src/orchestration/Layers/ProviderRuntimeIngestion.test.ts src/orchestration/Layers/ProjectionPipeline.test.ts src/orchestration/Layers/ProviderCommandReactor.test.ts src/provider/Layers/OpenCodeAdapter.test.ts
  • bun run test src/store.test.ts
  • bun fmt
  • bun lint
  • bun typecheck

bun lint passes with existing unrelated warnings.

Checklist

  • This PR is small and focused
  • I explained what changed and why
  • I included before/after screenshots for any UI changes
  • I included a video for animation/interaction changes

Note

Reconcile provider session state and settle stuck running turns on stop or error

  • On reactor startup, projected running sessions are reconciled against live provider sessions; sessions with no live counterpart are marked stopped with a descriptive lastError, while live ready sessions are mirrored and their latest turn settled.
  • When a session transitions to a non-running status (stopped, interrupted, error), the most recent running turn is now settled to interrupted or error with completedAt stamped from session.updatedAt, both in the server projection and the web UI store.
  • turn.aborted provider runtime events now mark the thread session as ready and clear the active turn, consistent with turn.completed handling.
  • Prompt-start failures in OpenCodeAdapter.sendTurn no longer emit a spurious turn.aborted event; interruptTurn now reliably emits turn.aborted for the correct turn and immediately reflects session state as ready.
  • Session stop requests that fail with "session not found" are treated as benign; other stop failures record a provider.session.stop.failed activity, preserve the running state, and surface lastError.

Macroscope summarized c497da5.

@reactreview
Copy link
Copy Markdown

reactreview Bot commented May 12, 2026

React health score gauge

React Review found 0 ❌ errors and 1 ⚠️ warning. This PR leaves the React health score unchanged.

Copy prompt for agent
Check if these React Review issues are valid. If so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.

Run this before and after your changes to verify the result:
npx react-doctor@latest --verbose --diff

Do not modify the react-doctor configuration unless explicitly asked.
Fix the underlying code issues instead of changing or suppressing the rules.

React Review found 0 errors and 1 warning. This PR leaves the React health score unchanged.

<file name="apps/server/src/orchestration/Layers/ProviderCommandReactor.test.ts">

<violation number="1" location="apps/server/src/orchestration/Layers/ProviderCommandReactor.test.ts:369">
Severity: Warning

Sequential `await` without a data dependency on the previous result — wrap the independent calls in `Promise.all([...])` so they race instead of waterfalling

Wrap independent awaits in `Promise.all([...])` so they race instead of waterfalling — second call doesn't depend on the first

Rule: `server-sequential-independent-await`
</violation>

</file>

Reviewed by react-review for commit d2742fc. Configure here.

@github-actions github-actions Bot added the vouch:trusted PR author is trusted by repo permissions or the VOUCHED list. label May 12, 2026
Comment thread apps/server/src/orchestration/Layers/ProviderCommandReactor.test.ts
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 12, 2026

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 3a06bc76-10df-4980-8a0c-ebfe7bcd8531

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@justsomelegs justsomelegs changed the title Handle stale provider runtime state reconcile provider session state and settle stuck turns May 12, 2026
@justsomelegs justsomelegs marked this pull request as ready for review May 13, 2026 12:26
@macroscopeapp
Copy link
Copy Markdown
Contributor

macroscopeapp Bot commented May 13, 2026

Approvability

Verdict: Needs human review

This PR introduces new runtime behavior for reconciling provider session state on startup and automatically settling stuck turns. Changes to state machine transitions and startup recovery logic warrant human review.

You can customize Macroscope's approvability policy. Learn more.

@eersnington
Copy link
Copy Markdown

@justsomelegs I've built and ran dist:desktop:dmg version and it still hangs on waiting, while the session on opencode is completed

image

but I am able to end a chat session now so I think your change only fixed a local state of event cycle ingestion

image

@justsomelegs
Copy link
Copy Markdown
Contributor Author

@eersnington it was subscribing to the wrong event stream as the client.subscribe.event() has been broken upstream and the current fix it to subscribe to the global.event.subscribe() method made the change and it should now work.

@justsomelegs
Copy link
Copy Markdown
Contributor Author

tbh as its an upstream issue it will most likley be fixed as the client event stream seems to just drop after the subscribed message but becuase t3code treats each thread as a new opencode instance/server not multiple threads per sever its not too much of an issue as the global event subscription will be scoped to that thread only plus its filterd on session id

@justsomelegs
Copy link
Copy Markdown
Contributor Author

i've undone the event stream change as its an upstream provider issue so i assume the issue will get fixed by opencode. its out of scope for my PR and no point adding a temp workaround if its gonna be fixed

@eersnington
Copy link
Copy Markdown

tbh as its an upstream issue it will most likley be fixed as the client event stream seems to just drop after the subscribed message but becuase t3code treats each thread as a new opencode instance/server not multiple threads per sever its not too much of an issue as the global event subscription will be scoped to that thread only plus its filterd on session id

@justsomelegs Apologies Mr Legs, but I've gotten bit busy while also looking into the issue myself. And yes they seemed to have change changed the event stream from a repo scoped event.stream to a global.events that has { directory, payload }. OpenCode itself moved its own CLI to client.global.event(), so client.event.subscribe() may be legacy. Will test out your change and compare it to mine at eersnington#1

i've undone the event stream change as its an upstream provider issue so i assume the issue will get fixed by opencode. its out of scope for my PR and no point adding a temp workaround if its gonna be fixed

Ig, but right now users with OpenCode >0.14.40 will not be able to use T3 code

@justsomelegs
Copy link
Copy Markdown
Contributor Author

@eersnington no problem at all :)

Ig, but right now users with OpenCode >0.14.40 will not be able to use T3 code

yeah true but there is already a PR open (#2673) for it so felt best to keep this PR scoped to bad session recovery states

@eersnington
Copy link
Copy Markdown

@eersnington no problem at all :)

Ig, but right now users with OpenCode >0.14.40 will not be able to use T3 code

yeah true but there is already a PR open (#2673) for it so felt best to keep this PR scoped to bad session recovery states

I agree, which is also why I'm keeping that PR in my fork as I've ran into a bunch more issues with steering prompts, and stale recovery. That way I can use a local build version of this.

@eersnington
Copy link
Copy Markdown

#2673 is great but IMO, it should also handle the older /event stream consumption so that people on older versions of OpenCode (<=0.14.40) won't be majorly surprised when they update T3 Code and their adapter breaks.

Due to this bug, a lot people within the issue threads have downgraded their OpenCode install too

@justsomelegs
Copy link
Copy Markdown
Contributor Author

could be a good PR to submit 👀

not at my computer at the moment otherwise i would submit it as a PR

@eersnington
Copy link
Copy Markdown

could be a good PR to submit 👀

not at my computer at the moment otherwise i would submit it as a PR

I've went ahead and did it Mr Legs #2704

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:L 100-499 changed lines (additions + deletions). vouch:trusted PR author is trusted by repo permissions or the VOUCHED list.

Projects

None yet

2 participants