Merged
Conversation
Allow callers to provide an explicit checkpointer for graph compilation. The executor uses this to enable state snapshots on all graphs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add RunManager for tracking active runs with per-key and global limits. Execute graphs via astream with state snapshots after each node. Sequential event IDs for duplicate-free SSE reconnection replay. Emit node_started before node_completed for each node. Derive condition_result in edge_traversed from schema branches. Human-in-the-loop resume with buffered replay (no SSE-listener wait). Run timeout (5min default) and cancellation via asyncio.Event. Safe DB updates in exception handlers via _safe_update_run. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
POST /v1/graphs/{id}/run starts execution and returns run_id.
GET /v1/runs/{id}/stream opens SSE with Last-Event-ID reconnection.
POST /v1/runs/{id}/resume accepts any JSON type as human input.
GET /v1/runs/{id}/status supports reconnection with DB fallback.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add paused_node_id/paused_prompt fields to RunContext instead of fragile ctx.events[-1] access in status endpoint - Add RunManager.cancel_all() to avoid accessing private _runs in shutdown - Document db lifetime intent in start_run route comment Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace monolithic code-reviewer with 3 focused agents: - security-reviewer (opus): auth, ownership, secrets, SSRF — CRITICAL/WARNING only - logic-reviewer (opus): correctness, edge cases, race conditions — with confidence levels - quality-reviewer (sonnet): tests, conventions, readability — capped at 5 suggestions code-reviewer.md becomes an orchestrator that launches all 3 in parallel. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add 12 manual test scripts (07-18) covering Phase 3 executor and SSE streaming features. Move all manual tests from scripts/ to tests/manual/ for better organization. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
POST /graphs/{id}/run,GET /runs/{id}/stream,POST /runs/{id}/resume,GET /runs/{id}/statusbuild_graphfor interrupt/resume supportscripts/totests/manual/Test plan
All 18 manual tests pass (
bash tests/manual/run_all.sh):Phase 2 builder (existing, moved to tests/manual/)
test_01— Linear graph (FakeListChatModel)test_02— Real Gemini LLM integrationtest_03— Branching with field_equals conditiontest_04— Tool node + tool_error condition routingtest_05— Human input interrupt & resumetest_06— Full pipeline (tool + LLM + condition)Phase 3 executor + SSE (new)
test_07— SSE event sequence: run_started → node_started → node_completed → edge_traversed → graph_completedtest_08— State snapshots in node_completed events evolve across nodestest_09— Run status transitions (running → completed), duration_ms, final_statetest_10— Human input pause/resume via executor: graph_paused event, submit_resume, completiontest_11— SSE reconnection: replay buffer stores sequential IDs, Last-Event-ID skips seen eventstest_12— Concurrent run limit: MAX_RUNS_PER_KEY enforced, different owners independenttest_13— Run timeout: RUN_TIMEOUT_SECONDS triggers error event, loop terminatestest_14— Condition routing SSE: edge_traversed shows correct condition_result per branchtest_15— Tool error routing SSE: tool_error routes success→END, error→llm_err with deferred edgetest_16— Keepalive during pause: no id field, excluded from replay buffertest_17— DB fallback: format_sse produces correct terminal events for completed/lost runstest_18— Cancel run: cancel_event terminates execution, error event emittedNotable findings during testing
edge_traversedwhen routing to END (not a real node) — test_15 verifies via completed node list insteadcancel_event+resume_eventto unblock_wait_for_resume— test_18 documents thisstream_run_ssequeue is single-reader; reconnection test_11 validates replay buffer directly🤖 Generated with Claude Code