Update task management flow and add integration tests#21
Merged
Conversation
Extends ForemanConfig with a new QueueConfig model matching the queue-mediated agent protocol spec. Adds corresponding tests and documents the new section in config.example.yaml. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
SQLite-backed task queue with enqueue, claim_next (concurrency-safe via BEGIN IMMEDIATE), complete, heartbeat, drain_completed, requeue_stale, and fail_exhausted. 21 tests cover all methods including concurrent claim. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Refactored claim_next to use threading lock in conjunction with `BEGIN IMMEDIATE` for same-process thread serialization. Updated related tests and improved cleanup with explicit resource management using close().
… architecture Adds detailed problem statement, design rationale, MVP scope, key assumptions, and open questions for implementing a robust task queue backed by SQLite. Documents at-least-once delivery, claim/requeue logic, and API adjustments. Addresses gaps in current synchronous dispatch handling.
- foreman/routers/queue.py: POST /queue/next (claim task or 204), POST /queue/complete (store decision + signal drain), POST /queue/heartbeat - foreman/routers/result.py: POST /harness/result (drain-loop nudge) - server.py: register both new routers on the FastAPI app - tests/test_queue_router.py, tests/test_result_router.py: HTTP contract tests using FastAPI TestClient with dependency_overrides (no SQLite in router tests) - pyproject.toml: per-file-ignores for FastAPI router B008/TC001/TC003 patterns Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Creates the standalone `foreman-client/` package that agent authors install to communicate with the harness queue. Exposes `next_task()`, `complete_task()`, and `heartbeat()` over synchronous httpx, with structlog events and `ForemanClientError` on non-2xx responses. 100% line and branch coverage via respx HTTP mocks. Also excludes `foreman-client/` and `agents/` from root pytest collection, and excludes `foreman-client/` from the root mypy pre-commit hook to prevent duplicate module name conflicts. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…e-phase-3 Implement Phase 3: foreman-client package with ForemanClient
…p tests - Add close(), __enter__, __exit__ to ForemanClient to prevent httpx connection pool leak - Export LLMBackendRef and TaskContext from foremanclient package __init__ - Move import json to module level in test_client.py; remove misleading call-ordering comment - Add TestForemanClientLifecycle tests for close() and context manager behaviour - Mark Phase 3 plan tasks and checkpoint complete; add phase-3-review.md Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…artifacts - Introduce `.superset/config.json` with an empty setup, teardown, and run configuration. - Add `.memsearch/memory/2026-04-26.md` for session logging and transcript retention.
…e-phase-3 Create foreman-client package and enhance Phase 3 functionality
updates: - [github.com/astral-sh/ruff-pre-commit: v0.15.11 → v0.15.12](astral-sh/ruff-pre-commit@v0.15.11...v0.15.12) - [github.com/pre-commit/mirrors-mypy: v1.20.1 → v1.20.2](pre-commit/mirrors-mypy@v1.20.1...v1.20.2) - [github.com/rvben/rumdl-pre-commit: v0.1.78 → v0.1.83](rvben/rumdl-pre-commit@v0.1.78...v0.1.83)
Replace synchronous POST→parse dispatch with durable enqueue:
- Dispatcher.dispatch() now enqueues the TaskMessage in TaskQueue and
sends a fire-and-forget nudge ({"task_id": ...}) to the agent endpoint.
- DecisionMessage parsing and executor.execute() are removed from dispatch();
those belong to the drain loop (Task 11).
- Dispatcher.__init__ gains a required task_queue: TaskQueue parameter.
- __main__.py creates TaskQueue from config.queue and passes it to Dispatcher.
- Integration and server tests updated to reflect new enqueue-based protocol.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…span Add two background asyncio tasks started in a FastAPI lifespan context manager: - _drain_loop: wakes on drain_event or drain_interval_seconds; calls TaskQueue.drain_completed(), executor.execute(), and memory.upsert_memory_summary() for each completed task. - _requeue_loop: runs every requeue_interval_seconds; calls requeue_stale() and fail_exhausted(max_retries=config.queue.max_retries). Both tasks cancel cleanly on shutdown. The lifespan also initialises app.state.drain_event so /harness/result and /queue/complete can signal it. __main__.py wires app.state.executor, .memory, and .config for the lifespan. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add --queue-db argument to the start subcommand so users can override the queue database path without changing config. Priority: --queue-db > config db_path > ~/.agent-harness/queue.db default. Update plan.md to mark Tasks 11, 12, 13 and Phase 4 checkpoint complete. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Wrap TaskQueue in a with-block in _run_start so the connection is closed on all exit paths, including sys.exit() from container startup errors. Replace five `with sqlite3.connect(...) as conn:` patterns in test_executor.py with explicit open/close — the with-form only manages transactions, leaving connections open until GC. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
POST /task now returns 202 immediately and fires a background task that claims the pending task via ForemanClient.next_task(), runs triage, and reports back via complete_task(). Lifespan startup poll picks up any tasks queued while the agent was down. Inline protocol models removed; foremanclient.models is the single source of truth for TaskMessage / DecisionMessage across agent and tests. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Documents the foreman-client SDK for agent authors: install, ForemanClient constructor args, next_task/complete_task/heartbeat methods, claim timeout, heartbeat cadence, idempotency contract, and a ≤30-line minimal example. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements the MVP acceptance criterion: zero task loss under a simulated agent restart. The test uses a minimal in-process harness (real TaskQueue, real MemoryStore) and exercises the actual ForemanClient + agent startup- poll code path without live network sockets. Also adds --run-integration pytest flag and integration marker so the test is skipped in CI by default. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
# Conflicts: # uv.lock
Contributor
|
Version hint: minor Comment ID: Display version hint-auto-generated |
Contributor
|
[pre-commit.ci] pre-commit autoupdate
Exposes `timeout: float = 5.0` on `ForemanClient.__init__` and forwards it to `httpx.Client`, so callers can tune per-deployment latency requirements without monkey-patching the transport. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Warns readers that task_id and decision.task_id must match; a silent mismatch causes the drain loop to miss the result. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ndlers Tasks 2+1 from pr-21-fixes.md: - drain_completed() no longer marks rows done; rows stay 'completed' - New mark_done(task_id) transitions completed→done after successful execute - _drain_loop wraps drain_completed() in outer try/except (loop never dies) - _drain_loop wraps per-task execute+memory+mark_done in inner try/except (one bad task does not abort others in the same batch) - _lifespan finally uses suppress(CancelledError, Exception) for drain_task Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Task 4 from pr-21-fixes.md: - requeue_stale() + fail_exhausted() wrapped in try/except Exception so one bad cycle does not kill the requeue loop permanently - _lifespan finally uses suppress(CancelledError, Exception) for requeue_task Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Task 3 from pr-21-fixes.md: - _lifespan startup poll now loops calling next_task() until it returns None, processing each task before moving to the next; previously only one task was claimed, leaving N-1 accumulated tasks permanently stuck - New test: startup poll with 3 queued tasks drains all 3 before yield Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…access) Task 5 from pr-21-fixes.md: - Rename Dispatcher._executor → Dispatcher.executor (public attribute) - __main__.py updated to use dispatcher.executor instead of dispatcher._executor Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Task 6 from pr-21-fixes.md: - _process_task now starts a daemon threading.Thread that calls client.heartbeat(task.task_id) every _HEARTBEAT_INTERVAL (25 s) while triage runs so the harness does not re-queue the task mid-flight - threading.Event stops the heartbeat thread in a finally block after triage returns or raises - import threading added; _HEARTBEAT_INTERVAL module constant added Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Task 7 from pr-21-fixes.md: - Minimal example now uses @asynccontextmanager lifespan: creates ForemanClient, drains queued tasks via while-loop, yields, closes client - FastAPI(lifespan=lifespan) used instead of bare FastAPI() - Startup Poll section updated from single next_task() call to the correct loop-until-None pattern with an explanation of why a single call is wrong Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Wrap `_drain_loop` and `_requeue_loop` bodies in exception handlers to ensure background loops do not terminate on errors. - Split `drain_completed` into a new `mark_done` method for per-task completion after successful execution. - Update startup poll to drain all queued tasks on agent boot. - Add heartbeat thread to `_process_task` to prevent requeue during long-running LLM calls. - Publicize `Dispatcher.executor` to remove private attribute access between modules.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This pull request enhances the task management flow and implements the following:
ForemanClientfor robust task handling and startup recovery.foreman-clientSDK, documenting all necessary methods and usage details.TaskQueueand database connections.TaskQueuetests to use context managers for better management of resources.Checklist
Additional context
These changes address Phase 5 and parts of Phase 6 of the project, ensuring seamless agent task handling and robustness against interruptions.