Skip to content

Update task management flow and add integration tests#21

Merged
coordt merged 35 commits intomainfrom
noble-cupcake
May 5, 2026
Merged

Update task management flow and add integration tests#21
coordt merged 35 commits intomainfrom
noble-cupcake

Conversation

@coordt
Copy link
Copy Markdown
Member

@coordt coordt commented May 4, 2026

Description

This pull request enhances the task management flow and implements the following:

  • Marked Phase 5 and Phase 6 tasks as complete in the project plan.
  • Updated the issue-triage agent to utilize ForemanClient for robust task handling and startup recovery.
  • Introduced an integration test to verify agent resilience during restarts, ensuring zero task loss.
  • Added a comprehensive guide for writing agents with the foreman-client SDK, documenting all necessary methods and usage details.
  • Fixed resource warnings related to proper closure of TaskQueue and database connections.
  • Improved testing by converting TaskQueue tests to use context managers for better management of resources.

Checklist

  • Tested changes locally.
  • Updated documentation where applicable.
  • Verified compatibility with existing workflows.

Additional context

These changes address Phase 5 and parts of Phase 6 of the project, ensuring seamless agent task handling and robustness against interruptions.

coordt and others added 23 commits April 22, 2026 14:18
Extends ForemanConfig with a new QueueConfig model matching the
queue-mediated agent protocol spec. Adds corresponding tests and
documents the new section in config.example.yaml.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
SQLite-backed task queue with enqueue, claim_next (concurrency-safe via
BEGIN IMMEDIATE), complete, heartbeat, drain_completed, requeue_stale,
and fail_exhausted. 21 tests cover all methods including concurrent claim.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Refactored claim_next to use threading lock in conjunction with `BEGIN IMMEDIATE` for same-process thread serialization. Updated related tests and improved cleanup with explicit resource management using close().
… architecture

Adds detailed problem statement, design rationale, MVP scope, key assumptions, and open questions for implementing a robust task queue backed by SQLite. Documents at-least-once delivery, claim/requeue logic, and API adjustments. Addresses gaps in current synchronous dispatch handling.
- foreman/routers/queue.py: POST /queue/next (claim task or 204),
  POST /queue/complete (store decision + signal drain), POST /queue/heartbeat
- foreman/routers/result.py: POST /harness/result (drain-loop nudge)
- server.py: register both new routers on the FastAPI app
- tests/test_queue_router.py, tests/test_result_router.py: HTTP contract tests
  using FastAPI TestClient with dependency_overrides (no SQLite in router tests)
- pyproject.toml: per-file-ignores for FastAPI router B008/TC001/TC003 patterns

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Creates the standalone `foreman-client/` package that agent authors install
to communicate with the harness queue. Exposes `next_task()`, `complete_task()`,
and `heartbeat()` over synchronous httpx, with structlog events and
`ForemanClientError` on non-2xx responses. 100% line and branch coverage
via respx HTTP mocks.

Also excludes `foreman-client/` and `agents/` from root pytest collection,
and excludes `foreman-client/` from the root mypy pre-commit hook to prevent
duplicate module name conflicts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…e-phase-3

Implement Phase 3: foreman-client package with ForemanClient
…p tests

- Add close(), __enter__, __exit__ to ForemanClient to prevent httpx connection pool leak
- Export LLMBackendRef and TaskContext from foremanclient package __init__
- Move import json to module level in test_client.py; remove misleading call-ordering comment
- Add TestForemanClientLifecycle tests for close() and context manager behaviour
- Mark Phase 3 plan tasks and checkpoint complete; add phase-3-review.md

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…artifacts

- Introduce `.superset/config.json` with an empty setup, teardown, and run configuration.
- Add `.memsearch/memory/2026-04-26.md` for session logging and transcript retention.
…e-phase-3

Create foreman-client package and enhance Phase 3 functionality
updates:
- [github.com/astral-sh/ruff-pre-commit: v0.15.11 → v0.15.12](astral-sh/ruff-pre-commit@v0.15.11...v0.15.12)
- [github.com/pre-commit/mirrors-mypy: v1.20.1 → v1.20.2](pre-commit/mirrors-mypy@v1.20.1...v1.20.2)
- [github.com/rvben/rumdl-pre-commit: v0.1.78 → v0.1.83](rvben/rumdl-pre-commit@v0.1.78...v0.1.83)
Replace synchronous POST→parse dispatch with durable enqueue:
- Dispatcher.dispatch() now enqueues the TaskMessage in TaskQueue and
  sends a fire-and-forget nudge ({"task_id": ...}) to the agent endpoint.
- DecisionMessage parsing and executor.execute() are removed from dispatch();
  those belong to the drain loop (Task 11).
- Dispatcher.__init__ gains a required task_queue: TaskQueue parameter.
- __main__.py creates TaskQueue from config.queue and passes it to Dispatcher.
- Integration and server tests updated to reflect new enqueue-based protocol.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…span

Add two background asyncio tasks started in a FastAPI lifespan context manager:
- _drain_loop: wakes on drain_event or drain_interval_seconds; calls
  TaskQueue.drain_completed(), executor.execute(), and
  memory.upsert_memory_summary() for each completed task.
- _requeue_loop: runs every requeue_interval_seconds; calls
  requeue_stale() and fail_exhausted(max_retries=config.queue.max_retries).

Both tasks cancel cleanly on shutdown. The lifespan also initialises
app.state.drain_event so /harness/result and /queue/complete can signal it.
__main__.py wires app.state.executor, .memory, and .config for the lifespan.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add --queue-db argument to the start subcommand so users can override the
queue database path without changing config. Priority: --queue-db > config
db_path > ~/.agent-harness/queue.db default. Update plan.md to mark Tasks
11, 12, 13 and Phase 4 checkpoint complete.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Wrap TaskQueue in a with-block in _run_start so the connection is closed
on all exit paths, including sys.exit() from container startup errors.

Replace five `with sqlite3.connect(...) as conn:` patterns in
test_executor.py with explicit open/close — the with-form only manages
transactions, leaving connections open until GC.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
POST /task now returns 202 immediately and fires a background task that
claims the pending task via ForemanClient.next_task(), runs triage, and
reports back via complete_task(). Lifespan startup poll picks up any
tasks queued while the agent was down.

Inline protocol models removed; foremanclient.models is the single
source of truth for TaskMessage / DecisionMessage across agent and tests.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Documents the foreman-client SDK for agent authors: install, ForemanClient
constructor args, next_task/complete_task/heartbeat methods, claim timeout,
heartbeat cadence, idempotency contract, and a ≤30-line minimal example.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements the MVP acceptance criterion: zero task loss under a simulated
agent restart. The test uses a minimal in-process harness (real TaskQueue,
real MemoryStore) and exercises the actual ForemanClient + agent startup-
poll code path without live network sockets.

Also adds --run-integration pytest flag and integration marker so the test
is skipped in CI by default.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 4, 2026

Version hint: minor
Current version: 0.3.0
New version (when merged): 0.4.0

Comment ID: Display version hint-auto-generated

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 4, 2026

PR Preview Action v1.8.1
Preview removed because the pull request was closed.
2026-05-05 15:32 UTC

coordt and others added 5 commits May 4, 2026 10:20
Exposes `timeout: float = 5.0` on `ForemanClient.__init__` and
forwards it to `httpx.Client`, so callers can tune per-deployment
latency requirements without monkey-patching the transport.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Warns readers that task_id and decision.task_id must match;
a silent mismatch causes the drain loop to miss the result.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ndlers

Tasks 2+1 from pr-21-fixes.md:

- drain_completed() no longer marks rows done; rows stay 'completed'
- New mark_done(task_id) transitions completed→done after successful execute
- _drain_loop wraps drain_completed() in outer try/except (loop never dies)
- _drain_loop wraps per-task execute+memory+mark_done in inner try/except
  (one bad task does not abort others in the same batch)
- _lifespan finally uses suppress(CancelledError, Exception) for drain_task

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
coordt and others added 7 commits May 5, 2026 09:35
Task 4 from pr-21-fixes.md:

- requeue_stale() + fail_exhausted() wrapped in try/except Exception so
  one bad cycle does not kill the requeue loop permanently
- _lifespan finally uses suppress(CancelledError, Exception) for requeue_task

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Task 3 from pr-21-fixes.md:

- _lifespan startup poll now loops calling next_task() until it returns None,
  processing each task before moving to the next; previously only one task
  was claimed, leaving N-1 accumulated tasks permanently stuck
- New test: startup poll with 3 queued tasks drains all 3 before yield

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…access)

Task 5 from pr-21-fixes.md:

- Rename Dispatcher._executor → Dispatcher.executor (public attribute)
- __main__.py updated to use dispatcher.executor instead of dispatcher._executor

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Task 6 from pr-21-fixes.md:

- _process_task now starts a daemon threading.Thread that calls
  client.heartbeat(task.task_id) every _HEARTBEAT_INTERVAL (25 s) while
  triage runs so the harness does not re-queue the task mid-flight
- threading.Event stops the heartbeat thread in a finally block after
  triage returns or raises
- import threading added; _HEARTBEAT_INTERVAL module constant added

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Task 7 from pr-21-fixes.md:

- Minimal example now uses @asynccontextmanager lifespan: creates
  ForemanClient, drains queued tasks via while-loop, yields, closes client
- FastAPI(lifespan=lifespan) used instead of bare FastAPI()
- Startup Poll section updated from single next_task() call to the correct
  loop-until-None pattern with an explanation of why a single call is wrong

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Wrap `_drain_loop` and `_requeue_loop` bodies in exception handlers to ensure background loops do not terminate on errors.
- Split `drain_completed` into a new `mark_done` method for per-task completion after successful execution.
- Update startup poll to drain all queued tasks on agent boot.
- Add heartbeat thread to `_process_task` to prevent requeue during long-running LLM calls.
- Publicize `Dispatcher.executor` to remove private attribute access between modules.
@coordt coordt merged commit 47cd678 into main May 5, 2026
7 checks passed
@coordt coordt deleted the noble-cupcake branch May 5, 2026 15:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant