Add directive-driven improvement and prompt surface optimization by Born14 · Pull Request #46 · Born14/verify

Born14 · 2026-04-05T23:14:08Z

Summary

Introduces two major enhancements to the autonomous improvement engine:

Directive-Driven Improvement — Operators can now guide the improvement loop through a improve-directive.md file instead of modifying TypeScript. This follows AutoAgent's "program the meta-agent" pattern.
Prompt Surface Optimization — Extends the improvement loop to recognize and optimize LLM prompts and tunable thresholds within gate files (e.g., vision.ts, triangulation.ts), allowing the LLM to prefer prompt edits over logic changes when appropriate.
Continuous Mode — Implements hill-climbing iteration support, allowing the improvement engine to re-baseline and iterate after each accepted improvement.

Key Changes

improve-directive.ts (new)
- loadDirective() — Loads and parses improve-directive.md with structured fields (priority gates, focus mode, edit style) and custom instructions
- formatDirectiveForPrompt() — Injects directive context into LLM prompts
- applyDirectiveToBundles() — Prioritizes evidence bundles based on directive's priority gates
improve-prompt-surface.ts (new)
- Defines known prompt regions in gate files (vision.ts, triangulation.ts, hallucination.ts)
- extractPromptRegion() — Extracts actual prompt text from source files
- formatPromptSurfaceContext() — Provides LLM with prompt region metadata and tuning advice
- isPromptRegion() — Checks if a file/function is a tunable prompt surface
improve.ts (modified)
- Refactored runImproveLoop() into runSingleIteration() to support continuous mode
- Loads and applies directive at start of each iteration
- Injects directive and prompt surface context into bundle processing
- Tracks cumulative LLM usage and accepted improvements across iterations
- Early termination when no improvements found
self-test.ts (modified)
- Added CLI flags: --continuous, --max-iterations=N, --directive=PATH, --prompt-surface
- Updated help text with examples for all new modes
types.ts (modified)
- Extended ImproveConfig with maxIterations, directivePath, promptSurface fields
improve-directive.md (new)
- Template file with commented examples showing how to configure improvement priorities
improve-directive.test.ts (new)
- Unit tests for directive parsing, prompt formatting, and bundle prioritization
.gitignore (modified)
- Added .verify/ directory (created by test runs)

Notable Implementation Details

Directive parsing is lenient (case-insensitive, flexible delimiters) to reduce friction
Prompt regions use start/end markers for robust extraction even if code changes
Directive context is injected into both diagnosis and fix generation prompts
Continuous mode re-baselines after each accepted improvement, enabling iterative refinement
Cumulative LLM usage is tracked and reported at the end of continuous runs
Early termination prevents wasted iterations when the improvement frontier is reached

https://claude.ai/code/session_01SJkfKmU2V83UrCvgyH2JAD

… surface optimization Three AutoAgent-inspired concepts integrated into the evidence-centric improve loop: 1. Continuous mode (--continuous / --max-iterations=N): Re-baselines after each accepted improvement and iterates, compounding small wins. Stops when an iteration produces no accepted candidates. 2. Directive-driven improvement (improve-directive.md): Externalizes improvement strategy into a human-editable Markdown file. Operators can specify priority gates, focus mode (false positives vs negatives), edit style preferences, and custom instructions — all injected into LLM diagnosis/fix prompts. 3. Prompt surface optimization (--prompt-surface): Extends the bounded surface to include LLM prompts within gates (vision.ts prompt, triangulation weights). The fix generator gets context about which regions are prompts vs logic, preferring prompt edits for prompt-related failures. https://claude.ai/code/session_01SJkfKmU2V83UrCvgyH2JAD

https://claude.ai/code/session_01SJkfKmU2V83UrCvgyH2JAD

Born14 · 2026-04-05T23:17:46Z

Deferring until dirty count reaches 0 and the basic improve loop is stable.

What's merge-ready:

--continuous (hill-climbing iteration) — will cherry-pick when dirty = 0

What needs more work:

Directive file — not needed until discovery mode is active and we need to steer priorities
Prompt surface — marker strings are unvalidated against current gate code, needs design pass

Good research. Just not the priority right now. The priority is clearing the last 7 dirty scenarios and publishing v0.8.0 with clean sensors.

Extracts runSingleIteration() from runImproveLoop(). When maxIterations > 1, the loop re-baselines after each accepted fix and runs again. Stops early if an iteration produces no improvements. Usage (disabled by default — single pass): bun run src/cli.ts improve --llm=gemini Enable with --continuous: bun run src/cli.ts improve --llm=gemini --continuous bun run src/cli.ts improve --llm=gemini --continuous --max-iterations=10 Nightly.sh does NOT use --continuous (single pass per run, as before). Enable when ready by adding --continuous to nightly.sh improve command. Directive system and prompt surface optimization from PR #46 intentionally NOT cherry-picked — deferred until needed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Born14 · 2026-04-07T13:24:08Z

Continuous mode cherry-picked to main (997babb). Directive system and prompt surface optimization deferred — not needed. Closing.

claude added 2 commits April 4, 2026 22:50

gitignore: add fixtures/demo-app/.verify/ state directory

39879e8

https://claude.ai/code/session_01SJkfKmU2V83UrCvgyH2JAD

Born14 closed this Apr 7, 2026

Born14 deleted the claude/evaluate-self-improvement-loop-3jCG4 branch April 7, 2026 13:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add directive-driven improvement and prompt surface optimization#46

Add directive-driven improvement and prompt surface optimization#46
Born14 wants to merge 2 commits intomainfrom
claude/evaluate-self-improvement-loop-3jCG4

Born14 commented Apr 5, 2026

Uh oh!

Born14 commented Apr 5, 2026

Uh oh!

Born14 commented Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Born14 commented Apr 5, 2026

Summary

Key Changes

Notable Implementation Details

Uh oh!

Born14 commented Apr 5, 2026

Uh oh!

Born14 commented Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants