feat: add MEMCTX_LLM_MODEL env, async agent_end, model attribution by ersintarhan · Pull Request #71 · weauratech/pi-memctx

ersintarhan · 2026-05-14T15:17:32Z

Closes #70

Motivation

Heavy host models (Claude Opus, etc.) make pi-memctx's agent_end hook block the parent agent turn 10-30s on rich turns, degrading coding UX. Additionally, there's no way to delegate pi-memctx's internal LLM work (curate, gateway judge, query expansion, pack-generate) to a faster/cheaper model while keeping the host on a premium model for reasoning.

Changes (three orthogonal, opt-in improvements)

1. Async `agent_end` — highest impact

Hook now wraps curate logic in void (async () => { ... })() so the parent turn returns immediately. Curate runs in the background; errors logged through ctx.logger?.error. Validated in production: opus host now feels instant on rich turns.

2. `MEMCTX_LLM_MODEL` env override

export MEMCTX_LLM_MODEL=anthropic/claude-haiku-4-5
# or plain id
export MEMCTX_LLM_MODEL=claude-haiku-4-5

New resolveLlmModel(ctx) helper:

Looks up env value in ctx.modelRegistry by id or provider/model
Warns + falls back to ctx.model if invalid (no hard fail)
Replaces direct ctx.model reads in 10 call sites
completeJsonWithLlm and completeJsonWithModel also accept an explicit model param

3. Optional `model:` frontmatter attribution

buildNote and llmArchitectureNote accept an optional model parameter. When provided, frontmatter gets:

---
type: observation
model: claude-haiku-4-5
...
---

Audit trail for which model wrote each memory. Old notes without the field continue to work.

Tests

10 new unit tests in test/unit.test.ts:
- 5 × resolveLlmModel (unset, by id, provider/model, invalid → warn + fallback, no model → undefined)
- 2 × buildNote model attribution (with model, without)
- 2 × llmArchitectureNote model attribution (with model, without)
- 1 × memctx_save integration test (saved note carries model: field)
Total: 111 tests passing (101 baseline + 10 new)
bun run ci (typecheck + tests + e2e) green

Backward compatibility

Zero breaking changes:

MEMCTX_LLM_MODEL unset + no explicit model param → identical to v0.13.1
agent_end hook signature unchanged
model: frontmatter only rendered when supplied

Real-world validation

Patch deployed to a Sentirum dev workspace via symlink (~/.pi/agent/extensions/pi-memctx) and run against a 49-doc pack with Claude Opus as the host model. Observed:

✅ Parent turn ends immediately on rich turns (async hook proven)
✅ Generated memories carry model: <id> field (attribution proven)
✅ Curate quality unchanged (English, structured, taxonomy-correct: action / context / observation / runbook)
✅ Three pi-memctx packs running side-by-side (cli-tools, pi-memctx self-pack, senti-sportsbook) with no conflict

Notes

No build step changes (still raw TS via files: ['index.ts', ...])
No new dependencies
~/.pi/agent/extensions/*/index.ts symlink loader confirmed working for local dev iteration

Use case for the env override

A typical setup that benefits:

# Host model = premium reasoning
# pi-memctx model = fast + cheap for curate volume
MEMCTX_LLM_MODEL=anthropic/claude-haiku-4-5 pi

This effectively splits the model budget: keep premium reasoning for the user-facing agent, delegate background memory work to a cheaper tier.

Three orthogonal improvements; fully backward compatible. 1. MEMCTX_LLM_MODEL env override - New resolveLlmModel(ctx) helper checks process.env.MEMCTX_LLM_MODEL - Supports plain 'id' and 'provider/model' formats via ctx.modelRegistry - Warns + falls back to ctx.model if env value is invalid - Refactored 10 call sites that previously used ctx.model directly - completeJsonWithLlm and completeJsonWithModel accept optional model param 2. Async agent_end (fire-and-forget curate) - agent_end hook now wraps curate in 'void (async () => { ... })()' - Parent turn returns immediately; curate runs in the background - Errors logged via ctx.logger?.error, never thrown to host - Significantly improves UX with slow host models (opus, etc.) 3. Optional 'model:' frontmatter attribution - buildNote and llmArchitectureNote accept optional model parameter - Renders 'model: <id>' field only when supplied (backward compatible) - saveMemoryCandidate threads resolveLlmModel(ctx)?.id through - Enables audit trail of which model curated each memory Tests: 10 new unit tests (5 resolveLlmModel + 2 buildNote + 2 llmArchitectureNote + 1 memctx_save integration). Total: 111 passing. 'bun run ci' green. Backward compat: with MEMCTX_LLM_MODEL unset and no model passed, behavior is identical to v0.13.1.

Copilot

Pull request overview

Adds opt-in LLM model selection and model attribution for pi-memctx memory generation, while making agent_end curation run asynchronously to reduce user-facing latency.

Changes:

Adds resolveLlmModel and routes LLM call sites through MEMCTX_LLM_MODEL fallback logic.
Adds optional model: frontmatter to generated notes and LLM architecture notes.
Converts agent_end memory curation to fire-and-forget background execution and adds related unit tests.

Reviewed changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 1 comment.

File	Description
`index.ts`	Implements model resolution, model attribution, and async `agent_end` curation.
`test/unit.test.ts`	Adds tests for model resolution and model frontmatter attribution.
`.gitignore`	Ignores local `.brv/` test artifact directory.

Comments suppressed due to low confidence (5)

index.ts:5314

Because the fire-and-forget task continues after this handler returns while activePack/activePackPath remain global mutable state, a pack switch or /memctx-init during the background LLM call can cause the turn's candidates to be saved or queued against the wrong pack. Snapshot the active pack/path for this task and ensure the save path uses that snapshot instead of the current globals.

		void (async () => {
			try {
				const messages = (event as any).messages ?? [];
				const { snippets, hasWrites, richDiscovery } = collectTurnSnippets(messages);
				if (snippets.length === 0) return;

index.ts:5403

The PR description says async curate errors are logged through ctx.logger?.error, but this implementation writes to console.error instead. If the host only captures extension logs via ctx.logger, background failures may be invisible to users/operators despite the advertised behavior.

			} catch (err) {
				console.error("[pi-memctx] curate failed (async)", err);
			}

index.ts:913

An invalid MEMCTX_LLM_MODEL logs a warning every time this helper is called. Several request paths call resolveLlmModel multiple times (and pack refresh can call it per repository), so one bad env value can flood logs; consider caching the resolution/warn result or warning once per value.

		const allModels = ctx.modelRegistry.getAll();
		const found = allModels.find((m) => m.id === envModel);
		if (found) return found;
		// Not found — warn and fall back to ctx.model
		console.warn(`[pi-memctx] MEMCTX_LLM_MODEL=${envModel} not found in registry, falling back to ctx.model`);

index.ts:5314

This newly fire-and-forget agent_end path is not covered by the added tests, even though it changes execution ordering and error handling. Please add a hook test that verifies the handler returns without awaiting curation and that background failures are captured/logged, so regressions in the async behavior are caught.

		void (async () => {
			try {
				const messages = (event as any).messages ?? [];
				const { snippets, hasWrites, richDiscovery } = collectTurnSnippets(messages);
				if (snippets.length === 0) return;

index.ts:905

When MEMCTX_LLM_MODEL is set, this assumes ctx.modelRegistry is always present. Several existing tests invoke changed call paths such as memctx_save with an empty context object, so running the suite with the new env var set will throw here instead of falling back; guard the registry access or update the affected contexts/tests to keep the env override truly opt-in.

	if (envModel) {
		// Try provider/model format first
		const slashIdx = envModel.indexOf("/");
		if (slashIdx > 0) {
			const provider = envModel.slice(0, slashIdx);
			const modelId = envModel.slice(slashIdx + 1);
			if (provider && modelId) {
				const resolved = ctx.modelRegistry.find(provider, modelId);
				if (resolved) return resolved;

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+/**
+ * Resolve the LLM model to use for pi-memctx operations.
+ *
+ * Priority:
+ *   1. MEMCTX_LLM_MODEL env var (if set and found in registry)
+ *   2. ctx.model (pi's active model — existing behavior)
+ */
+export function resolveLlmModel(ctx: ExtensionContext): ExtensionContext["model"] {


magnonta

Hey, I did a review of this alongside #70. The direction is solid and the problem is real — I just found a few things that need fixing before we can merge.

I posted design direction on the five open questions in #70, please check that first.

Blocking

1. `resolveLlmModel` needs to go through `MemctxConfig` / `envOrConfig`

The way I see it, this is the main architectural issue. The PR reads process.env.MEMCTX_LLM_MODEL directly on every call to resolveLlmModel(), but we already have the MemctxConfig / applyMemctxConfig() / envOrConfig() pattern for all 15 existing settings. Bypassing that creates three problems:

Inconsistent with how everything else is configured
No config.json support — user can't persist llmModel without env vars
console.warn fires on every call when the env value is invalid (flood)

The fix is straightforward: add llmModel?: string to MemctxConfig, resolve once in applyMemctxConfig() via envOrConfig(), and have resolveLlmModel use the already-resolved module-level variable.

2. `confirm` mode inside fire-and-forget

This one is a real bug. The entire agent_end body is wrapped in void (async () => { ... })(), but when autosaveMode === "confirm" it still calls ctx.ui.confirm() inside that block — after the turn already returned. ctx.ui.confirm needs an active turn to prompt the user, so this will either deadlock, throw, or silently fail.

The confirm path needs to run synchronously (before the void wrapper), or we skip curate entirely when in confirm mode.

3. Global state race condition

The fire-and-forget closure reads activePack and activePackPath by reference. If the user switches packs while background curate is still running, candidates get saved to the wrong pack. Snapshot both values at the top:

void (async () => {
    const pack = activePack;
    const packPath = activePackPath;
    // use these instead of globals
})();

4. `memctx_save` model attribution is wrong

memctx_save passes resolveLlmModel(ctx)?.id as the model field, but memctx_save is called by the host agent. The content was generated by the host model (e.g. Opus), not the internal memctx model (e.g. Haiku). It should be ctx.model?.id here. resolveLlmModel(ctx)?.id is correct for curate-generated and pack-generate notes — just not for memctx_save.

Important

5. `ctx.signal` in fire-and-forget

completeJsonWithModel passes ctx.signal to the LLM call. After the turn returns, the host may abort the signal and background curate fails silently. Maybe the fire-and-forget path should use its own AbortController, or at least handle signal abort gracefully.

6. Double `resolveLlmModel` calls

Several spots resolve the model for a guard check, then call completeJsonWithLlm which resolves it again internally. Example in judgeGatewayMemory:

const resolvedModel = resolveLlmModel(ctx);        // first
if (precisionRepoQuestion && resolvedModel) {
    await completeJsonWithModel(ctx, ...);          // resolves again inside
}

Same in buildRetrievalQueries, curateMemoryCandidatesFromTurn, and others. Each call hits process.env + getAll(). With the envOrConfig fix this becomes a non-issue, but worth cleaning up.

7. No tests for the async behavior

The 10 new tests cover resolveLlmModel and attribution well, but nothing verifies that agent_end actually returns without awaiting, that background errors are captured, or the confirm mode edge case.

8. `/memctx-review approve` attribution timing

The model is resolved at approve time, but the candidate was curated earlier — possibly in a different session with a different model config. The model should be captured at curate time and stored with the queued candidate, not resolved when the user approves.

Minor

9. `.brv/` in `.gitignore`

This looks like a local test artifact from your workspace. Should go in .git/info/exclude rather than the shared .gitignore.

10. Indentation

The first few lines inside void (async () => { are re-indented, but the rest of the body keeps old indentation. Mixed indentation in the same block — not a runtime issue but makes the code inconsistent.

11. `ctx.modelRegistry` guard

If MEMCTX_LLM_MODEL is set but ctx.modelRegistry is absent (some test contexts), resolveLlmModel throws. Worth guarding.

12. Curate errors should show in `/memctx-status`

console.error alone for background failures is not enough. If curate is failing systematically, the user has no way to know. A counter in llmStats (e.g. curateErrors) that shows in /memctx-status would help.

Happy to review a v2 once these are addressed. The core ideas are good — just need to fit them into the existing patterns.

ersintarhan · 2026-05-15T08:36:42Z

Addressed the review feedback in a5aaf8f.

What changed

Moved the internal LLM model override into the memctx config flow:
- llmModel in config
- MEMCTX_LLM_MODEL env override
- model registry guard
- one-time warning for invalid configured models
Fixed model attribution:
- memctx_save now attributes to the host model (ctx.model?.id)
- reviewed candidates preserve curate-time model attribution
Hardened async agent_end autosave:
- snapshots pack/path/model before background work
- skips detached UI confirm prompts
- uses a separate abort signal
- tracks background curate errors in /memctx-status
Removed shared .brv/ ignore entry and moved it to local .git/info/exclude.
Updated docs and added unit coverage for:
- registry fallback
- one-time invalid-model warnings
- host-model attribution

Validation

bun run ci ✅

Copilot AI review requested due to automatic review settings May 14, 2026 15:17

Copilot started reviewing on behalf of ersintarhan May 14, 2026 15:18 View session

Copilot AI reviewed May 14, 2026

View reviewed changes

magnonta mentioned this pull request May 14, 2026

Async agent_end + MEMCTX_LLM_MODEL env override + model attribution #70

Open

magnonta requested changes May 14, 2026

View reviewed changes

fix: address llm async autosave review feedback

a5aaf8f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add MEMCTX_LLM_MODEL env, async agent_end, model attribution#71

feat: add MEMCTX_LLM_MODEL env, async agent_end, model attribution#71
ersintarhan wants to merge 2 commits into
weauratech:mainfrom
ersintarhan:feat/memctx-llm-model-async-agent-end

ersintarhan commented May 14, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

magnonta left a comment

Uh oh!

ersintarhan commented May 15, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ersintarhan commented May 14, 2026

Motivation

Changes (three orthogonal, opt-in improvements)

1. Async agent_end — highest impact

2. MEMCTX_LLM_MODEL env override

3. Optional model: frontmatter attribution

Tests

Backward compatibility

Real-world validation

Notes

Use case for the env override

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

magnonta left a comment

Choose a reason for hiding this comment

Blocking

1. resolveLlmModel needs to go through MemctxConfig / envOrConfig

2. confirm mode inside fire-and-forget

3. Global state race condition

4. memctx_save model attribution is wrong

Important

5. ctx.signal in fire-and-forget

6. Double resolveLlmModel calls

7. No tests for the async behavior

8. /memctx-review approve attribution timing

Minor

9. .brv/ in .gitignore

10. Indentation

11. ctx.modelRegistry guard

12. Curate errors should show in /memctx-status

Uh oh!

ersintarhan commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changed

Validation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

1. Async `agent_end` — highest impact

2. `MEMCTX_LLM_MODEL` env override

3. Optional `model:` frontmatter attribution

1. `resolveLlmModel` needs to go through `MemctxConfig` / `envOrConfig`

2. `confirm` mode inside fire-and-forget

4. `memctx_save` model attribution is wrong

5. `ctx.signal` in fire-and-forget

6. Double `resolveLlmModel` calls

8. `/memctx-review approve` attribution timing

9. `.brv/` in `.gitignore`

11. `ctx.modelRegistry` guard

12. Curate errors should show in `/memctx-status`

ersintarhan commented May 15, 2026 •

edited

Loading