Skip to content

feat(trask): quality-first free LLM failover chain#92

Open
th3w1zard1 wants to merge 2 commits into
mainfrom
feat/trask-free-llm-quality-failover
Open

feat(trask): quality-first free LLM failover chain#92
th3w1zard1 wants to merge 2 commits into
mainfrom
feat/trask-free-llm-quality-failover

Conversation

@th3w1zard1
Copy link
Copy Markdown
Contributor

@th3w1zard1 th3w1zard1 commented May 29, 2026

Summary

  • Add CURATED_OPENROUTER_FREE_PRIORITY in @openkotor/config so free-profile compose tries quality-ranked OpenRouter :free models before scanning vendor/llm_fallbacks list order.
  • Raise rewrite compose attempts to 8 in research-wizard and web-research, matching primary + fallback budget.
  • Harden grounded compose for cached-index queries: passagesAnchoredForQuery backfill, preserveDistinctPassagePool, template-first fast path (REQ-C ≤30s), and markdown artifact stripping in answers.
  • Add weekly corpus refresh schedulers: Cloudflare cron Worker (infra/trask-reindex-scheduler) + GitHub Actions fallback (REQ-A).
  • Export TRASK_LLM_PROFILE=free and TRASK_RESEARCH_BUDGET_MS=30000 in trask_live_stack.sh for local Holocron/Discord parity.
  • KB crosswalk for REQ-A/B/C and retrieval caveat (bounded top-k hybrid, not exhaustive corpus scan).

Manual verification (Holocron :4010, browser + HTTP)

All five expert queries from data/trask/eval/verification-queries.json:

Query Grounding Sources Latency
TSLPatcher 2DA/TLK grounded ≥2 8.3s
MDLOps / Blender grounded ≥2 6.6s
Widescreen HUD / ini grounded ≥2 5.1s
Save games / Windows grounded ≥2 6.7s
reone runtime grounded ≥2 8.3s

Stack: bash scripts/trask_live_stack.sh (Worker :8787, indexer :8790, live crawl off).

Test plan

  • pnpm build
  • node --test packages/trask/dist/grounded-evidence.test.js (20/20)
  • Live stack health (4010, 8787, 8790)
  • Browser manual gate on Holocron (saves query Done · grounded, 2 https sources)
  • HTTP /api/trask/ask poll for all 5 expert queries ≤30s
  • Full pnpm holocron:e2e green on CI
  • pnpm verify:trask-discord (requires live bot token + stack)

Residual Review Findings

  • medium packages/trask/src/grounded-evidence.ts: BRIEF_MAX_CLAIM_LINES raised 2→5 — confirm Discord brief UX still wants ~2 visible lines vs pool size 5.
  • low packages/trask/src/research-wizard.ts: LLM claim replacement guard uses URL count only; optional hardening to require grounding improvement.
  • low Missing unit tests for tryGroundedCompose thin-citation → template fallback and brief fallback dual-URL path.

Curate OpenRouter :free model priority before vendor scan order, raise
rewrite compose attempts to 8, and harden Discord brief fallbacks when LLM
rewrite fails. Export TRASK_LLM_PROFILE=free in live stack scripts for
Holocron/Discord parity.
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

Align Holocron grounding with cached-index retrieval: backfill anchored
passages for multi-URL compose, preserve upstream passage pools, and skip
LLM retries when the grounded template already meets citation bars so expert
queries finish under the 30s budget. Add Cloudflare/GitHub weekly reindex
schedulers and KB crosswalk for REQ-A/B/C.
* vector index, so "in Cloudflare" means scheduling here, storage on the host.
*/

const trimTrailingSlashes = (value: string): string => value.replace(/\/+$/, "");
Comment on lines +485 to +486
value
.replace(/!\[([^\]]*)\]\([^)]*\)/gu, "$1")
export const extractNumberedSourceUrls = (sourceLines: readonly string[]): Map<number, string> => {
const map = new Map<number, string>();
for (const line of sourceLines) {
const match = line.match(/^\s*(\d+)\.\s+.+\s-\s+(https?:\S+)/u);
for (const line of sourceLines) {
const match = line.match(/^\s*(\d+)\.\s+.+\s-\s+(https?:\S+)/u);
if (!match) continue;
const url = match[2]!.replace(/[.,;:!?)]+$/u, "");
};

const json = (status: number, body: unknown): Response =>
new Response(JSON.stringify(body), {
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants