refactor(chat): shared history builder with model/plan/source-aware budgets by ABB65 · Pull Request #52 · Contentrain/studio

ABB65 · 2026-05-15T19:28:57Z

Summary

Studio chat and the Conversation API carried two near-identical copies of the sliding-window history builder: hard-coded 8K token ceiling, magic loadConversationMessages(..., 50) row cap, and divergent tool_calls / toolCalls casing handling. The 8K ceiling was left over from earlier model defaults and is undersized for Claude 4 (200K-window models).

This PR collapses both into server/utils/conversation-history.ts and makes the budget model/plan/source-aware.

Budget table (review-agreed)

// Model — capability + pricing tier
Haiku 4.5            12K
Sonnet 4 / 4.5 / 4.6 32K / 40K / 48K
Opus 4 / 4.1 / 4.7   32K / 32K / 48K
fallback             16K

// Plan — Contentrain's per-message margin posture
free          0      // defensive backstop, gated upstream
starter       0.75x
pro           1x
enterprise    1.25x
community     1x

// Source — who pays for the input tokens
studio / api  1x
byoa          1.5x   // user pays Anthropic directly

maxTokens = base × plan × source. Sonnet/Opus values are conservative on purpose — once prompt caching lands (cache reads cost ~10% of base input) the model table can safely grow. Sources: models overview, pricing.

Changes

New server/utils/conversation-history.ts — selectHistoryBudget() + buildPromptMessages(). Pure functions; no DB, no provider.
chat.post.ts — model selection moved before history (model drives budget); IIFE + push loop replaced with two helper calls.
ee/enterprise/conversation-api.ts — same pattern; duplicate loadConversationMessages wrapper renamed loadConversationHistoryForResponse and now serves only the /history.get route's enriched JSON response shape (the runtime chat path doesn't need usage/createdAt fields).
Integration test mocks — added vi.mock('~~/server/utils/conversation-history') to chat-route and overage-soft-cap integration files. One ownership-flow assertion relaxed from hard-coded 50 to expect.any(Number) (rowLimit is derived; budget arithmetic covered by the new unit tests).

Test plan

pnpm test — 605 passed (590 + 15 new unit tests in conversation-history.test.ts)
pnpm typecheck clean
pnpm lint — 0 errors on changed files

13 unit tests cover the matrix: per-model budget, fallback, plan multipliers (starter/enterprise/community/unknown), free→0 defensive backstop, BYOA 1.5x, API no-multiplier, rowLimit scaling, empty history, fits-in-budget, exceeds-budget, zero-budget, snake_case tool_calls, camelCase toolCalls, chronological ordering after cutoff.

Out of scope

Tokenization accuracy. Still using length/4 heuristic — accurate Anthropic countTokens integration is a separate PR.
Prompt cache breakpoints. Next PR (perf(ai): system-block array + cache_control); once cached, the Sonnet/Opus budget values in the table here can grow with margin headroom.

Net diff

+379 / −64 — most of the size is unit test coverage. Net runtime behavior change: conversations now use 1.5x–6x more history before cutoff depending on model/plan/source, but cost stays bounded by the multiplier table.

…udgets `chat.post.ts` and `ee/enterprise/conversation-api.ts` carried two copies of the same sliding-window history builder: a hard-coded 8K token ceiling, a magic `loadConversationMessages(..., 50)` row cap, and divergent `tool_calls`/`toolCalls` casing handling. Both also ignored the fact that Claude 4 model windows are now 200K — 8K was left over from earlier model defaults and was undersized for any non-toy conversation. This PR collapses both into `server/utils/conversation-history.ts`: - `selectHistoryBudget({ plan, model, source })` returns the per-call token ceiling and a derived `rowLimit` for DB pagination. Budget is decomposed along three axes: Model — capability and pricing tier: Haiku 4.5 12K Sonnet 4 / 4.5 / 4.6 32K / 40K / 48K Opus 4 / 4.1 / 4.7 32K / 32K / 48K fallback 16K Plan — Contentrain's per-message margin posture: free 0 (defensive backstop, should never reach chat) starter 0.75x pro 1x enterprise 1.25x community 1x Source — who pays for the input tokens: studio / api 1x byoa 1.5x (user pays Anthropic directly) Computed budget = base × plan × source. Sonnet/Opus values are intentionally conservative; once prompt caching lands (cache reads cost ~10% of base input) the model table can grow safely. - `buildPromptMessages({ history, newUserMessage, budget })` walks rows newest→oldest under the token cap, then takes the kept slice in chronological order and appends the current user message. Handles both `tool_calls` (snake_case from DB) and `toolCalls` (legacy EE wrapper) for the same content. Studio chat (`chat.post.ts`) now picks model before history (model drives budget), then calls the two helpers in place of the IIFE + push loop. Conversation API mirrors the change and drops its duplicate `loadConversationMessages` wrapper — that helper is renamed `loadConversationHistoryForResponse` and kept only for the `/history.get` route's JSON response shape, which still needs the enriched `{ usage, createdAt }` projection. Integration test mocks were missing the `~~/server/utils/conversation-history` entry; added to both `chat-route` and `overage-soft-cap` integration files. One assertion in `chat-route` previously hard-coded the magic `50` row limit — relaxed to `expect.any(Number)` since rowLimit is now derived (covered by the new unit tests). Net: −18 lines, +228 (mostly tests). No DB or schema changes; no runtime behavior change beyond "uzun konuşmalarda daha fazla history korunuyor" along the budget table.

ABB65 merged commit e61711e into main May 15, 2026
1 check passed

ABB65 deleted the refactor/chat-history-builder branch May 15, 2026 19:46

ABB65 mentioned this pull request May 15, 2026

perf(ai): anthropic prompt cache + cache token accounting #53

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(chat): shared history builder with model/plan/source-aware budgets#52

refactor(chat): shared history builder with model/plan/source-aware budgets#52
ABB65 merged 1 commit into
mainfrom
refactor/chat-history-builder

ABB65 commented May 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant