Switch to Sonnet 4.6 and deduplicate prompt#10
Merged
mattgodbolt merged 2 commits intomainfrom Feb 21, 2026
Merged
Conversation
- Upgrade model from Haiku 4.5 to Sonnet 4.6 for improved accuracy (correctly analyses complex optimisations like partial tail-recursion elimination where Haiku made factual errors about complexity) - Deduplicate system prompt: 13KB → 5KB (63% reduction) — same guidance said once instead of 3-4 times across sections - Remove assistant prefill (unsupported by Sonnet 4.6), with conditional logic so models that support it can still use it via prompt.yaml - Add Sonnet 4.6, Sonnet 4.5, Opus 4.5, Opus 4.6 to model cost table - Update tests for optional assistant prefill Cost is ~3.4x higher per request vs Haiku ($0.01-0.02 vs $0.003-0.005) but accuracy on complex cases is meaningfully better. 🤖 Generated by LLM (Claude, via OpenClaw)
There was a problem hiding this comment.
Pull request overview
This PR upgrades the AI model from Claude Haiku 4.5 to Claude Sonnet 4.6 for improved accuracy on complex reasoning tasks, and significantly reduces prompt redundancy from 13KB to 5KB. The changes include updating the model configuration, deduplicating repetitive guidance across prompt sections, making assistant prefill conditional (omitted when empty), and adding cost information for newer Claude model versions (Sonnet/Opus 4.5 and 4.6).
Changes:
- Model upgrade from
claude-haiku-4-5toclaude-sonnet-4-6for better accuracy on complex compiler optimization explanations - Prompt deduplication consolidating repetitive guidance into clear, concise "Core principles" (63% size reduction)
- Conditional assistant prefill logic that only includes the assistant message when prefill text is non-empty
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| app/prompt.yaml | Updates model to Sonnet 4.6, consolidates redundant prompt guidance, sets assistant_prefill to empty string |
| app/prompt.py | Adds conditional logic to only include assistant prefill message when non-empty |
| app/test_explain.py | Updates test to accept optional assistant prefill (1 or 2 messages instead of exactly 2) |
| app/model_costs.py | Adds cost entries for Sonnet 4.5, 4.6, Opus 4.5, and 4.6 model families |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
🤖 Generated by LLM (Claude, via OpenClaw)
This was referenced Feb 21, 2026
Closed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Upgrades the model from Haiku 4.5 to Sonnet 4.6 and cleans up the prompt.
Changes
Why Sonnet?
Tested across 5 cases (simple/complex code, beginner/experienced audience, optimised/unoptimised). Key finding: Haiku makes factual errors on complex reasoning that Sonnet gets right.
Example: On a fibonacci function where GCC partially eliminates tail recursion, Haiku incorrectly claims the complexity reduces from O(2^n) to O(n). Sonnet correctly identifies it remains O(2^n) with only half the call depth eliminated:
For a tool teaching people about compilers, this kind of accuracy matters.
Cost impact
Still very affordable — roughly 1-2 cents per explanation.
Prompt deduplication
The old system prompt said things like "trace through inputs and outputs step-by-step" and "verify whether
leaperforms address calculation vs memory access" 3-4 times in different sections. The new version says each thing once. This saves ~1,400 input tokens per request (further reducing the cost gap).Testing
(I'm Molty, an AI assistant acting on behalf of @mattgodbolt)