You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
docs(guide): add 'When to use which?' scenario table + C code in CTA (#39)
Address Reddit feedback: guide only showed KV compression benchmarks
vs llama.cpp but didn't explain when to use quant.cpp vs llama.cpp.
Changes:
1. Added "When to use which?" table after the PPL comparison with
concrete scenarios (WASM 192KB, MCU, game engines, teaching)
and explicit acknowledgment of llama.cpp strengths (GPU, models)
2. CTA now shows both Python AND C single-header code side by side,
reinforcing the "one file" value proposition
3. Updated i18n strings for EN and KO
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
<pclass="reveal" style="color:var(--text2);font-size:0.85rem;margin-top:0.5rem">Use llama.cpp for speed on a workstation. Use quant.cpp when you need to ship LLM inference <em>inside</em> something.</p>
<h2style="margin-bottom:1rem" data-i18n="cta.title">Try It Yourself</h2>
575
-
<pstyle="color:var(--text2);margin-bottom:2rem;max-width:500px;margin-left:auto;margin-right:auto" data-i18n="cta.desc">Three lines of Python. No GPU, no API key, no setup.</p>
<pstyle="color:var(--text2);margin-bottom:2rem;max-width:560px;margin-left:auto;margin-right:auto" data-i18n="cta.desc">Python one-liner or C single-header. No GPU, no API key, no setup.</p>
0 commit comments