Skip to content

Commit a6475d0

Browse files
unamedkrclaude
andauthored
i18n: complete EN/KO coverage for Verification + Beyond RAG + footer (#40)
Previously missing data-i18n attributes on: - Beyond RAG blockquote (rag.quote) - Beyond RAG "It didn't..." paragraph (rag.para2) - Verification section (entire): section label, title, intro, all 3 bar labels, hallucination problem heading/description/ examples/summary, 3 info cards, CTA button (14 new keys) - Footer (footer.text) All 185 HTML keys now match 185 EN keys and 185 KO keys exactly. Language toggle (EN ↔ KO) swaps every visible string. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent e4e0141 commit a6475d0

File tree

1 file changed

+70
-24
lines changed

1 file changed

+70
-24
lines changed

site/index.html

Lines changed: 70 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -522,7 +522,7 @@ <h3 class="reveal" data-i18n="ch5.context.title">Context Length on 8GB Mac</h3>
522522
<div class="section-label" data-i18n="rag.label">Movement</div>
523523
<h2 class="reveal" data-i18n="rag.title">Beyond RAG</h2>
524524

525-
<blockquote class="reveal" style="border-left:3px solid var(--accent);padding:1rem 1.5rem;margin:1.5rem 0;background:rgba(108,92,231,.05);font-size:1.1rem;line-height:1.6;color:var(--text)">
525+
<blockquote class="reveal" style="border-left:3px solid var(--accent);padding:1rem 1.5rem;margin:1.5rem 0;background:rgba(108,92,231,.05);font-size:1.1rem;line-height:1.6;color:var(--text)" data-i18n-html="rag.quote">
526526
<strong>Chunking RAG was a workaround for small context windows.</strong><br>
527527
The workaround became dogma.<br>
528528
Now context windows are big enough that we don't need the workaround.<br>
@@ -531,7 +531,7 @@ <h2 class="reveal" data-i18n="rag.title">Beyond RAG</h2>
531531

532532
<p class="reveal" data-i18n-html="rag.intro">Traditional RAG splits documents into 512-token chunks, embeds them in a vector database, and retrieves fragments. This was a reasonable engineering compromise when LLMs had 2K context windows. <strong>Now they have 128K. The compromise should have started disappearing.</strong></p>
533533

534-
<p class="reveal">It didn't. The infrastructure became dogma. Vector DBs became billion-dollar companies. "RAG pipeline" became something every AI engineer was expected to build, regardless of whether their use case actually needed one.</p>
534+
<p class="reveal" data-i18n="rag.para2">It didn't. The infrastructure became dogma. Vector DBs became billion-dollar companies. "RAG pipeline" became something every AI engineer was expected to build, regardless of whether their use case actually needed one.</p>
535535

536536
<div class="viz reveal">
537537
<div class="viz-title" data-i18n="rag.viz.title">Chunk-Level RAG vs Document-Level RAG</div>
@@ -600,34 +600,34 @@ <h4 data-i18n="rag.card3.t">Read Once, Query Forever</h4>
600600
<!-- ===== Verification Box ===== -->
601601
<section id="verification">
602602
<div class="container">
603-
<div class="section-label">Measured Result</div>
604-
<h2 class="reveal">7/7 vs 0/7 — Verified</h2>
605-
<p class="reveal">We compared three approaches on a synthetic 5-section document with 7 questions (4 single-hop, 3 multi-hop). Tested with <strong>Llama 3.2 3B Q8_0</strong>:</p>
603+
<div class="section-label" data-i18n="verify.label">Measured Result</div>
604+
<h2 class="reveal" data-i18n="verify.title">7/7 vs 0/7 — Verified</h2>
605+
<p class="reveal" data-i18n-html="verify.intro">We compared three approaches on a synthetic 5-section document with 7 questions (4 single-hop, 3 multi-hop). Tested with <strong>Llama 3.2 3B Q8_0</strong>:</p>
606606

607607
<div class="viz reveal">
608-
<div class="viz-title">Fact Extraction Accuracy</div>
608+
<div class="viz-title" data-i18n="verify.viz.title">Fact Extraction Accuracy</div>
609609

610610
<div class="mem-bar-container">
611-
<div class="mem-bar-label"><span>Chunk-RAG (wrong section retrieved)</span><span style="color:var(--red)">0/7 — all hallucinated</span></div>
611+
<div class="mem-bar-label"><span data-i18n="verify.bar1.label">Chunk-RAG (wrong section retrieved)</span><span style="color:var(--red)" data-i18n="verify.bar1.val">0/7 — all hallucinated</span></div>
612612
<div class="mem-bar"><div class="mem-bar-fill bar-fp32" style="--w:0%">0%</div></div>
613613
</div>
614614

615615
<div class="mem-bar-container">
616-
<div class="mem-bar-label"><span>Full Document (FP32 KV)</span><span style="color:var(--green)">7/7</span></div>
616+
<div class="mem-bar-label"><span data-i18n="verify.bar2.label">Full Document (FP32 KV)</span><span style="color:var(--green)">7/7</span></div>
617617
<div class="mem-bar"><div class="mem-bar-fill bar-aggr" style="--w:100%">100%</div></div>
618618
</div>
619619

620620
<div class="mem-bar-container">
621-
<div class="mem-bar-label"><span><strong>Full Document (6.4x KV compression)</strong></span><span style="color:var(--green)"><strong>7/7</strong></span></div>
622-
<div class="mem-bar"><div class="mem-bar-fill bar-aggr" style="--w:100%">100% — same as FP32</div></div>
621+
<div class="mem-bar-label"><span data-i18n-html="verify.bar3.label"><strong>Full Document (6.4x KV compression)</strong></span><span style="color:var(--green)"><strong>7/7</strong></span></div>
622+
<div class="mem-bar"><div class="mem-bar-fill bar-aggr" style="--w:100%" data-i18n="verify.bar3.inner">100% — same as FP32</div></div>
623623
</div>
624624
</div>
625625

626-
<h3 class="reveal">The Hallucination Problem</h3>
627-
<p class="reveal">When chunk-RAG retrieved the wrong section, the model didn't say "I don't know" — it generated <strong>plausible-sounding lies</strong>:</p>
626+
<h3 class="reveal" data-i18n="verify.halluc.title">The Hallucination Problem</h3>
627+
<p class="reveal" data-i18n-html="verify.halluc.desc">When chunk-RAG retrieved the wrong section, the model didn't say "I don't know" — it generated <strong>plausible-sounding lies</strong>:</p>
628628

629629
<div class="viz reveal">
630-
<div style="font-family:monospace;font-size:.85rem;line-height:2;color:var(--text2)">
630+
<div style="font-family:monospace;font-size:.85rem;line-height:2;color:var(--text2)" data-i18n-html="verify.halluc.examples">
631631
<div><span style="color:var(--accent2)">Q:</span> Who is the CTO?</div>
632632
<div><span style="color:var(--red)">Chunk-RAG:</span> "John Smith" &emsp; <span style="color:var(--text3)">→ truth: Maria Santos</span></div>
633633
<br>
@@ -639,28 +639,28 @@ <h3 class="reveal">The Hallucination Problem</h3>
639639
</div>
640640
</div>
641641

642-
<p class="reveal" style="color:var(--text);font-weight:500;font-size:1.1rem">This is the fundamental danger of chunk-RAG: <strong>retrieval failure becomes silent hallucination</strong>. KV compression makes it possible to load the entire document into context, eliminating this failure mode on consumer hardware.</p>
642+
<p class="reveal" style="color:var(--text);font-weight:500;font-size:1.1rem" data-i18n-html="verify.halluc.summary">This is the fundamental danger of chunk-RAG: <strong>retrieval failure becomes silent hallucination</strong>. KV compression makes it possible to load the entire document into context, eliminating this failure mode on consumer hardware.</p>
643643

644644
<div class="card-grid stagger" style="margin-top:2rem">
645645
<div class="info-card">
646646
<div class="card-icon">&#x2705;</div>
647-
<h4>KV Compression = Zero Quality Loss</h4>
648-
<p>FP32 7/7 = 6.4x compressed 7/7. The 6.4x memory savings cost nothing in fact extraction quality.</p>
647+
<h4 data-i18n="verify.card1.t">KV Compression = Zero Quality Loss</h4>
648+
<p data-i18n="verify.card1.d">FP32 7/7 = 6.4x compressed 7/7. The 6.4x memory savings cost nothing in fact extraction quality.</p>
649649
</div>
650650
<div class="info-card">
651651
<div class="card-icon">&#x1F517;</div>
652-
<h4>Multi-Hop Reasoning Works</h4>
653-
<p>"What risk affects the growth region?" requires linking Section 3 (Asia growth) with Section 5 (Asia currency risk). Full-doc: ✓. Chunk-RAG: impossible.</p>
652+
<h4 data-i18n="verify.card2.t">Multi-Hop Reasoning Works</h4>
653+
<p data-i18n="verify.card2.d">"What risk affects the growth region?" requires linking Section 3 (Asia growth) with Section 5 (Asia currency risk). Full-doc: ✓. Chunk-RAG: impossible.</p>
654654
</div>
655655
<div class="info-card">
656656
<div class="card-icon">&#x1F4BB;</div>
657-
<h4>Runs on 16GB Mac</h4>
658-
<p>Llama 3.2 3B Q8_0, no GPU. 6.4x KV compression makes this practical on consumer hardware.</p>
657+
<h4 data-i18n="verify.card3.t">Runs on 16GB Mac</h4>
658+
<p data-i18n="verify.card3.d">Llama 3.2 3B Q8_0, no GPU. 6.4x KV compression makes this practical on consumer hardware.</p>
659659
</div>
660660
</div>
661661

662662
<div style="text-align:center;margin-top:3rem">
663-
<a href="https://github.com/quantumaikr/quant.cpp/blob/main/docs/beyond-rag-manifesto.md" class="cta-btn cta-primary" style="font-size:.95rem">Read the Beyond RAG Manifesto &rarr;</a>
663+
<a href="https://github.com/quantumaikr/quant.cpp/blob/main/docs/beyond-rag-manifesto.md" class="cta-btn cta-primary" style="font-size:.95rem" data-i18n-html="verify.cta">Read the Beyond RAG Manifesto &rarr;</a>
664664
</div>
665665
</div>
666666
</section>
@@ -743,7 +743,7 @@ <h2 style="margin-bottom:1rem" data-i18n="cta.title">Try It Yourself</h2>
743743
<!-- ===== Footer ===== -->
744744
<footer>
745745
<div class="container">
746-
<p>quant.cpp &middot; Apache 2.0 &middot; <a href="https://github.com/quantumaikr/quant.cpp">GitHub</a> &middot; Made by <a href="https://github.com/quantumaikr">quantumaikr</a></p>
746+
<p data-i18n-html="footer.text">quant.cpp &middot; Apache 2.0 &middot; <a href="https://github.com/quantumaikr/quant.cpp">GitHub</a> &middot; Made by <a href="https://github.com/quantumaikr">quantumaikr</a></p>
747747
</div>
748748
</footer>
749749

@@ -913,7 +913,30 @@ <h2 style="margin-bottom:1rem" data-i18n="cta.title">Try It Yourself</h2>
913913
"rag.card2.d": "Can't fit 100K documents in context. Prefill is slow. RAG narrows the search to 2-3 relevant documents that DO fit.",
914914
"rag.card3.t": "Read Once, Query Forever",
915915
"rag.card3.d": "Pre-process documents into .kv files (GPU, once). Load instantly on any laptop (0.5s). Query offline, unlimited, private.",
916-
"rag.pipeline.title": "Pre-computed KV Library Pattern"
916+
"rag.pipeline.title": "Pre-computed KV Library Pattern",
917+
"rag.quote": "<strong>Chunking RAG was a workaround for small context windows.</strong><br>The workaround became dogma.<br>Now context windows are big enough that we don't need the workaround.<br><em style=\"color:var(--accent2)\">— Welcome to Beyond RAG.</em>",
918+
"rag.para2": "It didn't. The infrastructure became dogma. Vector DBs became billion-dollar companies. \"RAG pipeline\" became something every AI engineer was expected to build, regardless of whether their use case actually needed one.",
919+
"verify.label": "Measured Result",
920+
"verify.title": "7/7 vs 0/7 — Verified",
921+
"verify.intro": "We compared three approaches on a synthetic 5-section document with 7 questions (4 single-hop, 3 multi-hop). Tested with <strong>Llama 3.2 3B Q8_0</strong>:",
922+
"verify.viz.title": "Fact Extraction Accuracy",
923+
"verify.bar1.label": "Chunk-RAG (wrong section retrieved)",
924+
"verify.bar1.val": "0/7 — all hallucinated",
925+
"verify.bar2.label": "Full Document (FP32 KV)",
926+
"verify.bar3.label": "<strong>Full Document (6.4x KV compression)</strong>",
927+
"verify.bar3.inner": "100% — same as FP32",
928+
"verify.halluc.title": "The Hallucination Problem",
929+
"verify.halluc.desc": "When chunk-RAG retrieved the wrong section, the model didn't say \"I don't know\" — it generated <strong>plausible-sounding lies</strong>:",
930+
"verify.halluc.examples": "<div><span style=\"color:var(--accent2)\">Q:</span> Who is the CTO?</div><div><span style=\"color:var(--red)\">Chunk-RAG:</span> \"John Smith\" &emsp; <span style=\"color:var(--text3)\">→ truth: Maria Santos</span></div><br><div><span style=\"color:var(--accent2)\">Q:</span> What is the revenue?</div><div><span style=\"color:var(--red)\">Chunk-RAG:</span> \"$1,000,000\" &emsp; <span style=\"color:var(--text3)\">→ truth: 847 million</span></div><br><div><span style=\"color:var(--accent2)\">Q:</span> What percent is R&D?</div><div><span style=\"color:var(--red)\">Chunk-RAG:</span> \"15% of net income\" &emsp; <span style=\"color:var(--text3)\">→ truth: 14% of revenue</span></div>",
931+
"verify.halluc.summary": "This is the fundamental danger of chunk-RAG: <strong>retrieval failure becomes silent hallucination</strong>. KV compression makes it possible to load the entire document into context, eliminating this failure mode on consumer hardware.",
932+
"verify.card1.t": "KV Compression = Zero Quality Loss",
933+
"verify.card1.d": "FP32 7/7 = 6.4x compressed 7/7. The 6.4x memory savings cost nothing in fact extraction quality.",
934+
"verify.card2.t": "Multi-Hop Reasoning Works",
935+
"verify.card2.d": "\"What risk affects the growth region?\" requires linking Section 3 (Asia growth) with Section 5 (Asia currency risk). Full-doc: ✓. Chunk-RAG: impossible.",
936+
"verify.card3.t": "Runs on 16GB Mac",
937+
"verify.card3.d": "Llama 3.2 3B Q8_0, no GPU. 6.4x KV compression makes this practical on consumer hardware.",
938+
"verify.cta": "Read the Beyond RAG Manifesto &rarr;",
939+
"footer.text": "quant.cpp &middot; Apache 2.0 &middot; <a href=\"https://github.com/quantumaikr/quant.cpp\">GitHub</a> &middot; Made by <a href=\"https://github.com/quantumaikr\">quantumaikr</a>"
917940
},
918941
ko: {
919942
"nav.problem": "\uBB38\uC81C\uC810",
@@ -1077,7 +1100,30 @@ <h2 style="margin-bottom:1rem" data-i18n="cta.title">Try It Yourself</h2>
10771100
"rag.card2.d": "100K 문서를 한 번에 컨텍스트에 넣을 수 없습니다. Prefill이 느립니다. RAG는 검색을 2-3개 관련 문서로 좁혀줍니다.",
10781101
"rag.card3.t": "한 번 읽고, 영원히 질문",
10791102
"rag.card3.d": "문서를 .kv 파일로 사전 처리 (GPU, 1회). 어떤 노트북에서든 즉시 로드 (0.5초). 오프라인, 무제한, 프라이빗 질문.",
1080-
"rag.pipeline.title": "사전 계산된 KV 라이브러리 패턴"
1103+
"rag.pipeline.title": "사전 계산된 KV 라이브러리 패턴",
1104+
"rag.quote": "<strong>청킹 RAG는 작은 컨텍스트 윈도우에 대한 임시방편이었습니다.</strong><br>그 임시방편이 정설이 됐습니다.<br>이제 컨텍스트 윈도우가 충분히 커져서 임시방편이 필요 없습니다.<br><em style=\"color:var(--accent2)\">— Beyond RAG에 오신 것을 환영합니다.</em>",
1105+
"rag.para2": "사라지지 않았습니다. 인프라가 정설이 됐습니다. 벡터 DB는 수십억 달러 기업이 됐습니다. \"RAG 파이프라인\"은 실제 용도가 필요하든 아니든 모든 AI 엔지니어가 구축해야 할 무언가가 됐습니다.",
1106+
"verify.label": "측정 결과",
1107+
"verify.title": "7/7 vs 0/7 — 검증됨",
1108+
"verify.intro": "5개 섹션의 합성 문서와 7개 질문(4개 단일-hop, 3개 multi-hop)으로 세 가지 접근법을 비교했습니다. <strong>Llama 3.2 3B Q8_0</strong>으로 테스트:",
1109+
"verify.viz.title": "사실 추출 정확도",
1110+
"verify.bar1.label": "Chunk-RAG (잘못된 섹션 검색)",
1111+
"verify.bar1.val": "0/7 — 전부 환각",
1112+
"verify.bar2.label": "전체 문서 (FP32 KV)",
1113+
"verify.bar3.label": "<strong>전체 문서 (6.4배 KV 압축)</strong>",
1114+
"verify.bar3.inner": "100% — FP32와 동일",
1115+
"verify.halluc.title": "환각 문제",
1116+
"verify.halluc.desc": "Chunk-RAG가 잘못된 섹션을 검색했을 때, 모델은 \"모르겠습니다\"라고 말하지 않고 <strong>그럴듯한 거짓말</strong>을 생성했습니다:",
1117+
"verify.halluc.examples": "<div><span style=\"color:var(--accent2)\">Q:</span> CTO는 누구인가요?</div><div><span style=\"color:var(--red)\">Chunk-RAG:</span> \"John Smith\" &emsp; <span style=\"color:var(--text3)\">→ 정답: Maria Santos</span></div><br><div><span style=\"color:var(--accent2)\">Q:</span> 매출은 얼마인가요?</div><div><span style=\"color:var(--red)\">Chunk-RAG:</span> \"$1,000,000\" &emsp; <span style=\"color:var(--text3)\">→ 정답: 8억 4,700만</span></div><br><div><span style=\"color:var(--accent2)\">Q:</span> R&D는 몇 퍼센트인가요?</div><div><span style=\"color:var(--red)\">Chunk-RAG:</span> \"순이익의 15%\" &emsp; <span style=\"color:var(--text3)\">→ 정답: 매출의 14%</span></div>",
1118+
"verify.halluc.summary": "이것이 chunk-RAG의 근본적 위험입니다: <strong>검색 실패가 조용한 환각이 됩니다</strong>. KV 압축은 전체 문서를 컨텍스트에 로드할 수 있게 하여, 소비자 하드웨어에서 이 실패 모드를 제거합니다.",
1119+
"verify.card1.t": "KV 압축 = 품질 손실 0",
1120+
"verify.card1.d": "FP32 7/7 = 6.4배 압축 7/7. 6.4배 메모리 절감이 사실 추출 품질에 아무런 비용도 들이지 않습니다.",
1121+
"verify.card2.t": "Multi-Hop 추론 작동",
1122+
"verify.card2.d": "\"성장 지역에 영향을 미치는 위험은?\"은 섹션 3(아시아 성장)과 섹션 5(아시아 통화 위험)를 연결해야 합니다. 전체 문서: ✓. Chunk-RAG: 불가능.",
1123+
"verify.card3.t": "16GB Mac에서 실행",
1124+
"verify.card3.d": "Llama 3.2 3B Q8_0, GPU 없음. 6.4배 KV 압축으로 소비자 하드웨어에서 실용적이 됩니다.",
1125+
"verify.cta": "Beyond RAG 선언문 읽기 &rarr;",
1126+
"footer.text": "quant.cpp &middot; Apache 2.0 &middot; <a href=\"https://github.com/quantumaikr/quant.cpp\">GitHub</a> &middot; 제작 <a href=\"https://github.com/quantumaikr\">quantumaikr</a>"
10811127
}
10821128
};
10831129

0 commit comments

Comments
 (0)