You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
CHANGELOG and ROADMAP updates documenting the Variant G outlier
handling work. Headline table shows full Pareto landscape across
turbo_kv_3b/4b/3bo/5b/4bo with their bytes/block, compression,
PPL on Llama 3.2 3B, and production/research status.
Closes the per-channel outlier handling item from issue #15.
Copy file name to clipboardExpand all lines: CHANGELOG.md
+27Lines changed: 27 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,5 +1,32 @@
1
1
# Changelog
2
2
3
+
## [0.6.2] — 2026-04-08
4
+
5
+
### Highlights
6
+
7
+
-**🆕 `turbo_kv_4bo` / `turbo_kv_3bo`** — Per-block outlier handling research types. Each block stores the K=8 channels with the largest |rotated[i]| as exact FP16 values that overwrite the codebook reconstruction at dequant time. This is a simpler local form of the per-channel outlier handling described in the Google TurboQuant paper.
8
+
-**Karpathy-loop validation**: per-channel outliers cut the PPL gap **by more than half** on Llama 3.2 3B (4b: +5.3% → 4bo: +2.2%). Effect is model-dependent — see notes below.
9
+
-**Issue #15 progress**: closes the per-channel outlier handling exploration item. 5b remains the recommended quality option; 4bo/3bo ship as experimental.
Per-channel outlier handling is **data-dependent**:
24
+
- On Llama 3.2 3B (head_dim=128, heavier tails), `3bo` Pareto-improves over `4b`
25
+
- On SmolLM2 135M (smaller dimensions), `3bo` regresses past `4b` because the 3-bit base is too coarse
26
+
-`4bo` is dominated by `5b` on both models — slightly bigger and slightly worse
27
+
28
+
Until per-model auto-selection is implemented, the Pareto-optimal recommendations remain `turbo_kv_4b` (default) and `turbo_kv_5b` (quality). The outlier types are exposed for researchers and benchmarking.
0 commit comments