Skip to content

Commit 53b3323

Browse files
unamedkrclaude
andcommitted
fix(deltanet): restore L2 norm (removal causes output collapse) + decay fix kept
Karpathy loop results on Qwen3.5-4B short prompts: Loop 1: decay formula aligned with llama.cpp → still fails (0/5) Loop 2: L2 norm removed → WORSE (doc QA also breaks) → L2 norm restored (REQUIRED for Qwen3.5) The decay formula fix (sk before decay) is kept as it's mathematically correct per llama.cpp reference. Remaining suspect: Q scaling timing or state shape/layout mismatch. The reference analysis found llama.cpp uses [S_v, S_v] square state while quant.cpp uses [dk, dv] rectangular — this needs investigation. Refs #95 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent d26ca5e commit 53b3323

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

quant.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13890,6 +13890,8 @@ static void deltanet_forward(tq_model_t* model, tq_state_t* s, int l) {
1389013890
float* K_all = s->delta_qkv + dn_kv * dk;
1389113891
float* V_all = s->delta_qkv + 2 * dn_kv * dk;
1389213892

13893+
/* L2 normalization of Q/K: REQUIRED for Qwen3.5-4B.
13894+
* Removing this causes complete output collapse. */
1389313895
for (int h = 0; h < dn_kv; h++) {
1389413896
l2_normalize(Q_all + h * dk, dk);
1389513897
l2_normalize(K_all + h * dk, dk);

0 commit comments

Comments
 (0)