Skip to content

fix: preserve mtp.* tensors for dense Qwen3.5/Qwen3.6 models#2869

Open
erm14254 wants to merge 1 commit intoModelCloud:mainfrom
erm14254:fix-qwen35-mtp-preservation
Open

fix: preserve mtp.* tensors for dense Qwen3.5/Qwen3.6 models#2869
erm14254 wants to merge 1 commit intoModelCloud:mainfrom
erm14254:fix-qwen35-mtp-preservation

Conversation

@erm14254
Copy link
Copy Markdown

@erm14254 erm14254 commented May 8, 2026

Summary

Fixes MTP tensor preservation for dense Qwen3.5/Qwen3.6 model definitions.

Dense Qwen3.5/Qwen3.6 checkpoints can include auxiliary mtp.* tensors, but qwen3_5.py and qwen3_5_text.py did not preserve them during GPTQModel save. The MoE definition already preserves these tensors via out_of_model_tensors = {"prefixes": ["mtp"]}.

What Changed

  • Added out_of_model_tensors = {"prefixes": ["mtp"]} to Qwen3_5QModel.
  • Added out_of_model_tensors = {"prefixes": ["mtp"]} to Qwen3_5TextQModel.
  • Added a targeted unit test checking that both dense Qwen3.5/Qwen3.6 definitions preserve mtp.* auxiliary tensors.
  • No API changes.
  • No migration required.
  • Follow-up work out of scope: broader runtime validation across every Qwen3.5/Qwen3.6 checkpoint variant.

Tests

Every working PR must include at least one new simple, fast, targeted unit test when the change affects behavior, a bug fix, or a regression path.

  • I added a new simple/fast unit test for this change, or documented why that is not applicable.
  • I ran the new targeted test locally before opening this PR.
  • I ran any other directly relevant local tests.

Paste the exact test commands and results here:

python -m pytest tests/unit/models/definitions/test_qwen3_5_mtp.py

Result:

================================================= test session starts =================================================
platform win32 -- Python 3.12.10, pytest-9.0.3, pluggy-1.6.0 -- D:\q35-gptq\Scripts\python.exe
cachedir: .pytest_cache
rootdir: D:\GPTQModel-mtp-fix\tests
configfile: pytest.ini
plugins: anyio-4.13.0
collected 2 items

tests\unit\models\definitions\test_qwen3_5_mtp.py::test_qwen3_5_preserves_mtp_out_of_model_tensors PASSED
tests\unit\models\definitions\test_qwen3_5_mtp.py::test_qwen3_5_text_preserves_mtp_out_of_model_tensors PASSED

=========================================== 2 passed, 16 warnings in 13.75s ===========================================

Additional local observation:

Qwen3.6-27B dense source checkpoint contains 15 mtp.* tensors in model-auxiliary.safetensors.

Before this change, the dense model loaded as:
gptqmodel.models.definitions.qwen3_5.Qwen3_5QModel

and had:
out_of_model_tensors = None

The resulting GPTQ output had 0 mtp.* tensors.

By comparison, Qwen3.6-35B-A3B MoE already preserved 19 mtp.* tensors because qwen3_5_moe.py already defines:
out_of_model_tensors = {"prefixes": ["mtp"]}

Review Requirements

AI-assisted code is welcome.

Every changed file must still be properly reviewed by a human before the PR is opened as ready for review.

We will not accept PRs that are effectively unreviewed AI output. Non-human-reviewed changes often introduce obscure structure, mismatched APIs, project-inconsistent code patterns, or unnecessary monkeypatching instead of a correct fix or clean feature expansion.

  • I personally reviewed every file in this diff.
  • I checked that the code matches existing project structure, APIs, and conventions.
  • I avoided unnecessary monkeypatching and used the project's normal extension points where possible.

Notes

This mirrors the existing behavior in qwen3_5_moe.py, which already preserves auxiliary MTP tensors with:

out_of_model_tensors = {"prefixes": ["mtp"]}

The change uses the existing out_of_model_tensors save/export path and does not add any new special-case save logic.

The intended effect is that dense Qwen3.5/Qwen3.6 models with mtp.* auxiliary tensors keep those tensors during save, matching the existing MoE behavior.

@erm14254 erm14254 force-pushed the fix-qwen35-mtp-preservation branch from 4265779 to dfab1c6 Compare May 8, 2026 21:44
@Qubitium
Copy link
Copy Markdown
Collaborator

Qubitium commented May 9, 2026

@erm14254 Looks good! But can you imporve the unit test? Right now it actually does not test anything.

Comment on lines +1 to +10
from gptqmodel.models.definitions.qwen3_5 import Qwen3_5QModel
from gptqmodel.models.definitions.qwen3_5_text import Qwen3_5TextQModel


def test_qwen3_5_preserves_mtp_out_of_model_tensors():
assert Qwen3_5QModel.out_of_model_tensors == {"prefixes": ["mtp"]}


def test_qwen3_5_text_preserves_mtp_out_of_model_tensors():
assert Qwen3_5TextQModel.out_of_model_tensors == {"prefixes": ["mtp"]}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two tests are testing code you have written, not the actual execution effects of the code. We need to test the codes actually fixes the bug.

Copy link
Copy Markdown
Author

@erm14254 erm14254 May 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two tests are testing code you have written, not the actual execution effects of the code. We need to test the codes actually fixes the bug.

Okay here is some context, with GPTQModel 7.1.0-dev here is what happens:

Qwen3.6-27B: MTPs gets dropped and you end up with a GPTQ with 0 MTPs.

Qwen3.6-35B-A3B: MTPs get detected and imported successfully, with the informational message that says:

INFO Model: Merged 19 tensors with prefixes ['mtp.'] into the state

So basically, whatever it is that "qwen3_5_moe.py" and "qwen3_5_moe_text.py" are doing that "qwen3_5.py" and "qwen3_5_text.py" aren't, well it is clearly working:

According to model.safetensors.index.json this is the distribution of MTPs in Qwen3.6-35B-A3B:

`"mtp.fc.weight": "model-00005-of-00006.safetensors",
"mtp.layers.0.input_layernorm.weight": "model-00005-of-00006.safetensors",
"mtp.layers.0.mlp.experts.down_proj": "model-00005-of-00006.safetensors",
"mtp.layers.0.mlp.experts.gate_up_proj": "model-00006-of-00006.safetensors",
"mtp.layers.0.mlp.gate.weight": "model-00006-of-00006.safetensors",
"mtp.layers.0.mlp.shared_expert.down_proj.weight": "model-00006-of-00006.safetensors",
"mtp.layers.0.mlp.shared_expert.gate_proj.weight": "model-00006-of-00006.safetensors",
"mtp.layers.0.mlp.shared_expert.up_proj.weight": "model-00006-of-00006.safetensors",
"mtp.layers.0.mlp.shared_expert_gate.weight": "model-00006-of-00006.safetensors",
"mtp.layers.0.post_attention_layernorm.weight": "model-00006-of-00006.safetensors",
"mtp.layers.0.self_attn.k_norm.weight": "model-00006-of-00006.safetensors",
"mtp.layers.0.self_attn.k_proj.weight": "model-00006-of-00006.safetensors",
"mtp.layers.0.self_attn.o_proj.weight": "model-00006-of-00006.safetensors",
"mtp.layers.0.self_attn.q_norm.weight": "model-00006-of-00006.safetensors",
"mtp.layers.0.self_attn.q_proj.weight": "model-00006-of-00006.safetensors",
"mtp.layers.0.self_attn.v_proj.weight": "model-00006-of-00006.safetensors",
"mtp.norm.weight": "model-00006-of-00006.safetensors",
"mtp.pre_fc_norm_embedding.weight": "model-00006-of-00006.safetensors",
"mtp.pre_fc_norm_hidden.weight": "model-00006-of-00006.safetensors"

But beside that, okay sure I’ll update the test to use a tiny synthetic checkpoint folder with a "model-auxiliary.safetensors" file containing both an mtp.* tensor and a non-MTP tensor, then run the same prefix-normalization and merge helper used by the save path. That should verify that the dense Qwen3.5/Qwen3.6 definitions actually cause mtp.* tensors to be merged into the saved state dict, instead of just checking that the attribute exists.

Covers both Qwen3_5QModel and Qwen3_5TextQModel, here you go the new test file: test_qwen3_5_mtp.py

Download it and run the tests you want, let me know if you need anything else or if you have more questions, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants