fix: preserve mtp.* tensors for dense Qwen3.5/Qwen3.6 models#2869
fix: preserve mtp.* tensors for dense Qwen3.5/Qwen3.6 models#2869erm14254 wants to merge 1 commit intoModelCloud:mainfrom
Conversation
4265779 to
dfab1c6
Compare
|
@erm14254 Looks good! But can you imporve the unit test? Right now it actually does not test anything. |
| from gptqmodel.models.definitions.qwen3_5 import Qwen3_5QModel | ||
| from gptqmodel.models.definitions.qwen3_5_text import Qwen3_5TextQModel | ||
|
|
||
|
|
||
| def test_qwen3_5_preserves_mtp_out_of_model_tensors(): | ||
| assert Qwen3_5QModel.out_of_model_tensors == {"prefixes": ["mtp"]} | ||
|
|
||
|
|
||
| def test_qwen3_5_text_preserves_mtp_out_of_model_tensors(): | ||
| assert Qwen3_5TextQModel.out_of_model_tensors == {"prefixes": ["mtp"]} |
There was a problem hiding this comment.
These two tests are testing code you have written, not the actual execution effects of the code. We need to test the codes actually fixes the bug.
There was a problem hiding this comment.
These two tests are testing code you have written, not the actual execution effects of the code. We need to test the codes actually fixes the bug.
Okay here is some context, with GPTQModel 7.1.0-dev here is what happens:
Qwen3.6-27B: MTPs gets dropped and you end up with a GPTQ with 0 MTPs.
Qwen3.6-35B-A3B: MTPs get detected and imported successfully, with the informational message that says:
INFO Model: Merged 19 tensors with prefixes ['mtp.'] into the state
So basically, whatever it is that "qwen3_5_moe.py" and "qwen3_5_moe_text.py" are doing that "qwen3_5.py" and "qwen3_5_text.py" aren't, well it is clearly working:
According to model.safetensors.index.json this is the distribution of MTPs in Qwen3.6-35B-A3B:
`"mtp.fc.weight": "model-00005-of-00006.safetensors",
"mtp.layers.0.input_layernorm.weight": "model-00005-of-00006.safetensors",
"mtp.layers.0.mlp.experts.down_proj": "model-00005-of-00006.safetensors",
"mtp.layers.0.mlp.experts.gate_up_proj": "model-00006-of-00006.safetensors",
"mtp.layers.0.mlp.gate.weight": "model-00006-of-00006.safetensors",
"mtp.layers.0.mlp.shared_expert.down_proj.weight": "model-00006-of-00006.safetensors",
"mtp.layers.0.mlp.shared_expert.gate_proj.weight": "model-00006-of-00006.safetensors",
"mtp.layers.0.mlp.shared_expert.up_proj.weight": "model-00006-of-00006.safetensors",
"mtp.layers.0.mlp.shared_expert_gate.weight": "model-00006-of-00006.safetensors",
"mtp.layers.0.post_attention_layernorm.weight": "model-00006-of-00006.safetensors",
"mtp.layers.0.self_attn.k_norm.weight": "model-00006-of-00006.safetensors",
"mtp.layers.0.self_attn.k_proj.weight": "model-00006-of-00006.safetensors",
"mtp.layers.0.self_attn.o_proj.weight": "model-00006-of-00006.safetensors",
"mtp.layers.0.self_attn.q_norm.weight": "model-00006-of-00006.safetensors",
"mtp.layers.0.self_attn.q_proj.weight": "model-00006-of-00006.safetensors",
"mtp.layers.0.self_attn.v_proj.weight": "model-00006-of-00006.safetensors",
"mtp.norm.weight": "model-00006-of-00006.safetensors",
"mtp.pre_fc_norm_embedding.weight": "model-00006-of-00006.safetensors",
"mtp.pre_fc_norm_hidden.weight": "model-00006-of-00006.safetensors"
But beside that, okay sure I’ll update the test to use a tiny synthetic checkpoint folder with a "model-auxiliary.safetensors" file containing both an mtp.* tensor and a non-MTP tensor, then run the same prefix-normalization and merge helper used by the save path. That should verify that the dense Qwen3.5/Qwen3.6 definitions actually cause mtp.* tensors to be merged into the saved state dict, instead of just checking that the attribute exists.
Covers both Qwen3_5QModel and Qwen3_5TextQModel, here you go the new test file: test_qwen3_5_mtp.py
Download it and run the tests you want, let me know if you need anything else or if you have more questions, thanks!
Summary
Fixes MTP tensor preservation for dense Qwen3.5/Qwen3.6 model definitions.
Dense Qwen3.5/Qwen3.6 checkpoints can include auxiliary
mtp.*tensors, butqwen3_5.pyandqwen3_5_text.pydid not preserve them during GPTQModel save. The MoE definition already preserves these tensors viaout_of_model_tensors = {"prefixes": ["mtp"]}.What Changed
out_of_model_tensors = {"prefixes": ["mtp"]}toQwen3_5QModel.out_of_model_tensors = {"prefixes": ["mtp"]}toQwen3_5TextQModel.mtp.*auxiliary tensors.Tests
Every working PR must include at least one new simple, fast, targeted unit test when the change affects behavior, a bug fix, or a regression path.
Paste the exact test commands and results here:
python -m pytest tests/unit/models/definitions/test_qwen3_5_mtp.pyResult:
================================================= test session starts =================================================
platform win32 -- Python 3.12.10, pytest-9.0.3, pluggy-1.6.0 -- D:\q35-gptq\Scripts\python.exe
cachedir: .pytest_cache
rootdir: D:\GPTQModel-mtp-fix\tests
configfile: pytest.ini
plugins: anyio-4.13.0
collected 2 items
tests\unit\models\definitions\test_qwen3_5_mtp.py::test_qwen3_5_preserves_mtp_out_of_model_tensors PASSED
tests\unit\models\definitions\test_qwen3_5_mtp.py::test_qwen3_5_text_preserves_mtp_out_of_model_tensors PASSED
=========================================== 2 passed, 16 warnings in 13.75s ===========================================
Additional local observation:
Qwen3.6-27B dense source checkpoint contains 15 mtp.* tensors in model-auxiliary.safetensors.
Before this change, the dense model loaded as:
gptqmodel.models.definitions.qwen3_5.Qwen3_5QModel
and had:
out_of_model_tensors = None
The resulting GPTQ output had 0 mtp.* tensors.
By comparison, Qwen3.6-35B-A3B MoE already preserved 19 mtp.* tensors because qwen3_5_moe.py already defines:
out_of_model_tensors = {"prefixes": ["mtp"]}
Review Requirements
AI-assisted code is welcome.
Every changed file must still be properly reviewed by a human before the PR is opened as ready for review.
We will not accept PRs that are effectively unreviewed AI output. Non-human-reviewed changes often introduce obscure structure, mismatched APIs, project-inconsistent code patterns, or unnecessary monkeypatching instead of a correct fix or clean feature expansion.
Notes
This mirrors the existing behavior in qwen3_5_moe.py, which already preserves auxiliary MTP tensors with:
out_of_model_tensors = {"prefixes": ["mtp"]}The change uses the existing
out_of_model_tensorssave/export path and does not add any new special-case save logic.The intended effect is that dense Qwen3.5/Qwen3.6 models with
mtp.*auxiliary tensors keep those tensors during save, matching the existing MoE behavior.