fix: preserve mtp.* tensors for dense Qwen3.5/Qwen3.6 models by erm14254 · Pull Request #2869 · ModelCloud/GPTQModel

erm14254 · 2026-05-08T21:29:17Z

Summary

Fixes MTP tensor preservation for dense Qwen3.5/Qwen3.6 model definitions.

Dense Qwen3.5/Qwen3.6 checkpoints can include auxiliary mtp.* tensors, but qwen3_5.py and qwen3_5_text.py did not preserve them during GPTQModel save. The MoE definition already preserves these tensors via out_of_model_tensors = {"prefixes": ["mtp"]}.

What Changed

Added out_of_model_tensors = {"prefixes": ["mtp"]} to Qwen3_5QModel.
Added out_of_model_tensors = {"prefixes": ["mtp"]} to Qwen3_5TextQModel.
Added a targeted unit test checking that both dense Qwen3.5/Qwen3.6 definitions preserve mtp.* auxiliary tensors.
No API changes.
No migration required.
Follow-up work out of scope: broader runtime validation across every Qwen3.5/Qwen3.6 checkpoint variant.

Tests

Every working PR must include at least one new simple, fast, targeted unit test when the change affects behavior, a bug fix, or a regression path.

I added a new simple/fast unit test for this change, or documented why that is not applicable.
I ran the new targeted test locally before opening this PR.
I ran any other directly relevant local tests.

Paste the exact test commands and results here:

python -m pytest tests/unit/models/definitions/test_qwen3_5_mtp.py

Result:

================================================= test session starts =================================================
platform win32 -- Python 3.12.10, pytest-9.0.3, pluggy-1.6.0 -- D:\q35-gptq\Scripts\python.exe
cachedir: .pytest_cache
rootdir: D:\GPTQModel-mtp-fix\tests
configfile: pytest.ini
plugins: anyio-4.13.0
collected 2 items

tests\unit\models\definitions\test_qwen3_5_mtp.py::test_qwen3_5_preserves_mtp_out_of_model_tensors PASSED
tests\unit\models\definitions\test_qwen3_5_mtp.py::test_qwen3_5_text_preserves_mtp_out_of_model_tensors PASSED

=========================================== 2 passed, 16 warnings in 13.75s ===========================================

Additional local observation:

Qwen3.6-27B dense source checkpoint contains 15 mtp.* tensors in model-auxiliary.safetensors.

Before this change, the dense model loaded as:
gptqmodel.models.definitions.qwen3_5.Qwen3_5QModel

and had:
out_of_model_tensors = None

The resulting GPTQ output had 0 mtp.* tensors.

By comparison, Qwen3.6-35B-A3B MoE already preserved 19 mtp.* tensors because qwen3_5_moe.py already defines:
out_of_model_tensors = {"prefixes": ["mtp"]}

Review Requirements

AI-assisted code is welcome.

Every changed file must still be properly reviewed by a human before the PR is opened as ready for review.

We will not accept PRs that are effectively unreviewed AI output. Non-human-reviewed changes often introduce obscure structure, mismatched APIs, project-inconsistent code patterns, or unnecessary monkeypatching instead of a correct fix or clean feature expansion.

I personally reviewed every file in this diff.
I checked that the code matches existing project structure, APIs, and conventions.
I avoided unnecessary monkeypatching and used the project's normal extension points where possible.

Notes

This mirrors the existing behavior in qwen3_5_moe.py, which already preserves auxiliary MTP tensors with:

out_of_model_tensors = {"prefixes": ["mtp"]}

The change uses the existing out_of_model_tensors save/export path and does not add any new special-case save logic.

The intended effect is that dense Qwen3.5/Qwen3.6 models with mtp.* auxiliary tensors keep those tensors during save, matching the existing MoE behavior.

Qubitium · 2026-05-09T06:46:04Z

@erm14254 Looks good! But can you imporve the unit test? Right now it actually does not test anything.

Qubitium · 2026-05-09T06:47:53Z

+from gptqmodel.models.definitions.qwen3_5 import Qwen3_5QModel
+from gptqmodel.models.definitions.qwen3_5_text import Qwen3_5TextQModel
+
+
+def test_qwen3_5_preserves_mtp_out_of_model_tensors():
+    assert Qwen3_5QModel.out_of_model_tensors == {"prefixes": ["mtp"]}
+
+
+def test_qwen3_5_text_preserves_mtp_out_of_model_tensors():
+    assert Qwen3_5TextQModel.out_of_model_tensors == {"prefixes": ["mtp"]}


These two tests are testing code you have written, not the actual execution effects of the code. We need to test the codes actually fixes the bug.

These two tests are testing code you have written, not the actual execution effects of the code. We need to test the codes actually fixes the bug.

Okay here is some context, with GPTQModel 7.1.0-dev here is what happens:

Qwen3.6-27B: MTPs gets dropped and you end up with a GPTQ with 0 MTPs.

Qwen3.6-35B-A3B: MTPs get detected and imported successfully, with the informational message that says:

INFO Model: Merged 19 tensors with prefixes ['mtp.'] into the state

So basically, whatever it is that "qwen3_5_moe.py" and "qwen3_5_moe_text.py" are doing that "qwen3_5.py" and "qwen3_5_text.py" aren't, well it is clearly working:

According to model.safetensors.index.json this is the distribution of MTPs in Qwen3.6-35B-A3B:

`"mtp.fc.weight": "model-00005-of-00006.safetensors", "mtp.layers.0.input_layernorm.weight": "model-00005-of-00006.safetensors", "mtp.layers.0.mlp.experts.down_proj": "model-00005-of-00006.safetensors", "mtp.layers.0.mlp.experts.gate_up_proj": "model-00006-of-00006.safetensors", "mtp.layers.0.mlp.gate.weight": "model-00006-of-00006.safetensors", "mtp.layers.0.mlp.shared_expert.down_proj.weight": "model-00006-of-00006.safetensors", "mtp.layers.0.mlp.shared_expert.gate_proj.weight": "model-00006-of-00006.safetensors", "mtp.layers.0.mlp.shared_expert.up_proj.weight": "model-00006-of-00006.safetensors", "mtp.layers.0.mlp.shared_expert_gate.weight": "model-00006-of-00006.safetensors", "mtp.layers.0.post_attention_layernorm.weight": "model-00006-of-00006.safetensors", "mtp.layers.0.self_attn.k_norm.weight": "model-00006-of-00006.safetensors", "mtp.layers.0.self_attn.k_proj.weight": "model-00006-of-00006.safetensors", "mtp.layers.0.self_attn.o_proj.weight": "model-00006-of-00006.safetensors", "mtp.layers.0.self_attn.q_norm.weight": "model-00006-of-00006.safetensors", "mtp.layers.0.self_attn.q_proj.weight": "model-00006-of-00006.safetensors", "mtp.layers.0.self_attn.v_proj.weight": "model-00006-of-00006.safetensors", "mtp.norm.weight": "model-00006-of-00006.safetensors", "mtp.pre_fc_norm_embedding.weight": "model-00006-of-00006.safetensors", "mtp.pre_fc_norm_hidden.weight": "model-00006-of-00006.safetensors"

But beside that, okay sure I’ll update the test to use a tiny synthetic checkpoint folder with a "model-auxiliary.safetensors" file containing both an mtp.* tensor and a non-MTP tensor, then run the same prefix-normalization and merge helper used by the save path. That should verify that the dense Qwen3.5/Qwen3.6 definitions actually cause mtp.* tensors to be merged into the saved state dict, instead of just checking that the attribute exists.

Covers both Qwen3_5QModel and Qwen3_5TextQModel, here you go the new test file: test_qwen3_5_mtp.py

Download it and run the tests you want, let me know if you need anything else or if you have more questions, thanks!

fix: preserve mtp tensors for dense Qwen3.5 models

dfab1c6

erm14254 force-pushed the fix-qwen35-mtp-preservation branch from 4265779 to dfab1c6 Compare May 8, 2026 21:44

Qubitium requested changes May 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: preserve mtp.* tensors for dense Qwen3.5/Qwen3.6 models#2869

fix: preserve mtp.* tensors for dense Qwen3.5/Qwen3.6 models#2869
erm14254 wants to merge 1 commit intoModelCloud:mainfrom
erm14254:fix-qwen35-mtp-preservation

erm14254 commented May 8, 2026 •

edited

Loading

Uh oh!

Qubitium commented May 9, 2026 •

edited

Loading

Uh oh!

Qubitium May 9, 2026

Uh oh!

erm14254 May 9, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

erm14254 commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What Changed

Tests

Review Requirements

Notes

Uh oh!

Qubitium commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Qubitium May 9, 2026

Choose a reason for hiding this comment

Uh oh!

erm14254 May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

erm14254 commented May 8, 2026 •

edited

Loading

Qubitium commented May 9, 2026 •

edited

Loading

erm14254 May 9, 2026 •

edited

Loading