[OpenVINO] Support Qwen3.5 and Qwen3.5-MoE by rkazants · Pull Request #1634 · huggingface/optimum-intel

rkazants · 2026-03-08T18:48:12Z

What does this PR do?

Fixes 181271, 181280, 182003

Installation instructions:

pip install git+https://github.com/rkazants/optimum-intel.git@support_qwen3_5
pip install --pre -U openvino openvino-tokenizers nncf --extra-index-url https://storage.openvinotoolkit.org/simple/wheels/nightly
pip install transformers==5.2.0
pip install requests torchvision opencv-python

Exporting cmd-line:

optimum-cli export openvino -m Qwen/Qwen3.5-0.8B Qwen3.5-0.8B

Inference script:

from transformers import AutoProcessor
from transformers.video_utils import load_video
from huggingface_hub import hf_hub_download
from optimum.intel.openvino import OVModelForVisualCausalLM

model_dir = "Qwen/Qwen3.5-0.8B"

processor = AutoProcessor.from_pretrained(model_dir)
model = OVModelForVisualCausalLM.from_pretrained(model_dir)

# Prepare video input
video_path = hf_hub_download(
                repo_id="raushan-testing-hf/videos-test",
                filename="sample_demo_1.mp4",
                repo_type="dataset",
            )
input_video, _ = load_video(video_path, num_frames=10, backend="opencv")

messages = [
    {"role": "user", "content": [
        {"type": "video"},
        {"type": "text", "text": "Why is this video funny?"},
    ]}
]
text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = processor(text=[text], videos=[input_video], return_tensors="pt")

# Run inference
output_ids = model.generate(**inputs, max_new_tokens=100)
output_text = processor.decode(output_ids[0], skip_special_tokens=True)

print(output_text)

Before submitting

[N/A] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

Add conversion rule for the RecurrentAttentionCellOp operation used for GatedDeltaNet patching in OpenVINO PyTorch frontend. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: rkazants <35459624+rkazants@users.noreply.github.com>

savvadesogle · 2026-03-09T13:06:04Z

Thank you!! 🙏♥️😊

Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com>

…ort-qwen-3-5

Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com>

ikirsh · 2026-03-13T18:25:31Z

Can we ensure this PR includes a hardware compatibility check for the Core Ultra 200 series (245 through 285) and other Xe platforms?

Previous OpenVino MoE optimizations have caused kernel-level failures on these platforms without any documented warnings. We need to verify that this PR either provides full support or—at a minimum—documented and implements a graceful exit/error message rather than a system crash.

See this issue:

gpt-oss-20b-int4-ov runs on CPU but triggers OOM on iGPU #34416

and related issues:

qwen3-30b-a3b on ovms, works on CPU, crashs with out of memory on iGPU #34187
Qwen3-Coder-30B-A3B-Instruct-int4-ov runs on CPU but triggers OOM on iGPU #34415

…qwen3_5

Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com>

…qwen3_5

Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com>

lzhu41 · 2026-03-27T08:35:27Z

Hi, guys, when will this PR be merged? Qwen3.5 dense and MoE are the most important models currently in PRC, which has big biz impact in BU. Thank you!

malasy · 2026-04-05T06:14:07Z

ValueError: Asked to export a qwen3_5 model for the task text-generation-with-past, but the Optimum OpenVINO exporter only supports the tasks image-text-to-text for qwen3_5. Please use a supported task. Please open an issue at https://github.com/huggingface/optimum-intel/issues if you would like the task text-generation-with-past to be supported in the OpenVINO export for qwen3_5.

Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com>

WizardlyBump17 · 2026-04-15T14:57:33Z

Hello. I see this Pull Request and the Gemma 4 one were closed. Any reasons? Will we be seeing those models on OpenVino any time soon?

sund00bie · 2026-04-16T10:28:33Z

Hello. I see this Pull Request and the Gemma 4 one were closed. Any reasons? Will we be seeing those models on OpenVino any time soon?

These are the prs you'll need to follow now

openvinotoolkit/openvino.genai#3644
openvinotoolkit/openvino.genai#3717

Copilot AI and others added 4 commits March 8, 2026 22:37

Add _ov_ops.py with RecurrentAttentionCellOp conversion rule

8574954

Add conversion rule for the RecurrentAttentionCellOp operation used for GatedDeltaNet patching in OpenVINO PyTorch frontend. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Add initial Qwen3.5 model support with VLM and hybrid text model

050d14f

Co-authored-by: rkazants <35459624+rkazants@users.noreply.github.com>

Fix Qwen3.5 model patcher and config for VLM text embeddings access

4cbb25e

Co-authored-by: rkazants <35459624+rkazants@users.noreply.github.com>

Fix comment grammar in test_decoder.py

b660200

Co-authored-by: rkazants <35459624+rkazants@users.noreply.github.com>

rkazants mentioned this pull request Mar 9, 2026

[OpenVINO] Add Qwen3.5 (Gated Delta Networks) export and inference support #1635

Closed

4 tasks

rkazants added 3 commits March 11, 2026 10:28

Use Qwen3VLOpenVINOConfig

07d943d

Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com>

Merge remote-tracking branch 'upstream/transformers-v5' into add-supp…

9a91793

…ort-qwen-3-5

Remove redundant functions

d8864c4

Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com>

This was referenced Mar 11, 2026

Add support qwen 3 5 #1633

Closed

Qwen3.5 Family Support ❤️ #1628

Open

🙏 [Feature Request]: Add support for GLM-4.7-Flash openvinotoolkit/openvino#33776

Open

blairducrayoppat mentioned this pull request Mar 13, 2026

[Bug][GPU]: ScatterUpdate precision loss (fp16 down-cast) and compilation crash inside Loop body on GPU openvinotoolkit/openvino#34532

Closed

3 tasks

rkazants mentioned this pull request Mar 16, 2026

Feature/qwen3.5 export support #1638

Closed

rkazants added 5 commits March 18, 2026 22:00

Merge remote-tracking branch 'upstream/transformers-v5' into support_…

057ce12

…qwen3_5

Correct patching for vlm

934b32e

Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com>

Fix bf16 patching

e1f8c28

Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com>

Merge remote-tracking branch 'upstream/transformers-v5' into support_…

ea94354

…qwen3_5

Support Qwen3.5-MoE

4602e00

Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com>

rkazants changed the title ~~[OpenVINO] Support Qwen3.5~~ [OpenVINO] Support Qwen3.5 and Qwen3.5-MoE Mar 22, 2026

rkazants linked an issue Mar 23, 2026 that may be closed by this pull request

Qwen3.5 Family Support ❤️ #1628

Open

MissLostCodes mentioned this pull request Mar 27, 2026

[OpenVINO] Support Kimi 2.5 (kimi_k25) Export and Inference Resolves issue #1647 #1648

Open

zhaohb mentioned this pull request Apr 7, 2026

Add Qwen3.5 hybrid model support openvinotoolkit/openvino.genai#3592

Closed

zjhmax777 mentioned this pull request Apr 8, 2026

[Feature Request]: Qwen3.5 Model Support openvinotoolkit/openvino#35171

Open

1 task

GH-Jo mentioned this pull request Apr 8, 2026

[Bug]: Inconsistent tensor shape when running Qwen3.5 on NPU openvinotoolkit/openvino#35209

Open

3 tasks

Add position_ids input and its preparation for inference

cbe127e

Signed-off-by: Kazantsev, Roman <roman.kazantsev@intel.com>

echarlaix deleted the branch huggingface:transformers-v5 April 15, 2026 07:21

echarlaix closed this Apr 15, 2026

rkazants mentioned this pull request Apr 15, 2026

[OpenVINO] Support Qwen3.5, Qwen3.5-MoE and Qwen3.6 #1689

Open

1 task

yatarkan mentioned this pull request Apr 15, 2026

[VLM] Enable Qwen3.5 (SDPA only) openvinotoolkit/openvino.genai#3717

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[OpenVINO] Support Qwen3.5 and Qwen3.5-MoE#1634

[OpenVINO] Support Qwen3.5 and Qwen3.5-MoE#1634
rkazants wants to merge 13 commits intohuggingface:transformers-v5from
rkazants:support_qwen3_5

rkazants commented Mar 8, 2026 •

edited

Loading

Uh oh!

savvadesogle commented Mar 9, 2026

Uh oh!

ikirsh commented Mar 13, 2026 •

edited

Loading

Uh oh!

lzhu41 commented Mar 27, 2026

Uh oh!

malasy commented Apr 5, 2026

Uh oh!

WizardlyBump17 commented Apr 15, 2026

Uh oh!

sund00bie commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

Conversation

rkazants commented Mar 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Uh oh!

savvadesogle commented Mar 9, 2026

Uh oh!

ikirsh commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lzhu41 commented Mar 27, 2026

Uh oh!

malasy commented Apr 5, 2026

Uh oh!

WizardlyBump17 commented Apr 15, 2026

Uh oh!

sund00bie commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

rkazants commented Mar 8, 2026 •

edited

Loading

ikirsh commented Mar 13, 2026 •

edited

Loading