Skip to content

Fix weight layout detection for MatMul with transpose in OpenVINO backend#3958

Open
naraen-ram wants to merge 1 commit intoopenvinotoolkit:developfrom
naraen-ram:fix-matmul-transpose-compression
Open

Fix weight layout detection for MatMul with transpose in OpenVINO backend#3958
naraen-ram wants to merge 1 commit intoopenvinotoolkit:developfrom
naraen-ram:fix-matmul-transpose-compression

Conversation

@naraen-ram
Copy link

Fixes #3230

Description

Fix incorrect weight layout detection for MatMul layers when transpose is applied via OpenVINO graph rather than constant attributes.

Previously, constant_layer_attrs["transpose"] did not reflect graph-level transpose nodes, which could lead to incorrect channel axis detection during weight compression.

This change checks input_attributes metadata to correctly determine transpose state before computing layout.

Testing

Reproduced issue using custom Transpose → MatMul OpenVINO model.

Verified using:
pytest tests/openvino/native/quantization/test_weights_compression.py -k matmul

Result:
12 passed, 0 failed

Impact

Fixes weight compression correctness for:

  • AWQ
  • Mixed precision
  • Scale estimation
  • LoRA correction

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Good First Issue][NNCF]: Support transposed input for data-aware weight compression methods

1 participant