Skip to content

[quantization] Add gptq_use_orig_model_inference#702

Merged
mhs4670go merged 1 commit into
Samsung:mainfrom
stamalakhov:gptq_use_orig_model
May 13, 2026
Merged

[quantization] Add gptq_use_orig_model_inference#702
mhs4670go merged 1 commit into
Samsung:mainfrom
stamalakhov:gptq_use_orig_model

Conversation

@stamalakhov
Copy link
Copy Markdown
Contributor

This PR adds gptq_use_orig_model_inference to stabilize GPTQ for deep models.

Compare GPTQ quantization stability with different torch versions for HuggingFaceTB/SmolLM2-135M-Instruct:

torch original_PPL gptq_use_orig_model_inference_PPL
torch_2_6 22.76 22.95
torch_2_9 22.83 22.95
torch_2_10 22.81 22.95

please see additional experiments in the draft #670

Draft: #670
Related: #656

TICO-DCO-1.0-Signed-off-by: s.malakhov s.malakhov@partner.samsung.com

@stamalakhov stamalakhov requested a review from mhs4670go May 13, 2026 11:53
@stamalakhov stamalakhov self-assigned this May 13, 2026
This PR adds `gptq_use_orig_model_inference` to stabilize GPTQ for deep models.

TICO-DCO-1.0-Signed-off-by: s.malakhov <s.malakhov@partner.samsung.com>
@stamalakhov stamalakhov force-pushed the gptq_use_orig_model branch from 6e91b79 to d28dae4 Compare May 13, 2026 12:30
Copy link
Copy Markdown
Contributor

@mhs4670go mhs4670go left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mhs4670go mhs4670go merged commit 7e9b3e5 into Samsung:main May 13, 2026
7 checks passed
@stamalakhov stamalakhov deleted the gptq_use_orig_model branch May 13, 2026 12:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants