Skip to content

[FIX] test_hymba#2872

Open
ZX-ModelCloud wants to merge 5 commits intomainfrom
zx_fix_hymba
Open

[FIX] test_hymba#2872
ZX-ModelCloud wants to merge 5 commits intomainfrom
zx_fix_hymba

Conversation

@ZX-ModelCloud
Copy link
Copy Markdown
Collaborator

Summary

Fix test_hymba

What Changed

1.shared_kv_cache_dict was only populated when reuse_kv=True.

That breaks models like Hymba where not every decoder layer has reuse_kv=True: earlier layers may need to publish KV for later layers even when the current layer itself does not consume kv_last_layer. In those cases,
prev_kv stays empty by the time a later layer actually needs it.

This change adds a model-level write_shared_kv_cache switch on BaseQModel, keeps the default behavior unchanged, and enables it for Hymba.

2.hymba is compatible with transformers v5.

2. hymba is compatible with Transformer v5.

Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>
Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>
Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>
Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>
Comment thread gptqmodel/utils/hf.py Fixed
Signed-off-by: ZX-ModelCloud <zx@modelcloud.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant