Add cuda compatibility check for using grouped_mm#45001
Add cuda compatibility check for using grouped_mm#45001Sai-Suraj-27 wants to merge 5 commits intohuggingface:mainfrom
grouped_mm#45001Conversation
|
cc @IlyasMoutawwakil something we discovered elsewhere, but it definitely makes sense to add these imo will properly check it out tomorrow |
|
hi ! thanks for investigating this ! if i understand correctly these are not hard constraints (ie not breaking) they are just the conditions for the optimised triton/cutedsl paths right ? otherwise torch just uses the fallback path no ? or does it actually fail ? |
|
It can actually fail on lower torch versions e.g. iirc 2.8 - while it is available, some SM computes won't be able to use it then, hence the extra guarding here. We discovered those during some qwen moe tests, where @Sai-Suraj-27 ran on torch 2.8, see #44848 |
|
i see, yeah originally we used to just raise an error if grouped_mm is requested and the version is less than 2.9 if hasattr(torch, "_grouped_mm"):
return torch.cuda.get_device_capability(weight.device) >= (9, 0)will trigger the manual fallback on torch 2.9 + A100 which is slower than |
Thanks for the review @IlyasMoutawwakil @vasqu. Made it more explicit now. |
vasqu
left a comment
There was a problem hiding this comment.
LGTM, but let's wait on @IlyasMoutawwakil to confirm that it's what he meant
| # issue: https://github.com/pytorch/pytorch/issues/172440 | ||
| return False | ||
|
|
||
| if weight.device.type == "cuda": |
There was a problem hiding this comment.
I think we should add a small comment here for clarification
What does this PR do?
For torch>=2.10.0, the minimum CUDA compute capability requirement for
torch.nn.functional.grouped_mmis 8.0.For torch==2.8.0 for
torch._grouped_mm(), the minimum CUDA compute capability requirement is 9.0.Code Agent Policy
The Transformers repo is currently being overwhelmed by a large number of PRs and issue comments written by
code agents. We are currently bottlenecked by our ability to review and respond to them. As a result,
we ask that new users do not submit pure code agent PRs at this time.
You may use code agents in drafting or to help you diagnose issues. We'd also ask autonomous "OpenClaw"-like agents
not to open any PRs or issues for the moment.
PRs that appear to be fully agent-written will probably be closed without review, and we may block users who do this
repeatedly or maliciously.
This is a rapidly-evolving situation that's causing significant shockwaves in the open-source community. As a result,
this policy is likely to be updated regularly in the near future. For more information, please read
CONTRIBUTING.md.Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@vasqu