Skip to content

Comments

[attention backends] remove non-hub attention backends.#13171

Open
sayakpaul wants to merge 1 commit intomainfrom
remove-non-hub-attn-backends
Open

[attention backends] remove non-hub attention backends.#13171
sayakpaul wants to merge 1 commit intomainfrom
remove-non-hub-attn-backends

Conversation

@sayakpaul
Copy link
Member

What does this PR do?

Now that the Hub-based attention backends (Flash, Flash 3, SAGE) have been fully feature-complete (torch.compile, CP, etc.), we can confidently remove the non-Hub variants.

This has several advantages:

  • More Hub-centric
  • Eliminates complex installation nightmares
  • Lesser maintainence burden (otherwise we would have two different code paths for each of the features for each attention backend)
  • Helps us improve kernels as well

Cc: @danieldk @LysandreJik

@sayakpaul
Copy link
Member Author

I will run the tests in test_attention_backends.py as well.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants