fix(testing): Fix Parakeet, Evolla, Pi0, and Phi-3 test failures on main CI by harshaljanjani · Pull Request #45004 · huggingface/transformers

harshaljanjani · 2026-03-25T19:58:57Z

What does this PR do?

The following failing tests were identified and fixed in this PR (grouped them together since they share related root causes OR the code changes were extremely minimal and didn't warrant separate PRs):

→ Phi-3: I made a similar fix for LongCat-Flash in another PR, but it's essentially the same pattern. The PR [V5] Return a BatchEncoding dict from apply_chat_template by default again changed apply_chat_template to return a BatchEncoding dict instead of a tensor. The test was passing this dict directly to model.generate and then accessing .shape; this fixes that.
→ Pi0 / Parakeet / Evolla: test_sdpa_can_dispatch_on_flash forces only the Flash kernel, which rejects any non-null attention mask. Pi0 wraps PaliGemma which creates a causal mask mapping even when attention_mask=None (PaliGemma is already skipped for the very reason, so Pi0 should follow suit); Parakeet always passes a relative-position bias as attention_mask, so the mask is never None even when the test removes it; and Evolla's protein encoder generates an attention mask internally when none is provided, which then reaches SDPA as a non-null mask. Added the missing three to the skip list.

cc: @Rocketknight1

CI Failures

Before the fixes (feel free to cross-check; these errors are reproducible):

After the fixes (feel free to cross-check):

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you fix any necessary existing tests?

Rocketknight1 · 2026-03-26T12:48:26Z

Overall looks good, with one comment!

harshaljanjani · 2026-03-27T13:23:32Z

P.S. @Rocketknight1 thanks for all the reviews over the past months! I've absolutely loved working on Transformers. Please do let me know if you're open to connecting outside of GH (no stress if not!). Looking forward to future PRs and hoping that the model I've been heads-down on gets its final core review soon lol :)

HuggingFaceDocBuilderDev · 2026-03-27T13:32:22Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

ydshieh · 2026-03-27T14:15:01Z

Hi @harshaljanjani Thank you. I will also take a look too.

BTW, it would be nice to give the (full) test names that fails, like tests/models/phi3 .... as text too, which we could copy paste. And also good to attach the error log of the failing tests as text.

ydshieh

Also good from my side, thanks again!

I will update some updates on expected output values, so we have more tests fixed.

ydshieh · 2026-03-27T14:30:29Z

run-slow: phi3

github-actions · 2026-03-27T14:31:50Z

Workflow Run ⚙️

This comment contains run-slow, running the specified jobs:

models: ["models/phi3"]
quantizations: []

harshaljanjani · 2026-03-27T14:33:13Z

Good day @ydshieh, thanks for your time! I'll keep that in mind for future PRs. I usually attach screenshots from local runs, but that makes sense, I'll include the text along with them in forthcoming PRs :)
Just a gentle note in regard to the tests that there's a ~~removed: possibly flaky (?)~~ AssertionError that comes up in the test tests/models/phi3/test_modeling_phi3.py::Phi3IntegrationTest::test_phi3_mini_4k_instruct_generation which I didn't resolve with this PR, but the AttributeError it was crashing with before should now be resolved!

ydshieh · 2026-03-27T14:36:02Z

it's probably not flaky, but just we have different environment (hardware etc.) :-)

github-actions · 2026-03-27T14:44:11Z

CI Results

Workflow Run ⚙️

Commit Info

Context	Commit	Description
RUN	69aa42c9	workflow commit (merge commit)
PR	bac66bf4	branch commit (from PR)
main	689f52ce	base commit (on `main`)

Model CI Report

❌ 1 new failed tests from this PR 😭

phi3:
tests/models/phi3/test_modeling_phi3.py::Phi3IntegrationTest::test_phi3_mini_4k_instruct_generation (❌ ⟹ ❌)

github-actions · 2026-03-27T14:48:08Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: phi3

ydshieh · 2026-03-27T14:52:17Z

Only one

FAILED tests/models/phi3/test_modeling_phi3.py::Phi3IntegrationTest::test_export_static_cache - torch._dynamo.exc.Unsupported: Data-dependent branching

but it's already failing on main, not the scope of this PR.

…ain CI (huggingface#45004) * fix: Guard sdpa flash test and fix phi3/pi0 tests * fix: Narrow scope by adding it to the skip list * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

harshaljanjani marked this pull request as ready for review March 25, 2026 20:25

github-actions bot requested a review from ydshieh March 25, 2026 20:25

Rocketknight1 reviewed Mar 26, 2026

View reviewed changes

Comment thread tests/test_modeling_common.py

Rocketknight1 enabled auto-merge March 27, 2026 13:21

Rocketknight1 approved these changes Mar 27, 2026

View reviewed changes

harshaljanjani added 2 commits March 27, 2026 13:21

fix: Guard sdpa flash test and fix phi3/pi0 tests

222d623

fix: Narrow scope by adding it to the skip list

bac66bf

Rocketknight1 force-pushed the fix/parakeet-glmmoedsa-pi0-phi3-tests branch from 22fd2b1 to bac66bf Compare March 27, 2026 13:21

ydshieh approved these changes Mar 27, 2026

View reviewed changes

fix

76a7602

ydshieh disabled auto-merge March 27, 2026 15:05

ydshieh merged commit b0bba2d into huggingface:main Mar 27, 2026
28 checks passed

harshaljanjani deleted the fix/parakeet-glmmoedsa-pi0-phi3-tests branch March 27, 2026 15:06

harshaljanjani mentioned this pull request Mar 30, 2026

fix(models): Fix dtype mismatch in SwitchTransformers and TimmWrapperModel #45074

Merged

6 tasks

Conversation

harshaljanjani commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Uh oh!

Uh oh!

Rocketknight1 commented Mar 26, 2026

Uh oh!

harshaljanjani commented Mar 27, 2026

Uh oh!

HuggingFaceDocBuilderDev commented Mar 27, 2026

Uh oh!

ydshieh commented Mar 27, 2026

Uh oh!

ydshieh left a comment

Choose a reason for hiding this comment

Uh oh!

ydshieh commented Mar 27, 2026

Uh oh!

github-actions bot commented Mar 27, 2026

Uh oh!

harshaljanjani commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ydshieh commented Mar 27, 2026

Uh oh!

github-actions bot commented Mar 27, 2026

CI Results

Commit Info

Model CI Report

Uh oh!

github-actions bot commented Mar 27, 2026

Uh oh!

ydshieh commented Mar 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

harshaljanjani commented Mar 25, 2026 •

edited

Loading

harshaljanjani commented Mar 27, 2026 •

edited

Loading