Skip to content

skip 2 invalid test cases for pi0 model#45011

Merged
ydshieh merged 4 commits intohuggingface:mainfrom
kaixuanliu:pi0-tests
Apr 10, 2026
Merged

skip 2 invalid test cases for pi0 model#45011
ydshieh merged 4 commits intohuggingface:mainfrom
kaixuanliu:pi0-tests

Conversation

@kaixuanliu
Copy link
Copy Markdown
Contributor

@kaixuanliu kaixuanliu commented Mar 26, 2026

@ydshieh Hi, can you help review? Thx!

Current error (before this PR):

_ PI0ForConditionalGenerationModelTest.test_flash_attn_2_inference_equivalence _

self = <tests.models.pi0.test_modeling_pi0.PI0ForConditionalGenerationModelTest testMethod=test_flash_attn_2_inference_equivalence>

    @require_flash_attn
    @require_torch_accelerator
    @mark.flash_attn_test
    @slow
    @is_flaky()
    def test_flash_attn_2_inference_equivalence(self):
>       self.flash_attn_inference_equivalence(attn_implementation="flash_attention_2", padding_side="left")

tests/test_modeling_common.py:3426: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
tests/test_modeling_common.py:3348: in flash_attn_inference_equivalence
    outputs = model(**first_inputs)
/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py:1773: in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py:1784: in _call_impl
    return forward_call(*args, **kwargs)
src/transformers/utils/generic.py:876: in wrapper
    output = func(self, *args, **kwargs)
src/transformers/models/pi0/modeling_pi0.py:300: in forward
    outputs = self.model(
/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py:1773: in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py:1784: in _call_impl
    return forward_call(*args, **kwargs)
src/transformers/utils/generic.py:876: in wrapper
    output = func(self, *args, **kwargs)
src/transformers/models/pi0/modeling_pi0.py:177: in forward
    inputs_embeds = self.embed_prefix(input_ids, pixel_values, pixel_attention_mask)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = PI0Model(
  (dit): GemmaModel(
    (embed_tokens): GemmaTextScaledWordEmbedding(128, 32, padding_idx=0)
    (layers): ...)
        )
      )
      (norm): GemmaRMSNorm((16,), eps=1e-06)
      (rotary_emb): GemmaRotaryEmbedding()
    )
  )
)
input_ids = tensor([[127, 127, 127, 127, 127, 127, 127, 127, 105,  39,  96,  66,   5]],
       device='cuda:0')
pixel_values = tensor([[[[[0.6680, 0.4375, 0.6758, 0.8281, 0.4551, 0.5586, 0.7383, 0.1289],
           [0.5039, 0.5781, 0.1074, 0.281...    [0.0013, 0.1455, 0.2227, 0.1113, 0.2793, 1.0000, 0.5508, 0.7812]]]]],
       device='cuda:0', dtype=torch.bfloat16)
pixel_attention_mask = None, attention_mask = None

    def embed_prefix(self, input_ids, pixel_values, pixel_attention_mask, attention_mask=None):
>       max_num_cameras = pixel_attention_mask.shape[1]
E       AttributeError: 'NoneType' object has no attribute 'shape'

src/transformers/models/pi0/modeling_pi0.py:128: AttributeError
----------------------------- Captured stderr call -----------------------------

Writing model shards:   0%|          | 0/1 [00:00<?, ?it/s]
Writing model shards: 100%|██████████| 1/1 [00:00<00:00, 461.17it/s]

Loading weights:   0%|          | 0/73 [00:00<?, ?it/s]
Loading weights: 100%|██████████| 73/73 [00:00<00:00, 5831.86it/s]
Test failed with 'NoneType' object has no attribute 'shape' at try 1/5.

Writing model shards:   0%|          | 0/1 [00:00<?, ?it/s]
Writing model shards: 100%|██████████| 1/1 [00:00<00:00, 481.44it/s]

Loading weights:   0%|          | 0/73 [00:00<?, ?it/s]
Loading weights: 100%|██████████| 73/73 [00:00<00:00, 5590.26it/s]
Test failed with 'NoneType' object has no attribute 'shape' at try 2/5.

Writing model shards:   0%|          | 0/1 [00:00<?, ?it/s]
Writing model shards: 100%|██████████| 1/1 [00:00<00:00, 494.49it/s]

Loading weights:   0%|          | 0/73 [00:00<?, ?it/s]
Loading weights: 100%|██████████| 73/73 [00:00<00:00, 5831.41it/s]
Test failed with 'NoneType' object has no attribute 'shape' at try 3/5.

Writing model shards:   0%|          | 0/1 [00:00<?, ?it/s]
Writing model shards: 100%|██████████| 1/1 [00:00<00:00, 502.07it/s]

Loading weights:   0%|          | 0/73 [00:00<?, ?it/s]
Loading weights: 100%|██████████| 73/73 [00:00<00:00, 5600.28it/s]
Test failed with 'NoneType' object has no attribute 'shape' at try 4/5.

Writing model shards:   0%|          | 0/1 [00:00<?, ?it/s]
Writing model shards: 100%|██████████| 1/1 [00:00<00:00, 502.67it/s]

Loading weights:   0%|          | 0/73 [00:00<?, ?it/s]
Loading weights: 100%|██████████| 73/73 [00:00<00:00, 5730.67it/s]
------------------------------ Captured log call -------------------------------
ERROR    transformers.testing_utils:testing_utils.py:2640 Test failed with 'NoneType' object has no attribute 'shape' at try 1/5.
ERROR    transformers.testing_utils:testing_utils.py:2640 Test failed with 'NoneType' object has no attribute 'shape' at try 2/5.
ERROR    transformers.testing_utils:testing_utils.py:2640 Test failed with 'NoneType' object has no attribute 'shape' at try 3/5.
ERROR    transformers.testing_utils:testing_utils.py:2640 Test failed with 'NoneType' object has no attribute 'shape' at try 4/5.
_ PI0ForConditionalGenerationModelTest.test_flash_attn_2_inference_equivalence_right_padding _

self = <tests.models.pi0.test_modeling_pi0.PI0ForConditionalGenerationModelTest testMethod=test_flash_attn_2_inference_equivalence_right_padding>

    @require_flash_attn
    @require_torch_accelerator
    @mark.flash_attn_test
    @slow
    @is_flaky()
    def test_flash_attn_2_inference_equivalence_right_padding(self):
>       self.flash_attn_inference_equivalence(attn_implementation="flash_attention_2", padding_side="right")

tests/test_modeling_common.py:3434: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
tests/test_modeling_common.py:3348: in flash_attn_inference_equivalence
    outputs = model(**first_inputs)
/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py:1773: in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py:1784: in _call_impl
    return forward_call(*args, **kwargs)
src/transformers/utils/generic.py:876: in wrapper
    output = func(self, *args, **kwargs)
src/transformers/models/pi0/modeling_pi0.py:300: in forward
    outputs = self.model(
/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py:1773: in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py:1784: in _call_impl
    return forward_call(*args, **kwargs)
src/transformers/utils/generic.py:876: in wrapper
    output = func(self, *args, **kwargs)
src/transformers/models/pi0/modeling_pi0.py:177: in forward
    inputs_embeds = self.embed_prefix(input_ids, pixel_values, pixel_attention_mask)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = PI0Model(
  (dit): GemmaModel(
    (embed_tokens): GemmaTextScaledWordEmbedding(128, 32, padding_idx=0)
    (layers): ...)
        )
      )
      (norm): GemmaRMSNorm((16,), eps=1e-06)
      (rotary_emb): GemmaRotaryEmbedding()
    )
  )
)
input_ids = tensor([[127, 127, 127, 127, 127, 127, 127, 127,  40,   5,  51,  81, 105]],
       device='cuda:0')
pixel_values = tensor([[[[[0.3086, 0.1680, 0.2061, 0.2676, 0.5469, 0.6992, 0.1865, 0.8711],
           [0.6875, 0.2734, 0.2734, 0.045...    [0.0479, 0.7617, 0.5703, 0.2578, 0.6523, 0.0449, 0.5156, 0.6992]]]]],
       device='cuda:0', dtype=torch.bfloat16)
pixel_attention_mask = None, attention_mask = None

    def embed_prefix(self, input_ids, pixel_values, pixel_attention_mask, attention_mask=None):
>       max_num_cameras = pixel_attention_mask.shape[1]
E       AttributeError: 'NoneType' object has no attribute 'shape'

src/transformers/models/pi0/modeling_pi0.py:128: AttributeError

Comment on lines +216 to +223
@unittest.skip("PI0 model requires pixel_attention_mask to be provided")
def test_flash_attn_2_inference_equivalence(self):
pass

@unittest.skip("PI0 model requires pixel_attention_mask to be provided")
def test_flash_attn_2_inference_equivalence_right_padding(self):
pass

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the fix.

Copy link
Copy Markdown
Collaborator

@ydshieh ydshieh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙏 👍

@ydshieh
Copy link
Copy Markdown
Collaborator

ydshieh commented Apr 9, 2026

I will let @anton-l to have a look too before I merge.

Copy link
Copy Markdown
Contributor

@vasqu vasqu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, lgtm

Comment on lines +216 to +222
@unittest.skip("PI0 model requires pixel_attention_mask to be provided")
def test_flash_attn_2_inference_equivalence(self):
pass

@unittest.skip("PI0 model requires pixel_attention_mask to be provided")
def test_flash_attn_2_inference_equivalence_right_padding(self):
pass
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I investigated a bit and even if we pass pixel_attention_mask in that test, it will fail due to some size mismatches as the vlm cache is reused apparently (and some model dimensions don't match there). So it will need some deeper investigation to fix properly

Sadly, that means all FA tests are affected. Can you skip them all, i.e. FA3 and FA4 as well

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Done.

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
@github-actions
Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: pi0

pass

@unittest.skip("PI0 model requires pixel_attention_mask to be provided")
def test_flash_attn_3_inference_equivalence(self):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your effort. There are a lot of such tests ....

@ydshieh ydshieh merged commit f6ff4ed into huggingface:main Apr 10, 2026
17 checks passed
Comment on lines +276 to +278
@unittest.skip("PI0 model requires pixel_attention_mask to be provided")
def test_flash_attn_4_from_config(self):
pass
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could have done something like in Dia instead:

def skip_non_greedy_generate(self):
skippable_tests = [
"test_sample_generate_dict_output", # return sequences > 1
"test_beam",
"test_contrastive",
"test_assisted",
"test_prompt_lookup",
"test_model_parallel_beam_search",
"test_generate_without_input_ids",
]
for test in skippable_tests:
if self._testMethodName.startswith(test):
self.skipTest(reason="Dia only supports greedy search / sampling with one sequence.")

Would make it easier to maintain and can be a bit more regex based as well

@kaixuanliu kaixuanliu deleted the pi0-tests branch April 10, 2026 10:41
sirzechs66 pushed a commit to sirzechs66/transformers that referenced this pull request Apr 18, 2026
* skip 2 invalid test cases for pi0 model

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

* skip all FA related test cases

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

---------

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants