Skip to content

[BugFix][Relax] Add structural_equal verification to subroutine cache lookup#18930

Open
3em0 wants to merge 2 commits intoapache:mainfrom
3em0:fix/subroutine-hash-collision-verification
Open

[BugFix][Relax] Add structural_equal verification to subroutine cache lookup#18930
3em0 wants to merge 2 commits intoapache:mainfrom
3em0:fix/subroutine-hash-collision-verification

Conversation

@3em0
Copy link

@3em0 3em0 commented Mar 25, 2026

Summary

  • SubroutineMixin._get_subroutine() used structural_hash as the sole cache key without structural_equal verification. If two different arg_sinfo values produced the same 64-bit hash (collision), the cache would return a previously compiled function with mismatched parameter shapes, leading to silently incorrect compiled output.
  • Changed the cache to store a list of (arg_sinfo, result) pairs per hash bucket and verify with structural_equal on lookup, consistent with the pattern in block_builder.cc.
  • Added a security advisory document and regression test.

Root Cause

The subroutine cache (cls._gvar) was keyed by (structural_hash(arg_sinfo), is_dataflow). A hash match was treated as proof of structural equality, skipping the necessary structural_equal check. This is a hash-only lookup anti-pattern — hash determines the bucket, but equality must confirm the match.

For comparison, block_builder.cc correctly uses StructuralHash + StructuralEqual together as the hash and equality functions for std::unordered_map.

Test plan

  • Existing test test_linear passes (no regression)
  • New test test_different_shapes_produce_distinct_subroutines passes — verifies that the same Module class with different input shapes generates distinct subroutines

🤖 Generated with Claude Code

… lookup

SubroutineMixin._get_subroutine() used structural_hash as the sole
cache key without structural_equal verification. If two different
arg_sinfo values produced the same 64-bit hash, the cache would
return a previously compiled function with mismatched parameter
shapes, leading to silently incorrect compiled output.

The fix changes the cache to store a list of (arg_sinfo, result)
pairs per hash bucket and verifies each candidate with
structural_equal before returning. This follows the same pattern
used in block_builder.cc (StructuralHash + StructuralEqual).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical correctness defect in the Relax frontend's subroutine caching mechanism. Previously, the cache relied solely on structural_hash for lookups, which could lead to silent errors if hash collisions occurred between different input shapes. The changes implement a robust cache lookup strategy by incorporating structural_equal verification, ensuring that cached subroutines are always structurally identical to the requested ones. This significantly improves the reliability of compiled output and includes a new security advisory and a regression test to validate the fix.

Highlights

  • Cache Correction: Modified the SubroutineMixin._get_subroutine() cache to use structural_equal verification alongside structural_hash to prevent incorrect cache hits due to hash collisions.
  • Cache Structure Update: Updated the cache to store a list of (arg_sinfo, result) pairs per hash bucket, ensuring proper equality checks.
  • Security Advisory: Added a new security advisory document detailing the hash collision vulnerability and its implications.
  • Regression Test: Introduced a new test case to confirm that distinct subroutines are generated for different input shapes, preventing the reintroduction of the bug.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a security advisory concerning a subroutine cache hash collision in SubroutineMixin._get_subroutine(). The fix modifies the caching mechanism to use structural_equal verification after an initial hash lookup, preventing incorrect compiled output when different input shapes produce the same hash. A new test case has been added to validate this behavior. The review comments suggest minor readability improvements, such as using a temporary variable for a tuple and refactoring a loop into a list comprehension.

- Extract result tuple into variable and break down setdefault/append
- Use list comprehension for subroutine collection in test

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant