feat: Expand `murmur3` hash support to complex types #3077

andygrove · 2026-01-12T21:27:15Z

Which issue does this PR close?

Closes #.

Rationale for this change

This is needed for the following features:

Support complex types as partition keys in hash partitioning in native shuffle
Support round-robin partitioning in native shuffle

What changes are included in this PR?

How are these changes tested?

codecov-commenter · 2026-01-12T21:55:24Z

Codecov Report

❌ Patch coverage is 75.00000% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 59.50%. Comparing base (f09f8af) to head (078b204).
⚠️ Report is 845 commits behind head on main.

Files with missing lines	Patch %	Lines
...k/src/main/scala/org/apache/comet/serde/hash.scala	75.00%	2 Missing and 2 partials ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##               main    #3077      +/-   ##
============================================
+ Coverage     56.12%   59.50%   +3.37%     
- Complexity      976     1381     +405     
============================================
  Files           119      167      +48     
  Lines         11743    15560    +3817     
  Branches       2251     2586     +335     
============================================
+ Hits           6591     9259    +2668     
- Misses         4012     5002     +990     
- Partials       1140     1299     +159

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

parthchandra · 2026-01-13T01:42:11Z

native/spark-expr/src/hash_funcs/utils.rs

+                // Hash each element in sequence, chaining the hash values
+                for elem_idx in 0..len {
+                    let elem_array = values.slice(start + elem_idx, 1);
+                    let mut single_hash = [*hash];


why is single_hash an array?

single_hash is an array because the recursive hash method interface expects a slice of hashes and this allows us to reuse that rather than add another version of the code

native/spark-expr/src/hash_funcs/utils.rs

mbutrovich · 2026-01-13T14:59:03Z

spark/src/test/scala/org/apache/comet/CometHashExpressionSuite.scala

+ * These tests verify that Comet's native implementation of murmur3 hash produces identical
+ * results to Spark's implementation for all supported data types.
+ */
+class CometHashExpressionSuite extends CometTestBase with AdaptiveSparkPlanHelper {


Should we add this expression to one of the fuzz suites?

Sure. Are you thinking more about the fuzz data generation aspect, or testing across different scans/shuffles, or both?

Mostly just interested in the fuzz data generation, seems like a good schema to throw at this.

yup, that already uncovered a bug 💪

native/spark-expr/src/hash_funcs/utils.rs

andygrove added 3 commits January 12, 2026 14:18

tests

953154e

add to workflow

72c5342

save

b76e3eb

fix

45c679b

andygrove marked this pull request as ready for review January 13, 2026 00:05

andygrove requested review from comphead, mbutrovich and parthchandra January 13, 2026 00:05

This was referenced Jan 13, 2026

feat: Add support for round-robin partitioning in native shuffle #3076

Draft

feat: Add support for complex types as partition keys in native shuffle #3078

Draft

parthchandra reviewed Jan 13, 2026

View reviewed changes

mbutrovich reviewed Jan 13, 2026

View reviewed changes

native/spark-expr/src/hash_funcs/utils.rs Outdated Show resolved Hide resolved

mbutrovich reviewed Jan 13, 2026

View reviewed changes

native/spark-expr/src/hash_funcs/utils.rs Outdated Show resolved Hide resolved

andygrove added 3 commits January 13, 2026 12:15

address feedback

86355d8

remove comment

a31b2c0

fuzz test and bug fix

078b204

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Expand `murmur3` hash support to complex types #3077

feat: Expand `murmur3` hash support to complex types #3077

Uh oh!

andygrove commented Jan 12, 2026 •

edited

Loading

Uh oh!

codecov-commenter commented Jan 12, 2026 •

edited

Loading

Uh oh!

parthchandra Jan 13, 2026

Uh oh!

andygrove Jan 13, 2026

Uh oh!

Uh oh!

mbutrovich Jan 13, 2026

Uh oh!

andygrove Jan 13, 2026

Uh oh!

mbutrovich Jan 13, 2026

Uh oh!

andygrove Jan 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

feat: Expand murmur3 hash support to complex types #3077

Are you sure you want to change the base?

feat: Expand murmur3 hash support to complex types #3077

Uh oh!

Conversation

andygrove commented Jan 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

How are these changes tested?

Uh oh!

codecov-commenter commented Jan 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

parthchandra Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

andygrove Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mbutrovich Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

andygrove Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

mbutrovich Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

andygrove Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

feat: Expand `murmur3` hash support to complex types #3077

feat: Expand `murmur3` hash support to complex types #3077

andygrove commented Jan 12, 2026 •

edited

Loading

codecov-commenter commented Jan 12, 2026 •

edited

Loading