Skip to content

perf: Optimize split_part for scalar args#21238

Open
neilconway wants to merge 3 commits intoapache:mainfrom
neilconway:neilc/optimize-split-part-scalar
Open

perf: Optimize split_part for scalar args#21238
neilconway wants to merge 3 commits intoapache:mainfrom
neilconway:neilc/optimize-split-part-scalar

Conversation

@neilconway
Copy link
Copy Markdown
Contributor

@neilconway neilconway commented Mar 29, 2026

Which issue does this PR close?

Rationale for this change

In practice, split_part(string, delimiter, position) is usually invoked with constant values for delimiter and position. We can take advantage of that to hoist some conditional branches out of the per-row hot loop; more importantly, we can switch from using str::split to building one memchr::memmem::Finder and using it for each row. Building a Finder is relatively expensive but it's a clear win when we can amortize that one-time cost over an entire input batch.

Benchmarks (M4 Max):

  • scalar_utf8_single_char/pos_first: 105 µs → 41 µs, -61%
  • scalar_utf8_single_char/pos_middle: 358 µs → 97 µs, -73%
  • scalar_utf8_single_char/pos_negative: 110 µs → 46 µs, -58%
  • scalar_utf8_multi_char/pos_middle: 355 µs → 132 µs, -63%
  • scalar_utf8_long_strings/pos_middle: 1.97 ms → 1.11 ms, -43%
  • scalar_utf8view_long_parts/pos_middle: 467 µs → 169 µs, -63%
  • array_utf8_single_char/pos_middle: 351 µs → 357 µs, no change
  • array_utf8_multi_char/pos_middle: 366 µs → 357 µs, -2.6%

What changes are included in this PR?

  • Add benchmarks for split_part with scalar delimiter and position
  • Add new fast-path for split_part with scalar delimiter and position
  • Add SLT tests for split_part with scalar delimiter and position

Are these changes tested?

Yes.

Are there any user-facing changes?

No.

@github-actions github-actions bot added sqllogictest SQL Logic Tests (.slt) functions Changes to functions implementation labels Mar 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

functions Changes to functions implementation sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Optimize split_part for scalar args

1 participant