Skip to content

KAFKA-20249: Optimize raw value extraction in headers-aware deserializers#21706

Open
zheguang wants to merge 4 commits intoapache:trunkfrom
zheguang:zheguang-KAFKA-20249
Open

KAFKA-20249: Optimize raw value extraction in headers-aware deserializers#21706
zheguang wants to merge 4 commits intoapache:trunkfrom
zheguang:zheguang-KAFKA-20249

Conversation

@zheguang
Copy link
Contributor

This patch implements two optimizations, and their JMH benchmarks.

  1. Skipping
    Previously the raw value extraction in headers-aware deserializers undergoes deserialization and/or copying of headers, while only skipping is required. This happens for both empty and nonempty headers.

  2. Empty headers copying
    Empty headers have constant metadata footprint: the headers size is varint-encoded 1 byte of 0, and headers themselves consume no bytes. Based on this invariant, the ByteBuffer-based extraction can be replaced with a direct System.arraycopy, which is a Java native method optimized for specific platforms.

Benchmark:
This patch also includes JMH benchmarks to test the speedup. On my local machine, Optimization 1 speedup is 2-6x speedup. Optimization 2 is 1.35x.

Below is the throughput comparison of a recorded JMH run:

Benchmark                                                Mode  Cnt      Score     Error  Units
RawBytesExtraction.testRawAggregationWithHeaders        thrpt   15   1481.854 ±  31.448  ops/s
RawBytesExtraction.testRawAggregationWithHeadersOpt     thrpt   15  11797.165 ± 103.432  ops/s

RawBytesExtraction.testRawAggregationWithoutHeaders     thrpt   15   8359.080 ±  47.918  ops/s
RawBytesExtraction.testRawAggregationWithoutHeadersOpt  thrpt   15  15298.827 ± 452.741  ops/s

RawBytesExtraction.testRawValueWithoutHeaders           thrpt   15  11329.997 ± 260.399  ops/s
RawBytesExtraction.testRawValueWithoutHeadersOpt        thrpt   15  15372.816 ± 184.651  ops/s

Test:

  • Existing unit test covering nonempty headers
  • Added test case for empty headers

Benchmark                                                     Mode  Cnt      Score     Error  Units
RawBytesExtraction.testRawAggregationWithoutHeaders          thrpt   15   7850.891 ± 307.428  ops/s
RawBytesExtraction.testRawAggregationWithoutHeadersFastPath  thrpt   15  14957.556 ± 517.450  ops/s
Benchmark                                                     Mode  Cnt      Score     Error  Units
RawBytesExtraction.testRawAggregationWithHeaders             thrpt   15   1411.338 ± 110.527  ops/s
RawBytesExtraction.testRawAggregationWithHeadersFastPath     thrpt   15   6106.665 ± 218.032  ops/s
RawBytesExtraction.testRawAggregationWithoutHeaders          thrpt   15   7734.538 ± 525.487  ops/s
RawBytesExtraction.testRawAggregationWithoutHeadersFastPath  thrpt   15  14300.408 ± 212.519  ops/s
Benchmark                                                Mode  Cnt      Score     Error  Units
RawBytesExtraction.testRawAggregationWithHeaders        thrpt   15   1481.854 ±  31.448  ops/s
RawBytesExtraction.testRawAggregationWithHeadersOpt     thrpt   15  11797.165 ± 103.432  ops/s

RawBytesExtraction.testRawAggregationWithoutHeaders     thrpt   15   8359.080 ±  47.918  ops/s
RawBytesExtraction.testRawAggregationWithoutHeadersOpt  thrpt   15  15298.827 ± 452.741  ops/s

RawBytesExtraction.testRawValueWithoutHeaders           thrpt   15  11329.997 ± 260.399  ops/s
RawBytesExtraction.testRawValueWithoutHeadersOpt        thrpt   15  15372.816 ± 184.651  ops/s
@github-actions github-actions bot added triage PRs from the community streams performance labels Mar 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance streams triage PRs from the community

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant