Specialize Top1Monoid for thread-local aggregation#17058
Specialize Top1Monoid for thread-local aggregation#17058antiguru wants to merge 3 commits intoMaterializeInc:mainfrom
Conversation
ebc1d20 to
a6c0f1f
Compare
Introduce a top-1 monoid that shares the column order among all peers and retains buffers for efficient row unpacking. This allows for allocation-free row comparisons where otherwise the row needs to be unpacked into a new vector. The implementation relies on reference-counted shared state, which makes it unsuitable for sharing across thread boundaries. Signed-off-by: Moritz Hoffmann <mh@materialize.com>
Signed-off-by: Moritz Hoffmann <mh@materialize.com>
… exchange for per-worker pre-aggregation
a6c0f1f to
c486e48
Compare
|
This PR extends #17056 with improvements for monotonic top-1 aggregation, namely reuse of column order across top-1 monoids as well as per-worker pre-aggregation. However, in a local development environment, these improvements have not yet shown performance benefits as documented in an internal spreadsheet:
The settings used for the evaluation are the same described in #17056 (comment). Here, the time and record factors are wrt. main, and show no real benefit from the techniques added in the setting evaluated. After discussion with @antiguru, we felt that the draft PR #17058 adds quite a bit of complexity for a relatively small (at this point) benefit. So we are suggesting leaving the top-1 improvement in #17058 in the icebox, perhaps for when we actually can see some benefit of per-worker pre-aggregation for top-1, e.g., in distributed dataflow executions. |
Similarly to #17056, use a non-allocating monoid for all top-k aggregations.
This currently depends on a change to Differential (TimelyDataflow/differential-dataflow#375) before it can land (it has landed.)
Checklist
This PR has adequate test coverage / QA involvement has been duly considered.
This PR evolves an existing
$T ⇔ Proto$Tmapping (possibly in a backwards-incompatible way) and therefore is tagged with aT-protolabel.If this PR will require changes to cloud orchestration, there is a
companion cloud PR to account for those changes that is tagged with
the release-blocker label (example).
This PR includes the following user-facing behavior changes: