Skip to content

Optimization: Fuse filtered aggregates into parent GROUP BY#479

Merged
snaumenko-st merged 3 commits into
master-servicetitanfrom
aggregate-fuse
Apr 23, 2026
Merged

Optimization: Fuse filtered aggregates into parent GROUP BY#479
snaumenko-st merged 3 commits into
master-servicetitanfrom
aggregate-fuse

Conversation

@snaumenko-st
Copy link
Copy Markdown

Eliminates a common "GroupBy N+1" pattern in the LINQ translator where filtered aggregates on a grouping parameter were emitted as per-group correlated subqueries.

What changes

The translator now rewrites filtered aggregates over a grouping so they fuse into the parent AggregateProvider and emit as a single SELECT ... GROUP BY:

LINQ shape Rewrite
g.Count(p) g.Sum(x => p(x) ? 1 : 0)
g.Where(p).Count() same as above (Where chain peeled first)
g.Where(p).Sum(s) g.Sum(x => p(x) ? s(x) : 0)
g.Where(p).Min/Max/Avg(s) g.Min/Max/Avg(x => p(x) ? s(x) : null)

Non-nullable Sum uses 0 in the ELSE branch to preserve LINQ's empty-set=0 contract; Min/Max/Avg and nullable Sum use NULL so SQL's NULL-skipping semantics match LINQ.

Secondary cleanups

  • Peels chained Wheres in front of a non-fusable aggregate into a single FilterProvider (one WHERE f1 AND ... AND fn instead of stacked filters).
  • Guards the peel against the indexed Where((x, i) => ...) overload, whose index binding would silently change if AND-combined.
  • Factors the logic into PeelWhereChain / CombinePredicates / IsFusableGroupingSource / BuildFusedAggregateSelector helpers shared by Count and Sum/Min/Max/Avg paths.

Tests

New e2e fixture AggregateFusionTest (17 tests) covers: Count/Where-Count fusion, Sum/Min/Max/Avg fusion, empty-group zero-vs-NULL regressions, root-level Where-chain collapse, and the indexed-Where guard (gated on ProviderFeatures.RowNumber). Broader LINQ suites (GroupBy / Aggregate / Subquery / Where / Indexed) showed no regressions locally.

Count(predicate) evaluated on a grouping parameter produced a per-group
correlated subquery because the Where introduced by the predicate wrapped
the grouping's AggregateProvider with a FilterProvider that
ChooseSourceForAggregate cannot fuse. For queries like

  GroupBy(k).Select(g => new { g.Key, A = g.Count(p1), B = g.Count(p2) })

this yielded one extra SELECT per aggregate per group (GroupBy N+1).

In VisitAggregateSource, when the source is a grouping-parameter bound
to an AggregateProvider, rewrite Count(predicate) into
Sum(predicate ? 1 : 0). The conditional becomes a CalculateProvider,
which ChooseSourceForAggregate does fuse, collapsing all aggregates
into a single SELECT ... GROUP BY. aggregateType is passed by ref so
the caller sees the Count -> Sum switch.

Parameterless Count(), Count(predicate) outside a fusable grouping,
and explicit Sum(predicate ? 1 : 0) keep their existing codepaths.

Add Orm.Tests.Linq.Optimization fixture with shared test base, small
model, and end-to-end tests covering fusion shape, multi-aggregate
fusion, predicate composition, the already-working Sum(CASE) form,
root-level Count preservation, and regression guards that a group with
zero matching rows still materializes 0 (not NULL) for both Count and
LongCount.

Made-with: Cursor
g.Where(p).Count() on a grouping parameter is the LINQ identity of
g.Count(p), but the translator saw the outer Count as parameterless
and the inner Where as just another provider, so the fusion rewrite
introduced for g.Count(p) never fired and the query still produced a
per-group correlated subquery.

In VisitAggregateSource, before the Count -> Sum(CASE) rewrite, peel
Queryable.Where / Enumerable.Where calls off the source and combine
their predicates with AndAlso (rebasing each peeled lambda onto the
outer predicate's parameter via ExpressionReplacer). Chained Wheres
collapse into a single AndAlso-combined predicate, so
g.Where(p1).Where(p2).Count() fuses the same way.

Add e2e tests covering the single Where, chained Wheres, and
LongCount variants. Each new test asserts the fused shape by
requiring zero occurrences of both "(SELECT COUNT" and "(SELECT SUM"
in the emitted SQL, catching the subtler case where the rewrite
applies but the aggregate still lands in a correlated subselect.

Made-with: Cursor
Extends the existing Count(predicate) GroupBy-fusion rewrite to Sum, Min,
Max and Average by pulling a peeled Where chain (plus the call's own
predicate where applicable) into the aggregate selector as a CASE:

  g.Where(p).Sum(s)        -> g.Sum(x => p(x) ? s(x) : 0)
  g.Where(p).Min/Max/Avg(s)-> g.Min/Max/Avg(x => p(x) ? s(x) : null)

Non-nullable Sum uses 0 in the ELSE branch to preserve LINQ's empty-set=0
contract; Min/Max/Avg and nullable Sum use NULL so SQL's NULL-skipping
semantics match LINQ.

Refactors VisitAggregateSource to share the Where-peel, fusability check
and fused-selector construction between Count and Sum/Min/Max/Avg via
PeelWhereChain, CombinePredicates, IsFusableGroupingSource and
BuildFusedAggregateSelector helpers.

Guards PeelWhereChain against the indexed Where overload
(Func<T,int,bool>) - its position binding would silently change if AND-
combined with another predicate - and extends the non-fusable Where
collapse to Sum/Min/Max/Avg so root-level q.Where(f1).Where(f2).Sum(...)
materialises one FilterProvider instead of a stack, matching the
non-fusable Count path.

Adds end-to-end fusion tests for each new operator, a zero-matching-rows
regression for Sum, a guard test for indexed Where, and a root-level
Where-chain collapse test for Sum.

Made-with: Cursor
@snaumenko-st snaumenko-st requested a review from hzargaryan April 23, 2026 18:37
@snaumenko-st snaumenko-st merged commit ab7a5d0 into master-servicetitan Apr 23, 2026
88 checks passed
@snaumenko-st snaumenko-st deleted the aggregate-fuse branch April 23, 2026 21:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants