perf: change singleevents default order to occurreddate desc with supporting indices DHIS2-20991#23093
perf: change singleevents default order to occurreddate desc with supporting indices DHIS2-20991#23093
Conversation
39ed422 to
43eef90
Compare
|
What are our options? Is the cost of additional round trip to the db worth thinking about? Like prefetching the orgunit for SELECTED/CHILDREN cases? |
|
Even if PG actually respected our I did a bit of digging and found the planner parameter |
814eb0d to
e801e12
Compare
12bd13b to
0e175bb
Compare
a28e351 to
a7fdc10
Compare
…-20991 Change default order from ev_id desc to occurreddate desc, eventid desc to match what clients always request. Introduce OrderJdbcClause utility for clean order clause building with tie-breaker direction matching the first user-specified order. Extend the V2_43_50 singleevent occurreddate index with eventid as trailing column.
|

Change the default single event order from
eventid desctooccurreddate desc, eventid descand introduceOrderJdbcClausefor consistent order clause building across all tracker JDBC stores.The previous default order (
eventid desc) is the primary key, which is not an orderable field in the API -- clients could not explicitly request it. The Capture app and Android SDK always sendorder=occurredAt:desc, so every request paid for an explicit order that should have been the default. Makingoccurreddate desc, eventid descthe default means requests without an explicitorderparam get the same indexed scan. Theeventidtie-breaker ensures deterministic pagination.Default order SQL
Before (master):
After (PR):
Tie-breaker direction matching (
OrderJdbcClause)OrderJdbcClauseis a shared utility used by all four tracker JDBC stores. When a user specifies an order likeoccurredAt:asc, the tie-breaker column (eventid) now matches that direction (asc). This enables a single forward or backward index scan on the composite index.Before (
order=occurredAt:asc):After:
PostgreSQL B-tree indices support scanning in both directions. An index on
(a, b)can serve bothorder by a desc, b desc(forward scan) andorder by a asc, b asc(backward scan). But mixed directions likea asc, b descrequire a sort -- the index cannot deliver rows in that order.Index change (V2_43_56)
Extends the V2_43_50
occurreddateindex witheventidas trailing column to cover the new default order without a sort:singleeventindices after this PR:singleevent_pkey(eventid)unique_singleevent_uid(uid)in_singleevent_programstageid_occurreddate(programstageid, occurreddate, eventid)(programstageid, occurreddate)in_singleevent_programstageid_organisationunitid(programstageid, organisationunitid)in_singleevent_programstageid_assigneduserid(programstageid, assigneduserid)Test Data
Sierra Leone database with generated test data:
singleeventGenerated by inserting additional data into the Sierra Leone base database for the Malaria programme (
VBqh0ynB2wv, single events).Test Users
F_TRACKED_ENTITY_INSTANCE_SEARCH_IN_ALL_ORGUNITSauthority, COC sharing enforcedALLauthority (superuser), COC sharing skippedPerformance
EXPLAIN ANALYZE execution time (ms), warmup=2. 10M single events in one programme stage.
Summary
Default order (the change in this PR) -- fast for all users. Admin/super have no org unit filter (10M events), restricted/district have a search scope filter (17K/1M events). The composite index delivers rows in
occurreddate, eventidorder without a sort:Default order + date range -- improved.
occurredAfter=2024-07-01&occurredBefore=2024-12-31narrows to ~8,800 events. The composite index covers both the range filter and the sort:orgUnit=DiszpKrYNg8&orgUnitMode=SELECTED&occurredAfter=2024-07-01&occurredBefore=2024-12-31Non-indexed orders, broad scope -- pre-existing, unchanged. No index covers
createdAt,updatedAt, data element, orassignedUserDisplayNameordering. PG must scan and sort all events in scope. Restricted user (17K events) is fine:order=createdAt:descorder=updatedAt:descorder=qrur9Dvnyt5:asc(data element)orgUnitMode=ACCESSIBLE&order=updatedAt:descorgUnitMode=ACCESSIBLE&order=updatedAt:descOrg unit filter, few/zero events + broad user access -- pre-existing, unchanged. Index competition: planner picks the
occurreddateindex expecting to find matches quickly, scans all 10M rows. See Limitations > Index competition:orgUnit=O6uvpzGd5pu&orgUnitMode=SELECTED&order=occurredAt:descorgUnit=O6uvpzGd5pu&orgUnitMode=CHILDREN&order=occurredAt:descFull benchmark results (93 queries)
All timings: EXPLAIN ANALYZE execution time (ms), warmup=2.
Single Events (
/api/tracker/events?program=VBqh0ynB2wv, 10M events)order=occurredAt:desc(admin)order=occurredAt:desc(super)order=occurredAt:desc(restricted)order=occurredAt:desc(district)order=occurredAt:desc(admin)order=occurredAt:desc(restricted)order=occurredAt:desc(district)order=occurredAt:desc(admin)order=occurredAt:desc(restricted)order=occurredAt:desc(district)order=occurredAt:desc(admin)order=occurredAt:desc(restricted)order=occurredAt:desc(district)order=occurredAt:desc(admin)status=ACTIVEorder=occurredAt:desc(admin)status=COMPLETEDorder=occurredAt:desc(admin)status=VISITEDorder=occurredAt:desc(admin)order=occurredAt:desc(admin)updatedWithin=30dorder=occurredAt:desc(admin)order=occurredAt:desc(admin)assignedUserMode=NONEorder=occurredAt:desc(admin)assignedUserMode=ANYorder=occurredAt:desc(admin)assignedUserMode=PROVIDEDorder=occurredAt:desc(admin)order=occurredAt:desc(admin)includeDeleted=trueorder=occurredAt:desc(admin)order=occurredAt:desc(admin)order=occurredAt:desc(admin)order=createdAt:desc(admin)order=updatedAt:desc(admin)order=qrur9Dvnyt5:asc(admin)totalPages=trueorder=occurredAt:desc(admin)dataElementIdScheme=CODEorder=occurredAt:desc(admin)order=occurredAt:desc(admin)status=ACTIVE+ date rangeorder=occurredAt:desc(admin)order=occurredAt:desc(admin)order=occurredAt:desc(restricted)order=occurredAt:desc(district)order=occurredAt:asc(admin)Working Lists (programme stage
pTo4uMt3xur, Ngelehun CHC)occurredAt:descoccurredAt:descoccurredAt:descupdatedAt:descupdatedAt:descupdatedAt:descassignedUserDisplayName:descassignedUserDisplayName:descassignedUserDisplayName:descoccurredAt:descoccurredAt:descoccurredAt:descupdatedAt:descupdatedAt:descupdatedAt:descassignedUserDisplayName:descassignedUserDisplayName:descassignedUserDisplayName:descoccurredAt:descoccurredAt:descoccurredAt:descupdatedAt:descupdatedAt:descupdatedAt:descassignedUserDisplayName:descassignedUserDisplayName:descassignedUserDisplayName:descoccurredAt:descupdatedAt:descassignedUserDisplayName:descoccurredAt:descoccurredAt:descoccurredAt:descupdatedAt:descupdatedAt:descupdatedAt:descassignedUserDisplayName:descassignedUserDisplayName:descassignedUserDisplayName:descLimitations
Index competition
Two indices compete for the planner's attention:
(programstageid, occurreddate, eventid)-- avoids a sort onoccurreddatebut must scan all rows in the programme stage, filtering by org unit after the fact.(programstageid, organisationunitid)-- filters by org unit first, then sorts the result.The first index is better for queries where the org unit filter is unselective (ALL, ACCESSIBLE with a large scope) -- scan in order, stop after
limitrows. The second is better when the org unit filter is selective (SELECTED on a single facility, DESCENDANTS of a small subtree) -- narrow the scan first, then sort within the result.The planner doesn't always choose correctly. When the target org units have few or zero events, the planner may still pick the
occurreddateindex expecting to find matching rows early. It then scans all 10M rows, discarding each one, before returning an empty result. This is the root cause of the slow SELECTED and CHILDREN queries for admin (Bo district) -- a large org unit where the index scan must visit many rows.Non-indexed ordering
All orderable fields except
occurredAtrequire a full sort of all matching rows regardless of page size -- no index covers these orderings.createdAt,updatedAt, and data element orders are 6-9s on 10M events.Total pages
totalPages=truealways runs a full count query. Cost is O(N) in matching rows.What we tried
The original goal of DHIS2-20991 was to fix the slow SELECTED/CHILDREN queries for single events (admin: SELECTED 11.7s, CHILDREN 16s on 10M events). Several approaches were explored and rejected because they all hit the index competition problem described above:
Denormalize
organisationunitpathontosingleeventStore
organisationunit.pathdirectly onsingleeventso org unit filters (SELECTED, DESCENDANTS) become predicates on the event row without joiningorganisationunit. Maintained by two triggers: one onsingleeventinsert/update, one cascadingorganisationunit.pathchanges. Added a composite index(programstageid, organisationunitpath, occurreddate, eventid).Regression: the planner must choose between the path-based index (filters by org unit first, then sorts) and the
(programstageid, occurreddate, eventid)index (scans in order, stops atlimit). For org units with few or zero events in the programme, the planner picks theoccurreddateindex expecting to find matches quickly, then scans all 10M rows. The path denormalization just shifts which index the planner over-estimates. In benchmarking it did not improve the slow cases and added trigger/sync complexity.Denormalize
hierarchylevelontosingleeventExtended the path denormalization to also store
organisationunit.hierarchylevelonsingleevent, eliminating the OU join for CHILDREN mode (which needs the level check). Same index competition problem as above.Composite index
(programstageid, organisationunitid, occurreddate, eventid)Extend the existing
(programstageid, organisationunitid)index withoccurreddate, eventidtrailing columns so org unit-scoped queries can both filter and sort from the same index.Regression: the planner now has three competing indices and makes worse choices in some cases. For enrollments the same approach was tried with a materialized CTE to force plan selection (PR #22982, closed) -- the CTE defeats LIMIT pushdown, causing multi-second regressions when the org unit scope is large (e.g. national hospital owning 50% of events).
Why this PR is the safe subset
All approaches that try to fix the SELECTED/CHILDREN slow path introduce the same fundamental tension: the planner must choose between an index that filters by org unit (good for narrow scopes) and an index that scans in sort order (good for broad scopes or no org unit filter). There is no single plan that handles both well. The planner's cost estimates for
LIMITqueries are unreliable when data distribution varies between org units.This PR avoids that problem entirely by only changing the default order and tie-breaker direction -- changes that affect the
ORDER BYclause but not theWHEREclause. The SELECTED/CHILDREN slow path remains pre-existing on master, unchanged by this PR.