ooptimize IoU matching: sparse strtree output + 1:1 pre-filtering#1361
ooptimize IoU matching: sparse strtree output + 1:1 pre-filtering#1361musaqlain wants to merge 1 commit into
Conversation
There was a problem hiding this comment.
Pull request overview
This PR addresses the performance/memory bottleneck in polygon IoU matching by keeping STRtree overlap results sparse and reducing the size of the assignment problem before running Hungarian matching.
Changes:
- Change
_overlap_all()to return sparse parallel arrays (overlap indices + intersection/union areas) instead of dense(n_truth × n_pred)matrices. - Update
match_polygons()to pre-resolve unambiguous 1:1 overlaps and runlinear_sum_assignment()only on the remaining ambiguous subset. - Compute union areas via
area(A) + area(B) - area(intersection)instead ofshapely.union().
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
7932be5 to
5afa786
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1361 +/- ##
==========================================
- Coverage 86.87% 86.75% -0.13%
==========================================
Files 24 24
Lines 3064 3202 +138
==========================================
+ Hits 2662 2778 +116
- Misses 402 424 +22
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Traced through the diff the two-stage matching logic checks out. If a truth has exactly one overlapping pred and that pred has exactly one overlapping truth (bincount==1 both sides) with positive intersection, pulling them out can't affect the optimal assignment for what remains. The inter_areas > 0 guard matters too STRtree "intersects" includes boundary-touching pairs with zero-area overlap that you'd never want to lock in as a match. Couple of things I noticed: The early return adds a new path that didn't exist before when truths and preds both exist but nothing overlaps, the original ran linear_sum_assignment on an all-zeros matrix and assigned some truths to predictions with IoU=0. The new code short-circuits all truths to unmatched (prediction_id=None, IoU=0). More sensible, but it is a behavioral change a test for truths-present-zero-overlaps would pin down the contract. setdefault in Stage 2 is technically safe since Stage 1 and Stage 2 touch disjoint truth indices by construction, but a direct assignment would make that invariant visible instead of silently swallowing a violation if it ever breaks. Since #1345 had the tracemalloc script, running it on this branch |
|
Thanks for the progress on this @musaqlain, it would be interesting to see the same memory test you shared in the issue thread. I can also have a look on a larger dataset to see what sort of improvement we're likely to see in practice. I don't think we have any numbers for what sort of sparsity we'd actually see (like how many ambiguous matches do we end up with for a typical eval). In the case where there is no overlap, this shouldn't affect final scoring within the eval path because we threshold on IoU ( |
jveitchmichaelis
left a comment
There was a problem hiding this comment.
Some comments, I think some complexity can be reduced.
|
thanks @jveitchmichaelis, here are the tracemalloc numbers (n_truth=5000, n_pred=10000): after (sparse arrays): memory drops ~1,200x. typical sparsity looks to be well under 0.1% so the pre-filter should help in practice. On the zero-overlap case: agree, |
|
since the number of function parameters changed, ( |
5afa786 to
9183c2d
Compare
|
Thanks for this, I'm going to run some tests on a big dataset on our cluster (since that's probably worst-case). I'll also run an eval check against our benchmark dataset to check for any regression there, but in theory this approach should be a pure speedup. |
Resolves #1345
_overlap_all() used the STRtree to find overlapping pairs but then discarded that sparse result by filling dense (n_truth × n_pred) matrices. These were passed directly to
linear_sum_assignment(), which runs inO(n²m).Following @jveitchmichaelis's suggestion, this PR:
_overlap_all() returns sparse parallel arrays directly from the STRtree.
match_polygons() first identify unambiguous 1:1 matches {using
np.bincounton the STRtree indices} and resolves them immediately. Only the remaining ambiguous pairs go tolinear_sum_assignment()via a reduced sub-matrix.1 improvement I made: union areas are computed arithmetically (
area(A) + area(B) - area(intersection)) instead of callingshapely.union(), this is more efficient as per my findings.Existing tests pass as-is.....
#AI disclosure