ooptimize IoU matching: sparse strtree output + 1:1 pre-filtering by musaqlain · Pull Request #1361 · weecology/DeepForest

musaqlain · 2026-03-25T19:10:38Z

Resolves #1345

_overlap_all() used the STRtree to find overlapping pairs but then discarded that sparse result by filling dense (n_truth × n_pred) matrices. These were passed directly to linear_sum_assignment(), which runs in O(n²m).

Following @jveitchmichaelis's suggestion, this PR:

_overlap_all() returns sparse parallel arrays directly from the STRtree.
match_polygons() first identify unambiguous 1:1 matches {using np.bincount on the STRtree indices} and resolves them immediately. Only the remaining ambiguous pairs go to linear_sum_assignment() via a reduced sub-matrix.
1 improvement I made: union areas are computed arithmetically (area(A) + area(B) - area(intersection)) instead of calling shapely.union(), this is more efficient as per my findings.

Existing tests pass as-is.....

#AI disclosure

AI is used for final improvements. snippets are generated by copilot along my coding
AI is used to gather background knowledge...

Copilot

Pull request overview

This PR addresses the performance/memory bottleneck in polygon IoU matching by keeping STRtree overlap results sparse and reducing the size of the assignment problem before running Hungarian matching.

Changes:

Change _overlap_all() to return sparse parallel arrays (overlap indices + intersection/union areas) instead of dense (n_truth × n_pred) matrices.
Update match_polygons() to pre-resolve unambiguous 1:1 overlaps and run linear_sum_assignment() only on the remaining ambiguous subset.
Compute union areas via area(A) + area(B) - area(intersection) instead of shapely.union().

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

codecov · 2026-03-25T20:05:36Z

Codecov Report

❌ Patch coverage is 84.78261% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 86.75%. Comparing base (408e150) to head (9183c2d).
⚠️ Report is 3 commits behind head on main.

Files with missing lines	Patch %	Lines
src/deepforest/IoU.py	84.78%	7 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1361      +/-   ##
==========================================
- Coverage   86.87%   86.75%   -0.13%     
==========================================
  Files          24       24              
  Lines        3064     3202     +138     
==========================================
+ Hits         2662     2778     +116     
- Misses        402      424      +22

Flag	Coverage Δ
unittests	`86.75% <84.78%> (-0.13%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

vickysharma-prog · 2026-03-27T04:54:26Z

Traced through the diff the two-stage matching logic checks out. If a truth has exactly one overlapping pred and that pred has exactly one overlapping truth (bincount==1 both sides) with positive intersection, pulling them out can't affect the optimal assignment for what remains. The inter_areas > 0 guard matters too STRtree "intersects" includes boundary-touching pairs with zero-area overlap that you'd never want to lock in as a match.

Couple of things I noticed:

The early return adds a new path that didn't exist before when truths and preds both exist but nothing overlaps, the original ran linear_sum_assignment on an all-zeros matrix and assigned some truths to predictions with IoU=0. The new code short-circuits all truths to unmatched (prediction_id=None, IoU=0). More sensible, but it is a behavioral change a test for truths-present-zero-overlaps would pin down the contract.

setdefault in Stage 2 is technically safe since Stage 1 and Stage 2 touch disjoint truth indices by construction, but a direct assignment would make that invariant visible instead of silently swallowing a violation if it ever breaks.

Since #1345 had the tracemalloc script, running it on this branch
would give concrete before/after numbers.

jveitchmichaelis · 2026-03-30T17:15:54Z

Thanks for the progress on this @musaqlain, it would be interesting to see the same memory test you shared in the issue thread. I can also have a look on a larger dataset to see what sort of improvement we're likely to see in practice. I don't think we have any numbers for what sort of sparsity we'd actually see (like how many ambiguous matches do we end up with for a typical eval).

In the case where there is no overlap, this shouldn't affect final scoring within the eval path because we threshold on IoU (result["match"] = result.IoU > iou_threshold). A completely non-overlapping assignment would get dropped here?

jveitchmichaelis

Some comments, I think some complexity can be reduced.

musaqlain · 2026-04-03T19:14:16Z

thanks @jveitchmichaelis, here are the tracemalloc numbers (n_truth=5000, n_pred=10000):
bbefore (dense matrices):

Shape: (5000, 10000)
Sparsity: 0.0001
Peak memory: 763.9 MB
Time: 0.97s

after (sparse arrays):

STRtree pairs (M): 5070 of 50,000,000 possible (0.01% non-zero)
Peak memory: 0.6 MB  |  Time: 0.17s

memory drops ~1,200x. typical sparsity looks to be well under 0.1% so the pre-filter should help in practice.

On the zero-overlap case: agree, IoU > iou_threshold would drop those matches downstream anyway so final scoring is unaffected.

musaqlain · 2026-04-03T19:28:29Z

since the number of function parameters changed, (_overlap_all()), here is the updated memory test script, I ran it against the updated code changes nad the results are satiffactory:

import os
import time
import tracemalloc
import geopandas as gpd
import numpy as np
from shapely.geometry import box

from deepforest.IoU import _overlap_all


def make_dummy_data(n, img_size=10000, box_size=50):
    xs = np.random.uniform(0, img_size - box_size, n)
    ys = np.random.uniform(0, img_size - box_size, n)
    geoms = [box(x, y, x + box_size, y + box_size) for x, y in zip(xs, ys)]
    return gpd.GeoDataFrame({"score": np.random.uniform(0.5, 1.0, n)}, geometry=geoms)


np.random.seed(42)
N_TRUTH = 5000
N_PRED = 10000
truth = make_dummy_data(N_TRUTH)
preds = make_dummy_data(N_PRED)

tracemalloc.start()
t0 = time.time()

t_idx, p_idx, inter_areas, union_areas, truth_ids, pred_ids = _overlap_all(preds, truth)

t1 = time.time()
_, peak = tracemalloc.get_traced_memory()
tracemalloc.stop()

M = t_idx.size
dense_size_mb = (N_TRUTH * N_PRED * 8 * 2) / 1024 / 1024  # two float64 matrices
sparse_size_mb = (M * 8 * 4) / 1024 / 1024  # four float64/intp arrays of length M
sparsity = M / (N_TRUTH * N_PRED)

print(f"STRtree pairs (M):  {M} of {N_TRUTH * N_PRED} possible ({sparsity:.4%} non-zero)")
print(f"Dense matrix would: {dense_size_mb:.1f} MB")
print(f"Sparse arrays use:  {sparse_size_mb:.1f} MB")
print(f"Peak memory (measured): {peak / 1024 / 1024:.1f} MB")
print(f"Time: {t1 - t0:.2f}s")

jveitchmichaelis · 2026-04-07T17:21:02Z

Thanks for this, I'm going to run some tests on a big dataset on our cluster (since that's probably worst-case).

I'll also run an eval check against our benchmark dataset to check for any regression there, but in theory this approach should be a pure speedup.

Copilot AI review requested due to automatic review settings March 25, 2026 19:10

Copilot started reviewing on behalf of musaqlain March 25, 2026 19:11 View session

Copilot AI reviewed Mar 25, 2026

View reviewed changes

Comment thread src/deepforest/IoU.py Outdated

Comment thread src/deepforest/IoU.py

Comment thread src/deepforest/IoU.py Outdated

Comment thread src/deepforest/IoU.py Outdated

Comment thread src/deepforest/IoU.py Outdated

musaqlain force-pushed the IoU_performance branch from 7932be5 to 5afa786 Compare March 25, 2026 19:26