Skip to content

fix(routing/http): cap records after filtering#1157

Open
lidel wants to merge 1 commit into
mainfrom
fix/post-filter-records-limit
Open

fix(routing/http): cap records after filtering#1157
lidel wants to merge 1 commit into
mainfrom
fix/post-filter-records-limit

Conversation

@lidel
Copy link
Copy Markdown
Member

@lidel lidel commented May 19, 2026

Problem

The Delegated Routing server applied the records limit before filter-addrs/filter-protocols ran, so records the filters dropped shrank /routing/v1/providers and /routing/v1/peers responses below the configured limit.

This Fix

The server now calls the delegate FindProviders/FindPeers with a limit of 0 (unbounded), reads the result iterator lazily, and applies the cap itself after filtering. A delegate that used the limit to end its walk early will now end it on Close instead.

See it in action in:

Changes

  • routing/http/types/iter: add Limit, an iterator that caps another iterator at a fixed number of values
  • routing/http/server: apply the records limit post-filter; pass 0 to the delegate; update tests and DelegatedRouter docs

The Delegated Routing server applied the records limit before
filter-addrs/filter-protocols ran, so records the filters dropped
shrank /routing/v1/providers and /routing/v1/peers responses below
the configured limit.

The server now calls the delegate FindProviders/FindPeers with a
limit of 0 (unbounded), reads the result iterator lazily, and
applies the cap itself after filtering. A delegate that used the
limit to end its walk early will now end it on Close instead.

- routing/http/types/iter: add Limit, an iterator that caps
  another iterator at a fixed number of values
- routing/http/server: apply the records limit post-filter; pass
  0 to the delegate; update tests and DelegatedRouter docs
@codecov
Copy link
Copy Markdown

codecov Bot commented May 19, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 63.21%. Comparing base (4ea1540) to head (7fa3c6c).

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1157      +/-   ##
==========================================
+ Coverage   63.15%   63.21%   +0.05%     
==========================================
  Files         267      268       +1     
  Lines       26867    26884      +17     
==========================================
+ Hits        16969    16994      +25     
+ Misses       8173     8163      -10     
- Partials     1725     1727       +2     
Files with missing lines Coverage Δ
routing/http/server/server.go 77.83% <100.00%> (+1.67%) ⬆️
routing/http/types/iter/limit.go 100.00% <100.00%> (ø)

... and 12 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@lidel lidel marked this pull request as ready for review May 19, 2026 22:33
@lidel lidel requested a review from a team as a code owner May 19, 2026 22:33
Copy link
Copy Markdown
Contributor

@guillaumemichel guillaumemichel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a better approach than ipfs/someguy#149 👍🏻

Would it make sense to use FindProvidersAsync() instead of FindProviders()? It is the more standard way to find provider (go-libp2p interface), and gives caller more control.

FindProvidersAsync(ctx, cid, 0) would allow to cancel the request (cancel context) after we are happy with the results (e.g enough providers with addresses), so that we don't solicit the content router more than strictly needed.

@lidel
Copy link
Copy Markdown
Member Author

lidel commented May 20, 2026

Hm.. the goal here (cancel once we have enough, don't over-ask the router) is right, and it's already what this PR does.

@guillaumemichel lmk if i misunderstood, but the DelegatedRouter.FindProviders is slightly different from go-libp2p interface:

  • go-libp2p ContentDiscovery.FindProvidersAsync returns <-chan peer.AddrInfo a value channel, no error channel
  • boxo's DelegatedRouter is a separate interface, returning iter.ResultIter[types.Record] (richer type than peer.AddrInfo, and is lazy: the server pulls record by record, stops once iter.Limit is satisfied, and the deferred Close() cancels the rest

Is the idea to lift FindProvidersAsync to DelegatedRouter? If so, it wouldn't add control we lack, and it has costs:

  • <-chan peer.AddrInfo has no slot for per-item errors; iter.Result does.
  • a channel forces a goroutine even for synchronous delegates. boxo's own HTTP-delegated client is a goroutine-free streaming decoder today; a channel interface would spawn a pump goroutine per request.
  • ApplyFiltersToIter, Limit, and the ndjson writer are all iter-based, so a channel would be converted right back into an iterator.

So I'd keep FindProviders as-is. I did expand the DelegatedRouter doc comment to spell out the lazy + Close() contract, so implementers know to stream results and stop work on Close.

Copy link
Copy Markdown
Contributor

@guillaumemichel guillaumemichel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I was missing a piece of the puzzle in someguy.

Everything looks good, just 2 nits inline

Comment on lines 321 to +325
defer provIter.Close()

filteredIter := filters.ApplyFiltersToIter(provIter, filterAddrs, filterProtocols)
providers, err := iter.ReadAllResults(filteredIter)
var limitedIter iter.ResultIter[types.Record] = iter.Limit(filteredIter, recordsLimit)
providers, err := iter.ReadAllResults(limitedIter)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

provIter could be unconditionally closed after iter.ReadAllResults(), which would cancel request on router faster. No need to wait for response to be sent.

Suggested change
defer provIter.Close()
filteredIter := filters.ApplyFiltersToIter(provIter, filterAddrs, filterProtocols)
providers, err := iter.ReadAllResults(filteredIter)
var limitedIter iter.ResultIter[types.Record] = iter.Limit(filteredIter, recordsLimit)
providers, err := iter.ReadAllResults(limitedIter)
filteredIter := filters.ApplyFiltersToIter(provIter, filterAddrs, filterProtocols)
var limitedIter iter.ResultIter[types.Record] = iter.Limit(filteredIter, recordsLimit)
providers, err := iter.ReadAllResults(limitedIter)
provIter.Close()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or even better to close limitedIter?

Comment on lines 461 to +466
defer peersIter.Close()

peersIter = filters.ApplyFiltersToPeerRecordIter(peersIter, filterAddrs, filterProtocols)
filteredIter := filters.ApplyFiltersToPeerRecordIter(peersIter, filterAddrs, filterProtocols)
var limitedIter iter.ResultIter[*types.PeerRecord] = iter.Limit(filteredIter, recordsLimit)

peers, err := iter.ReadAllResults(peersIter)
peers, err := iter.ReadAllResults(limitedIter)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same reasoning as above, no need to keep query longer than required.

Suggested change
defer peersIter.Close()
peersIter = filters.ApplyFiltersToPeerRecordIter(peersIter, filterAddrs, filterProtocols)
filteredIter := filters.ApplyFiltersToPeerRecordIter(peersIter, filterAddrs, filterProtocols)
var limitedIter iter.ResultIter[*types.PeerRecord] = iter.Limit(filteredIter, recordsLimit)
peers, err := iter.ReadAllResults(peersIter)
peers, err := iter.ReadAllResults(limitedIter)
filteredIter := filters.ApplyFiltersToPeerRecordIter(peersIter, filterAddrs, filterProtocols)
var limitedIter iter.ResultIter[*types.PeerRecord] = iter.Limit(filteredIter, recordsLimit)
peers, err := iter.ReadAllResults(limitedIter)
peersIter.Close()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or even better to close limitedIter?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants