Skip to content

fix(api,agent): Move anycast prefixes filter to routing-profiles#1780

Merged
bcavnvidia merged 1 commit into
NVIDIA:mainfrom
bcavnvidia:routing-profile-prefixes
May 20, 2026
Merged

fix(api,agent): Move anycast prefixes filter to routing-profiles#1780
bcavnvidia merged 1 commit into
NVIDIA:mainfrom
bcavnvidia:routing-profile-prefixes

Conversation

@bcavnvidia
Copy link
Copy Markdown
Contributor

@bcavnvidia bcavnvidia commented May 18, 2026

Description

This PR moves control of the anycast prefix list filter from site-level to routing-profile, which is effectively tenant-level.

anycast_site_prefixes is now deprecated for non-legacy virtualization. Initially, empty allowed_anycast_prefixes in routing profiles will trigger a fall-back to anycast_site_prefixes.

A future PR will remove rendering of anycast_site_prefixes in DPU network config completely, making site-level prefixes apply only to legacy deployments.

To prepare for the change, deployments should opt-out of anycast_site_prefixes if configured by removing or emptying the list in the NICo API TOML config. Any prefixes listed should be moved to the allowed_anycast_prefixes list in the routing-profile config used for tenants who are expected to announce prefixes to their DPUs.

Type of Change

  • Add - New feature or capability
  • Change - Changes in existing functionality
  • Fix - Bug fixes
  • Remove - Removed features or deprecated functionality
  • Internal - Internal changes (refactoring, tests, docs, etc.)

Related Issues (Optional)

Breaking Changes

  • This PR contains breaking changes

Testing

  • Unit tests added/updated
  • Integration tests added/updated
  • Manual testing performed
  • No testing required (docs, internal refactor, etc.)

Additional Notes

@bcavnvidia bcavnvidia requested a review from a team as a code owner May 18, 2026 20:05
@bcavnvidia bcavnvidia changed the title Routing profile prefixes fix(api,agent): Add anycast prefixes filter to routing-profiles May 18, 2026
@bcavnvidia bcavnvidia force-pushed the routing-profile-prefixes branch from 0c1d86a to 72bec1f Compare May 18, 2026 20:11
@bcavnvidia bcavnvidia changed the title fix(api,agent): Add anycast prefixes filter to routing-profiles fix(api,agent): Move anycast prefixes filter to routing-profiles May 18, 2026
Comment thread crates/rpc/proto/forge.proto
@ajf
Copy link
Copy Markdown
Collaborator

ajf commented May 19, 2026

This is a breaking change because existing routing-profiles would need the site-level prefixes, if any, moved to the routing profiles used for tenants.

Mitigation can be done by updating the nico API config, if necessary, before rolling out this change. Only affects deployments that use host<->DPU BGP peering.

What's at risk of breaking? How do deployments know what they have to do?

@bcavnvidia
Copy link
Copy Markdown
Contributor Author

bcavnvidia commented May 19, 2026

This is a breaking change because existing routing-profiles would need the site-level prefixes, if any, moved to the routing profiles used for tenants.

Mitigation can be done by updating the nico API config, if necessary, before rolling out this change. Only affects deployments that use host<->DPU BGP peering.

What's at risk of breaking? How do deployments know what they have to do?

@ajf Good point. I used too few words.

What would "break":

  • Any announcements from host to DPU ("BYOIP"). Non-updated profiles would have a policy list that blocks any announcements. This does not include site-controllers, only managed hosts.

The "fix":

  • Take anything in anycast_site_prefixes prefixes and list that in allowed_anycast_prefixes for any routing-profile that should allow announcements.

@ajf
Copy link
Copy Markdown
Collaborator

ajf commented May 19, 2026

Take anything in anycast_site_prefixes prefixes and list that in allowed_anycast_prefixes for any routing-profile that should allow announcements.

Can we merge these things automatically to prevent a breaking change? If not, then we'll need to make sure this shows up in the release notes obviously, cc @sabbanis

@bcavnvidia
Copy link
Copy Markdown
Contributor Author

Take anything in anycast_site_prefixes prefixes and list that in allowed_anycast_prefixes for any routing-profile that should allow announcements.

Can we merge these things automatically to prevent a breaking change? If not, then we'll need to make sure this shows up in the release notes obviously, cc @sabbanis

Looks like github lost my previous reply somehow. 🤔

We could merge/fallback, but I intentionally didn't because then it looks a bit like "default allow."

If there are two profiles INT and EXT, and anycast_site_prefixes=[10.10.10.0/24], then tenants under either profile can announce from the prefix, currently.

In this PR, if allowed_anycast_prefixes is made optional and the fallback is anycast_site_prefixes, it seems too easy for users to continue unintentionally creating profiles that allow announcements in all profiles.

An alternative might be a deprecation process for anycast_site_prefixes, but then users would be forced to define the optional field to prevent announcements, and it seems like that would just allow more time to pass before having to do the same break anyway. It feels like this is a good time to do it.

@ajf
Copy link
Copy Markdown
Collaborator

ajf commented May 19, 2026

Just thinking out loud, is it possible to have a config option to merge these two things, default to true, then in some future release we default to false, then remove it? That'd at least give the control to the deployment how they want it to behave and not create a flag day where one deployment will break their users.

I could go either way, since backward compatibility is only going to get harder. I do know we have impacted people internally that will need to be considered when doing deployment of this change.

Your call if it's "breaking change" + documentation or make it configurable at runtime.

@bcavnvidia
Copy link
Copy Markdown
Contributor Author

bcavnvidia commented May 19, 2026

Just thinking out loud, is it possible to have a config option to merge these two things, default to true, then in some future release we default to false, then remove it?

I thought about that, but then at some future date we make a change that would have the same breaking effect we're talking about right now, possibly twice (first when we change the default and hit everyone who hadn't set it, and again later when we remove it entirely and hit all the people who had explicitly set it to true). 🤔

If we're going to take it away at some point, but not right away, then it seems simpler to just call out anycast_site_prefixes as deprecated and explaining what that means, which is fine. I just worry that we're unintentionally committing to default-allow (and also that changing will be harder later).

@bcavnvidia bcavnvidia force-pushed the routing-profile-prefixes branch 2 times, most recently from d8e6a9e to 3acdca9 Compare May 19, 2026 21:35
@bcavnvidia
Copy link
Copy Markdown
Contributor Author

bcavnvidia commented May 19, 2026

Spiritually conflicted, but I made it more of a deprecation process. @sabbanis should probably still make sure the details in the description are in the release notes.

Updated the README in /cfg/

@bcavnvidia bcavnvidia force-pushed the routing-profile-prefixes branch from 3acdca9 to b6f23f8 Compare May 19, 2026 21:44
@bcavnvidia bcavnvidia force-pushed the routing-profile-prefixes branch from b6f23f8 to f484470 Compare May 19, 2026 21:50
@bcavnvidia bcavnvidia merged commit baae1d7 into NVIDIA:main May 20, 2026
43 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants