BFD and falcon-lab cleanup [Spring Cleaning 1/N]#717
Merged
Conversation
Use mg-common's ManagedThread so we retain JoinHandles in a structured fashion in case we want to make explicit use of these later. Also introduce AddPeerRequest to group arguments defining a peer config. This avoids clippy's too many arguments warning and provides some rough structure for the peer add flow.
After converting to using a thread builder instead of thread::spawn() (builder::spawn returns result, thread::spawn panics), the Result should be propagated back out to the callers so they can do proper error handling.
ensure() was returning a Result<UdpSocket> but the caller just dropped it, so now we just return Result<()>. Removes redundant presence check for peer during add_peer callpath. Updates egress() to use recv_timeout() to avoid possible deadlocks where recv() hangs on a dead channel and we miss the killswitch (AtomicBool).
Remove the workaround for Illumos #17853 from falcon-lab.sh
Adds new falcon-lab test that makes use of the pre-existing trio topo. Generalizes the trio topology creation to reduce boilerplate. Adds a dual-stack static routing config with BFD enabled. Adds a test to ensure BFD comes up for v4 and v6, each next-hop can be disabled independently, and that each path is installed into the RIB as expected as BFD sessions are brought up/down.
Illumos complains if you try to add a static ipv6 address via ipadm before that datalink already has an addrconf address. Make it so! Signed-off-by: Trey Aspelund <trey@oxidecomputer.com>
Signed-off-by: Trey Aspelund <trey@oxidecomputer.com>
Since mgd images in falcon-lab do not ship with the control plane, we have to manually poke and prod dendrite to install entries into softnpu. This includes uplink addresses we intend to use for unicast, e.g. BFD and BGP (although unnumbered BGP works because softnpu has a punt-to-CPU catch-all for link-local traffic). Signed-off-by: Trey Aspelund <trey@oxidecomputer.com>
Fix the syntax to enable BFD for static routes on EOS. Also fixup the parsing of EOS "show bfd peers | json". Signed-off-by: Trey Aspelund <trey@oxidecomputer.com>
Signed-off-by: Trey Aspelund <trey@oxidecomputer.com>
Adds falcon-lab/falcon-lab symlink so there's a standard directory to run the tests from locally. Adds a falcon-lab/cargo-bay directory for dumping binaries to be used by falcon-lab tests. Adds a falcon-lab/cargo-bay/.gitignore that ignores everything except for the .gitignore itself, so nothing gets tracked by git inadvertently. Signed-off-by: Trey Aspelund <trey@oxidecomputer.com>
Signed-off-by: Trey Aspelund <trey@oxidecomputer.com>
Adds validation of routes in loc_rib ("selected") and ASIC (dpd).
Checks FSM state of both peers, not just the 0th one.
Check for specific prefixes when looking in RIB (on both local/peer).
Bumps bestpath fanout to 2 so we can validate ECMP in loc_rib/dpd.
Signed-off-by: Trey Aspelund <trey@oxidecomputer.com>
Signed-off-by: Trey Aspelund <trey@oxidecomputer.com>
Cleans up the FRR config a little bit by replacing a bunch of ALLOW-ALL boilerplate with "no bgp ebgp-requires-policy". Signed-off-by: Trey Aspelund <trey@oxidecomputer.com>
rcgoodfellow
approved these changes
Apr 24, 2026
| Ok(()) | ||
| } | ||
|
|
||
| pub async fn run_trio_bfd_static_test( |
Collaborator
There was a problem hiding this comment.
This is awesome. Really stoked to see this e2e test come together.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
First round of splits from #648
BFD daemon cleanup
ManagedThread+ namedBuilder::spawn; propagate spawnResults up to admin callers.AddPeerRequest; addAddPeerError::PeerExistsso mgd returns 409 instead of 500 on duplicates.egress()from blockingrx.recv()torecv_timeout(1s)with a kill-switch poll, avoiding the silent-channel deadlock.New falcon-lab test:
trio-bfd-static-routingboot_trio()/BootedTrio.pkill -STOP bfddon FRR anddocker pause ceoson EOS — reversible, daemon-state-preserving.link_ipv{4,6}_create+link_ipv6_enabled_set, run ddmd (mg-lower depends on it), bumpbestpath_fanoutto 2, pre-seed each tfport with anaddrconflink-local before static v6, and tolerate "already"-styleipadmerrors for--no-cleanupre-runs.Trio-unnumbered assertion tightening
bestpath_fanoutto 2 so ECMP is actually exercised.