Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
145 changes: 145 additions & 0 deletions docs/config_bgp.md
Original file line number Diff line number Diff line change
Expand Up @@ -1037,3 +1037,148 @@ admin@branchoffice1.seattlesite1 (routing-protocol[type=bgp])# confederation mem
admin@branchoffice1.seattlesite1 (routing-protocol[type=bgp])# confederation member-as 2200
admin@branchoffice1.seattlesite1 (routing-protocol[type=bgp])# exit
```

## Viewing Filtered BGP Routes

When an inbound BGP policy rejects prefixes received from a neighbor, those routes do not appear in the BGP table or the FIB. The `filtered-routes` option exposes exactly which prefixes were suppressed by the inbound policy for a given neighbor, making it straightforward to troubleshoot why expected routes are absent from the routing table.

:::note
This feature is available in SSR version 7.2.0-r1 and above.
:::

### PCLI

The `filtered-routes` option is available as a third choice alongside `received-routes` and `advertised-routes` in the `show bgp neighbors` command:

```
show bgp neighbors [vrf <vrf_name>] <neighbor_ip> filtered-routes [ipv4 | ipv4-vpn | ipv6 | ipv6-vpn]
```

**Examples**

Display filtered routes for a neighbor in the default VRF using IPv4 unicast (the default address family):

```text
admin@router1.site1# show bgp neighbors 172.16.3.3 filtered-routes
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please provide at least a CLI output for IPv4


Display filtered IPv6 routes for a neighbor in a named VRF:

```text
admin@router1.site1# show bgp neighbors vrf vrfA fd00:5::3 filtered-routes ipv6
```

When no routes have been filtered, the command returns an empty table. When routes are present, the output format mirrors that of `received-routes` and `advertised-routes`. If the neighbor address is unknown, the VRF does not exist, or the address family is invalid, the PCLI surfaces the underlying error string describing the problem.

### REST API

A new endpoint mirrors the PCLI functionality:

```
GET /api/v1/routing/bgp/neighbors/filtered-routes
```

**Query Parameters**

| Parameter | Required | Default | Description |
|---|---|---|---|
| `neighborAddress` | Yes | — | IP address of the BGP neighbor |
| `vrf` | No | `default` | VRF name |
| `addressFamily` | No | `ipv4` | Address family: `ipv4`, `ipv4-vpn`, `ipv6`, or `ipv6-vpn` |
| `firstIndex` | No | `0` | Zero-based starting index for paginated results |
| `elementCount` | No | all | Maximum number of routes to return (range: 1–5000) |

:::note
The REST endpoint does not support `vrf all` or `addressFamily all`. Each VRF and address family must be queried individually.
:::

**Example: IPv4, default VRF**

```bash
curl --unix-socket /var/run/128technology/speakeasy.sock -i -XGET \
'http://localhost/api/v1/routing/bgp/neighbors/filtered-routes?neighborAddress=172.16.3.3&firstIndex=0&elementCount=1'
```

Response:

```json
{
"bgpTableVersion": 14,
"bgpLocalRouterId": "2.1.1.1",
"defaultLocPrf": 100,
"localAS": 2,
"bgpStatusCodes": {
"suppressed": "s", "damped": "d", "history": "h",
"valid": "*", "best": ">", "multipath": "=",
"internal": "i", "ribFailure": "r", "stale": "S", "removed": "R"
},
"bgpOriginCodes": { "igp": "i", "egp": "e", "incomplete": "?" },
"filteredRoutes": [
{
"prefix": "10.99.1.0/24",
"network": "10.99.1.0/24",
"nextHop": "172.16.3.2",
"metric": 0,
"weight": 0,
"path": "3",
"bgpOriginCode": "?",
"valid": true,
"best": true
}
],
"totalPrefixCounter": 1,
"filteredPrefixCounter": 0,
"nextEntry": 1
}
```

**Example: IPv6, named VRF**

```bash
curl --unix-socket /var/run/128technology/speakeasy.sock -i -XGET \
'http://localhost/api/v1/routing/bgp/neighbors/filtered-routes?neighborAddress=fd00:5::3&firstIndex=0&elementCount=1&addressFamily=ipv6&vrf=vrfA'
```

Response:

```json
{
"bgpTableVersion": 1,
"bgpLocalRouterId": "2.1.1.1",
"defaultLocPrf": 100,
"localAS": 2,
"filteredRoutes": [
{
"prefix": "2001:db8:5::1/128",
"network": "2001:db8:5::1/128",
"nextHopGlobal": "fd00:5::3",
"metric": 0,
"weight": 0,
"path": "3",
"bgpOriginCode": "?",
"valid": true,
"best": true
}
],
"totalPrefixCounter": 1,
"filteredPrefixCounter": 0,
"nextEntry": 1
}
```

### Troubleshooting

| Failure | PCLI behavior | REST behavior |
|---|---|---|
| `bgpd` not running | Surfaces vty error string | Returns standard upstream failure with informative status code |
| Unknown neighbor IP, neighbor not in specified VRF/address family | Surfaces vty error string with neighbor details | Returns `200 OK` with a `warning` key in the JSON body |
| Invalid `addressFamily` or `vrf` argument | Surfaces vty error string | Returns `200 OK` with a `warning` key in the JSON body |
| vty call timeout (120 s) | Surfaces timeout error string | Returns `HTTP 400` with timeout exception message |

PCLI and REST activity is logged in `routingManager.log`. FRR vty-level logs are in `routingEngine.log`.

### Version History

| Release | Modification |
|---|---|
| 7.2.0 | Feature introduced. |
209 changes: 209 additions & 0 deletions docs/config_pmtu.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,209 @@
---
title: Path MTU Discovery Enhancements
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there does not exist a document for Path MTU for the SSR, I would suggest calling this document Path MTU Discover, and leave off the Enhancements. We should use this document to describe overall PMTU behavior, including the enhancements.

The only area in our docs that covers PMTU is concepts_machine_communication.

I would leave that content there, but provide a link to this new document. It might be worthwhile to duplicate some or all of the other content here.

sidebar_label: Path MTU Discovery Enhancements
---

#### Version History

| Release | Modification |
| ------- | ------------ |
| 7.2.0 | Feature introduced |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Untrue. PMTU has been around for a long time. Perhaps from the beginning? Might be better off removing this version history table.


## Overview

The SSR performs Path MTU Discovery (PMTUD) along the overlay to determine the correct maximum transmission unit (MTU) for each peer path. By default, this test runs every ten minutes. If a change in the underlay reduces the available path MTU between two SSRs, the new value is not discovered until the next PMTUD cycle. Additionally, existing sessions continue to use the previous MTU value until the next time those sessions are rebuilt.

Devices in the underlay may report an ICMP Destination Unreachable / Fragmentation Needed (type 3, code 4) error, referred to here as a _TooBig_ packet, to indicate they could not forward a packet due to an undersized MTU. Prior to SSR 7.2.0, these messages were forwarded to the correct endpoint, but the SSR itself did not act on the MTU value contained in the message, leaving existing sessions with an incorrect PMTU.

SSR 7.2.0 introduces two complementary enhancements to address these gaps:

1. **Underlay ICMP reaction** — When the SSR receives a TooBig packet from the underlay, it updates the affected overlay flow and generates a corrected TooBig packet toward the original packet sender, allowing the sender to adjust its segment size.

2. **Session Refresh** - The flow which was traversed to trigger the TooBig response from the underlay is now updated to use the MTU reported in the TooBig packet.

For TCP flows, setting `enforced-mss automatic` on the egress `network-interface` is the recommended complement to these features. It adjusts the TCP MSS advertised at the interface boundary to avoid fragmentation in the first place. See [Configuration](#configuration) for details.

## How The SSR Reacts to Underlay ICMP TooBig Messages

The following sequence illustrates what happens when the underlay path MTU changes after a session is already established.

### Initial State

```mermaid
sequenceDiagram
participant Client
participant Hub as Hub SSR
participant R1 as Spoke SSR
participant Server

Client->>Hub: Data (MTU 1500)
Hub->>R1: SVR overlay packet (MTU 1500)
R1->>Server: Data (MTU 1500)
Note over Hub,R1: Underlay MTU = 1500. Session PMTU on both SSRs = 1500.
```

The client and server are communicating through two peering SSRs over the overlay. The PMTU is consistent at 1500 across all hops, and both SSRs have applied an MTU of 1500 to the forward flow actions for this session.

### Underlay MTU Drops — First TooBig Received by Hub

```mermaid
sequenceDiagram
participant Client
participant Hub as Hub SSR
participant R2 as Underlay Device
participant R1 as Spoke SSR
participant Server

Note over R2,R1: Underlay MTU between Hub and R2 drops to 1300
Hub->>R2: SVR packet hub-WAN to spoke-WAN (over 1300 bytes)
R2-->>Hub: ICMP TooBig type 3 code 4, reported MTU = 1300
Note over Hub: DivertedPacketHandler finds reverse flow. Updates Hub-to-Spoke PMTU to 1300.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should emphasis that this only affects this particular flow. A different session using this same path does not realize the PMTU reported to this session. It also does not persist, meaning if the session is modified for some other reason (FIB update, device flap, etc), the egress MTU is 'reset' back to the latest at that time based on the most recent BFD-based PMTU detection test, or interface MTU if no such test has been done

Hub-->>Server: New TooBig toward Server
Note over Server: Server adjusts MSS if TCP-capable
```

When R2 (an underlay device) cannot forward an oversized packet, it sends a TooBig packet to the Hub's WAN interface. The SSR processes this message and does the following:

1. It extracts the encapsulated IP header from the TooBig body to identify the affected overlay session.
2. It finds the reverse flow using that header and updates the Hub → Spoke forward flow's PMTU to the value reported by the underlay.
Copy link
Copy Markdown

@tjbresnahan tjbresnahan May 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pedantic, but it doesnt have to be a reverse flow. It's whatever flow was traversed to elicit the TooBig response. In the context of a TCP download used throughout this doc, it would likely be a reverse flow

3. It constructs a new TooBig packet directed toward the original packet sender (the Server), so the server's TCP stack can reduce its MSS.

:::note
The MTU value propagated in the new TooBig packet reflects the underlay-reported value. On paths with encryption, HMAC, FEC, or BFD tunneling overhead, the effective usable MTU will be lower than the raw underlay value. The SSR accounts for these overheads when setting the MTU on forward flow actions.
:::

## Fabric Fragmentation and Oversize Packet Behavior

When the PMTU on an overlay (SVR/fabric) path is lower than the MTU of the segment immediately preceding the Hub, packets larger than the PMTU will require fragmentation along the overlay. The SSR always fragments fabric packets when necessary, even when the incoming packet carries the Don't Fragment (DF) bit. This preserves packet delivery but prevents the sender from learning about the smaller path MTU and adjusting its segment size.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IPv6 traffic also is supposed to never fragment at a router, but like the IPv4 with dont-frag, we do anyways when its an SVR path


:::note
For TCP traffic, setting `enforced-mss automatic` on the egress `network-interface` is the most reliable way to avoid this scenario. When set, the SSR rewrites the TCP MSS at the interface boundary to match the session MTU (including the path MTU for SVR sessions). This is commonly known as `MSS Clamping` and is not the default; it must be explicitly configured.
:::

## Configuration

### Enabling Oversize Fabric Packet Behavior

#### On a `network-interface`

```
config
authority
router <router-name>
node <node-name>
device-interface <device-interface-name>
network-interface <network-interface-name>
oversize-fabric-packet-behavior true
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is oversize-fabric-packet-behavior true a new configuration attribute? What is the default?

exit
exit
exit
exit
exit
exit
```

#### On a `service-policy`

```
config
authority
service-policy <policy-name>
oversize-fabric-packet-behavior true
exit
exit
exit
```

### Configuring `enforced-mss` (Recommended for TCP)

Set `enforced-mss` to `automatic` on egress interfaces to avoid fabric fragmentation for TCP traffic. The SSR calculates the correct MSS from the interface or path MTU for SVR sessions.

```
config
authority
router <router-name>
node <node-name>
device-interface <device-interface-name>
network-interface <network-interface-name>
enforced-mss automatic
exit
exit
exit
exit
exit
exit
```

### Configuring PMTUD Interval

The PMTUD interval (how frequently the SSR probes each overlay path) is configurable at the router level and can be overridden per neighborhood or per adjacency.

```
config
authority
router <router-name>
path-mtu-discovery
enabled true
interval 600
exit
exit
exit
exit
```

| Field | Default | Description |
| ----- | ------- | ----------- |
| `enabled` | `true` | Enables or disables PMTUD for this router. |
| `interval` | `600` | Seconds between PMTUD tests. Valid range: 1–86400. |

To override the interval for a specific adjacency:

```
config
authority
router <router-name>
node <node-name>
device-interface <device-interface-name>
network-interface <network-interface-name>
adjacency <ip-address>
path-mtu-discovery
enabled true
interval 300
exit
exit
exit
exit
exit
exit
exit
exit
```

## Verification

Use `show peers` to confirm the currently discovered path MTU for each peer path:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing to correct here, just clarifying that this IS correct today. The show peers command investigates the peer path, which will show the PMTU at that moment in time. It's the session level part that only updates when a TooBig packet comes in, and only for that single session.

I dont think there's visibility into that from a show command perspective.

A new stat, stats/icmp/flow-mtu-updates, is a count of flows that have had their MTU updated at runtime via a TooBig packet. This counter is reset when the system resets (not persisted)


```text
admin@node1.router1# show peers
Peer Node Network Interface Destination Status Hostname Path MTU
------------------------ --------- ------------------- --------------- -------- ------------ ----------
router2 node1 wan0 192.0.2.10 Up router2.lab 1300
```

A `Path MTU` value of `0` indicates PMTUD is disabled or has not yet completed a test cycle.

A new stat, `stats/icmp/flow-mtu-updates`, provides a count of flows that have had their MTU updated at runtime via a TooBig packet. This counter is reset when the system resets (not persisted).

**need stat example

## Troubleshooting

- If the path MTU shown by `show peers` does not reflect the expected value, verify that `path-mtu-discovery > enabled` is `true` on both sides of the adjacency.
- If TCP sessions continue to fragment after configuring `enforced-mss automatic`, confirm the setting is applied to the correct egress interface and that both peers have completed a PMTUD cycle.

## Related Topics

- [Concepts: Machine to Machine Communication](concepts_machine_communication.md) — path MTU discovery protocol details and BFD traffic patterns.
- [Configuration Reference Guide](config_reference_guide.md) — full parameter reference for `path-mtu-discovery`, `enforced-mss`, and `session-resiliency`.
- [Configuring Session Recovery Detection](config_session_recovery.md) — session health-check and flow rebuild mechanisms.
- [Configuring Forward Error Correction](config_forward_error_correction.md) — complementary resiliency feature for packet loss.
Loading