Stop using Router-ID to distinguish between BGP sessions #398

taspelund · 2024-10-25T17:01:50Z

BGP-ID is insufficient to distinguish between BGP sessions, as it's perfectly valid/reasonable to have multiple sessions to the same peer. Or an operator can (incorrectly or otherwise) assign the same BGP-ID to multiple routers.
This ensures that Session-A to Peer-A going down does not impact Session-B to Peer-A.

Fixes: #241

rdb/src/types.rs

taspelund · 2024-10-26T00:26:47Z

I'd be interested in some feedback on whether or not it would be worth destructuring the Option wrapping the local sockaddr.
e.g.

➜  maghemite git:(trey/unique_neigh) ✗ rustfmt $(git status | grep modified | grep -v 'json\|Cargo.lock\|Cargo.toml' | awk '{print $2}' | tr '\n' diff --git a/rdb/src/types.rs b/rdb/src/types.rs
index 7c1211c..db3d0a4 100644
--- a/rdb/src/types.rs
+++ b/rdb/src/types.rs
@@ -26,13 +26,19 @@ pub struct Path {

 #[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, Eq, PartialEq)]
 pub struct SocketAddrPair {
-    pub local: Option<SocketAddr>,
+    pub local: SocketAddr,
     pub peer: SocketAddr,
 }

 impl SocketAddrPair {
     pub fn new(local: Option<SocketAddr>, peer: SocketAddr) -> Self {
-        Self { local, peer }
+        Self {
+            local: local.unwrap_or(SocketAddr::new(
+                IpAddr::from_str("0.0.0.0").unwrap(),
+                0,
+            )),
+            peer,
+        }
     }
 }

I'm not sure if this is a reasonable approach or if there's a more idiomatic way to do this.
The TcpStream type has a local_addr() method to query the kernel for the local SockAddr, but that's already used by BgpConnectionTcp to populate that field + store the return as an Option.

I'm just not sure if there's a better way to handle this, or if leaving it as an Option makes the most sense.

internet-diglett · 2024-10-28T16:26:53Z

I'd be interested in some feedback on whether or not it would be worth destructuring the Option wrapping the local sockaddr. e.g.
➜  maghemite git:(trey/unique_neigh) ✗ rustfmt $(git status | grep modified | grep -v 'json\|Cargo.lock\|Cargo.toml' | awk '{print $2}' | tr '\n' diff --git a/rdb/src/types.rs b/rdb/src/types.rs
index 7c1211c..db3d0a4 100644
--- a/rdb/src/types.rs
+++ b/rdb/src/types.rs
@@ -26,13 +26,19 @@ pub struct Path {

 #[derive(Debug, Clone, Serialize, Deserialize, JsonSchema, Eq, PartialEq)]
 pub struct SocketAddrPair {
-    pub local: Option<SocketAddr>,
+    pub local: SocketAddr,
     pub peer: SocketAddr,
 }

 impl SocketAddrPair {
     pub fn new(local: Option<SocketAddr>, peer: SocketAddr) -> Self {
-        Self { local, peer }
+        Self {
+            local: local.unwrap_or(SocketAddr::new(
+                IpAddr::from_str("0.0.0.0").unwrap(),
+                0,
+            )),
+            peer,
+        }
     }
 }
I'm not sure if this is a reasonable approach or if there's a more idiomatic way to do this. The TcpStream type has a local_addr() method to query the kernel for the local SockAddr, but that's already used by BgpConnectionTcp to populate that field + store the return as an Option.

I'm just not sure if there's a better way to handle this, or if leaving it as an Option makes the most sense.

This could make sense. In what situations do we expect the local SocketAddr to be None?

taspelund · 2024-10-28T19:32:21Z

This could make sense. In what situations do we expect the local SocketAddr to be None?

It looks like local_addr() is the top of a bunch of layers wrapping libc getsockname():
https://stdrs.dev/nightly/x86_64-pc-windows-gnu/src/std/sys_common/net.rs.html#309-311

So if the illumos man page is to be trusted on this:

ERRORS

       The call succeeds unless:

       EBADF
                   The argument s is not a valid file descriptor.


       ENOMEM
                   There was insufficient memory available for the operation to
                   complete.


       ENOSR
                   There were insufficient STREAMS resources available for the
                   operation to complete.


       ENOTSOCK
                   The argument s is not a socket.

It seems like the only time we'd realistically expect this to fail for reasons within our control is if we call local_addr() for an invalid fd/socket... which I'd guess would only happen if the connect() call fails or if the socket gets closed out from under us?

From the API handler side, mgd checks whether or not there's a valid SessionRunner<Cnx: BgpConnection> for the peer we're attempting to delete before grabbing the local/peer sockaddr:

    pub(crate) async fn remove_neighbor(
        ctx: Arc<HandlerContext>,
        asn: u32,
        addr: IpAddr,
    ) -> Result<HttpResponseDeleted, Error> {
        info!(ctx.log, "remove neighbor: {}", addr);

        let session = get_router!(ctx, asn)?
            .get_session(addr)
            .ok_or(Error::NotFound("session for bgp peer not found".into()))?;
        let conn =
            SocketAddrPair::new(session.bind_addr, session.neighbor.host);
        ctx.db.remove_bgp_peer_prefixes(conn);
        ctx.db.remove_bgp_neighbor(addr)?;
        get_router!(&ctx, asn)?.delete_session(addr);

        Ok(HttpResponseDeleted())
    }

It looks like the only time mgd calls into the kernel to query the local sockaddr is during Update generation -- to figure out what next-hop to set on routes we originate.

I don't imagine we need to do this more than once (store the local sockaddr after a successful connect() call), so long as we properly handle the socket dying underneath us. I'm not sure if/how we handle those scenarios today (maybe we just need to check for an err every time we interact with the socket/BgpConnection?) but maybe that's an indication I need to get a better understanding of our socket/error handling.

BGP-ID is insufficient to distinguish between BGP sessions, as it's perfectly valid/reasonable to have multiple sessions to the same peer. Or an operator can (incorrectly or otherwise) assign the same BGP-ID to multiple routers. This ensures that Session-A to Peer-A going down does not impact Session-B to Peer-A. Fixes: #241 Signed-off-by: Trey Aspelund <trey@oxidecomputer.com>

Signed-off-by: Trey Aspelund <trey@oxidecomputer.com>

Add test case to simulate multiple BGP sessions to same peer Also refactor rdb's test_rib() to be a bit more clean and clear. Signed-off-by: Trey Aspelund <trey@oxidecomputer.com>

Switch from SocketAddrPair to IpAddr for distinguishing between BGP sessions. A BGP peer is identified by the IP we are connecting to, and multiple TCP sessions via the same IP isn't something we're looking to support. So instead of trying to handle all the cases where the local SocketAddr may or may not be available, simplify things by using the configured peer IP which will always be present regardless of the peer's FSM State. Signed-off-by: Trey Aspelund <trey@oxidecomputer.com>

taspelund · 2024-11-01T04:33:29Z

@rcgoodfellow apparently I just needed a week or so to let the thoughts percolate, but I think you were right about using the peer IP as the distinguisher...

The biggest benefit I see to this is that the IP a stateless object (it's either configured/present or it's not) so we won't have to think about socket state changing. No error handling if we can't query the socket state from the kernel. No changes to the hash input, possibly leaving us unable to cleanup entries inserted with the old socket state (mainly the ephemeral port).

I've updated the code to use an IpAddr instead of a SocketAddrPair, which I think has simplified things quite a bit.
I've also cleaned up logging in the rdb tests.

Another review over the latest changes would be appreciated.

move rdb over to using mg_common logging functions, rather than its own re-implementation. Also add a .gitignore for *.log so nextest doesn't pollute the repo. Signed-off-by: Trey Aspelund <trey@oxidecomputer.com>

taspelund · 2024-11-01T07:28:50Z

One other comment I'll leave regarding the session discriminator: if we ever decide to support peering to the same address multiple times (e.g. via explicit ipv6 link-local addresses), this will likely need to be updated again to include a "scope" in addition to the peer address

taspelund added Bug needs testing bgp Border Gateway Protocol labels Oct 25, 2024

taspelund requested review from zeeshanlakhani, rcgoodfellow and internet-diglett October 25, 2024 17:01

taspelund self-assigned this Oct 25, 2024

taspelund force-pushed the trey/unique_neigh branch from 35e9257 to 8e5d38f Compare October 25, 2024 17:55

zeeshanlakhani reviewed Oct 25, 2024

View reviewed changes

rdb/src/types.rs Outdated Show resolved Hide resolved

rdb/src/types.rs Outdated Show resolved Hide resolved

taspelund force-pushed the trey/unique_neigh branch from 8735a28 to 8e5d38f Compare October 25, 2024 18:09

taspelund marked this pull request as ready for review October 25, 2024 21:56

taspelund added 4 commits October 28, 2024 14:10

remove inadvertent comment

4e52b58

Add new() method for SocketAddrPair

d20095a

Signed-off-by: Trey Aspelund <trey@oxidecomputer.com>

Add test case for multiple BGP sessions to same peer

75b03e1

Add test case to simulate multiple BGP sessions to same peer Also refactor rdb's test_rib() to be a bit more clean and clear. Signed-off-by: Trey Aspelund <trey@oxidecomputer.com>

rcgoodfellow approved these changes Oct 30, 2024

View reviewed changes

taspelund force-pushed the trey/unique_neigh branch from 774c5e6 to 6f41d48 Compare November 1, 2024 04:20

rdb: test logging improvements

da93056

move rdb over to using mg_common logging functions, rather than its own re-implementation. Also add a .gitignore for *.log so nextest doesn't pollute the repo. Signed-off-by: Trey Aspelund <trey@oxidecomputer.com>

taspelund force-pushed the trey/unique_neigh branch from 6f41d48 to da93056 Compare November 1, 2024 04:34

taspelund changed the title ~~Use socket addr pair to distinguish between BGP sessions~~ Stop using Router-ID to distinguish between BGP sessions Nov 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stop using Router-ID to distinguish between BGP sessions #398

Stop using Router-ID to distinguish between BGP sessions #398

taspelund commented Oct 25, 2024

taspelund commented Oct 26, 2024

internet-diglett commented Oct 28, 2024

taspelund commented Oct 28, 2024 •

edited

Loading

taspelund commented Nov 1, 2024

taspelund commented Nov 1, 2024

Stop using Router-ID to distinguish between BGP sessions #398

Are you sure you want to change the base?

Stop using Router-ID to distinguish between BGP sessions #398

Conversation

taspelund commented Oct 25, 2024

taspelund commented Oct 26, 2024

internet-diglett commented Oct 28, 2024

taspelund commented Oct 28, 2024 • edited Loading

taspelund commented Nov 1, 2024

taspelund commented Nov 1, 2024

taspelund commented Oct 28, 2024 •

edited

Loading