Skip to content

Networking Communication Protocol #2

Open
majdnasra2 wants to merge 5 commits into
feat/bidirectional-pipefrom
majdnasra2/comm-protocol
Open

Networking Communication Protocol #2
majdnasra2 wants to merge 5 commits into
feat/bidirectional-pipefrom
majdnasra2/comm-protocol

Conversation

@majdnasra2
Copy link
Copy Markdown

@majdnasra2 majdnasra2 commented Apr 22, 2026

This adds the whole control protocol end to end — everything the design in PROTOCOL.md covers: wire framing, message catalog, arca-control (ArcaSession, codec), arca-monitor (Listen / Connect / accept flow), and how data pipes get handed off after ConnectionReady.

Core behavior: Arca talks to the monitor over one control pipe; listen and connect are request/reply; inbound TCP uses AcceptRequest so IncomingConnection is always a reply with the same request_id (no unsolicited events, no Arca stash). The monitor keeps a per-listener pending-accept queue and only runs kernel accept when a wait is queued; poll_accepts, pump_once, FrameReadBuf, and serve_one implement the driver side. ConnectRequest blocks on the real connect.

PROTOCOL.md describes the full spec; tests cover the crate APIs and monitor paths.

Not compatible with a monitor that only pushed IncomingConnection with request_id 0 and never AcceptRequest.

@majdnasra2 majdnasra2 changed the base branch from main to feat/bidirectional-pipe April 22, 2026 22:34
@majdnasra2 majdnasra2 marked this pull request as ready for review May 13, 2026 22:23
@majdnasra2 majdnasra2 changed the title [IN PROGRESS] Networking Communication Protocol Networking Communication Protocol May 13, 2026
@majdnasra2 majdnasra2 force-pushed the majdnasra2/comm-protocol branch from 3363f73 to 09d57e3 Compare May 13, 2026 22:34
Copy link
Copy Markdown

@Akshay-Srivatsan Akshay-Srivatsan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't gone through this fully, but I think in general this is more code than necessary and some structural changes would clean it up a lot. Namely, modelling the protocol directly in Rust structs [1], and more clearly separating the different layers (like the CS 144 codebase). The protocol docs kind of reflect this; they don't have a separation between layers right now, and describe the entire thing as "the protocol" even though a bunch of the parts are independent.

There's also a scattered cases where errors are handled inconsistently (a mix of panicking, returning Results, and busy-waiting), which should probably be cleaned up.

[1]: e.g., the messages could look something like:

  enum ControlRequest {
      Listen { request_id: u32, endpoint: Endpoint },
      Connect { request_id: u32, endpoint: Endpoint },
      Accept { request_id: u32, listener_id: u32 },
  }

  enum ControlReply {
      ListenOk { request_id: u32, listener_id: u32 },
      ListenErr { request_id: u32, code: u32 },
      ConnectOk { request_id: u32, ready: ConnectionReady },
      ConnectErr { request_id: u32, code: u32 },
      AcceptOk { request_id: u32, ready: ConnectionReady },
      AcceptErr { request_id: u32, code: u32 },
  }

And then parsing a request could be ControlRequest::try_from(bytes)? or similar.

Comment thread monitor/src/relay.rs
match pipe.write(&buf[off..n]) {
Ok(0) => return Ok(off),
Ok(m) => off += m,
Err(PipeError::WouldBlock) => {}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid busy-waiting here?

Comment thread monitor/src/lib.rs
MessageType::AcceptRequest => {
let lid = AcceptListenerId::decode(frame.payload()).listener_id;
if !self.listeners.contains_key(&lid) {
return Err(MonitorError::UnknownListener(lid));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should something be communicated back to the guest here?

Comment thread control/src/protocol.rs
Comment on lines +140 to +144
pub fn decode(payload: &[u8]) -> Self {
assert!(payload.len() == 6, "endpoint payload must be 6 bytes");
let host = [payload[0], payload[1], payload[2], payload[3]];
let port = u16::from_le_bytes([payload[4], payload[5]]);
Endpoint::new(host, port)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code structure in this file (of using decode as an infallible constructor that panics) is a bit weird. Can this use the TryFrom trait instead?

Comment thread control/src/protocol.rs
Comment on lines +33 to +116
/// Catalog of message kinds carried on the control pipe.
///
/// Keep this list **small** and add new variants only when there's a real
/// need. Each variant is a single byte on the wire.
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
#[repr(u8)]
pub enum MessageType {
/// Arca → Linux: please `bind`+`listen` on the given [`Endpoint`].
ListenRequest = 1,
/// Linux → Arca: listener is ready; payload is [`ListenerReady`].
ListenOk = 2,
/// Arca → Linux: please `connect` outbound to the given [`Endpoint`].
ConnectRequest = 3,
/// Linux → Arca: outbound connect succeeded; payload is [`ConnectionReady`].
ConnectOk = 4,
/// Linux → Arca: reply to [`MessageType::AcceptRequest`]; payload is
/// [`ConnectionReady`] with `listener_id` set.
IncomingConnection = 5,
/// Linux → Arca: bind/listen failed; payload is [`ErrPayload`].
ListenErr = 6,
/// Linux → Arca: outbound connect failed; payload is [`ErrPayload`].
ConnectErr = 7,
/// Arca → Linux: block until the next inbound TCP connection on this
/// listener; payload is [`AcceptListenerId`] (4 B).
///
/// The matching reply is [`MessageType::IncomingConnection`] with the same
/// `request_id`.
AcceptRequest = 8,
}

impl MessageType {
/// Reverse of `as u8`, returning `None` for unknown bytes so the codec
/// can reject garbage instead of silently misinterpreting it.
pub fn from_u8(v: u8) -> Option<Self> {
match v {
1 => Some(Self::ListenRequest),
2 => Some(Self::ListenOk),
3 => Some(Self::ConnectRequest),
4 => Some(Self::ConnectOk),
5 => Some(Self::IncomingConnection),
6 => Some(Self::ListenErr),
7 => Some(Self::ConnectErr),
8 => Some(Self::AcceptRequest),
_ => None,
}
}
}

/// One control message in memory (header + inline payload buffer).
///
/// The `payload` array is fixed-size so `ControlFrame` is `Copy` and lives
/// happily in `no_std`. Only the first `payload_len` bytes are valid; use
/// [`ControlFrame::payload`] to get the live slice.
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub struct ControlFrame {
pub message_type: MessageType,
pub request_id: u32,
pub payload_len: u16,
pub payload: [u8; MAX_FRAME_PAYLOAD],
}

impl ControlFrame {
/// Build a frame, copying `payload` into the inline buffer.
///
/// Panics if `payload.len() > MAX_FRAME_PAYLOAD`. Callers always own the
/// payload, so this is a programming bug, not a runtime input error.
pub fn new(message_type: MessageType, request_id: u32, payload: &[u8]) -> Self {
assert!(payload.len() <= MAX_FRAME_PAYLOAD, "payload too large");

let mut out = Self {
message_type,
request_id,
payload_len: payload.len() as u16,
payload: [0u8; MAX_FRAME_PAYLOAD],
};
out.payload[..payload.len()].copy_from_slice(payload);
out
}

/// The valid prefix of `self.payload`.
pub fn payload(&self) -> &[u8] {
&self.payload[..self.payload_len as usize]
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels weirdly structured. It seems like there should be an enum Message that contains the message type and payload (parsed into semantically meaningful structures), and that enum should implement all the encode/decode logic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants