Networking Communication Protocol #2
Conversation
3363f73 to
09d57e3
Compare
Akshay-Srivatsan
left a comment
There was a problem hiding this comment.
I haven't gone through this fully, but I think in general this is more code than necessary and some structural changes would clean it up a lot. Namely, modelling the protocol directly in Rust structs [1], and more clearly separating the different layers (like the CS 144 codebase). The protocol docs kind of reflect this; they don't have a separation between layers right now, and describe the entire thing as "the protocol" even though a bunch of the parts are independent.
There's also a scattered cases where errors are handled inconsistently (a mix of panicking, returning Results, and busy-waiting), which should probably be cleaned up.
[1]: e.g., the messages could look something like:
enum ControlRequest {
Listen { request_id: u32, endpoint: Endpoint },
Connect { request_id: u32, endpoint: Endpoint },
Accept { request_id: u32, listener_id: u32 },
}
enum ControlReply {
ListenOk { request_id: u32, listener_id: u32 },
ListenErr { request_id: u32, code: u32 },
ConnectOk { request_id: u32, ready: ConnectionReady },
ConnectErr { request_id: u32, code: u32 },
AcceptOk { request_id: u32, ready: ConnectionReady },
AcceptErr { request_id: u32, code: u32 },
}And then parsing a request could be ControlRequest::try_from(bytes)? or similar.
| match pipe.write(&buf[off..n]) { | ||
| Ok(0) => return Ok(off), | ||
| Ok(m) => off += m, | ||
| Err(PipeError::WouldBlock) => {} |
There was a problem hiding this comment.
Can we avoid busy-waiting here?
| MessageType::AcceptRequest => { | ||
| let lid = AcceptListenerId::decode(frame.payload()).listener_id; | ||
| if !self.listeners.contains_key(&lid) { | ||
| return Err(MonitorError::UnknownListener(lid)); |
There was a problem hiding this comment.
Should something be communicated back to the guest here?
| pub fn decode(payload: &[u8]) -> Self { | ||
| assert!(payload.len() == 6, "endpoint payload must be 6 bytes"); | ||
| let host = [payload[0], payload[1], payload[2], payload[3]]; | ||
| let port = u16::from_le_bytes([payload[4], payload[5]]); | ||
| Endpoint::new(host, port) |
There was a problem hiding this comment.
The code structure in this file (of using decode as an infallible constructor that panics) is a bit weird. Can this use the TryFrom trait instead?
| /// Catalog of message kinds carried on the control pipe. | ||
| /// | ||
| /// Keep this list **small** and add new variants only when there's a real | ||
| /// need. Each variant is a single byte on the wire. | ||
| #[derive(Debug, Clone, Copy, PartialEq, Eq)] | ||
| #[repr(u8)] | ||
| pub enum MessageType { | ||
| /// Arca → Linux: please `bind`+`listen` on the given [`Endpoint`]. | ||
| ListenRequest = 1, | ||
| /// Linux → Arca: listener is ready; payload is [`ListenerReady`]. | ||
| ListenOk = 2, | ||
| /// Arca → Linux: please `connect` outbound to the given [`Endpoint`]. | ||
| ConnectRequest = 3, | ||
| /// Linux → Arca: outbound connect succeeded; payload is [`ConnectionReady`]. | ||
| ConnectOk = 4, | ||
| /// Linux → Arca: reply to [`MessageType::AcceptRequest`]; payload is | ||
| /// [`ConnectionReady`] with `listener_id` set. | ||
| IncomingConnection = 5, | ||
| /// Linux → Arca: bind/listen failed; payload is [`ErrPayload`]. | ||
| ListenErr = 6, | ||
| /// Linux → Arca: outbound connect failed; payload is [`ErrPayload`]. | ||
| ConnectErr = 7, | ||
| /// Arca → Linux: block until the next inbound TCP connection on this | ||
| /// listener; payload is [`AcceptListenerId`] (4 B). | ||
| /// | ||
| /// The matching reply is [`MessageType::IncomingConnection`] with the same | ||
| /// `request_id`. | ||
| AcceptRequest = 8, | ||
| } | ||
|
|
||
| impl MessageType { | ||
| /// Reverse of `as u8`, returning `None` for unknown bytes so the codec | ||
| /// can reject garbage instead of silently misinterpreting it. | ||
| pub fn from_u8(v: u8) -> Option<Self> { | ||
| match v { | ||
| 1 => Some(Self::ListenRequest), | ||
| 2 => Some(Self::ListenOk), | ||
| 3 => Some(Self::ConnectRequest), | ||
| 4 => Some(Self::ConnectOk), | ||
| 5 => Some(Self::IncomingConnection), | ||
| 6 => Some(Self::ListenErr), | ||
| 7 => Some(Self::ConnectErr), | ||
| 8 => Some(Self::AcceptRequest), | ||
| _ => None, | ||
| } | ||
| } | ||
| } | ||
|
|
||
| /// One control message in memory (header + inline payload buffer). | ||
| /// | ||
| /// The `payload` array is fixed-size so `ControlFrame` is `Copy` and lives | ||
| /// happily in `no_std`. Only the first `payload_len` bytes are valid; use | ||
| /// [`ControlFrame::payload`] to get the live slice. | ||
| #[derive(Debug, Clone, Copy, PartialEq, Eq)] | ||
| pub struct ControlFrame { | ||
| pub message_type: MessageType, | ||
| pub request_id: u32, | ||
| pub payload_len: u16, | ||
| pub payload: [u8; MAX_FRAME_PAYLOAD], | ||
| } | ||
|
|
||
| impl ControlFrame { | ||
| /// Build a frame, copying `payload` into the inline buffer. | ||
| /// | ||
| /// Panics if `payload.len() > MAX_FRAME_PAYLOAD`. Callers always own the | ||
| /// payload, so this is a programming bug, not a runtime input error. | ||
| pub fn new(message_type: MessageType, request_id: u32, payload: &[u8]) -> Self { | ||
| assert!(payload.len() <= MAX_FRAME_PAYLOAD, "payload too large"); | ||
|
|
||
| let mut out = Self { | ||
| message_type, | ||
| request_id, | ||
| payload_len: payload.len() as u16, | ||
| payload: [0u8; MAX_FRAME_PAYLOAD], | ||
| }; | ||
| out.payload[..payload.len()].copy_from_slice(payload); | ||
| out | ||
| } | ||
|
|
||
| /// The valid prefix of `self.payload`. | ||
| pub fn payload(&self) -> &[u8] { | ||
| &self.payload[..self.payload_len as usize] | ||
| } | ||
| } |
There was a problem hiding this comment.
This feels weirdly structured. It seems like there should be an enum Message that contains the message type and payload (parsed into semantically meaningful structures), and that enum should implement all the encode/decode logic.
This adds the whole control protocol end to end — everything the design in
PROTOCOL.mdcovers: wire framing, message catalog,arca-control(ArcaSession, codec),arca-monitor(Listen/Connect/ accept flow), and how data pipes get handed off afterConnectionReady.Core behavior: Arca talks to the monitor over one control pipe;
listenandconnectare request/reply; inbound TCP usesAcceptRequestsoIncomingConnectionis always a reply with the samerequest_id(no unsolicited events, no Arca stash). The monitor keeps a per-listener pending-accept queue and only runs kernelacceptwhen a wait is queued;poll_accepts,pump_once,FrameReadBuf, andserve_oneimplement the driver side.ConnectRequestblocks on the real connect.PROTOCOL.mddescribes the full spec; tests cover the crate APIs and monitor paths.Not compatible with a monitor that only pushed
IncomingConnectionwithrequest_id0 and neverAcceptRequest.