fix(pb-envoy): reduce round trips for ws connection#4603
Conversation
|
PR Review: fix(pb-envoy): reduce round trips for ws connection This is a well-structured optimization that eliminates a blocking round trip during WebSocket connection establishment by moving version and envoy_key from an initial protocol message to URL query parameters. The concurrent tokio::try_join! is clean. A few issues are worth addressing. BUG 1: is_serverless hardcoded to false in Conn The computed is_serverless value is correctly used to call report_success, but never propagated to the Conn struct (is_serverless: false is hardcoded). This means the guard in ws_to_tunnel_task.rs that calls pegboard::ops::envoy::expire only for serverful envoys on ToRivetStopping will incorrectly fire for serverless envoys too, since conn.is_serverless is always false. Fix: use is_serverless (the computed variable) in the struct literal. BUG 2: Double util::timestamp::now() call for create_ts In the new-envoy branch of the DB transaction, create_ts is assigned from one call to timestamp::now(), but tx.write calls timestamp::now() again for the stored value. These two timestamps will differ. Since create_ts is later used to construct ActiveEnvoyKey, ActiveEnvoyByNameKey, and EnvoyLoadBalancerIdxKey, those keys will not match what was written to create_ts_key. MINOR 1: Misleading metric name RECEIVE_INIT_PACKET_DURATION now measures namespace resolution time rather than the time to receive an init packet (no longer a blocking step). The name should be updated (e.g. NAMESPACE_RESOLVE_DURATION) to keep dashboards and alerts accurate. MINOR 2: prepopulate_actor_names always Some in client prepopulate_actor_names: Some(prepopulate_map) is sent even when the map is empty, triggering the write loop in handle_metadata unconditionally. Sending None when the map is empty avoids the unnecessary DB transaction. NOTE: envoy_key now in URL query string Moving envoy_key from a binary protocol message to a URL query parameter means it may appear in access logs on proxies or load balancers. Since these are server-to-server connections over TLS this is a low-risk trade-off, but worth auditing log retention and scrubbing policies to ensure keys are not persisted in plain-text logs. Summary: The core optimization is sound and the protocol rename (ToRivetInit to ToRivetMetadata) is cleaner semantically. BUG 1 (is_serverless hardcode) and BUG 2 (double timestamp::now()) should be followed up on. |
cd89bc9 to
f4762f2
Compare
f171f60 to
db25562
Compare
db25562 to
c694cb5
Compare
002100d to
fc3149f
Compare
c694cb5 to
91328b0
Compare
91328b0 to
2a54f2e
Compare
fc3149f to
26f63c1
Compare
Merge activity
|
26f63c1 to
cfda19b
Compare

Description
Please include a summary of the changes and the related issue. Please also include relevant motivation and context.
Type of change
How Has This Been Tested?
Please describe the tests that you ran to verify your changes.
Checklist: