Support http2 upstream phase2#30
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 58ac29be44
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Code Review
This pull request introduces several H2 upstream enhancements, including deterministic rejection of the CONNECT method, detection of response truncation due to Content-Length mismatches, and support for te: trailers negotiation. It also implements a more granular timeout model for H2 streams, distinguishing between send stalls and response wait times. A comprehensive suite of tests has been added to verify these protocol behaviors. Review feedback identifies an opportunity to improve the te header tokenizer to handle parameters like weights, ensuring more robust detection of trailer support.
|
LGTM |
HTTP/2 upstream Phase-2: 9 correctness gates + 30 new tests
Summary
Closes 9 correctness gaps in the HTTP/2 upstream path that were deferred from PR #27 (Phase-1 MVP). Touches the codec / per-stream / sink layer only — no checkout-flow refactor (that's PR-3 / wait-queue work).
X-H2-Limitation: connect-not-supportedheader. Primary gate inProxyTransaction::DispatchH2, secondary defense-in-depth gate inUpstreamH2Connection::SubmitRequest.RESULT_TRUNCATED_RESPONSE(terminal — no retry; held-fallback machinery is deferred to PR-3).te: trailerscapture-and-re-emit for gRPC clients — header rewriter stripsteper RFC 7230 hop-by-hop rules; the H2 nv-array re-emits the synthetic token from a per-transaction flag set by an inline locale-safe ASCII tokenizer.OnRequestSubmitted()to the sink interface (default no-op virtual), wired through nghttp2'son_frame_send_callbackfiltered on HEADERS/DATA + END_STREAM. Per-stream send-stall closure is killed by generation bump when END_STREAM clears the wire; response-timeout arms via an arm-once flag coordinated between OnHeaders and OnRequestSubmitted (whichever fires first wins).Why
Phase-1 (PR #27) shipped the H2 codec and connection table but left correctness gaps that operators would only notice as silent misbehavior:
:scheme/:path, which RFC 9113 §8.5 forbids on CONNECT pseudo-headers).te: trailershad it stripped without re-emit, breaking trailer negotiation.Scope
What ships
New result codes (
include/upstream/proxy_transaction.h)RESULT_TRUNCATED_RESPONSE = -10— terminal in this PR (held-fallback retry deferred to PR-3). Maps to 502 BadGateway inMakeErrorResponse.RESULT_H2_METHOD_NOT_SUPPORTED = -11— terminal. DedicatedMakeErrorResponsebranch emitsX-H2-Limitation: connect-not-supported. Pre-routing hook inOnErrorcallsReleaseBreakerAdmissionNeutralso the rejection isn't counted as an upstream failure.// -12 reserved for PR-3's RESULT_GOAWAY_UNPROCESSED.SEND_STALL_FALLBACK_MS = 30000lifted from a function-local in the H1 send loop to a class-levelstatic constexprso both H1 and H2 paths share the sameresponse_timeout_ms == 0zero-disable semantic.CONNECT rejection (criterion 20)
ProxyTransaction::DispatchH2— early-return withRESULT_H2_METHOD_NOT_SUPPORTEDbefore any pool / lease / breaker work.UpstreamH2Connection::SubmitRequest— explicitreturn -1after warn log + sink->OnError dispatch (defense-in-depth for any future caller bypassing DispatchH2).Truncation + NO_BODY validation (criteria 47, 55, 78)
request_methodandbody_bytes_receivedfields onUpstreamH2Stream.OnHeadersCompleteextends NO_BODY framing to include status 204/304 + HEAD requests.OnDataChunkRecvCallbackStep 1.5 validates pre-dispatch: rejects body bytes on NO_BODY framing; rejects DATA whose length exceedsexpected_length - body_bytes_received. Rejection routes throughself->ResetStream(stream_id)so the existingin_receive_data_guard defers the inline FlushSend.OnStreamClose(NO_ERROR)branch surfaces CL short-reads asRESULT_TRUNCATED_RESPONSEinstead of dispatchingOnComplete.Defense-in-depth caveat (called out in
docs/http2_upstream.mdand the internal reference doc): nghttp2's HTTP messaging enforcement (default-on) intercepts CL/NO_BODY violations BEFORE invoking user callbacks, routing them throughOnStreamClose's non-NO_ERROR branch asRESULT_UPSTREAM_DISCONNECT. The application-level Step 1.5 and CL-short-read-on-NO_ERROR checks are unreachable on the standardnghttp2_session_client_newpath; they are retained as a backstop for future code paths that opt out of enforcement (e.g. vianghttp2_option_set_no_http_messagingfor streaming protocols). Operator dashboards see truncation in the disconnect bucket today, not a dedicated truncation bucket. Comments at the two callsites flag this clearly so future contributors know which layer they're working under.te: trailerscapture (criteria 69, 72)ProxyTransactionconstructor body adds an inline tokenizer that runs BEFOREHeaderRewriter::RewriteRequeststrips alltevalues:if (c >= 'A' && c <= 'Z') c |= 0x20) —std::tolowerwould corrupt Turkish locale ('I' → 'ı').client_te_trailers_bool when any token equals "trailers".The H2 outbound nv-array build re-emits a synthetic
te: trailersfrom this flag AFTER the rewriter strip pass:UpstreamH2Connection::SubmitRequestsignature gainsbool client_te_trailers = falseas the last positional param (defaulted to keep test callers compiling).Send-stall + response-timeout handoff (criteria 23, 70)
New
OnRequestSubmitted()virtual onUpstreamResponseSink— default no-op so non-H2 sinks ignore. Fires fromOnFrameSendCallbackinupstream_h2_connection.cc(registered vianghttp2_session_callbacks_set_on_frame_send_callback) on HEADERS/DATA frames carrying END_STREAM.ProxyTransaction::DispatchH2initializes ALL H2 state BEFORE callingh2->SubmitRequest:This resolves the synchronous-fire race for bodyless requests where nghttp2 inline-flushes HEADERS+END_STREAM during
SubmitRequest's internalFlushSend. If state init were deferred to after submit,OnRequestSubmitted'sif (!h2_path_) return;guard would silently drop the kill of the send-stall closure — spuriousOnError(UPSTREAM_DISCONNECT)~30s later.OnRequestSubmitteditself:h2_send_stall_generation_(kills the closure queued in DispatchH2).SENDING_REQUEST → AWAITING_RESPONSEif state hasn't already advanced (early peer reply).ArmResponseTimeout()if!h2_response_timeout_armed_.OnHeaders(H2 branch — distinct from H1) does NOT poison the connection on early-final-headers (per-stream isolation) and arms response-timeout via the same arm-once flag.CleanupH2 branch enforces strict ordering:h2_send_stall_generation_(invalidates queued closure).ClearResponseTimeout()— keys onh2_path_to bumph2_response_timeout_generation_; MUST run withh2_path_still true.h2_response_timeout_armed_ = false.h2_path_ = false(last).Submit-failure rollback
If
SubmitRequestreturns failure AFTER the inline FlushSend has already firedOnRequestSubmittedsynchronously (and armed the response-timeout closure), the rollback bumps BOTH generation counters and resets the arm-once flag:State stays untouched —
AttemptCheckout(called byMaybeRetry's deferred-retry timer and immediate-retry branch) resetsstate_before the next attempt.Tests (commits 8 from the implementation sequence)
test/h2_upstream_test.hgrows from 63 to 93 tests:All tests follow the dtor-defends-FailAllStreams lifetime pitfall:
RecordingSinkdeclared BEFOREUpstreamH2Connectionso reverse-destruction order is correct.CI updates
.github/workflows/weekly-valgrind.yml—h2_upstreamadded to the suite list (it's a protocol/lifecycle suite — sink lifecycle, generation counters, nghttp2 heap; weekly valgrind catches reads-of-uninitialized that ASan misses).ci.yml::build-linux-tsan-restandbuild-macosalready hadh2_upstream(TSan and OS-sensitivity coverage — no edit needed).Public docs (
docs/http2_upstream.md)Two new entries under
## Caveats:RESULT_UPSTREAM_DISCONNECTbucket (nghttp2 messaging enforcement intercepts before our checks); guidance to correlate with upstream-side response logs if they need to distinguish "peer reset / TCP drop" from "Content-Length short read".Test plan
make -j4builds clean../test_runner— 1257/1257 passing../test_runner h2_upstream— 93/93 passing (was 63/63).build-linux-gcc,build-linux-clang,build-linux-asan,build-linux-tsan-heavy,build-linux-tsan-rest,build-macos. (The weekly valgrind job will exerciseh2_upstreamon its next Sunday run.)What's deferred (PR-3)
RESULT_TRUNCATED_RESPONSEis terminal. Retry requires the partial-response replay buffer (paused_buffer_ + live-mirrorpending_retryable_5xx_body_+ drain backpressure + deferred terminal dispatch +OnBodyChunk == CLOSEDterminal). Coupled to PR-3's wait-queue / replacement-connect machinery.RESULT_GOAWAY_UNPROCESSEDretry within one tick (criterion 54) — reserved result code -12.