esi: replace _findOpeningTag with memchr by pixitha · Pull Request #13173 · apache/trafficserver

pixitha · 2026-05-18T22:08:18Z

Problem

EsiParser::_findOpeningTag uses a hand-rolled two-state-machine byte loop
to scan response bodies for <esi: and <!--esi opening tags. On a CDN
edge server profiled under production load, this function appeared as a
top-5 CPU leaf, ahead of OpenSSL EC math and zlib.

The root cause is that the loop is unconditionally O(n) at ~2 ns/byte — it
checks every byte even when the response body contains no ESI tags at all,
which is the common case on a CDN edge.

There is also a correctness issue in the original: the KMP-failure comment
in the source notes that the state machine mishandles sequences like
<e<esi: — the first <e advances a partial-match counter but then
fails, and the recovery doesn't rewind to the second <. The new
implementation avoids this entirely.

Solution

Replace the loop with memchr + memcmp:

memchr locates the next < — delegates to the platform's optimized
libc implementation (e.g. __memchr_avx2 on glibc x86-64), which
processes 128 bytes per iteration in an AVX2 unrolled loop.
memcmp verifies the candidate anchor (<esi: or <!--esi).
On no-match, advance past the < and repeat.

Partial-match (chunk boundary) paths are preserved: when the buffer ends
mid-prefix, the function returns PARTIAL_MATCH so the caller can
accumulate more data before deciding.

The only interface change is #include <cstring> — all call sites, return
types, and semantics are identical.

Performance

Note: synthetic microbenchmark. Benchmark runs each scenario over a
256 KiB body for 1.0 s. Real response bodies are shorter and vary in
tag density; treat these as directional numbers.

Measured on E5-2683 v4 Broadwell, 2.10 GHz, RHEL 8, glibc 2.28, gcc -O2:

Scenario	Baseline GB/s	memchr GB/s	Speedup
text-only (no `<`)	0.503	36.871	73×
typical HTML (~1.5% `<`, no ESI)	0.475	19.241	40×
html-sparse	0.470	18.896	40×
html-dense	0.437	14.853	34×
`<!--esi` comment nodes	0.431	18.362	43×
pathological (5% bare `<`)	0.290	1.512	5×

The pathological case is the worst case for this approach — every false <
triggers a memcmp. Real CDN response bodies fall in the typical HTML range.

Testing

Unit tests: 870 assertions pass in parser_test. Four new SECTION
blocks cover boundary conditions specific to this implementation:
exact-length prefix at chunk end for both <esi: and <!--esi,
<!--esi without required trailing whitespace, and multiple false <
anchors before a valid tag.
Autests: esi, esi_304, and esi_nested_include all pass (built
inside ci.trafficserver.apache.org/ats/fedora:42, CMake preset
ci-fedora-autest).
Differential correctness check: 10 million random inputs were run
through both the old and new implementations — 0 mismatches across all
MATCH_TYPE returns, positions, and is_html_comment_node flags.

Replace the hand-rolled two-state-machine loop with memchr + memcmp. memchr delegates scanning to the platform's optimized implementation (e.g. __memchr_avx2 on glibc x86-64) to skip non-'<' bytes, then memcmp verifies each candidate anchor. This eliminates the KMP-failure limitation noted in the original comment: the old implementation could mishandle opening sequences like '<e<esi'; the new one finds the correct anchor unconditionally. Adds four parser_test sections covering boundary cases specific to the new implementation: exact-length prefix at chunk end for both <esi: and <!--esi, <!--esi without required trailing whitespace, and multiple false '<' anchors before a valid tag.

Copilot

Pull request overview

Replaces the hand-rolled byte-by-byte state machine in EsiParser::_findOpeningTag with a memchr+memcmp based scan that delegates the common case (skipping non-< bytes) to libc's optimized implementation. This also fixes a known correctness limitation of the previous state machine (mishandling of repeated opening sequences like <e<esi).

Changes:

Rewrite _findOpeningTag to use memchr to locate < and memcmp to verify the <esi: / <!--esi anchors, preserving the COMPLETE_MATCH / PARTIAL_MATCH / NO_MATCH semantics and partial-match-at-chunk-boundary behavior.
Add <cstring> include for memchr/memcmp.
Add four new parser test SECTIONs covering exact-length prefixes at chunk boundaries, <!--esi without trailing whitespace, and multiple false < anchors before a valid tag.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
plugins/esi/lib/EsiParser.cc	Replaces the two-state-machine scan loop with a memchr/memcmp-based implementation and updates the function's leading comment.
plugins/esi/test/parser_test.cc	Adds boundary-condition test sections for the new implementation.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

zwoop · 2026-05-19T20:31:11Z

-      if (++i_esi == ESI_TAG_PREFIX_LEN) {
-        is_html_comment_node = false;
-        opening_tag_pos      = i_data - i_esi + 1;
+  const char *const buf     = data.data();


One thought here, and I asked Claude to make an example: Did you consider pulling in and using libswoc TextView instead? It excels at things like "string parsing". Not a show stopper, and could be a future consideration.

I asked Claude to write a suggested use with TextView (not tested ...)

swoc::TextView view{data.data() + start_pos, data.size() - start_pos}; const swoc::TextView esi_prefix{ESI_TAG_PREFIX, ESI_TAG_PREFIX_LEN}; const swoc::TextView html_prefix{HTML_COMMENT_NODE_INFO.tag_suffix, size_t(HTML_COMMENT_NODE_INFO.tag_suffix_len)}; while (!view.empty()) { size_t pos = view.find('<'); if (pos == swoc::TextView::npos) return NO_MATCH; view.remove_prefix(pos); size_t abs_pos = data.size() - view.size(); if (view.starts_with(esi_prefix)) { /* COMPLETE_MATCH esi */ } if (view.size() > html_prefix.size() && view.starts_with(html_prefix)) { char ch = view[html_prefix.size()]; if (ch==' '||ch=='\t'||ch=='\r'||ch=='\n') { /* COMPLETE_MATCH html */ } } if (view.size() < esi_prefix.size() && esi_prefix.starts_with(view)) { /* PARTIAL */ } if (view.size() <= html_prefix.size() && html_prefix.starts_with(view)) { /* PARTIAL */ } view.remove_prefix(1); }

I was mostly unaware of what was offered in libswoc, so that's def why I didn't use it. I did some quick benchmarks with a synthetic comparison between this vs TextView and libswoc is as fast if not slightly faster sometimes. Its within margin of error, so Its more up to you guys if you want it re-written that way now, or something we should tackle later?

zwoop · 2026-05-19T20:33:41Z

+  const char *const buf     = data.data();
+  const size_t      total   = data.size();
+  const size_t      esi_len = ESI_TAG_PREFIX_LEN;
+  const size_t      hlen    = HTML_COMMENT_NODE_INFO.tag_suffix_len;


Do you really need this hlen local? The compiler will optimize it away, but I honestly would just like to see HTML_COMMENT_NODE_INFO.tag_suffix_len where you need this, specially since you memcmp from HTML_COMMENT_NODE_INFO.tag_suffix already (no shadow variable for the string).

Same with the esi_len I think, neither are blockers.

corrected in latest commit

Remove the hlen and esi_len local aliases so the length argument to each memcmp call is read from the same named constant as the string argument, consistent with how ESI_TAG_PREFIX and HTML_COMMENT_NODE_INFO.tag_suffix were already written in full. Addresses review feedback from zwoop on PR apache#13173. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

bneradt · 2026-05-19T22:46:22Z

[approve ci]

shukitchan requested a review from Copilot May 18, 2026 22:11

shukitchan self-assigned this May 18, 2026

shukitchan added the esi esi plugin label May 18, 2026

Copilot started reviewing on behalf of shukitchan May 18, 2026 22:11 View session

shukitchan added this to the 11.0.0 milestone May 18, 2026

Copilot AI reviewed May 18, 2026

View reviewed changes

Comment thread plugins/esi/lib/EsiParser.cc Outdated

fixing style issue

fdf09bf

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

bryancall requested a review from shukitchan May 18, 2026 22:37

bryancall assigned pixitha and unassigned shukitchan May 18, 2026

bryancall requested a review from zwoop May 18, 2026 22:38

apache deleted a comment from shukitchan May 18, 2026

zwoop previously approved these changes May 19, 2026

View reviewed changes

pixitha dismissed zwoop’s stale review via 6f60f18 May 19, 2026 21:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

esi: replace _findOpeningTag with memchr#13173

esi: replace _findOpeningTag with memchr#13173
pixitha wants to merge 3 commits into
apache:masterfrom
pixitha:esi-find-opening-tag-fast-path

pixitha commented May 18, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

zwoop May 19, 2026 •

edited

Loading

Uh oh!

pixitha May 21, 2026

Uh oh!

zwoop May 19, 2026

Uh oh!

pixitha May 19, 2026

Uh oh!

bneradt commented May 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

pixitha commented May 18, 2026

Problem

Solution

Performance

Testing

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

zwoop May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pixitha May 21, 2026

Choose a reason for hiding this comment

Uh oh!

zwoop May 19, 2026

Choose a reason for hiding this comment

Uh oh!

pixitha May 19, 2026

Choose a reason for hiding this comment

Uh oh!

bneradt commented May 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

zwoop May 19, 2026 •

edited

Loading